csv-writing
This collection provides a simple set of routines for writing comma-separated-values (CSV) files.
There is no well-specified generally-agreed-upon format for CSV files, but there are a set of conventions, and this library tries to cleave to those conventions.
At the same time, it does provide a number of simple parameters that you can adjust in order to customize the less-well-agreed-upon aspects of serialization. One obvious design question is how to allow you to specify values for these parameters without decorating every procedure with a raft of optional parameters. I haven’t found a completely satisfactory answer, but—in somewhat the style of Neil Van Dyke’s csv-reading library—you get a reasonable default, and a way to specify a printing-parameters structure that customizes printing if you need it.
Also, users should note that in general, the CSV representation loses information; the string "TRUE", the symbol 'TRUE, and the boolean value #t all have (by default) the same printed representation, the string "TRUE". Users may certainly specify custom procedures for the printing of booleans, symbols, strings, or all values, in order to address this in whatever style makes sense for their particular application.
NOTE: In fact, you can use this package to display TSV files as well. See below for an example.
(require csv-writing) | package: csv-writing |
The representation of a table is simply a list of lists of values.
The rows do not all need to be the same length.
procedure
(display-table table [ port #:printing-params printing-params]) → void? table : table? port : output-port? = (current-output-port)
printing-params : csv-printing-params? = default-csv-printing-params
For instance,
(display-table '((name title) ("joey" bottle-washer) ("margo" sign-painter 34)))
Produces
name,title |
joey,bottle-washer |
margo,sign-painter,34 |
as output.
The documentation of the make-csv-printing-params procedure provides information on how to customize the printing.
procedure
(table->string table [ #:printing-params printing-params]) → string? table : table?
printing-params : csv-printing-params? = default-csv-printing-params
procedure
(table-row->string row [ #:printing-params printing-params]) → string? row : list?
printing-params : csv-printing-params? = default-csv-printing-params
So, for instance:
(table-row->string '(342 bc "def" #t))
... produces the string "342,bc,def,TRUE".
As before, the printing-params can be used to customize the printing of values.
procedure
(table-cell->string cell [printing-params]) → string?
cell : any?
printing-params : csv-printing-params? = default-csv-printing-params
procedure
(make-csv-printing-params [ #:table-cell->string table-cell->string #:string-cell->string string-cell->string #:number-cell->string number-cell->string #:boolean-cell->string boolean-cell->string #:symbol-cell->string symbol-cell->string #:quotes-only-when-needed? quotes-only-when-needed? #:quoted-double-quote quoted-double-quote] #:column-separator column-separator) → csv-printing-params? table-cell->string : procedure? = default-table-cell->string string-cell->string : procedure? = default-string-cell->string number-cell->string : procedure? = default-number-cell->string
boolean-cell->string : procedure? = default-boolean-cell->string symbol-cell->string : procedure? = default-symbol-cell->string quotes-only-when-needed? : boolean? = #t quoted-double-quote : string? = "\"\"" column-separator : ","
The table-cell->string procedure controls the translation of cell values to strings. Here’s a simple (and mostly useless) example:
(display-table '((a b) (c d)) #:printing-params (make-csv-printing-params #:table-cell->string (λ (str) "X")))
This produces the output
X,X |
X,X |
The default table-cell->string procedure dispatches to customizable printing procedures for strings, numbers, symbols, and booleans, and signals an error for all other kinds of data. If the user provides a different procedure for table-cell->string, then the values of procedures such as boolean-cell->string will be irrelevant, since they won’t be called.
Put differently, overriding table-cell->string is the “nuclear option”, indicating that you just want all of the default procedures to get the heck out of the way.
The string-cell->string procedure is called by the default table-cell->string procedure to map strings to CSV values. So, for instance:
(display-table '(("ab" 34) ("cd" 2)) #:printing-params (make-csv-printing-params #:string-cell->string string-upcase))
...would produce the output:
AB,34 |
DC,2 |
The default string-cell->string procedure maps strings to themselves, unless they contain newlines, commas, or double-quotes, in which case it wraps them in double-quotes, and quotes double-quotes using the default quoted-double-quote.
The quotes-only-when-needed? parameter is true by default; if set to false, then all strings are wrapped in double-quotes, regardless of whether they contain dangerous characters. This parameter is ignored if you provide your own string-cell->string procedure.
The quoted-double-quote parameter is the string that is used in place of a double-quote that appears in a table cell. This parameter is ignored if you provide your own string-cell->string procedure.
The number-cell->string procedure is called by the default table-cell->string procedure to effect the translation of numbers to CSV cells.
The default number-cell->string procedure uses ~r to produce strings for rational numbers, and signals an error otherwise.
The symbol-cell->string procedure is called by the default table-cell->string procedure to effect the translation of symbols to CSV cells. By default, it simply maps symbols to strings using symbol->string and then calls the default string-cell->string procedure. (Specifying a custom string-cell->string procedure will not affect the behavior of the default symbol-cell->string procedure.)
The boolean-cell->string procedure is called by the default table-cell->string procedure to effect the translation of booleans to CSV cells. By default, it produces "TRUE" and "FALSE".
The column-separator string is used to separate columns. Typically, for a CSV file, this is the string "," (hence the name "comma"-separated....) If you supply a tab character, you’ll get a TSV instead. In fact, you might just want to use default-tsv-printing-params.
procedure
(csv-printing-params? v) → Boolean
v : Any
(define default-tsv-printing-params (make-csv-printing-params #:string-cell->string tsv-string-converter #:column-separator "\t")) ;; strings with tabs cause errors, others are passed unchanged (define (tsv-string-converter str) (match str [(regexp #px"\t") (error 'tsv-string-converter "no tabs allowed: ~e" str)] [other str]))
Here’s an example of using it:
(table->string '(("a" "b" 14) ("c" "d e" 278)) #:printing-params default-tsv-printing-params)
Note that it’s entirely possible that you’ll want to customize this, for instance by providing a special mapping for sql-null? or other values.
1 Suggestions Etc.
The goal is for this library to be useful to other people. Please, by all means report problems. Pull requests are especially welcome.