Problem solved, right? Not so fast -- I not only had to parse these files, I had to account for differences in formats between the different systems reading and writing them such as the delimiter, quote character, linefeed character, etc. Python conveniently has a Dialect class, which can be used to inform the CSV parser of the differences in formats between files. clojure-csv uses dynamic variables to store some of these constants, which means that they can be set by the code consuming the library easily.
In an attempt to duplicate this functionality in Clojure, I came up with the idea of creating a format hash-map with these values in it and using a macro to bind its values to the dynamic variables in clojure-csv. Here's what I came up with:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(def input-dialect {:delimiter \,, :end-of-line "\n", :quote-char \"}) | |
(def output-dialect {:delimiter \|, :end-of-line "\n\n", :quote-char \'}) | |
(defmacro with-dialect [given-dialect & body] | |
`(binding [*delimiter* (:delimiter ~given-dialect) | |
*end-of-line* (:end-of-line ~given-dialect) | |
*quote-char* (:quote-char ~given-dialect)] | |
~@body)) |
In this code, I'm using the binding key word to set these global variables for the scope of the function, after which point they will be reset to their original values.
And here's a use of it:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(spit "presidents-reformatted.csv" | |
(with-dialect output-dialect | |
(write-csv | |
(with-dialect input-dialect | |
(parse-csv | |
(slurp "presidents.csv")))))) |
That codes reads in the U.S. Presidents CSV sample found here and spits it back out in a different format, using a binding-within-a-binding to do so.
No comments:
Post a Comment