Wednesday, January 11, 2012

csv-clojure and Python dialects

On a recent project, I had the choice of using either Clojure or Python to parse some CSV files. Now, Python has a built-in library for handling CSV files, which Clojure does not have. However, being slightly enamored of Clojure, I started investigating my options. The best-known CSV-parsing library seems to be David Santiago's clojure-csv, which contains functions for reading and writing CSV files.

Problem solved, right? Not so fast -- I not only had to parse these files, I had to account for differences in formats between the different systems reading and writing them such as the delimiter, quote character, linefeed character, etc. Python conveniently has a Dialect class, which can be used to inform the CSV parser of the differences in formats between files. clojure-csv uses dynamic variables to store some of these constants, which means that they can be set by the code consuming the library easily.

In an attempt to duplicate this functionality in Clojure, I came up with the idea of creating a format hash-map with these values in it and using a macro to bind its values to the dynamic variables in clojure-csv. Here's what I came up with:



In this code, I'm using the binding key word to set these global variables for the scope of the function, after which point they will be reset to their original values.

And here's a use of it:



That codes reads in the U.S. Presidents CSV sample found here and spits it back out in a different format, using a binding-within-a-binding to do so.

No comments:

Post a Comment