7.24.2010

PyMET: a simple api client for Trimet data

I've been hacking on and off this Trimet data wrapper for about a year now and I recently decided to give it a big push. My biggest point of contention with the project has been dealing with irregular XML api data and modeling that in python objects. It's an annoying and frustrating task in any circumstance. You must deal with parsing the xml, and then figure out how you'd like to shove it into your object properties for every single property. This means you spend more time figuring out how to map the data than actually using it.

I am really tired of this situation, it flat out sucks. That's why I've gotten lazy and tried to make it better. My new Trimet classes are based on a lazy xml api class which handles this mapping for you based on introspection of the returning data. Thanks to the BeautifulSoup project, discovering attributes, text values and children of XML objects is a very simple task. My only task was to provide the mapping. For this, I recursively loop over the elements and make a few important choices:
  1. choose to either add attributes and descend if any element has children
  2. put all elements which have identical tagname siblings into a list
With these goals for the algorithm, implementation was not too difficult. The main things to keep track of is when to append data vs. when to just add a new dict object (I am converting the entire xml tree into a dict/list object). I'm also doing so fun stuff with __getattr__ so you can acces properties of the object returned based on your schema and the data itself.

Now that I have the obvious bugs worked out, adding new XML apis based on this should be trivial. I think civic apps has some which I might try out.

* Warning, this code is still pretty rough and needs some cleanup, but if you know what you're doing it should be easy to follow the examples in arrivals.py