Showing posts with label Python. Show all posts
Showing posts with label Python. Show all posts

1.07.2011

Beginnings of a python analyzer

Analyzing the source code your running is a massive task. An enormous amount of effort goes into search for and fixing software faults. One thing I've always wondered is how would an analyzer (static or runtime) look if it were examining python. Being a dynamically typed language, python can be extremely tricky to analyze from just its syntax. However, thanks to Armin Ronacher it does have one excellent feature which could allow deeper analysis; direct access to a parsed ast. With this module you can examine, manipulate or mangle the language any way you see fit.
I've decided to attack the number one bug creator in my opinion, passing None to a function. If this were C, that would be like passing NULL. I can't really see why there would be a good place to do that and in my experiences its usually causing a bug. Beyond the basic reasons, its just semantically wrong to be passing around nothing. It sort of defeats the point of providing arguments at all. In python, we've got really nice default argument syntax so if you must have None for a argument, make that the default and don't pass anything. If you can't avoid this due to some API then I'm sorry.
Anyway, here's how I went after it; first I created a class which extends NodeTransformer and added this method:

def visit_Call(self, node):        
 self.generic_visit(node)        
 for x in chain(node.args, node.keywords):            
  x = x.value if hasattr(x, 'value') else x            
  if isinstance(x, Name) and x.id == 'None':            
   warn("Passing None Argument to function", RuntimeWarning)
 return node

Then I just use that method in a simple function to create the ast and walk it:

with open(sys.argv[1]) as f:    
 source = f.read()    
 ast = parse(source, sys.argv[1])    
 node = StaticTransformer().visit(ast)

Now when I call this script and pass in any compilable Python code, if there are any warnings I'll know about them and be able to fix them before the code gets used elsewhere. It is easy to add more methods to check ast nodes, you just follow the visit_<Node> pattern for naming the functions in your NodeTransformer class. I've created a gist with the code that I've written so far, feel free to fork and contribute!

10.26.2010

Being lazy in Python

One of my favorite features of Haskell is its lazy evaluation of list comprehensions. This makes it possible to take the nth position item of a list without first having to calculate the entire list. Pretty sweet if you want to reason of infinite data sets such as the Fibonacci numbers or primes. In a perfect world, I'd be writing Haskell all the time. However, that's just not possible. Python is far more popular, is installed by default on many OSes and can make some tasks very simple. So really, I want lazy list evaluations in Python. The only way to implement this without hacking up the actual interpreters object space is to mess with iterators. So I've implemented the Fibonacci sequence as an iterator. Essentially, I use the next parameter to move over the list of Fibonacci numbers and calculate the next one. I then store any numbers which I have already calculated to prevent extra work. I then use that pre-calculated list along with further calculation to implement list subscripting and contains. One thing to note is that itertools.takewhile will continue iteration for one through the failing condition, not up to. Hence the existence of a rollback function. Overall it was a fun exercise on using iterators and who knows, maybe it's actually useful.

from itertools import takewhile


class Fib(object):
    i = 0
    items = list()

    def __init__(self):
        self.n = 0
        self.n_1 = 1

    def __contains__(self, item):
        if item in self.items:
            return True
        else:
            self.items += list(takewhile(lambda x: x <= item, self))
            self._rollback()
            return self.items[-1]

    def __getitem__(self, key):
        if key <= len(self.items):
            return self.items[key - 1]
        else:
            self.items += list(takewhile(lambda x: self.i < key - 1, self))
            self._rollback()
            return self.items[-1]

    def __iter__(self):
        return self

    def next(self):
        next = self.n + self.n_1
        self.n = self.n_1
        self.n_1 = next
        self.i += 1
        return next

    def _rollback(self):
        """
        When iteration gets stopped by takewhile we've gone one too far
        so it needs to be backed up one
        """
        self.n = self.items[-2]
        self.n_1 = self.items[-1]
        self.i -= 1
Update: I've got a gist going of this which includes some bug fixes for slicing.

7.24.2010

PyMET: a simple api client for Trimet data

I've been hacking on and off this Trimet data wrapper for about a year now and I recently decided to give it a big push. My biggest point of contention with the project has been dealing with irregular XML api data and modeling that in python objects. It's an annoying and frustrating task in any circumstance. You must deal with parsing the xml, and then figure out how you'd like to shove it into your object properties for every single property. This means you spend more time figuring out how to map the data than actually using it.

I am really tired of this situation, it flat out sucks. That's why I've gotten lazy and tried to make it better. My new Trimet classes are based on a lazy xml api class which handles this mapping for you based on introspection of the returning data. Thanks to the BeautifulSoup project, discovering attributes, text values and children of XML objects is a very simple task. My only task was to provide the mapping. For this, I recursively loop over the elements and make a few important choices:
  1. choose to either add attributes and descend if any element has children
  2. put all elements which have identical tagname siblings into a list
With these goals for the algorithm, implementation was not too difficult. The main things to keep track of is when to append data vs. when to just add a new dict object (I am converting the entire xml tree into a dict/list object). I'm also doing so fun stuff with __getattr__ so you can acces properties of the object returned based on your schema and the data itself.

Now that I have the obvious bugs worked out, adding new XML apis based on this should be trivial. I think civic apps has some which I might try out.

* Warning, this code is still pretty rough and needs some cleanup, but if you know what you're doing it should be easy to follow the examples in arrivals.py

    7.11.2010

    2 Packages for Flask

    I've released 2 extensions for Flask today; Flask-Markdown and Flask-Jinja2Extender. Both are very simple, but handy when you're doing a lot of templating and working with Markdown. Also, it's import to note that both will be going under heavy development and should be considered unstable, blah, blah, blah. Otherwise enjoy!

    Edit:
    One more thing, you can also grab cockerel from pypi.

    7.06.2010

    Building out a webpp using Flask

    Flask is a new micro-framework by the folks at Pocoo. I have to say I'm super impressed. The build-out is really flexible which I find as refreshing compared to Django. You can organize the application in any fashion you like; I've chosen to do a standard layout for a wsgi application. I've got a module for models, and a webapp which has submodules for my views. A nice feature of Flask is being able to register these modules on boot and instantiate all the routing; I am also initializing and managing the db connection when starting the app using the Flask-SQLAlchemy extension. Finally, I've got Flatland taking care of my forms. This overall architecture seems to his the sweet spot for my application needs. I'm knocking out tasks and taking names! Naturally all my code is on github by now...

    6.30.2010

    Moar Parsing

    I burned the midnight oil a bit last night working on my output serializer/parser and got it working. The grammar is nearly complete, with only 1 shift/reduce conflict. The parser will take a coqtop response, such as
    1 subgoal
      A : Prop
      B : Prop
      C : Prop    
      ============================
       A -> ~ ~ A \/ B > C
    """
    

    and turn that into a mostly correct python object, which for now is just a dict:
    {   'goal': {   'expr': (   {   'expr': ('A', '~ ~ A'), 'operator': '->'},
                                {   'expr': ('B', 'C'), 'operator': '>'}),
                    'operator': '\\/'},
        'hyp': [   {   'name': 'A', 'type': 'Prop'},
                   [{   'name': 'B', 'type': 'Prop'}, {   'name': 'C', 'type': 'Prop'}]],
        'subgoal': '1 subgoal'}
    

    It's pretty cool to see it work correctly on my sample strings, but the next step will be to bring it into the request/response lifecycle and see how well these objects will serialize. Still for now, progress has been made!

    6.08.2010

    Why Twisted Changed My Life

    So at Open Source Bridge I ran into the folks at OSU's Open Source Lab. They were a huge help for flushing out the high level architecture of my connection server for Cockerel. Their project, pydra, was on my list of libraries for implementing a non-blocking i/o server. Had I done it that way, I would have been in for some pain. That is definitely not
    what pydra was designed for; its great for large processing jobs over networks. Instead, they suggested using Twisted.

    For those who don't know, Twisted is a Python library designed for building async-network servers. Internal classes include, protocols for connections, bare-bones ssh clients, irc clients/servers, and a variety of other utilities for building your async-server/client
    application. The maturity of the libraries allowed me to build out a simple connection protocol in a little over 30 minutes.

    Implimenting a server protocol in Twisted is very simple. You can begin with the base protocol.Protocol, for example, and then define a dataReceived() method that processes the data. That's pretty much it! Now, dealing with exceptions and other particulars is clearly going to take most of the dev time, but the basics of your server can be completed just like that. I highly recommend trying it out.

    Thanks again to OSL! Y'ALL ROCK!