Erin Call

Erin Call

You Can't Spell Engineering Without Erin

Posts tagged python

The Splinter Mystery

I've been wrestling with a mystery lately. The Catsnap tests use a browser-interaction library called Splinter to verify the JavaScript on the pages. After I added a few tests, I started to see some fairly bizarre performance problems.

While the browser was doing things, it was very quick indeed. But at what looked like random times, it would just sit and think about philosophy for a while. I asked about this on the Splinter mailing list and got no responses (I'm not sure how active that mailing list is), so I finally did some investigation on my own.

I created a repository that demonstrates the slow behavior. I've built a fairly minimal amount of behavior, so it's easy to read through. Unfortunately, even though it demonstrates the problem, it still doesn't explain it. I still have no idea why this particular set of steps leads to such a sudden drop-off in performance.

At this point I think I've exhausted my ability to debug this any farther. I'm hoping to take it to the Cobrateam, uh, team, and see if they're interested in helping track down the root cause.

Posted on 2013-08-10T17:42:00Z
Posted in python.

Announcing Radlibs

For the last few days I've been eating, breathing, and sleeping a new project. Code has flown from my fingers like sparks from a millstone. Today I'd like to announce the result of my fevered efforts: Radlibs! At its core, Radlibs is a laguange for generating English text. It uses randomly-selected category-members to fill in the blanks in a given phrasal template.

The "Rad" in "radlibs" comes from its recursive nature: category-members may also be phrasal templates. Thus, with the right libraries Radlibs can generate some good clean fun:

crate and barrel

(Read the full post...)

Posted on 2013-07-07T21:09:00Z
Posted in python.

Catsnap 3.0 Beta

A beta version of Catsnap 3.0 is now available. 3.0 introduces a major architectural change: instead of accessing dynamodb directly, the command-line script is now a client to a postgres-backed web api. This has several advantages, including browser-based access, reduced operating costs (potentially all the way to $0.00), and improved security.

You'll need to run the server code somewhere. The instructions below assume you're using a VPS or EC2. However, you could also run catsnap on your personal computer, or a managed service like Heroku. In any case, you'll need to create a postgres database for Catsnap to use. First-time Catsnap users will also need to create an S3 bucket.

(Read the full post...)

Posted on 2012-12-27T23:15:00Z
Posted in python, catsnap.

Oh, So That's Why You'd Use A Generator

So, I've been working on this catsnap project of mine, improving the performance. The big problem is that it's spending an enormous amount of time on the wire, waiting to get information back from AWS. For example, when searching for images by tag, my first pass used a pretty naïve approach: loop through the images associated with a tag, asking DynamoDB about each one. Fortunately, the excellent boto library for interacting with aws has a batched lookup mode that lets me grab all the images at once.

Only...not quite. DynamoDB has this thing going on where if you ask for some items, it'll give some or all of them back.[1] It's polite enough to tell you which keys it ignored, but you still have to ask for them again if you really wanted what you said you wanted.

(Read the full post...)

Posted on 2012-07-18T06:19:00Z
Posted in python.

The System Pip On Lion Is Completely Broken??

I sat down to hack on Pullme for the first time on my new macbook. I ran into unreasonable roadblocks:

(Read the full post...)

Posted on 2012-02-18T19:58:00Z
Posted in python, bugs.

Annotation Station: hanke.py

A week or so ago, I annotated a chunk of code for a co-worker who was complaining that open-source did him little good if he didn't understand the language. I was surprised--and pleased--to find that I learned a lot from the exercise myself. I resolved immediately to do it some more. This is the first in what will be several trillion annotated chunks of code.

(Read the full post...)

Posted on 2012-01-17T06:59:00Z
Posted in python, annotation station.

Python's Scoping System Will Hurt You If You Let It

So, I ran into a bit of Python's behavior that upset me. Here's some code that is wrong:

if some_function():
    x = 5
print x

Depending on the return value of some_function, this code may or may not execute successfully. Whether or not it happens to succeed, though: it's wrong. You have this exception sitting there, waiting to jump up and bite you. It is a problem waiting to happen. Python doesn't care, though! It will happily accept that code and let you catch the error in production. But why? Why doesn't Python just define a new lexical scope when entering an if or for block? Then it could tell you during compile-time that your code is wrong, and you could fix it before it ever screwed you up.

(Read the full post...)

Posted on 2012-01-10T04:57:00Z
Posted in python, harm reduction.