Performance Heisenbugs and Premature Optimisation

(definition from Wikipedia)

Recently we were in the midst of writing some heavy number-crunching code for a web app we are currently developing for a client. This system involves the collation of large numbers of surveys. Questions are stored in a database, which are scored against various options for each interviewee. Once the surveys are complete, different charts can be generated. These charts involve the computing of averages, ranges, differences, and all manner of other statistical magic.

The only problem with this level of flexibility is that it's comparing items in an exponential number of ways. Each question is rated against various options. Each question and option pair is created by an interviewee. Each survey has multiple interviewees. Each action taken in a survey has notes associated with it. There's a mind-boggling number of permutations.

When looking at a chart generated by our system, we were shocked to find that it took an unacceptable four seconds to render. People aren't going to like waiting that long to see a chart.

As usual, our first port of call when debugging a web application is the fantastic Django Debug Toolbar. Its SQL panel shows a beautifully-formatted list of all database queries run during a request, with a timeline of how long each one took. Popping open Django Debug Toolbar on the chart rendering views, we could immediately see that we had way too many queries.


Into the codebase we delved, looking for ways to optimise. We had already cached a large amount of the expensive calculations that were common to many of the charts. Django Debug Toolbar told us that, in fact, the majority of queries were coming from a particular place in the code: pulling out the notes attached to the survey in a tight loop. For a given scoring pair, the set of notes attached to it was retrieved through a relation in its model, which generated N queries. This looked like a good place to do some optimisation.

As every note was to be rendered on the screen, and they all lived in one table, the most elegant way to fetch them was to instead gather them all at once. We extracted them all in one go, sorted them into a dictionary, and had the renderer pick them out as and when needed.

After this work was done, we reloaded the page to see how our efforts had paid off.

Down to 130 queries! Excellent! But... the time taken had only halved, to two seconds? Still too long. What more could be optimised? We were running out of things to speed up. Then we noticed something a little odd: Django Debug Toolbar reported that the SQL itself completed in 90 milliseconds - plenty fast enough. So where was this atrocious time to complete coming from? Were we going to have to write native code to speed things up? Were we to swap our trusty database for something NoSQL?

It was time for some lateral thinking. Perhaps Django Debug Toolbar was misreporting something? We dropped in a piece of profiling middleware based on hotshot and turned off the toolbar. We refreshed the page - and...

It loaded almost instantaneously.

It turns out that all that beautiful syntax highlighting and collation of SQL queries that Django Debug Toolbar does isn't free. In logging each query and measuring all of the things it does, Django Debug Toolbar was using up lots of CPU time. Most of the multi-second request latency we were seeing wasn't in our code at all - it was in the code we were using to find out why our code was slow! In a production setting, with Django Debug Toolbar turned off, the problem simply wouldn't exist.

The moral of the story? We're certainly not criticising Django Debug Toolbar, which is one of the most powerful and useful weapons in our development armoury. Our advice is simply to take a step back and check the basics before diving too deep into optimisation that you don't really need to do.

blog comments powered by Disqus