Python in Production at DabApps

Jamie Matthews

A few weeks ago, a blog post by software engineer and Python Software Foundation fellow Hynek Schlawack lamented the recent lack of conversation in the Python community around building and operating web-based services in production.

As Python has become the go-to language in so many areas of the industry (data science, AI, education and infrastructure automation to name just a few), the volume of published content in these new and exciting fields has grown. But that discussion has perhaps drowned out the voices of those of us who use Python for more prosaic (but no less exciting) purposes: building backends for database-driven business applications.

This is our attempt to redress that balance and talk a bit about how we use Python in production.

Technology choices

It’s now more than a decade since the idea that became DabApps was born in a pub in Brighton. From day one, we decided to base our core technology stack on Python and the Django web framework. We haven’t regretted that decision for a second.

As we’ve grown and matured, so has Python and Django. Our technical approach is 2020 is in many ways quite different to when we started, but even our older codebases are remarkably free of cruft and still fairly easy to maintain. This is in a large part thanks to Python’s emphasis on readability, and the community’s unwavering focus on backwards compatibility and responsible rate of change. Like thousands of other businesses, we know we can depend on Python and Django for the long-term, and so we’re confident in recommending them to clients.

Django as an API server

Uniformity of approach has always been something we’ve strived for. As the web industry has moved towards single-page app frontends built with React we’ve followed along, and soon afterwards adopted React Native for our mobile applications.

This means that our Python backends, rather than rendering HTML to send to the browser, instead send JSON data which is consumed by our JavaScript (now TypeScript) frontends. Like most people building APIs with Django, we use Django REST framework to accomplish this: the framework’s creator Tom was our first employee at DabApps and so we have very strong links to this ubiquitous and beautifully designed library.

Our Python backends are responsible for all of our complex business logic, permissions checking, database reads and writes (we use PostgreSQL), caching, background job processing, interactions with external services and more. We try to keep our APIs smart, returning just the data the client needs, to minimise the amount of work that we need to do in JavaScript.

Wagtail

DabApps has historically focussed on complex, interactive, data-driven bespoke applications and steered away from CMS-backed marketing sites. However, our clients often need some aspect of CMS-driven content as part of their build, or they need a marketing site to complement their app. For the past few years we’ve been using Wagtail to fulfil these sorts of requirements.

Wagtail is a very nicely designed CMS. Its “streamfield” concept is a very good fit for representing long-form website content, and it strikes a good balance between developer customisation and friendly WYSIWYG editing. We particularly love the fact that it doesn’t take over your entire project, and instead allows your CMS content to sit nicely alongside your other Django apps.

Twelve Factor

We follow 12 Factor App methodology, and so our mental model when building web applications is based on a fleet ("formation") of processes. Every web app needs a web process: our standard go-to WSGI server is currently Waitress because of its simplicity and predictability. We are evaluating Gunicorn for workloads where a process-based concurrency model makes more sense.

Many applications also need a background worker process for performing long-running or asynchronous tasks: for this, we generally use django-db-queue due to its operational simplicity and lack of requirement for a separate queue broker.

Often there is a requirement for recurring scheduled tasks. Our standard approach has been to write a Django management command that starts an APScheduler instance, but we are gradually moving to running management commands with Heroku scheduler.

We configure our applications through environment variables and have well-separated built/release/run stages, meaning our code tends to be simple to operate in production. All log messages are written to standard output streams, and we aggregate them with a custom-written Heroku log drain that ships them to AWS CloudWatch for indexing, monitoring and alerting.

A note on Python 3

The transition from Python 2 to Python 3 has caused some controversy in the Python world. We switched to Python 3 for all new projects in 2016 (starting with Python 3.4) and we’re currently nearing the end of the process of upgrading the last few projects that were still on Python 2 to Python 3.

This has taken a while (as any technology upgrades tend to in an agency setting) but hasn’t been particularly painful for us. This is due to a few factors. First, even our earliest projects were on Python 2.7, which had already put the groundwork in place for the Python 3 switch. Secondly, most of our projects are fairly idiomatic Django, which enforces proper handling of Unicode data (one of the biggest changes in Python 3).

A note on async

Another hot topic in the Python community is the addition of first-class cooperative multitasking primitives into the language via the async/await keywords. This is a very exciting development, and promises huge steps forward in one of the few areas that Python hasn’t historically been known for: high-performance networking. Frameworks such as Starlette are already comfortably competing with Node.js in levels of throughput. Andrew Godwin and others are currently working on adding first-class async support to Django.

We’ve dabbled in async Python (particularly when making requests to APIs that may not respond in a timely fashion) but we don’t currently feel it will make a huge difference to the way we build most of our web backends. The reasoning behind this might make a good future post.

Summary

In 2010 it might have been considered a risky move for a web agency to base its tech stack on Python, especially in Brighton where most agencies are using either PHP or Microsoft technologies. Since then, Python’s popularity has skyrocketed, and our tech stack has proven itself as a solid, reliable basis for even the most complex client requirements.

As a company, we’ve written well over half a million lines of Python over the last ten years. The code we’ve written is (for the most part) clean, readable and maintainable, and we couldn’t be happier to carry on running Python in production for the next decade and beyond.