Programming and Cognitive Load

Jamie Matthews

This post is about how we can take inspiration from cognitive psychology in order to write better code. It gets a little technical in places (especially in the examples), but should be perfectly understandable by a non-technical reader.

It's a well-known saying in software engineering that "programs must be written for people to read, and only incidentally for machines to execute" (Abelson & Sussman, "Structure and Interpretation of Computer Programs", preface to the first edition). But what does this actually mean in practice?

On the surface, there are simple rules of thumb that all developers learn to follow, such as choosing good variable and function names, formatting code correctly, and writing relevant and non-redudant comments. But the quest for improved readability and maintainability can be interpreted at a deeper level than this, by considering how we process information when we read, and therefore how our minds operate. Conclusions drawn from this way of approaching programming may disagree with some conventional wisdom when it comes to structuring our code.

I should start by saying that I am not a psychologist, and this is not intended to be an academic article. Think of this argument as a metaphor - a way of looking at things that may be helpful, rather than a rigorous evidence-based exposition of a scientific position.

What is cognitive load?

Cognitive load is a concept from cognitive psychology, originally developed by John Sweller, that is generally discussed in relation to learning. It is a measure of the amount of effort being used in the working memory. Working memory is, loosely speaking, the part of the cognitive system that processes information related to what the individual is currently doing (in contrast to long-term memory, which retains structured knowledge). We know that working memory is quite limited in its capacity (try remembering a phone number after hearing it once) and is vulnerable to overload.

Programming and related tasks (like tracking down bugs) are very reliant on working memory. As programmers we're familiar with the feeling of “loading a program into your brain”, of holding a subset of an algorithm or data model in your mind so you can start to reason about it. This happens whenever we're asked to look at a new, unfamiliar piece of code (if we've just started working at a new company, say). But it also happens with software we've been maintaining for years. Any non-trivial software system is so complex that no individual developer can understand or remember how the whole thing works all of the time, so we have to re-familiarise ourselves with the inner workings of a particular part of the codebase before we can start making changes or fixing bugs. Maintaining software is essentially a continual learning process, so the theoretical underpinnings of cognitive load can be applied.

Programmer

Cognitive load, according to the psychologists, falls into three types:

Intrinsic cognitive load is the difficultly of learning the task itself. When learning how to make a cake, say, it would be the effort required to weigh and combine all of the ingredients correctly, stir the batter just the right amount, preheat the oven, time the baking correctly and so on.

Germane cognitive load is effort spent formulating new lower-level representations of sub-parts of the task. Part of the recipe might be spent explaining how to make icing in great detail. Over time, however, an experienced baker will learn exactly how to make icing, and so that part of the overall cake-making task will be reduced to simply "make the icing".

Extrinsic cognitive load is additional effort imposed by the manner in which the learning takes place. In the cake example, imagine if all of the units are given in pounds and ounces but you are accustomed onto to working in grams and kilograms. Extrinsic load would be the time spend converting between imperial and metric at every step of the way.

Simply speaking, intrinsic and germane load are considered "good" and extrinsic "bad". Too much germane load makes learning a new task initially harder and slower (while the learner picks up the underlying concepts), but this work is valuable because the next time you come to do the same task, it'll be easier. Intrinsic load is just the effort required to perform the task - but this should reduce over time as new lower-level knowledge and experience are formed. Extrinsic load is like friction, slowing you down and filling up your working memory with non-essential, wasted effort.

Hopefully, if you're a programmer, you'll already have an intuitive sense that these types of cognitive load have some relevance in your day-to-day tasks. The key argument of this article is that we can (and should) examine programming constructs and abstractions by considering the effects they have on intrinsic, extrinsic or germane cognitive load in future readers of the code.

Basic Example

Let's introduce a simple example, while bearing in mind the caveat that simple examples often run the risk of encouraging criticism of the example itself, rather than the underlying point. A "real world" example would require much more space than we have.

We're going to look at two different ways of implementing a Django REST framework view (ie the code that handles an HTTP request to a particular URL). The code samples below were taken directly from the Django REST framework documentation. Again, please remember that this is not, by any means, an argument against any particular bits of Django REST framework - it's just an example!

Here are the two approaches:

First, the "long-winded" version.

@api_view(['GET', 'POST'])
def snippet_list(request):
    """
    List all code snippets, or create a new snippet.
    """
    if request.method == 'GET':
        snippets = Snippet.objects.all()
        serializer = SnippetSerializer(snippets, many=True)
        return Response(serializer.data)

    elif request.method == 'POST':
        serializer = SnippetSerializer(data=request.data)
        if serializer.is_valid():
            serializer.save()
            return Response(serializer.data, status=status.HTTP_201_CREATED)
        return Response(serializer.errors, status=status.HTTP_400_BAD_REQUEST)

If you have a basic understanding of the principles of HTTP (requests and responses, status codes etc) then this should be readable even if you don't know Django (or even Python!). This is one of the areas where Python shines: the code tends to be very readable even by non-experts. The code here tells a story. The story is called "snippet list" (the function name) and this is an "api view" that accepts GET and POST requests (you might not know what a Python decorator is, but the fact that it's close to the name of the function might give you a hint that it configures the function somehow). The function takes an argument called "request" (which we might safely assume is some kind of representation of the HTTP request). The top-level if statement inside the function then branches based on the HTTP method. Let's follow the GET branch. The first line inside there (paraphrased) says "get all the Snippet objects". You don't need to know anything about databases to understand what that does. We then use a thing called a "serializer" - this is a bit more of a stretch, but you might have a hunch that it's to do with converting the data to a format suitable for transferring over the network, and you'd be right. Finally, we return a Response - and we can safely guess this abstracts away the details of the HTTP response.

Now let's look at the second (much shorter) version using Django REST framework's class-based generic views:

class SnippetList(generics.ListCreateAPIView):
    queryset = Snippet.objects.all()
    serializer_class = SnippetSerializer

Superficially, this code contains many of the same concepts, but it's much denser. Crucially, rather than telling a story about how an HTTP request is going to be handled, it declares how the framework should be configured in order to handle those HTTP requests all by itself. That means that the reader must already have a concept of exactly how a ListCreateAPIView works before being able to follow the flow. A quick glance by a novice uncovers more questions than answers: which HTTP methods does this handle? What status codes does it return? What is a queryset, exactly? What is a "serializer class"? (the fact that the serializer was actually being created and used in the first example gave a lot of clues about its purpose that are absent here).

Pair Programming

Let's frame the difference between these two in terms of cognitive load. To a beginner, the first example has a better ratio of intrinsic to germane load: there are fewer constituent concepts that need to be explained or understood in order to follow what's going on, so a learner will pick up the meaning of the code quicker. In the second example, the reader must first take on board several larger concepts before they can form a real understanding of what's going on.

Making changes

Taken in isolation, a seasoned developer may not accept the argument that the increased germane load in the second example automatically makes it better. There are of course other things to consider when assessing the quality of the code, such as the reduction in the number of lines (fewer lines of code means less room for bugs, right?). However, we've only considered so far the cognitive effort of learning what the code currently does, and haven't thought about what a developer would need to do in order to change its behaviour. The defining characteristic of all software under active development is that it changes all the time, in completely unpredictable ways. A big part of our job as developers should really be to make software that's not just correct, but is easy to change. Let's think about that next by considering another toy example.

Suppose we want to gather some statistics on how often snippets are created. In order to do that, we're going to use Python's logging system to log a message whenever a valid POST request is received. We know what the line of code should look like (logger.info("New snippet created!")), but where should that line of code go?

In the first code example, it's fairly clear even to a novice developer where to put it. Right before the Response is returned in the if serializer.is_valid() branch would seem like a sensible spot. Because each step of the execution flow is spelled out clearly, it's obvious exactly what is going on, and so it's easy to go from a mental model of what you're trying to do ("log right before returning the response") to a code representation ("add the logging line right before the line which returns the response").

Now consider the second example. Where do we put the logging call now? There's no actual flow of execution visible to modify, but we know that somewhere, somehow, roughly the same logic must be encoded. So, we crack open the documentation and the source code for generic class based views.

Eventually we figure out that a ListCreateAPIView uses Python's support for multiple inheritance to subclass three things: a ListModelMixin, a CreateModelMixin and a GenericAPIView (in that order). Which way does Python's method resolution order go again? A quick Google reveals it's essentially left-to-right so maybe we should start by looking at ListModelMixin? Hang on, that doesn't sound right - we want to do something when a model is created, so let's try CreateModelMixin. But wait, looking at the implementation of CreateModelMixin, there's no HTTP handling code in there. Where does the POST request actually get handled? Aha! That's back in ListCreateAPIView, which calls a method called create that's implemented on CreateModelMixin. create calls another method called perform_create (!) which is the part that actually calls the serializer's save method. So maybe we should add a perform_create method to our SnippetList and call super() first, followed by our logging call. But wait! How are errors handled? What if the serializer.save() call fails? Back to the source code…

Maybe I've overdramatised a little to make a point. But hopefully the point is made: the pile of abstractions in the second example, despite reducing slightly the amount of typing needed to do simple things, increases the amount of background knowledge and hoop-jumping needed to implement more complex things in the face of new requirements.

It could be argued that this extra required knowledge is germane load ("once you've learned the control flow for creating a model, you don't need to look at the documentation any more!"). I'd counter this by saying: the API surface area for generic views is huge (there's the implementations for list, detail, update and delete for a start, each with their own mixins and hook points, not to mention the various different ways of configuring authentication, authorization, serialization, parsing and so on). This is a lot of background knowledge to absorb. I've been working with REST framework since the very beginning, and I still don't remember much of this stuff without looking it up. I still have to constantly refer to the documentation and the source code when working with class-based generic views.

Abstraction

What if all this extra abstraction isn't actually germane cognitive load at all, but extrinsic load? What if the abstractions we've built, following good software engineering principles of implementation hiding and avoiding repetition, have actually resulted in a set of tools that are harder to use, harder to understand and reason about, and most importantly harder to change than the explicit, lower-level, step-by-step flow shown in the first example? Do the benefits of this implementation really outweigh the drawbacks?

So am I saying that all abstraction is bad? Of course not. If we didn't have any abstractions, we'd be writing all of our programs in assembler. Abstraction is the key insight that makes possible the staggeringly complex software ecosystem that the entire modern world couldn't function without.

Consider the Django ORM line in the examples above: snippets = Snippet.objects.all(). At the level of abstraction we're working at, the meaning of this is completely clear to the reader: it says "give me all the Snippet objects" and that's exactly what it does. It's lowering extrinsic cognitive load, because if we weren't using an ORM the code would be littered with lines about connecting to the database, creating a cursor, running a query (which would be in a different programming language to the surrounding code) and then hydrating model instances. The abstraction provided by Django's ORM lets you forget about all those details, allowing you to focus on the business logic.

Summary

Producing maintainable software depends on readable code, and readable code depends on controlling cognitive load. Successful abstractions minimise cognitive load - particularly, they avoid extrinsic load, and provide a smooth transition from germane to intrinsic load. Creating ever higher layers of abstraction just to reduce the number of keystrokes we need to make may not always be the right thing to do. Instead, by recognising that code reading is essentially a learning process, and considering how each line of code we write will be processed in the mind of the reader, we increase our chances of creating understandable, maintainable software.

DabApps is a leading Software agency based in the UK (www.dabapps.com). Please comment below or get in touch directly if you want to find out more about how we work, our areas of technical expertise and our tech stack.
Check out our current career opportunities, our values and the clients we work with.