How We’re Using Architectural Decision Records to Move Fast and Not Break Things (as much)

Start with why. The famous leadership gambit by Simon Sinek.

Justin Holbrook on Twitter: "A6: Find your why. Why do you teach ...

Sinek says that to motivate people you need to give them a purpose. Do this first. Start with the why.

Then move on to the how, and finally the what. The mistake he says leaders in many businesses make is they get the order backwards. They’re quick to convey the what and, sometimes, the how, but generally forget the why, and it’s a dysfunction of leadership.

We at AO started to notice some parallels in our approach to software development.

Why not why?

When we build things, our teams always capture the what. It’s an epic or a feature or an item of work on a board. We’ll break down the what into slices of how. We move these slices along the board as we build and ship items.

We commit code and capture the what in some meaningful way in our commit messages. We have change logs as part of an automated change tracking process. These capture the what too. We do a lot of recording the what. Sometimes the how.

But, in software development, we don’t capture the why. Why did we build it like that? Why did we configure it in that way? Why did we remove that bit? Why did we choose that technology?

The more thought we gave this the more we recognised it as a problem, and it’s less of a technical one and, as usual, more a human one. Let’s try and visualise it.

Details Disappear

We have a dog that’s jumped into the pool. It’s jumped in to get its toy which has sunk to the bottom. Think of this as the feature our team are currently working on.

A week later we’ve shipped, we’ve moved on, we remember, but we’ve not retained all the information.

A month later. It’s still there. We remember the work. But it’s losing clarity. The dog is in the pool; can we still remember why?

A few months later and now it’s much fuzzier. We think it was a dog. But it could be a bear.

Several months later and this is all someone new to the team sees. They’re diligent enough to look at the code and the tests, but they can’t derive the context.

And ultimately it ends up looking something like this.*

This phenomenon was neatly captured in a programming frame by Alan Eagleson.

Six months is somewhat optimistic (more like six weeks), but what the increasingly low fidelity images and Eagleson’s law illustrate is that detail disappears over time.

And it’s compounded when team members leave, because with them too leaves the detail and rationale for decisions they made. Gone forever. Without a mechanism to capture it you end up with people asking things like:

“Anyone know what this is?”

Or “Why do we do it like this?”

Or even “How the hell did we end up here?”

Having to Make Choices Blind

And when we ask questions like those we’re generally faced with a choice. We either blindly accept it is how it is, slowly back away and just leave it be, or we blindly change it and hope we don’t break something.

It goes without saying this isn’t really a good place to be. Both of these options come with significant risk. Different kinds of risk. So how do we mitigate against that risk? We need a mechanism to capture the why; a document to retain enough information so we know why something is there, why we do it like that and why we got here.

Capturing the Why

Nowadays, documentation has got something of a bad rep, and understandably so. It’s hard to write good documentation; technical writers are a rare breed, good ones even rarer. No-one really wants to read long documentation. Code bases change frequently and docs, like code comments, need to keep up; making them a chore to maintain, with the payoff versus effort dubious. Even the Agile manifesto says to prefer “working software over comprehensive documentation.” But, to quote Michael Nygard, the person credited with the seminal work on ADRs many years ago, “Agile isn’t against documentation, just valueless documentation.

Enter Architecture Decision Records. A lightweight document to capture a decision on software architecture or technology choices. A snapshot of context at a moment in time. As Nygard puts it, “An architecture decision record is a short text file in a format similar to an Alexandrian pattern. Each record describes a set of forces and a single decision in response to those forces.

ADRs are short, quick to produce, need little to no maintenance, and the detail they retain is valuable. A quick note of what you decided when and why.

Here’s the anatomy of a decision record outlined by Nygard many years ago.

ADRs in the Wild

Let’s say, hypothetically, we have a friend who lives in the countryside and has the menagerie of family pets you see above. “How did you end up with a mini zoo?” We may ask. Well it all started with a request from a child…

After consultation with the spouse, the decision to purchase a hamster is accepted. But before they did anything, they recorded the decision. Start with why.

And time goes on. Hamsters don’t live for ever. The decision they originally made is no longer the right one, for reasons outside their control.

Notice the only thing changing here is the status, the rest of the record is immutable.

And now there is a rabbit.

A status can also be amended by another decision, in this case to adopt a dog.

The family has grown, they’ve moved to a bigger house and now they have two pets. In the interest of brevity let’s summarise some of the other decisions made.

Our friends relocated to the countryside, bought a farmhouse and inherited two sheep (legacy infrastructure of an acquisition). There’s an orchard at the farmhouse and orchards need bees for maintenance and monitoring (swarm of microservices).

There was space for chickens and the prospect of free eggs was too tempting to turn down (low effort, high value quick win). The daughter saw the open fields and asked for a pony (as developers often want to use something new and shiny).

The Grandma unfortunately passed away, leaving them with a parrot; a legacy system they had to inherit that caused swearing with its integration issues. And then a Llama was purchased to protect the sheep from foxes (yes, this is a thing https://en.wikipedia.org/wiki/Guard_llama). Retrofitting security instead of shifting left.

And that’s how they ended up with a menagerie of pets. Each small step is logical in its own way.

And through the use of decision records they have an audit trail so anyone can understand how and why our friends ended up with all those pets. If they sold the farmhouse, along with most of the animals, the new owner might think to remove the bees, but they wouldn’t make that decision blindly because the decision records show them why the bees are there.

Real World ADRs

Hopefully all this helps to illustrate the value of retaining the detail behind decisions. So, what about real world ADRs you may ask? Here’s a handful of examples from the gov.uk Github https://github.com/alphagov/govuk-aws/tree/master/doc/architecture/decisions to show how they can be utilised on a technical project.

The first record is the decision to record decisions. Nice.

They have ADRs for infrastructural decisions.

ADRs on data storage.

APIs, back end services and front end too.

They use diagrams and images to convey context or decisions. Anything that helps to convey the detail but isn’t going to change as an ADR represents a decision at a point-in-time.

So, the gov.uk team, like our friends at the farmhouse, have an audit trail of decisions from the past, to help them make informed decisions for the future.

Getting the Most Out of ADRs

But capturing context and starting with why is not the only reason we’re adopting ADRs at AO. There’s plenty of other compelling reasons too.

Firstly, there’s that it’s a document, and by sheer virtue of having to write something down, it forces you to think it through more. Jeff Bezos built a culture of the memo and documentation to great effect for that very reason at Amazon. And ADRs are a low barrier to entry document, requiring little effort, so cynics of documentation are likely to be less resistant to give them a try.

Secondly there’s the benefit to the team. Some decisions will be made at organisational level, but the vast majority will be at team level. Even decisions for the organisation will likely originate through a single team. And ADRs are a great catalyst for team discussion. Anyone in the team can offer up a proposal, the team can discuss, assess and decide as a collective whether to accept or reject the decision. Doing so promotes autonomy and ownership.

Thirdly, and maybe most important of them all, they help others due to being discoverable and searchable. Other teams benefit from understanding and building on the decisions a team have made, raising awareness and spreading knowledge. We’re toying with adding an “inherited from” status to help us track when a team have made a decision based on one that originated from another team. The ability to share design decisions can’t be understated; it reduces the likelihood of silos of working and encourages long term compatibility, re-usability and scalability of software as systems grow and evolve.

And lastly, they’re a great onboarding tool. Reading through decision records is a great way to help new team members learn about your tech stack, codebase and why you do things the way you do.

We’re rolling ADRs out over the next several weeks at AO. We’ll follow this blog post up in a few months to document all we’ve learnt from introducing ADRs. If we didn’t, this post would be a bit hypocritical after all, wouldn’t it?

By Michael Chadwick and Alastair Brown (Principal Developer)

* Just a quick side note here to say we can’t take credit for the great idea of using increasingly low fidelity images to convey how detail disappears over time, we took that from a superb presentation on ADRs from Michael Keeling and Joe Runde of IBM https://resources.sei.cmu.edu/asset_files/Presentation/2017_017_001_497746.pdf