On "technical debt" - Donald Erik Schneider

The term “technical debt” is a useful one. Unfortunately, as with anything useful, it is overused and used sloppily. It’s also sometimes misused to disparage a historical decision somebody disagrees with. But more than that, I think it is broadly misinterpreted. The term is intended to help justify work that has no immediate customer benefit, but rather is expected to help us execute faster. The business analysis on the value of a new feature customers are clamoring for is easier to quantify than the value of a system re-architecture. It can be hard to justify investment in cleaning up the code.

But that’s not really what “debt” means.

Most of us think about debt from the perspective of personal finance – that is, largely as a convenient way to buy things and pay for them later. Or perhaps from a debtor’s prison perspective: something to be avoided, or certainly paid off as quickly as possible. But when engineers raise the topic of technical debt, they’re often communicating with business owners, who often have degrees in business or economics. To such a business person, this is simply not what “debt” means. When an engineer talks about technical debt to scare managers into worrying about operations (“We have a ton of technical debt, we need to fix it!”), it kinda sounds to sophisticated business people like… well, like what those same business people sound like when they try to justify new technologies to technologists: “I want you to blockchain the virtual. Right away. Because it seems like we don’t have enough AI, yet.”

A business leader may be thinking about things very differently. Many may think about debt in terms of leverage. Well-structured debt is a benefit; it provides extra money we can use today. So if you talk to an options trader about having a ton of technical debt, their reaction is likely to be, “That’s fantastic! how do we get more?”

It’s not about debt.

The term invites us to focus on the wrong part of the problem. The casual interpretation of debt suggests that problems accrue because we are borrowing time from the future and being lazy about paying off our debts, and that is inaccurate and unfair. In fact what we’re doing is making intentional, mindful decisions about opportunity cost.

It’s not debt; it’s future value. Those are related but different things.

I’ve often said that “every keystroke a developer makes is a business decision.” We are, every day, in almost everything we do, continuously trading off opportunity cost: should I do a thing now, or wait and do it later so we can do something else now? When I decide it’s not worth adding a comment to piece of code, I’m fundamentally trading off the value of investing the time right now to clarify my intent against the time a future developer (usually me) needs to spend figuring out what the hell that code is doing. When I decide not to introduce a new abstraction to make the code more flexible, I’m trading off the value of investing time right now so it will be easier to do something else later. These are all opportunity cost decisions.

Sometimes… you just do it

There is a subtle consideration that is often overlooked in these conversations. In many cases, just evaluating the opportunity cost of doing something costs as much or more than doing it. When an activity is small enough, trying to decide whether the investment is worth it can be more work than the work itself. When in doubt, we should just do it. The first question to ask is whether it is even worth worrying about the cost of an investment. Small changes, incremental changes, housekeeping, etc, are all things that often take less time to do than to measure. The concept of “refactoring” is partially founded on this principle! So for small or medium code refactoring:

I always clean as I go. It’s like housework. You never really need to vacuum, because if you leave it until you actually need to vacuum, your carpet is [elided].
If I have trouble understanding some code I need to work on, I’ll often refactor to help clarify. This requires good unit tests, but of course if you don’t have good unit tests what you’re doing isn’t refactoring. So I’ll add unit tests first, which also helps explain the code (“wait, it does what now?” is sometimes a very useful test result). Then if needed I’ll refactor to improve comprehensibility.

Continuous housekeeping tasks – especially operational hygiene, such as good logging for diagnostics, operational metrics (you should never ever fail because you run out of disk space!), continuous deployment, keeping operating system and libraries up to date, etc – we should not waste time debating whether these are worth doing. If you have reached the point you have to worry about the investment it takes to upgrade your servers, the failure happened a long time ago. That’s not technical debt, it’s poor operational hygiene.

Sometimes… you gotta think

But when times are hard and things are bad, sometimes you need to worry about the cost/benefit of an investment. When that happens, the considerations change. Such investments require more justification, but the underlying trade-off is pretty straightforward. As my current manager says, “If it saves us more time than it costs us, we make the investment. It’s just that simple.” Quantifying this isn’t easy, but that’s what we’re trading off. Large changes, whether it be factoring the code, re-architecting your system, or just replacing it outright, need real forethought about the return on the investment. The rubric is not “debt” – it’s simply trying to decide whether the investment will pay off in the long term or not.

We cannot get that completely right – there is always risk that we’ve overestimated the return, or that we technology simply won’t be around long enough to justify the investment – but that is the decision we’re trying to make. Once again, it’s not about debt; it’s about opportunity cost and return on investment.

So, many technical leaders at Amazon lean away from the term. It’s not that the concept is wrong; it’s that it puts all change into a single basket. The question is not whether the code has problems; it’s code, it has problems. The question is whether the value we get in return for fixing it justifies the investment in fixing it.

It’s not technical debt. It’s technical opportunity cost.

4 thoughts on “On “technical debt””

Jeff Martin says:

March 1, 2021 at 8:18 am

Great article, I never really thought about non-technical folks hearing the debt part of the problem and potentially thinking its not only something else, but something good instead of bad.

The way of thinking about opportunity cost for each line of code, or keystroke is also good. Developers/Engineers need to have this in our head all the time. Its part of what makes good engineers good instead just code typists.

1. Luke West says:
  
  March 1, 2021 at 12:21 pm
  
  This is a great way of looking at the subject. I’ve hated “technical debt” as a term. It’s as if the developers purposefully made rubbish code which you have to pay to get them to fix. So it’s a problem.
  Your view is that it’s an opportunity to make it better.
  I love it.
  
Ene says:

March 29, 2021 at 1:22 am

What’s up with the scroll on this page? For every single step I turn the mouse wheel the view moved maybe twice as long as i expected. Also the heavily condensed font is hard to read. I had to copy the content to a text document to read it without annoying distractions.

Pingback: allesnurgecloud #09 – technische Schuld, Exchange Lücke, Konferentipps und mehr – allesnurgecloud.com