GlockOClock

Posted on Feb 26

Kaizen Master - Solution for Technical Debt (and Legacy Code)

#agile #programming #productivity #development

(a short cookbook is at the bottom of the page)

~70% of developers call technical debt the main obstacle in their job. The problem scale estimates ~$2.4 trillion/year loss only in the US (e.g. CISQ report, AEIdeas Post). Yeah, that much! Programmers and managers notice: the first 1-2 years a product is fine, but then you see 30-60% time being spent on fixing. The company becomes nearly paralyzed.

As an engineering manager, I thought about it and during the last 1.5 years tried an approach based on Kaizen. Its goal is to gradually reduce technical debt without rewiring much resources. And, chief, it gave results!

Speed-Quality Dilemma

Devs often don’t have time to study a case deeper and find a better solution. Instead, they commit a fast one. It causes accumulative quality decline, including system effect - when several problems cause more damage than a simple sum of them. You get a hole in your pocket, through which resources crumble.

Gentler deadlines or maintenance procedures add to quality, but lower speed is bad for business, who wants to outrun competitors and make a revolution (and pays salary per-hour).

So, while both speed and quality needs are justified, they are rarely achieved together. But sometimes they are…

Kaizen

Though Kaizen is popular in IT, it’s mostly used in the “continuous improvement” meaning, which is not exactly correct. Short history lesson: it started at WWII, when resources and specialists in the US were tight, so as time. And the rival’s Blitzkrieg strategy was about high speed. American industry had to keep up, improvements were needed, but large modernizations were out of the table. So, the idea of small periodic zero-investment improvements came in. On long distance, little steps gave strategic level impact. A bit later the approach came to Japan, where due to cultural features (Bushido, Shinto, Buddhism), it grew into Kaizen.

Kaizen is a methodology of small, prolonged over time improvements of simplification, cleaning and ordering nature. Fixing bugs and adding new features is not a simplification, nor a zero-investment. Code refactor, process reorganization and algorithm shortening - this is Kaizen (philosophical objections possible). And, yes, Kaizen was designed for long-term endeavors with limited resources. Thus, it’s a bit more than just “continuous improvement”.

Transactional Kaizen

Kaizen is often criticized for stopping work when a chance to improve is found, and pumping up resources to speed up the improvement. Indeed, it can disrupt development, especially in cross-functional teams and matrix-structured organizations, where resources are on-demand. Plus, improvements can spin other improvements, which leads to even bigger disruption. Here comes Transactional Kaizen - a less chaotic approach.

In my current collective, we work in 6-week-long transactions (projects/epics). During a transaction, we stay on planned features and bugs. If we see an opportunity, we don’t improve immediately. Instead, we leave a TODO comment in code (faster than opening a backlog, creating a task).

As a transaction ends, we run Kaizen “blitz” - a week of improvements. No bugs fixed, no functionality added, only TODOs and refactor. But that's not all!

What is Improvement

Say we read an object from a database and update it. Under the hood, it does 2 DB requests, SELECT and UPDATE:

attr = Attraction.objects.get(pk=1)
attr.name = "new name"
attr.save()

Though it’s a common way to update an object, it can be done with just one request:

Attraction.objects.filter(pk=1).update(name="new name")

This small modification saves some processor time and look at it! We don’t even need a new test case, as we change the existing code.

Say we have tests, and we need test objects. E.g. to test a vacation plan, we need attractions (museums, beaches, etc.). Such tests quickly grow boilerplates. On a Kaizen blitz, we write an attraction init function with default values and flags for multi-step operations, so we reduce boilerplates as well as speed up writing the future tests.

Say we have a Docker container based on image. During a blitz, we change it to image-slim to reduce the container size, hence speed up its building.

These are "small zero-investment improvements". I guess, champ, you agree that one doesn’t take much of your time, 10-30 min maybe. Do it on a periodic basis, and code can become art, though it’ll take some time.

Kaizen Master

A step away from Kaizen is that we have only one person for Kaizen blitz. Why? Imagine you put an entire team (~5 devs) on it. Will the department stop for a week? Isn’t it a big resource rewire in terms of business? We probably can use a shorter blitz, say 1 day, but then we need to choose a day and sync devs, or make them choose a day each for themselves… And how to prevent them from grabbing the same TODOs - we don’t use trackers, only code comments. A lot of “if”.

So, in each team we install a Kaizen Master (KM), who after every transaction (project/epic) does blitz. As you might have guessed, it’s the technical team lead. Here is why:

Lead reviews all pull requests and projects, thus, aware of what’s going on in code. Others will spend more time studying what they didn’t see yet.
Lead tracks all code, thus, has an eye on global inconveniences and patterns, like lack of testing utilities and boilerplates.
Lead is usually the top experienced employee with architectural skills and strategic vision. These items and Kaizen complement each other.
After a transaction is finished, our leads usually spend time with stakeholders and managers, deciding what’s next, who to delegate, etc. A blitz with a comparatively low load and free roaming time (there is no plan for blitz) seems a good time for meetings, brainstorming, chaotic calls and approvals.

With this in mind, tech lead seems the most effective person for blitz. They’ll improve fast, accurately and considering long-term consequences.

When to Apply

Kaizen Master makes sense in fast growing SaaS/PaaS projects, such as fin tech, web services, cloud products, etc. They are long-term, have large multi-tenant codebases, require to often work on already implemented functionality (e.g. add new features to it). And project resources are limited. Huge amounts of work on tight deadlines happen in rich corporations all the time, so don’t take “limited resources” too straight.

KM fits well with teams that work in Shape-Up, Scrum, Rapid Development and Extremal Programming methodologies. Basically, any transactional/iterative approach fits.

When NOT to Apply

If a team has time to run improvement sessions or deadlines are absent (you heard that right, I worked like this once), Kaizen doesn’t make much sense. Short-term (<1 year), contract development and projects with generic logic (e.g. REST API limited to CRUD and filtering) will not benefit.

Profits

It causes systematic code and product quality to rise, though it may take ~8-10 months before impact becomes significant (which isn’t much in terms of business).
Resources are not drastically rewired - the team keeps working, while KM is on blitz alone. Depending on transaction size, KM will spend 10-15% of their time for improvements, which isn’t a lot.
No planning, tools or bureaucracy needed - improvements are small, spontaneous and informal.
Gives the KM time to review the codebase (not just out-of-context pull requests), which improves their knowledge of project and team, which helps in architecture and strategy.
KM helps to avoid scope creep, as improvements are organized - separated into blitz.

In Anuran (trip planning webservice, where I work), we have a ~4 years old codebase. 1.5 years ago, our technical debt was ~40% of work time. It’s because we started the project with prototyping - rapid development of pieces, dropping them, keeping successful ones and then merging them into a single system. But today, we spend only ~6% of time on technical debt. Sounds like success, ha?!

Cookbook

So, the finalized recipe of KM:

Transactions:
Split the work into large periods - transactions. In most teams, 6-8 weeks is enough for code to grow significantly and obtain problems.

Kaizen blitz:
1 week is enough for KM to review a lot of code and improve it. There is no work planned for the KM during blitz, and it must not be considered as spare time. It’s reserved for quality control, only urgent problems should be attended at it by KM.

(It’s why we do blitz periodically instead of event-driven - there is always something else to do urgently, so improvement is delayed indefinitely. When time is reserved, even though it’s not much, it’s used efficiently.)

Simplify:
Only simplifications are done on blitz. Bug fixing and features (even tiny) are not. Most activity should go around refactoring, erasing unneeded code, reducing dependencies, writing reusable utilities, merging boilerplates, lowering cyclomatic and time complexity.

Point improvements:
If an improvement takes more than a day, split it into smaller pieces. It’s fine to do a part and continue on the next blitz.

Choose KM:
It’s recommended to choose tech lead as a KM due to their architectural, strategical skills, best practices and actual codebase knowledge.

Write TODOs:
To make KM work, devs should write TODOs when they find a potential improvement/inconvenience.

Do not:

Use complex tools for TODO tracking. KM is a lightweight methodology, not a new form of bureaucracy.
Consider KM as an excuse to write badly, because there is time to fix it later. Technical debt piles up exactly because of this logic.
Make it complicated. 3 days instead of a week, transactions of different sizes - it causes frustration. Keep blitz 1 week long and choose transaction length according to the speed of codebase growth.

Conclusion

That's it, that's a methodology we designed in Anuran to deal with technical debt. I think you'll agree it's very lightweight and informal (which is the goal) but may still bring results. If you're going to try it for your team, it'd be nice if you'd provide some feedback and observations. It may help to sharpen KM even more.

DEV Community