TL;DR: Drifting occurs in every codebase. Codebases do not break down due to a specific problem; they are drifting due to small changes (such as importing for analytics tracking) that become significant over time from a maintenance standpoint. You will not discover these issues with ESLint or unit tests. The long-term result is that after 6 months, you will be scared to touch anything in your codebase. This article describes what is causing the drift in your codebase and what you can do about it.
I submitted a PR with 3 lines of code including a simple import statement that would enable other developers to track user activity.
I received notification that 47 tests failed.
Not because of an issue in the code, but because that structure caused a circular connection to all four packages and that meant that the test runner would not even be able to access the packages anymore.
Sounds similar to something you may have seen before?
Most licensed software written in JavaScript or TypeScript does not simply fall apart.
It collapses over time.
All the tests continue to pass, and there are no issues flagged by ESLint.
However, if you change the code with each production release; with each build as a result; you feel increasingly anxious.
All your programmers give you the impression that there are portions of the code where they'd prefer not to work.
"Don't make any changes to the utilities file," is a common phrase heard by a programmer.
Every programmer has been through similar examples of having no confidence in the changes they're making.
Architectural Drift includes all the various effects of creating poor architecture; it does, however, include the continual decay of an already established piece of architecture. In this case, continual decay has created an increasing difficulty in supporting the ongoing enhancement of a given solution.
Some examples of what to look out for that indicate architectural drift has occurred:
- You have at least one or more files that were initially under 50 lines long, but have grown by hundreds or thousands of lines.
- Changing one property on one of your type definitions has caused you to modify 15 or more other files.
- You are importing
@app/sharedfrom nearly every source file in your application. - There is a lack of understanding in the team about why user interface code is imported in the backend.
- Several random tests fail with a message such as, "Cannot reference X until it has been initialized."
- There exist frequent merge conflicts on the same file, even with two different developers working in completely separate areas of the application.
The anatomy of drift: real examples
Once you can recognize that at least two of the above issues apply, it's highly likely you're experiencing architectural drift. Examples of architectural drift in real-world settings follow.
Example 1: The "Friday afternoon import"
It's 5:47 PM. You need to ship a feature. The orders module needs to call something from payments.
// orders/createOrder.ts
import { validatePaymentMethod } from '../payments/validation';
export async function createOrder(items: Item[], paymentMethod: string) {
if (!validatePaymentMethod(paymentMethod)) {
throw new Error('Invalid payment method');
}
// ... rest of the logic
}
Seems fine, right? But payments already imports from orders:
// payments/processPayment.ts
import { getOrderTotal } from '../orders/utils';
export async function processPayment(orderId: string) {
const total = await getOrderTotal(orderId);
// ...
}
Well done! You have now created a circular dependency.
Bundling tools are smart. Your code may still work, but you now have two modules that cannot be separated. There is no way to independently test them. You cannot create a separate package for one of the modules. Eventually you will find that when you attempt to import one of the modules, the order in which you imported them matters, and you will receive confusingly 'undefined' JavaScript runtime errors.
Example 2: The Growing God Module
Every project has that one magical file that people turn to first when they begin coding. The magic usually starts off innocently enough:|
// shared/utils.ts
export function formatDate(date: Date): string { /* ... */ }
export function formatCurrency(amount: number): string { /* ... */ }
Six months later:
// shared/utils.ts - now 2847 lines
export function formatDate(date: Date): string { /* ... */ }
export function formatCurrency(amount: number): string { /* ... */ }
export function validateEmail(email: string): boolean { /* ... */ }
export function parseJWT(token: string): JWTPayload { /* ... */ }
export function calculateTax(amount: number, region: string): number { /* ... */ }
export function sendAnalytics(event: AnalyticsEvent): void { /* ... */ }
export function debounce<T extends Function>(fn: T, ms: number): T { /* ... */ }
export function deepClone<T>(obj: T): T { /* ... */ }
export function encryptPassword(password: string): string { /* ... */ }
// ... 200 more exports
The file currently has the following characteristics:
- 47 modules depend on it (fan-in).
- 23 of its own dependencies (fan-out).
- 89 commits during the previous quarter (churn).
It is no longer a utility module, and it has become a god module, a single point of failure, thereby making your entire codebase fragile.
Example 3: Layer Violation Cascade
You are creating a clean architecture, where the domain logic is pure and clean, while infrastructure is responsible for all of the dirty tasks.
However, someone proceeded to add analytics:
// domain/userService.ts
import { trackEvent } from '@app/ui/analytics'; // π
export function registerUser(data: UserData) {
const user = createUser(data);
trackEvent('user_registered', { userId: user.id });
return user;
}
Why does this appear to be complicated? It's only one extra import!
However, your Application Domain now relies on the UI.
The Infrastructure Layer, which relies on the Application Domain, now also has to depend on the UI, through the Application Domain.
The Application Domain that was previously thought of as "pure", now has to have the entire React Dependency Tree existing below it, in order for it to function.
You cannot extract your Application Domain Logic from your Application Domain Layer, for use in a Common Library.
You will have to create test mocks for the UI Analytics in order to test your Application Domain Logic.
Just one additional import, but it has a cascading effect.
Example 4: Shotgun Surgery
Adding the "Middle Name" field for users was an incredibly painless modification.
Files modified: 23
Lines added: 147
Lines deleted: 12
You had to update:
-
Domain
- User type definition
-
Transport / API
- UserDTO
- CreateUserRequest
- UpdateUserRequest
- UserResponse
-
Infrastructure
- Database migration
- 2 seed scripts
-
Application Layer
- 3 API endpoints
- 5 validation functions
-
UI
- 2 form components
-
Tests
- 4 test files
This is called shotgun surgery.
A single logical change resulted in modifications scattered across the entire codebase.
The entities are so tightly coupled that it becomes impossible to change one without touching many others.
The architectural drift caused by JavaScript and TypeScript projects is exacerbated by the ease with which they can be developed.
Easy Imports
You don't need to think about it; you don't have to worry about compiler warnings. Just type import { something } from 'wherever' and you're good to go.
Boundaries Are Only Suggestions
There are no rules governing module boundaries as there are with Java packages and CI assemblies; all of it is based on conventions, which are easily ignored.
Monorepo Growth is Phenomenal
What started out as 3 packages has grown to 27 - the original maintainers have all left, and nobody remembers why the @app/legacy-utils package exists.
Loss of Context
The note you added to the code that says, "TEMPORARY - remove after Q3 migration" is still there - written in 2021. No context remains for you to remember what you were doing at the time.
ESLint is a Great Tool
ESLint has a lot of value, but its value is only for its intended purpose.
- Code Quality / Style Consistency / Common Bugs all are represented in ESLint.
ESLint operates on a file-level basis and in real-time.
Even no-restricted-imports falls short because of its limitations:
- You need to know ahead of time what you need to restrict
- It doesn't adapt to the ever-changing architecture.
- And it can't tell you, "Is this PR going to ruin the architecture or improve it?"
Architectural drift is temporal. ESLint is not.
Code reviews may be used to catch architectural drift. However, as code review is an effective way of validating the cleanliness and quality of code, they aren't as effective when it comes to testing architectural drift since:
- The number of reviewers will frequently change and thus will have no historical context of how the code was constructed.
- The "I'll fix this later" mentality becomes commonplace and nothing gets fixed.
- No one will remember how the architecture was intended to look, thus all the previous diagrams or other documentation that supported the intended architecture become useless.
Architectural drift occurs while maintaining that code is supposed to function correctly. Thus, unless there is some sort of automated tool to assist with keeping architectural drift in check, this is an outcome that is relatively easy to miss when you are trying to determine what the original architecture was expected to resemble.
The real problem: architecture has no baseline
The central problem is:
We do not consider Architecture to be a Versioned Artifact
Files are tracked in Git, and CI checks their current state.
What about Architecture? It can reside in lots of places:
- Architecture Decision Records that no one ever reads
- Diagrams that are 3 versions out of date
- The minds of departed developers
Because of this, teams have no way of reliably answering this question:
Will this PR introduce an architectural regression?
With no baseline, every change is made in a vacuum.
A different approach: regression-based architecture
Instead of attempting to pursue an example of 'perfect' architecture you could do:
- Don't build it out
- Don't refactor it
- Let the current modules stay the same, then allow for no new architectural regression
This process is composed of the following four steps:
- Determine your current design and module structure.
- Document that architecture/design and module structure as the initial baseline.
- Each Pull Request (PR) will be compared against the documented initial baseline
- The Continuous Integration (CI) process will only allow you to fail CI would be if a new architectural violation occurs.
This process is similar to how you think of:
- Regression Testing
- Snapshot Testing
- Performance Budgets
This is not an assertion that the architecture is the best, this is an assertion that the architecture has not regressed.
What this looks like in practice
Consider a typical monorepo:
packages/
ui/ β depends on domain
domain/ β depends on shared
infra/ β depends on domain
shared/ β depends on nothing
Everything is reasonably layered. You capture this as a baseline.
Now someone opens a PR:
// domain/userService.ts
import { trackEvent } from '@repo/ui/analytics';
With regression-based checking, CI responds:
β Architectural regression detected
New violation: domain β ui
domain/userService.ts imports @repo/ui/analytics
This dependency did not exist in the baseline.
This would create a layer violation (domain importing from ui).
Importantly, there is no room for differing viewpoints or discussions regarding this as this is purely informational: "Prior to this, it was nonexistent. Here are some of the ramifications of having such an object in existence."
As a consequence, the developer will have two choices:
- Address the problem (Move the analytic data to shared repository).
- Intentionally adjust the baseline following group coordination.
Regardless of the angle chosen, it was a thoughtful determination/decision, rather than a random occurrence.
Tools that can help
There are several tools available for anyone interested in implementing a similar approach, including:
Archlint is an option specifically for Javascript and Typescript Projects.
- Analyses how modules depend on each other
- Creates and maintains an Architectural Baseline
- Compares PRs (Pull Requests) to the defined Architectural Baseline
- Identifies Code Smells including Circular Dependencies, God Objects, Layer Violation etc.
It does not help you redesign your system, but rather, makes architectural drift apparent very early so that you can fix it while it is still inexpensive to do so.
Key takeaways
The concept referred to as "Architectural Drift" happens gradually and cannot be attributed to a specific PR. It is impossible to track Architectural Drift using the existing tools in the software development ecosystem (ESLint, automated tests, manual code review). User-defined static architecture rules do not adequately address this issue because, unlike Architectural Drift, they do not evolve over time. Instead of focusing on the "ideal" architecture, tracking the presence/absence of regressions is the best approach.
We should think of Architectural Drift similarly to how we think of behavior; that is, the behavior of our architecture can also drift back over time, which causes regression and can negatively impact our applications, therefore we must also monitor our architectural drift and track the number of times our architecture verges on a regression β while providing a mechanism for us to track any regressions.
While it may not be essential to have the perfect architecture, we should have sufficient confidence that our architecture isn't becoming worse without our knowledge.
What's the worst architectural drift you've seen in a codebase? Drop a comment below β I'd love to hear your horror stories. π
Top comments (0)