David Montoya

Posted on Jan 28, 2022 • Edited on Feb 1, 2022

How to approach a codebase a la DevOps

#devops #productivity #beginners #sre

We devs often have to jump from repo to repo as we work through implementing a new feature or making a change to an app or API (hello SREs). Approaching a complex codebase that we haven't touched before (or recently) can be a daunting task. Having a systematic approach for getting acquainted with a codebase before rushing to introduce change, will give you a more encompassing view of the code, help you put the required change in context, and save you from shaving the wrong yak.

Whether solo or pair programming, small or large codebase, open source or proprietary code, follow these steps before you start hacking away.

1. Start from the README

A README is the de-facto index page of a program or codebase for users and future maintainers. Good READMEs welcome developers to self-service code changes in open organizations. Codebase owners should ensure the most important things a maintainer needs to know about the app are documented here, along with a quick try-it-yourself guide and one-liners to build, test or setup the app. At a minimum, the README should serve as an index that points to more detailed documents and diagrams.

Questions to ponder: What does this codebase do? Does it have tests? Can I install it? Does it have diagrams?

2. Poke at the CI pipeline

Looking at a codebase from the perspective of the CI pipeline gives you insights into the change frequency, stability and overall health of the codebase. Confirming the codebase is in a healthy state and "ready for change" before making a code change can save you from going down rabbit-holes, troubleshooting errors unrelated to your change.

When exploring a CI pipeline, look for common failures and signs of flaky tests so you know what to expect when running the tests locally; Browsing through recent build (and commits) can reveal patterns about a common type of change, the average size of change, or major refactorings or features that have just been introduced; From the list of releases, you can tell the "release cadence" and when to expect your change to make it to production. Lastly, rerun the most recent job to confirm the pipeline is idempotent and build artifacts outputted are consistent on every run.

If the pipeline is red, adding new revisions would only increase noise and make it harder for others to troubleshoot. Hold off pushing your changes until the codebase is back to continuous integration mode.

Questions to ponder: Is the pipeline green? When was the last time it ran? Does it fail often? Does it perform linting? Does it have flaky tests? Does it have e2e tests? Who was a recent contributor I could reach out to for help?

3. Run the tests from your local

Running the test suite from your machine gives you a baseline for when you start hacking away on code and iterating on the new test case. Running the tests can yield some insights on the level of test coverage, testing patterns used by maintainers, potential external dependencies, and the overall maintainability of the codebase. Codebases with consistent test patterns and sensible test coverage make it safe and efficient to introduce change.

Questions to ponder: Are the tests passing? Does it even have tests? Can the tests run on my machine? Do I have the required dependencies? Does it have external dependencies? Can the tests run with my Wi-Fi off? Can I add a new test?

4. Identify the entry-point

The entry-point in a software program determines how it is initiated and executed. Knowing where the entry-point is, gives you an idea of how to consume and test the code you're about to change. Most apps perform some form of configuration task upon start. Any configuration required to run the app it's likely being read and validated near the entry-point. When introducing new configuration options to an app like a new environment variable, the entry-point is a good place to start.

What the entry-point looks like depends on what the program does and how it's consumed. In HTTP based programs like Web apps or JSON APIs, the entry-point is an "http server" that opens a port and accepts TCP connections. You'd then need a client to consume it. Search the codebase for the occurrence of that port or references to HTTP resource paths. In the case of libraries, the entry-point would be a set of public interfaces or methods that expose a certain functionality. Tests are a good place to start when digging into library code. For command line apps (or CLIs) the entry-point would be a "command" function that's meant to be invoked from a terminal once installed.

In Dockerized apps, start by looking at the Dockerfile or docker-compose.yaml files. If an entry-point is not explicitly configured, one will be required when launching the container on the target platform. If running in Kubernetes, the entry-point would then be found in the command field of the Pod specification.

Questions to ponder: How does it run? How is it initialized? Does it need configuration? How is it executed in production?

5. Read up. Spot the patterns.

At this point, we should have a better view of the state and form of the codebase. Next step is to look at its structure and the actual business code. Every codebase has patterns set forward by the early or main maintainers. Depending on the size and type of change, you may need to emulate or adapt those patterns. Spotting the patterns, practices and overall structure of the code, puts the required code change in context and keeps you focused on introducing only the necessary code modifications. For consistency's sake, practices and conventions already established across the codebase should prevail over individual preferences.

If the domain model is not too anemic, scan for types (or classes) and their publicly scoped methods.

Questions to ponder: Where does my code change fit in all this? Can I change the code with the least impact to existing features? Can the codebase support the required change or will it require a refactoring?

At this point I'd strongly encourage you to practice TDD and write a unit test first before actually jumping into changing the code, but that's subject for another post :)

And you? How do you approach a codebase?

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

DEV Community

How to approach a codebase a la DevOps

1. Start from the README

2. Poke at the CI pipeline

3. Run the tests from your local

4. Identify the entry-point

5. Read up. Spot the patterns.

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

Top comments (0)

Read next

Fixing '@layer utilities...' Tailwind Error: A Quick Guide

Android WebView Crash: Fix "Operation not permitted"

Unlocking Your Potential Through Pushing Limits: Tips for Surpassing Boundaries (Bite-size Article)

Host a static website on AWS: A detailed step-by-step guide