loading...
Cover image for Don't let that huge codebase scare you! Tips and tools to make sense of other people's code

Don't let that huge codebase scare you! Tips and tools to make sense of other people's code

mjraadi profile image Mohammadjavad Raadi ・4 min read
Image: Cherlat.com, Getty Images/iStockphoto

There comes a time in every developer's life when they start a new job, join a development team or decide to contribute to an open source project and you’re faced with a new, large, unfamiliar codebase and a bug to fix. The codebase is something of a size you’ve never really seen before but don't worry, I am going to share a few tips and tools for how you can make sense of other people's code without going crazy.

  1. You Don't Necessarily Need to Know Everything 🀯

    It was a game changer for me to realize that I simply can’t understand everything. Lots of beginner engineers are ambitious and want to read everything. The spirit is good; however, no, you simply can’t, period. It is important to prioritize what code you want to understand, and what code to skip.

  2. Read the Docs πŸ“–

    The first place I start with a new project is reading over any available documentation or README files. This allows me to become familiar with the setup, functions, style, and other important parts of the codebase. Some parts of the documentation might be out of date, but seeing the evolution of code can also help to understand the project’s history. Unfortunately, sometimes, the documentation isn't complete and it may be wrong.

  3. Use Command Line Search Tools πŸ”

    It is not always easy to be immediately effective when you dig for the first time into a codebase containing several thousand of lines. But my secret weapon to find my way through so many lines of code is command line search tools like grep and ack. Searching for a unique string or keyword is an excellent way to find out where a functionality is located without jumping into a text editor. I'll demonstrate how I use this handy method to immediately find out what file(s) I need to look at to fix a bug.

    A Practical Takeaway

    I've recently started contributing to the DEV source code and I highly recommend everyone to do so as well. It has quite a large codebase and the maintainers are extremely welcoming. I believe it is a perfect place for junior developers to contribute to the open source and put their knowledge into practice.
    I identified a bug and reported the issue, and I wanted to try to fix the issue myself. Consider that DEV's backend is developed with Ruby and I am not a Ruby developer but what I needed to fix was the HTML code that was returned to the browser. I had no idea what file I needed to look at or where the function was.

    Grep

    I opened up the dev tools and realized that the <div> element in question has a class name of ltag__twitter-tweet__video. Running the string through grep I found three files I needed to look at:

$ grep -iRl "ltag__twitter-tweet__video" app/
app/assets/stylesheets/ltags/TweetTag.scss
app/views/liquids/_tweet.html.erb
app/liquid_tags/tweet_tag.rb

Cool, right? The nice benefit of using grep is that it’s available on pretty much any β€˜nix distribution you might use. Its utility is powerful in many different contexts. To learn more about grep, make sure to check out this post.

Ack

Ack is β€œa tool like grep, optimized for programmers.” It searches recursively by default (i.e., your project) while ignoring VCS directories like .git and has convenient tools that help you explore code with fewer keystrokes.
Taking the same grep example, here’s how we would search for β€œltag_twitter-tweet_video”:

$ ack "ltag__twitter-tweet__video" app/
app/assets/stylesheets/ltags/TweetTag.scss
44:  .ltag__twitter-tweet__video{

app/views/liquids/_tweet.html.erb
9:        <div class='ltag__twitter-tweet__video'>

app/liquid_tags/tweet_tag.rb
30:          el.getElementsByClassName("ltag__twitter-tweet__video")[0]   .style.display = "block";

Ack is my trusty search tool of choice and I think you will get a lot of value in using it as a grep replacement. I would highly recommend learning how to use ack. Consider reading this to learn more.

  1. Ask! πŸ™‹

    Finally, if you’re still stuck… then ask! A great way to gain knowledge of a project when starting out is to pair with a developer on your team who is more familiar with the codebase. This person can provide high-level insight about particular design patterns, testing, processes, and third-party code that are relevant to the project. They can also give you more historical context about the project and why certain choices were originally made.

And those are my tips! Hopefully this helps with the daunting task of unfamiliar code. Please try not to stress. Breath, remember these and you’ll be fine!
What ones do you have? I'd love to hear your tips and tricks as well.

About Me

I am a full stack web developer and co-founder of Bits n Bytes Dev Team, a small group of highly talented and professional freelance developers, where we provide custom web application development services based on cutting-edge technologies, tailored to client's unique business needs.

I'm available for hire and you can check out my portfolio website at https://www.bitsnbytes.ir/portfolio or contact me at raadi@bitsnbytes.ir.

Posted on by:

mjraadi profile

Mohammadjavad Raadi

@mjraadi

Full Stack JavaScript Developer | Building Web Apps | Working With Vue, React, Node.js & MongoDB | Open to Remote Roles

Discussion

markdown guide
 

Personally, if I'm landed in front an application I haven't seen before and is complex, I like to get a debugger attached, put a breakpoint in the main method (or a controller action method) and then just step line by line, inspecting what changes and where the code goes.

It can take a while, but it means you are getting the literal truth of what the program does.

This approach particularly helps with badly laid-out code or 'lightly commented' code.

 

I think a debugger is a defacto tool. Combining the awk / feel solutions with a debugger has a genuine benefit for everyone.

 

Thanks for sharing. I too find myself doing this every so often.

 

congrats on finding the bug and helping resolve it! I want to start doing it too, I've reported issues, but haven't tried to fix the code, just need to take that first step

also, consider using ripgrep instead of ack, it will be much faster, see blog.burntsushi.net/ripgrep/ for details

slightly off-topic: I published a book on "GNU grep and ripgrep" last week, it is still free for few more hours, so check it out if you are interested:

 

Thanks, I appreciate the time and effort you put into publishing this book and providing for free.

 

Is there any advice you would give
If there is no kind of Documentation available not even readme 🀯πŸ˜ͺ

 

You can always help with documentation. It may mean you need to dig deeper in the codebase, but the documentation that you add will help future developers, including yourself. πŸ˜‰ I wrote about this earlier this year.

 

Static analysis tools can be a huge help. You can get a pretty good idea of where most of the code is being tied together via dependencies, which files are most volatile (# of commits vs lines of file), easily reformat things to your preferred style, and generate dependency diagrams. I have been using Jetbrains Resharper for years as a C# developer and recently used NDepend on an architectural assessment, where it was a huge help.

 

ReSharper and NDepends are both great tools! I would recommend a tool like ReSharper to every Devs. It's also a great way to learn about best practices and enforce standards.

My only complaint with ReSharper (besides resource usage, but that's largely on Microsoft for VS still being a 32-bit app) is the amount of options available. You can spend hours trying to get your coding style set up, and the documentation explaining what some of the options are or why they matter is pretty poor. I wish they had some presets that would let you start off with a common coding style instead of having to tinker around so much.

I feel like there is so much more ReSharper can do that I don't know about, but their documentation seems to be slipping. I have experienced very poor interactions with JB customer service as of late and I hope they can resolve that.

I know what you mean. It does take a lot of resources, especially for very big codebases. The code templates I think can get as complex as you get, but it does have some basic formatting built in. It would be great if they did include more built-in templates for styles and best practices. The newer version of Visual Studio seems to include more and more of the features that usually were only found in ReSharper.

 

Reading the commit messages for the files you're working on and the units tests(if there are any) could potentially help you understand the code better.

 

Well in my case I was handed over the code which was not documented & didn't used any kind of version control

Guess, I will die

Anyway I figured out the things to change & it worked

Oh boy, I totally get what you went through. Glad you worked that out.

 

Find in files... That is all 🀣

 

Shift + CMD + F - my favourite shortcut in Sublime πŸ˜‚

 

Great read! πŸ˜€

One of the things I find myself doing, depending on the codebase is to look for the tests OR write some tests around the code.

I like tests that express some of the business requirements and show any thought patterns and logic that have formed the rules around what the code should and should not be doing.

Sometimes it's nicer if there are no tests at all so I can experiment more and see the boundaries myself.

Either way tests are so valuable!!

 

You're article just saved me a life times worth of "where is that file".

I'll employ its use in my next source code venture.

 
 

Ack really seems interesting, it will save lots of effort

 

Adding to all the above points and comments one can even use the tools like eclipse to search in all the code bases (should be imported into a comman workplace). Use find in all the files. I generally import the project on eclipse and search based on the keyword. I also use git history to search through the comments with the key words. One can even use JIRA to search for the story or task which has keyword.
My personal startergy is use
1) jira first to find all the story and tasks
2) then use the commit ids to look into the files that got checked in.
3) look into the eclipse to find out the files.
4) Do the static code flow.
5) Put debug point then to find the correct flow.

 

Docs' are for suckers!

Reading the source and implementing a bunch of methods that are still in flux, or intended as private, and then weeping when they break is the way to go! :D