zvone187

Posted on Jun 29, 2023

How to kickstart automated test suite when there are 0 tests written and the codebase is already huge

You know, there’s a certain anxiety that creeps in when you take a look at a codebase you’re just starting to work on, only to realize it’s a vast uncharted wilderness. Not a single test has been written to guard against the unexpected. It’s like walking on a tightrope over a chasm, knowing a single misstep could send your entire project plummeting into chaos.

If you worked on a codebase with 0 tests, you know that it can be a daunting task to think about covering the entire codebase with tests from scratch where none currently exist. The process demands an almost Herculean effort: you’d have to pour over every function, method, and component, brainstorm all potential edge cases, structure the test suite code, and get it all running smoothly. And that’s not even touching on the time it takes to reach meaningful coverage. We’re talking weeks, perhaps months, before you can sit back and say, “Yes, we’ve hit 80% or 90% coverage.”

This is why I’m excited to share what I’ve been working on for the past couple of months. This journey takes us to a place where the realm of automated testing meets the magical world of AI. Meet Pythagora, an open-source dev tool that’s about to become your new best friend.

Throughout this blog post, I’m going to show you how to kickstart automated testing with Pythagora which harnesses the power of AI to generate tests for your entire codebase, all with a single CLI command, and hopefully get your codebase to 80% – 90% code coverage in a single day.

Creating a test suite from scratch

We all know the saying, “Rome wasn’t built in a day”. The same could be said for a comprehensive, effective test suite. It’s a meticulous, demanding process, but once you’ve traversed this rocky road, the sense of accomplishment is profound. Let’s journey together through the necessary steps involved in creating a test suite from scratch and reaching that coveted 80% – 90% code coverage.

Laying the groundwork

In the first stage, you’re like a painter in front of a blank canvas. The world is full of possibilities, and you’re free to create a masterpiece. Your masterpiece, in this case, involves choosing the types of tests you want to write, finding the right testing framework to use, and adopting the best practices suited for your specific environment. Are you considering unit tests, integration tests, E2E tests, or a blend of all three?

While this initial setup is often viewed as the “easy” part, it is by no means a walk in the park. Time, research, and perhaps a few cups of coffee are required to make informed decisions.

Pythagora - this is harder than I thought

Diving into the details

Once you’ve got your basic structure in place, it’s time to roll up your sleeves and delve deep into the nitty-gritty. Now, you’ll need to go through your entire codebase, one function at a time, and write tests for each. Your task here is to ensure that your tests touch all lines of code within each function, method, or component. This task is akin to exploring an intricate labyrinth. You need to traverse every path, turn every corner, and ensure no stone is left unturned.

Writing these tests is a detailed, time-intensive step. It’s not just about writing a few lines of code; it’s about understanding the function’s purpose, its expected output, and how it interacts within your application.

Exploring the edge cases

After the initial round of testing, you might breathe a sigh of relief. Hold on, though; there’s still an important piece of the puzzle left. It’s time to dive into the wild, unpredictable world of edge cases. This part might not increase your code coverage percentage, but it’s crucial in testing the robustness and resilience of your code.

These so-called negative tests help evaluate how your code reacts to various inputs, particularly those on the fringes of expected behavior. From empty inputs to values that push the limits of your data types, these tests are designed to mimic user behavior in the real world, where users often have a knack for pushing your code in directions you never thought possible.

Creating a test suite from scratch is a Herculean task. But rest assured, every effort you put in is a step towards creating a more robust, reliable, and resilient application. And remember, you’re not alone. We’ve all been there, and with a tool like Pythagora, the journey is not as daunting as it may seem.

Generating Tests with one CLI command

On the other hand, with Pythagora, what you can do is enter:

npx pythagora --unit-tests --path ./path/to/repo

Pythagora will navigate through all files in all folders, conjuring up unit tests for each function it encounters. Now, you can sit back and relax or go grab lunch and let it run for a while until it finishes writing tests.

Ok, but wait, what the hell is Pythagora??

What is Pythagora

I’ve always dreamed of a world where automated tests could be created for me. But the reality isn’t that simple. No one knows your code quite like you do, making it challenging for another to draft effective automated tests for it. The results often fall short of what you’d achieve yourself.

However, everything changed when ChatGPT entered the scene. As I tinkered with this technology, I found myself wondering, “Could we harness the power of ChatGPT for writing automated tests?” Curiosity piqued, I delved deeper, experimenting with its capabilities, and what I discovered blew me away. ChatGPT demonstrated an incredible ability to comprehend code, offering a glimpse of a promising new avenue in automated testing.

And thus, an idea for Pythagora was born.

Pythagora is an open-source dev tool, crafted with one mission in mind: making automated testing autonomous. I envision a world where developers, such as you and me, can focus on creating features without getting bogged down in the mire of test writing and maintenance. To achieve this vision, it’s using GPT-4.

Currently, Pythagora has the prowess to write both unit and integration tests. However, for the purposes of this blog post, we’ll concentrate on its ability to generate unit tests.

Installation

To install Pythagora, you just need to do npm i pythagora. That’s it! Pythagora is now at your service.

Configuration

Once Pythagora is installed, you’ll need to configure it with an API key. This can either be an OpenAI API key or a Pythagora API key.

To use an OpenAI API key, you should run the following command:

npx pythagora --config --openai-api-key <OPENAI_API_KEY>

It’s important to note that, if you choose to use your own OpenAI API key, you must have access to GPT-4.

Alternatively, you can obtain a Pythagora API key from this link. Once you have it, set it up with the following command:

npx pythagora --config --pythagora-api-key <PYTHAGORA_API_KEY>

Commands

If you prefer to generate tests for a specific file, use:

npx pythagora --unit-tests --path ./path/to/file.js

And if you have a particular function in mind, use:

npx pythagora --unit-tests --func <FUNCTION_NAME>

How does Pythagora work

Let’s peel back the curtain and take a peek into the engine room. What makes Pythagora tick?

At its core, Pythagora functions as an intrepid explorer, delving into the intricate labyrinth of your codebase. First, it maps all functions that are exported from your files so that it can call them from within the tests. Obviously, if a function is not exported, it cannot be called from the outside of its file. Btw, after generating tests a couple of times, it will make you think about your codebase and how can you structure it better so that more tests can be generated.

Once it identifies the exported functions, Pythagora takes another step into the rabbit hole: it investigates each function in turn, hunting down any additional functions called within. Picture it as the archaeologist of your codebase, gently brushing away layers of dust to expose the hidden connections and dependencies. In other words, it looks for all functions that are called from within the function being tested so that GPT can get a better understanding of what does a function, for which the tests are being written for, do.

Armed with this information, Pythagora prepares to utilize the power of AI. It packages the collected code and dispatches it to the Pythagora API. Here, the actual magic happens: a prompt is meticulously crafted and handed over to the GPT model. This interaction between the code, the API, and the AI model results in generating a comprehensive set of unit tests, ready to be deployed and put to work.

Both the API server and the prompts used are open-source. They’re available for you to delve into, scrutinize, and even contribute to if you so desire. You can find the Pythagora API server here while the prompts, key ingredients in the creation of unit tests, are housed in this folder.

Reviewing Tests

Once Pythagora writes all requested tests, it’s time for you to jump in and start reviewing them. This is a vital step in the process; it’s important to know what has been created and ensure everything aligns with your expectations.

Remember, Pythagora creates Jest-based tests. So, to run all the generated tests, you can just run:

npx jest ./pythagora_tests/

Now, a word of caution: Pythagora is still in its early stages. As with all young projects, it’s bound to have some hiccups along the way. So, you might encounter failing tests in your initial runs. Don’t be disheartened; consider this a part of the journey. With your review and the continuous improvements to Pythagora, these failed tests will soon be a thing of the past.

And let’s not forget the bright side. Even with these early-stage teething problems, Pythagora can get you to a place where your codebase has a substantial, potentially up to 90%, test coverage.

Committing Tests

The review process, especially for larger codebases, may take a few hours. Remember, you’re not only looking at the tests that passed but also at those that failed. It’s crucial to understand every test you’re committing to your repository. Knowledge is power, after all.

After a thorough review and potential tweaks, you’re ready to make your final move: committing the generated tests to your repository. With this last step, you would have successfully integrated a robust unit test suite into your project. And all of this is achieved with the power of Pythagora and a few lines of command in your terminal.

Example Tests on Lodash Repo

Alright, now that I’ve got your interest piqued, let’s delve into the real stuff – tangible examples of Pythagora in action. For the purpose of our demonstration, we selected a well-known open source project, Lodash.

Running just one Pythagora command was enough to generate a whopping 1604 tests, achieving an impressive 91% code coverage of the entire Lodash repository. But it’s not just the quantity of tests that’s impressive. Out of these, 13 tests unearthed actual bugs within the Lodash master branch.

If you’re curious to check these out yourself, we’ve forked the Lodash repository and added the tests generated by Pythagora. Feel free to explore them here.

Now let’s take a closer look at one of the tests that caught a sneaky bug:

test(`size({ 'a': 1, 'b': 2, 'length': 9 })`, () => {
  expect(size({ 'a': 1, 'b': 2, 'length': 9 })).toBe(3); // test returns 9
});

In this test, the size function of Lodash is supposed to return the size of a JSON object. But, GPT added a key named length, a little trick to see if Lodash might return the value of that key instead of the true size of the object. It appears that Lodash fell for this ruse, as the test failed by returning ‘9’ instead of the expected ‘3’.

This is a fantastic example of how Pythagora, powered by GPT, excels at uncovering tricky edge cases that could easily slip under the radar. By generating a large number of such intricate test cases automatically, Pythagora can be your trusty sidekick, helping you discover and fix bugs you might never have anticipated.

Conclusion

Well, there we have it, fellow developers. We’ve embarked on quite a journey today, traversing through the uncharted territories of a substantial codebase devoid of tests, and returning with an automated suite of tests crafted by our trusty AI-powered tool, Pythagora.

You’ve learned that even in the face of a daunting, test-less codebase, there’s no need for despair. The task of creating a substantial suite of tests need not be an uphill slog anymore. We’ve witnessed the magic of Pythagora as it examined a well-known open-source library, Lodash, and generated 1604 tests that covered a jaw-dropping 91% of the codebase.

We saw how Pythagora isn’t just about quantity, but also the quality of tests. It isn’t just creating tests for the sake of it, but intelligently finding edge cases and bugs that may have otherwise slipped through unnoticed. Pythagora unmasked 13 real bugs in the Lodash master branch – a testament to the power of AI in software testing.

Now, you should have a clearer understanding of why AI-powered testing tools like Pythagora are not just a luxury, but a necessity in today’s fast-paced development landscape.

So whether you’re dealing with an existing project with zero tests or starting a new one and looking to establish a solid testing framework from the outset, remember that you’re not alone. Pythagora is here to take the reins, helping you generate meaningful tests with ease, and saving you valuable time that can be better spent on developing great features.

Thank you for joining me on this journey, and I can’t wait to see how you utilize Pythagora in your projects. Happy coding!

P.S. If you found this post helpful, it would mean a lot to me if you starred the Pythagora Github repo and if you try Pythagora out, please let us know how it went on hi@pythagora.ai.

DEV Community