DEV Community

Cover image for How to optimize factory creation.
Clément Morisset for Potloc

Posted on • Updated on

How to optimize factory creation.

At Potloc we have a test stack which is pretty standard in the Rails ecosystem. We run tests with RSpec, we use FactoryBot for setting up our test data, Capybara for user interactions, Github Actions as a CI etc..

These great tools allow us to code at a fast pace with good test coverage. But this pace comes at a cost. The more the team grows, the bigger the codebase gets and the more tests get written.

🤕 The issue

As developer we usually take care of optimizing our own code and queries, we are used to test new implementations with all the edge cases. But tests optimization is likely a topic that we put at the bottom of the list and that’s if we ever even think about it.
Up until the moment where quietly but surely you will end up with a CI that take ages to run and a whole test suite to speed up.

Let’s see how we took on this challenge at Potloc.

For the purpose of demonstration we will rely on this simple test file:

RSpec.describe Purging::QuestionnaireWorker, type: :worker do
  let(:questionnaire) { create(:questionnaire) }

  describe "#perform" do
    it "destroys a survey form" do
      expect { subject.perform(questionnaire.id) }.to change(Questionnaire, :count).by(-1)
    end

    context "given associations" do
      it "destroys a questionnaire and its associations" do
        create(:question, :postal_code, questionnaire: questionnaire)

        expect { subject.perform(questionnaire.id) }.to change(SurveyQuestion, :count).by(-1)
      end
    end
  end
end
Enter fullscreen mode Exit fullscreen mode

🧑‍🚒 The solutions

The factory-bot gem is used in almost in all of our spec files and it make our set up much more easier than when we use fixtures.
Here is the tradeoff, the easier the gem is to use, the more likely you’ll end up with some pain to control its usage. And when the times come to tackle slow tests, the best bet you can take is to start digging into you factories because it’s likely they are the primary reason why your test suite is slowing down

  • Avoiding the factory cascades

To quote Evil Martian a factory cascade is an

uncontrollable process of generating excess data through nested factory invocations.

In our test we use two factories a questionnaire and question who could be represented like this as a tree:

questionnaire
|
|---- survey

question
|
|---- survey
Enter fullscreen mode Exit fullscreen mode

In this simple example each factory calls a nested factory. This means that every time we create a questionnaire factory, we also create a survey factory.

To have a better vision of what objects are created in our spec file we can use test-prof, a powerful gem that provides a collection of different tools to analyse your test suite performance. One of this tool is really useful to identify a factory cascade, let’s introduce factory profiler.

If we run FPROF=1 bundle exec rspec the factory profiler, it will generate the following report:

[TEST PROF INFO] Factories usage

 Total: 6
 Total top-level: 3
 Total time: 00:01.005 (out of 00:26.087)
 Total uniq factories: 3

   total   top-level     total time      time per call      top-level time               name

       3           0        0.6911s            0.2304s             0.0000s             survey
       2           2        0.6397s            0.3199s             0.6397s             questionnaire
       1           1        0.3658s            0.3658s             0.3658s             question
Enter fullscreen mode Exit fullscreen mode

The most interesting insight is the difference between the total and the top-level. The more this difference is important, the more you end up with factory cascade, meaning that you are creating useless factories.

Let’s take the survey , we don’t instantiate any surveys in our test suite but during the execution we create 4.

The easiest workaround it is to instantiate a survey at the top-level and to associate the factories to it.

RSpec.describe Purging::QuestionnaireWorker, type: :worker do
  let(:survey) { create(:survey) }
  let(:questionnaire) { create(:questionnaire, survey: survey) }
  ....

  it "destroys a questionnaire and its associations" do
    create(:question, :postal_code, questionnaire: questionnaire, survey: survey)
    ....
Enter fullscreen mode Exit fullscreen mode

If we run the factory profiler we now have a different report:

[TEST PROF INFO] Factories usage

 Total: 5
 Total top-level: 5
 Total time: 00:00.868 (out of 01:09.067)
 Total uniq factories: 3

   total   top-level     total time      time per call      top-level time               name

       2           2        0.6986s            0.3493s             0.6986s             survey
       2           2        0.0973s            0.0487s             0.0973s             questionnaire
       1           1        0.0729s            0.0729s             0.0729s             question
Enter fullscreen mode Exit fullscreen mode

Nice! No more factory cascades. The total and the top-level columns are the same. We now create 5 factories instead of 8. We have decreased the time spent creating factories by 30%.

The caveat of this method it that it could be a heavy process to maintain. Thankfully, test-prof as a recipe called FactoryDefault. Removing factory cascades manually could be good enough most of the time but if you want to go further you can follow the documentation.

That being said, test-prof has even more to offer, it’s time to introduce an awesome helper named let_it_be.

  • Reuse the factory you need

Let's bring a little bit of magic and introduce a new way to set up a shared test data.

let_it_be is a helper that allows you to reuse the same factory for all your spec file. In our example we don’t need to create 2 survey and 2 questionnaire we could re-use the same ones for all our file.

RSpec.describe Purging::QuestionnaireWorker, type: :worker do
  let_it_be(:survey) { create(:survey) }
  let_it_be(:questionnaire) { create(:questionnaire, survey: survey) }
  ...
end
Enter fullscreen mode Exit fullscreen mode

If we run the factory profiler we now have a different report:

[TEST PROF INFO] Factories usage

 Total: 3
 Total top-level: 3
 Total time: 00:00.272 (out of 00:24.264)
 Total uniq factories: 3

   total   top-level     total time      time per call      top-level time               name

       1           1        0.2024s            0.2024s             0.2024s             survey
       1           1        0.0323s            0.0323s             0.0323s             questionnaire
       1           1        0.0375s            0.0375s             0.0375s             question
Enter fullscreen mode Exit fullscreen mode

So now we only create the factories we need, by reusing the same ones throughout our file.

Be aware that let_it_be come with a caveat section. I strongly encourage you to read the documentation and use this powerful helper in accordance with your needs.

🚀 Conclusion

Let’s take a step back and relish our improvements:

Initial Without cascades With let_it_be
Factories creation time 00:01.00 00:00.868 00:00.272

Numbers look nice for this simple example. But what is the impact in real life at Potloc?

So far we just applied this recipe for a specific folder of our codebase. Below the result by profiling locally that folder:

Before we spent 3.50 min in factories creation, now 2 min. (~ -50%)
Before we created 6824 factories, now 4378. (~ -35%)

test-prof is the swiss army knife we needed to speed up our test suite. It’s still a long journey but by embracing this topic we have already taken an important step!

Want to go further? Watch this 99 problems of slow test talk by Vladimir Dementyev

Interested in what we do at Potloc? Come join us! We are hiring 🚀

Top comments (0)