DEV Community

Lucian Ghinda
Lucian Ghinda

Posted on • Originally published at notes.ghinda.com

Flaky Tests, Courtesy of AI

When AI is writing tests in agentic mode, it's crucial to monitor its output.
Sometimes, it uses tricks to pass the tests or creates unnecessary tests.

Diff from a change made by an agentic LLM:

Diff from a change made by LLMs


For example, consider this diff that has at least two issues:

  1. The test is unnecessary for a medium-risk, medium-impact feature. It merely checks that the counter cache in Rails functions correctly. This test was written in an earlier iteration that I haven't reviewed yet.

  2. The Category model has the following constraint:

index_categories_on_position_and_parent_id (position, parent_id) UNIQUE
Enter fullscreen mode Exit fullscreen mode

Notice how it added a position: rand(1..1000) to make some tests to pass because they lacked the position?

This approach is just inviting flaky tests!

I added this to my Flakiness section:

- Use consistent data across tests to avoid flakiness. Prefer existing fixtures instead of generating data randomly.
- Do not use `rand` or anything similar when generating data for tests.
- Use `travel_to` for all time-sensitive tests (assert time or need time to make the test work)
- Use `.sort` when the results are not guaranteed to be in a specific order.
Enter fullscreen mode Exit fullscreen mode

Top comments (0)