Vasily Polovnyov

Posted on May 3, 2022 • Edited on May 5, 2022

Fake data in tests

#programming #ruby #testing

Tests are not only a tool to automatically check the code and its correctness, but also an excellent documentation. Every time I don't understand something in the documentation of a library, I look at the specs and quickly find an answer to my question.

In tests it is the best to use data as close to real-life as possible: "User" → "Ivan Pavlov", "ip" → "82.100.200.3", empty file → jpeg with your cat. This way we get good documentation and good examples of how to use our API.

To make the tests consistent, I prefer to use a single domain for the test data. For example, I often use:

1. Characters and quotes from The Simpsons:

user = User.new(name: "Bart Simpson", email: "bart@simpson.dev")

expect(user.to).to eq "Bart Simpson <bart@simpson.dev>"

2. Characters and quotes from '90s action movies:

comment: "Dead or alive... you're coming with me"

3. Lyrics of Eminem's or Beyoncé's songs (please don't ask):

do_request(text: "In my shoes, just to see what it's like to be me")

P. S. Some of you might mention Faker. I don't use it as I think it's an anti-pattern. I'll elaborate on this in next post.

Oldest comments (8)

Simon Egersand 🎈 • May 4 '22

I respectfully disagree 🙂 I think you should use meaningful values. That is, values that tell you something about the test. Looking at your examples, if I instead use "Homer Simpson" as a name, will the test still pass? Well, I don't know because the value doesn't tell me.

I recently wrote a post about using meaningful test values in tests: dev.to/simeg/stop-using-meaningles...

Vasily Polovnyov • May 4 '22

I think you should use meaningful values

That's what I'm talking about :-) "Bart Simpson" or "Homer Simpson" is much better than "user", "123" or "test".

Looking at your examples, if I instead use "Homer Simpson" as a name, will the test still pass? Well, I don't know because the value doesn't tell me.

What would you use?

Simon Egersand 🎈 • May 4 '22

"Bart Simpson" or "Homer Simpson" is much better than "user", "123" or "test".

Why? :) They don't say anything about the test. Are these exact values important for the test to pass, or can I change them to test? I would not know because Bart Simpson doesn't have a meaning in the test context. That's my point .

I tend to use values such as irrelevant or specific-name. That way I know when a value is irrelevant for the test, or when the value is specific and required for the test to pass :)

Happy to discuss further, this topic is very interesting!

Vasily Polovnyov • May 4 '22

I see. I think, irrelevant data should not be at tests at all. That's why we use factories:
thoughtbot.com/blog/why-factories

Why? :) They don't say anything about the test

Oh, I'm sorry for the confusion. I put it there because it was actually used in test:

user = described_class.new(name: "Bart Simpson", email: "bart@simpson.dev")

expect(user.to).to eq "Bart Simpson <bart@simpson.dev>"

What do you think?

Simon Egersand 🎈 • May 4 '22

I understand. Factories can be useful but the problem I see with them is that add additional logic to your test, and forces you to keep additional context in your head when you read the test. I try to avoid helper functions unless they are very obvious.

The example you provide is clear because the test is so small, but what if the user had 15 fields instead of 2?

Here's an interesting blog post on why you should avoid keeping logic in your tests.

Curious what you think!

Vasily Polovnyov • May 5 '22

The example you provide is clear because the test is so small, but what if the user had 15 fields instead of 2?

It depends. If all these 15 fields are important to the test, I'll leave them as is. If not, I'll move them to some sort of helper or factory. See:
thoughtbot.com/blog/the-self-conta...

Here's an interesting blog post on why you should avoid keeping logic in your tests.

Awesome post. I agree completely with it: we should always optimize tests not for code brevity but for understanding.

Simon Egersand 🎈 • May 5 '22

It depends. If all these 15 fields are important to the test, I'll leave them as is. If not, I'll move them to some sort of helper or factory.

If you have 15 fields and you want to have tests for all of them, would that not leave you with a lot of factories?

You have convinced me factories are useful, but I'm still scared of abusing them. I need to experiment more to understand the trade-offs and learn how to balance.

[..] we should always optimize tests not for code brevity but for understanding.

Could not agree more!

How to write helpful, understandable and maintainable tests are often overlooked, in my opinion, and I really enjoy learning more about it. Far too many times I've had to spend many brain cycles just to understand what a test is actually verifying.

If you have more test related blog posts to share, please send them my way, friend!

shrey • May 4 '22

As a tester, I strongly disagree with your approach. Your test should never be dependent on just one set or type of test data.