DEV Community

Cover image for Test Data Libraries for Elixir
Ulisses Almeida for AppSignal

Posted on • Originally published at blog.appsignal.com

Test Data Libraries for Elixir

In part one of this series, we introduced Elixir test factories and fixtures. Then in part two, we explored using data generation functions.

Now we'll look at some of the best Elixir libraries to use for your test data.

But before we do, let's quickly discuss why test data libraries can be helpful.

Why Choose a Test Data Library for Your Elixir App?

Elixir's built-in language features are more than sufficient for writing simple helpers to build your application test data.

For example, refer to the Ecto Guide to write a factory method pattern API like the one provided by ExMachina.

However, existing test data libraries are convenient as you don't have to write them yourself.

Elixir Libraries for Your Test Data

Here's a quick overview of existing test data libraries and why you might want to use them. We'll look at ExMachina, ExZample, Faker, and StreamData.

Let's start with the most famous one: ExMachina.

ExMachina

The factory library ExMachina created by Thoughbot uses function names and generates an atom used to call factories. For example:

defmodule MyApp.Factory do
  use ExMachina

  def github_repo_factory do
    repo_name = sequence(:github_repo_name, &"repo-#{1}")

    %GitHub.Repo{
      id: 1296269,
      name: repo_name,
      full_name: "octocat/#{repo_name}",
      description: "This your first repo!",
      owner_id: 1,
      owner_url: "https://api.github.com/users/octocat",
      private: false,
      html_url: "https://github.com/octocat/#{repo_name}",
      url: "https://api.github.com/repos/octocat/#{repo_name}"
    }
  end
end
Enter fullscreen mode Exit fullscreen mode

You can invoke that factory using the functions provided by ExMachina. For example, here's a function that generates a list of resources:

MyApp.Factory.build_list(3, :github_repo)
[
  %GitHub.Repo{
    name: "octocat/repo-1"
    # ...
  },
  %GitHub.Repo{
    name: "octocat/repo-2"
    # ...
  }
    %GitHub.Repo{
    name: "octocat/repo-3"
    # ...
  }
]
Enter fullscreen mode Exit fullscreen mode

The github_repo_factory/0 can be called using the :github_repo atom, which can call any utility function provided by ExMachina and injected by the use ExMachina macro.

One of the most powerful features of ExMachina is the sequence function, which guarantees you'll get the next increment of a sequence each time you call it, and the number will always be unique. It simplifies the caller's job by eliminating the need to explicitly pass a unique name. Using the sequence function automatically avoids unique database constraint issues when calling the same factory multiple times.

ExMachina also lets you add more building strategies, such as a JSON strategy that transforms your data example into JSON. The documentation suggests that you create smaller modules (with macros defining your factories) to break down a large factory module and still serve everything under the same module API. However, I wouldn't necessarily recommend this approach: it's harder to track the factory code definition if it breaks. I would rather have multiple factory modules when using ExMachina.

ExMachina is a well-established, actively maintained library that's widely used in the Elixir community. If invoking factories using the atom naming mechanism fits your preference, it's a solid choice!

Now let's take a look at another tool — ExZample.

ExZample

ExZample is a factory library I wrote which allows you to use defined examples in your struct modules. My goal was to enable developers to organize their factories in any way they want while still having access to convenient functions. For example, if you don't define an example function in GitHub.Repo, you can still build the struct with its default values:

ExZample.build_list(3, GitHub.Repo)
[
  %GitHub.Repo{
    name: nil
    # ...
  },
  %GitHub.Repo{
    name: nil
    # ...
  }
    %GitHub.Repo{
    name: nil
    # ...
  }
]
Enter fullscreen mode Exit fullscreen mode

You can define an example function in the struct module, and ExZample will automatically use it if it's available.

# github/repo.ex
  def example do
    repo_name = sequence(:github_repo_name)

    %GitHub.Repo{
      id: 1296269,
      name: repo_name,
      full_name: "octocat/#{repo_name}",
      description: "This your first repo!",
      owner_id: 1,
      owner_url: "https://api.github.com/users/octocat",
      private: false,
      html_url: "https://github.com/octocat/#{repo_name}",
      url: "https://api.github.com/repos/octocat/#{repo_name}"
    }
  end

# test/support/test_helper.exs
# Sequences are created on app start
ExZample.create_sequence(:github_repo_name, &"repo-#{1}")

# then you can invoke the data example with ExZample module:
ExZample.build_list(3, GitHub.Repo)
[
  %GitHub.Repo{
    name: "octocat/repo-1"
    # ...
  },
  %GitHub.Repo{
    name: "octocat/repo-1"
    # ...
  }
    %GitHub.Repo{
    name: "octocat/repo-1"
    # ...
  }
]
Enter fullscreen mode Exit fullscreen mode

If you prefer the atom naming mechanism of ExMachina, ExZample.DSL has you covered. You can define your factory in a module like this:

defmodule MyApp.Factory do
  use ExZample.DSL

  factory :github_repo do
    example do
      repo_name = sequence(:github_repo_name)
      %GitHub.Repo{
        id: 1296269,
        name: repo_name,
        full_name: "octocat/#{repo_name}",
        description: "This your first repo!",
        owner_id: 1,
        owner_url: "https://api.github.com/users/octocat",
        private: false,
        html_url: "https://github.com/octocat/#{repo_name}",
        url: "https://api.github.com/repos/octocat/#{repo_name}"
      }
    end
  end

  def_sequence :github_repo_name, return: &"repo-#{1}"
end

# Then use the aliased factory in your tests:
ExZample.build_list(3, :github_repo)
[
  %GitHub.Repo{
    name: "octocat/repo-1"
    # ...
  },
  %GitHub.Repo{
    name: "octocat/repo-1"
    # ...
  }
    %GitHub.Repo{
    name: "octocat/repo-1"
    # ...
  }
]
Enter fullscreen mode Exit fullscreen mode

Here, we use macros provided by ExZample.DSL to define factories using the factory directive. We need to explicitly define the atom name :github_repo. Inside the factory body, we define the example block that builds a struct example.

Although the ExZample library may not be as well-known or extensively tested as ExMachina, ExZample was designed to pick up the exact features you need and ignore the rest. You may want to consider giving it a try. If you have any feedback or suggestions for improvements, feel free to contribute.

Next up, let's see what Faker has to offer.

Faker

Faker generates sample data that looks realistic but is fake. So, instead of using a sequence like those provided by ExMachina or ExZample, we can use Faker to generate false but genuine-looking data:

# test/support/test_helper.exs
# Ensure you start Faker before running your tests
Faker.start()

Faker.Internet.slug()
"sit_et"

Faker.Internet.slug()
"deleniti-consequatur"

Faker.Internet.slug()
"foo-bar"
Enter fullscreen mode Exit fullscreen mode

The random nature of data generation can sometimes result in duplicate names, which may cause flaky tests. However, this is likely to occur only in rare situations, so it's still worth relying on Faker. Faker has an impressive collection of data samples and generation algorithms, including IP addresses, e-mails, URLs, addresses, and even Pokémon names. It's a great library to use in combination with the others presented here.

Finally, let's take a quick look at StreamData.

StreamData

StreamData generates a bunch of data samples for you. Unlike Faker, StreamData doesn't aim to generate realistic values. It creates raw, random data. You ask for a raw data type, and you get it. For example:

Enum.take(StreamData.string(:alphanumeric), 3)
#=> ["AcT", "9Ac", "TxY"]
Enter fullscreen mode Exit fullscreen mode

StreamData returns a data generator that produces an infinite amount of random data. It implements the Enumerable protocol, so you can filter data using the Enum functions:

StreamData.string(:alphanumeric)
|> Enum.filter(&(String.length(&1) >= 5 and String.length(&1) <= 10))
|> Enum.take(1)
["hygT78ch"]
Enter fullscreen mode Exit fullscreen mode

In the example above, we're only interested in grabbing strings between 5 and 10 characters long. StreamData was designed to be used with property testing, but you can also use its powerful data samples to build data examples for regular tests.

General Thoughts and Suggestions

I often wonder if it's worth deviating too far from the Elixir standard library when using a certain pattern. So I would like to invite you to aim for the following in your next Elixir project:

  • Try using your application's functions and rules to build test data. Create wrappers to simplify their usage if needed, with convenient defaults for testing.
  • When your application doesn't control the accuracy of your data, try to create example functions on the struct modules and use them to test your code.
  • Use Faker or StreamData to add some randomness, enriching your test data and making your tests less dependent on specific data.

If these three points still aren't enough for your data testing needs, embrace a factory library to bring more convenience and structure to your test
suite.

If you try the lean suggested approaches, you might realize how little you really need to be able to generate test data and have a healthy test suite.

Wrapping Up

In part one of this series, we summarised the ins and outs of Elixir test factories and fixtures. In the second part, we focused on generating data functions.

Finally, in this third and last part, we explored some test data libraries you can use for your Elixir application, including ExMachina, ExZample, Faker, and StreamData.

I hope you found this series helpful.

Good luck with your testing!

P.S. If you'd like to read Elixir Alchemy posts as soon as they get off the press, subscribe to our Elixir Alchemy newsletter and never miss a single post!

Top comments (0)