DEV Community

Cover image for Generating Data Functions in Your Elixir App
Ulisses Almeida for AppSignal

Posted on • Originally published at blog.appsignal.com

Generating Data Functions in Your Elixir App

In the first part of this series, we explored the ins and outs of Elixir test factories and fixtures. However, test factories bypass the rules of your Elixir application. Let's turn our attention to how we can avoid this by using data generation functions.

Let's dive straight in!

Creating Data With Your Elixir App's Public APIs

We'll use application code to generate test data. One of the biggest benefits of this method is that the generated data will be consistent with your application's rules. However, you need to write an extra layer of functions that your tests can easily use.

The Test Pyramid Strategy for Your Elixir App

Before jumping into the code, it's important to think about your application's public APIs and how data will be generated for each test strategy.

Saša Juric wrote about how to keep a maintainable test suite in Elixir. In the post, he described how the most important public API he needed to test in his app was the HTTP API layer, which is how his users interact with the app. For his app's context, it was worth having a lot of test HTTP requests, even though it might have some drawbacks on the test speed performance. In his test pyramid strategy, Saša favored interface-level tests over lower-level ones.

testing_factories_with_elixir

The image above shows different strategies you might choose for your app.

You might want a balanced pyramid with a good amount of tests in all layers or favor unit tests above everything else.

The tests in the top layer are slower and more difficult to maintain, and they overlap with lower-layer tests.

The high-level tests ensure that all modules work together, while the lower-level tests can tell you exactly which module or function is not working properly.

Your application's test pyramid strategy might have more layers with different names — for example, the term "unit" might mean completely different things depending on who you ask. Each strategy has its own trade-offs, and discussing them is out of the scope of this post.

The most important thing to take away, though, is that you need to understand the layers of your Elixir application and the type of data needed to write accurate functions that generate test data.

I'll give you some examples, but ultimately, it's up to you to decide what works best for your team and source code.

Extract Business Logic from Your Web Code

It's a well-known and common practice to split business logic from web code in web applications built with the Phoenix framework. In Phoenix, we usually call these business modules "context" modules.

These modules typically interact with databases, external services, other context modules, and a lot of other functions. The functions in context modules are usually the public API for your web layer.

An Example Using Helpers in Elixir

If you want to invest in a lot of tests on the context level, you can create helpers for the most-needed resources. For example, let's say you have these modules:

# lib/my_app/accounts.ex
defmodule MyApp.Accounts do
  def signup_user(username, password) do
  # ...
  end
end

# lib/my_app/profiles.ex
defmodule MyApp.Profiles do
  def create_author_profile(user) do
  # ...
  end
end

# lib/my_app/profiles.ex
defmodule MyApp.News do
  def create_post(post_contents, author) do
  # ...
  end
end
Enter fullscreen mode Exit fullscreen mode

Here, we have the Accounts, Profiles, and News contexts. Suppose we want to test the MyApp.News.create_post/2 function. We first need to create an Author profile. To create an Author profile, we need a User account.

If we have more functions in the News context, or if more contexts need an Author, we always have the tedious task of creating these chained struct relationships.

We can create helpers and make them available to our tests that need valid Author and User structs in the system. To make these helpers only available in the :test environment, we need to tell Mix, Elixir's project build tool, where to find these files. You can update mix.exs with the following configuration:

elixirc_paths: (
  case Mix.env() do
    :test -> ["lib", "test/support"]
    _ -> ["lib"]
  end
 )
Enter fullscreen mode Exit fullscreen mode

We tell Mix to compile the files in the test/support directory with the lib directory when building the :test environment. Of course, you can use any directory name you want, but test/support is a widely used convention. This way, you can create modules only available for a :test environment.

Here's an example of some helpers:

# test/support/helpers.ex
defmodule MyApp.Helpers do
  def create_user(opts \\ []) do
    username = Keyword.get(opts, :username, "test_user")
    password = Keyword.get(opts, :password, "p4ssw0rd")

    {:ok, user} = MyApp.Accounts.signup_user(username, password)
  end

  def create_author(opts \\ []) do
    user = Keyword.get_lazy(opts, :user, &signup_user/0)

    {:ok, author} = MyApp.Profiles.create_author_profile(user)
  end
end
Enter fullscreen mode Exit fullscreen mode

In the MyApp.Helpers module above, we create wrappers over our application's core API with convenient defaults to use in tests. In real life, we wouldn't create users with the password "p4ss0wrd" by default, but for our test scripts, it's fine.

We also use Keyword.get/3 to overwrite important attributes of the created resources. This avoids unnecessary side effects, especially when the caller provides us with a user to the create_author/1 helper. That's why we use Keyword.get_lazy/3 in the create_author/1 helper. get_lazy will only invoke the signup_user function if the :user key doesn't exist.

Invoking and Customizing Helpers in Elixir

We can also be strict with data patterns here since we don't expect the creation of users or authors to fail. The caller can invoke and customize these helpers for the needs of the test. For example:

user = create_user(username: "test_user_2")

create_author(user: user)
Enter fullscreen mode Exit fullscreen mode

When writing your tests, you can use the examples above and write something similar:

# test/news/news_test.exs

# module definition and stuff

alias MyApp.Helpers

# maybe other tests in the middle

test "creates new post with given the contents" do
  author = Helpers.create_author()

  assert {:ok, post} = News.create_post("My first post!", author)
end
Enter fullscreen mode Exit fullscreen mode

In the code above, we use alias MyApp.Helpers to make our helper module functions accessible with a few characters. It makes the use of these functions as convenient as the test factories provided by ExMachina.

A cool advantage of this approach is that if your editor uses a language server like ElixirLS, you can quickly discover or navigate these functions easily. The build(:user) pattern doesn't allow the editors of today to track the definition code directly.

Breaking the Helpers Module Down Into Other Modules

As the Helpers module grows and gets more complex, you might want to break it down into different modules.

For example, you could break it down by context - AccountsHelper, ProfilesHelpers, etc. The best naming and file organization depends on each application's needs, so I'll leave it up to you.

The Helpers example discussed here might have left you thinking: "This looks like factories." And you're not wrong!

This is a different implementation of test factories. Instead of creating examples of data uncoupled with your application's rules, here we tie your application's rules and your test examples together.

It works like the factory pattern because your tests aren't coupled with the way the underlying struct is built.

This example satisfies the demands of the context modules layer, but what about the other layers? In the next section, we'll explore when using public APIs isn't enough and how to generate data examples for these cases.

Beyond Public APIs: Data Examples

One of the main advantages of using your Elixir app's public API to generate data for your tests is that the data will comply with your app's rules and database constraints.

While the speed and coupling with the database may not always be ideal, this is a small price to pay to ensure your data's validity.

As your app grows in complexity, you may need to add a layer of tests that run in memory to improve performance. It is also not uncommon for systems to talk to other systems through network APIs. In these cases, your application likely won't have enough control over the rules to build valid data. You may need to rely on API specifications provided by the remote system and snapshot some examples to use in your tests.

Data examples built in memory and decoupled from database or application rules can be extremely helpful.

Let's explore a lightweight — and somewhat controversial — method of generating data examples within your application modules.

Writing Data Examples in Data Definition Modules

Years ago, a colleague invited me to watch an episode of Ruby Tapas presented by Avid Grimm.

Inspired by the book Practical Object-Oriented Design by Sandi Metz, Grimm talks about writing data examples in the modules where the data definition lives. This can serve as executable documentation and also be used in tests.

Some people might not be big fans of mixing test data with application code. But, as long as it is clear that the data is meant as an example and not for production use, it can be a useful technique.

For example, let's say you're writing a GitHub client, and you're defining the Repo struct:

# github/repo.ex
defmodule GitHub.Repo do
  defstruct [
    :id,
    :name,
    :full_name,
    :description,
    :owner_id,
    :owner_url,
    :private,
    :html_url,
    :url
  ]

  @typespec t :: %__MODULE__{
    id: pos_integer(),
    name: String.t(),
    full_name: String.t(),
    description: String.t(),
    owner_id: pos_integer(),
    owner_url: String.t(),
    private: boolean(),
    html_url: String.t(),
    url: String.t()
  }
end
Enter fullscreen mode Exit fullscreen mode

Here, we define the GitHub.Repo struct and document the keys type with typespec. While this provides a lot of information, extra documentation can help readers understand the data's nuances.

In this example, can you tell the difference between the name and full_name? Or url and html_url? It's hard to tell, right?

We can make it clearer. Let's add an example of the values:

# github/repo.ex
def example do
  %GitHub.Repo{
    id: 1296269,
    name: "Hello-World",
    full_name: "octocat/Hello-World",
    description: "This your first repo!",
    owner_id: 1,
    owner_url: "https://api.github.com/users/octocat",
    private: false,
    html_url: "https://github.com/octocat/Hello-World",
    url: "https://api.github.com/repos/octocat/Hello-World"
  }
end
Enter fullscreen mode Exit fullscreen mode

Defining the example/0 function, which returns an example of the data structure, can be useful in various ways.

For example, if you're exploring the code in IEx (the Elixir interactive shell), you could invoke the function to quickly experiment with a complex function call. Livebook material (interactive documents that allow users to run code) could use these functions to show an example of the data shape. In production, the example values could be used as hints for form fields.

Customize Key Values of the example Function

One of the main purposes of an example function, however, is to create data for tests. You can make the example function even more useful by allowing the caller to customize the key values. Here's an example:

def example(attributes \\ []) do
  struct!(
    %GitHub.Repo{
      id: 1296269,
      name: "Hello-World",
      full_name: "octocat/Hello-World",
      description: "This your first repo!",
      owner_id: 1,
      owner_url: "https://api.github.com/users/octocat",
      private: false,
      html_url: "https://github.com/octocat/Hello-World",
      url: "https://api.github.com/repos/octocat/Hello-World"
    },
    attributes
  )
end
Enter fullscreen mode Exit fullscreen mode

We use Elixir's struct!/2 kernel function to build structs dynamically. The great thing about struct! is that it fails if any unexpected keys are set. Now we can invoke the example/1 function and customize the data in any way we want:

test "renders anchor tag to the repository" do
  repo = GitHub.Repo.example(html_url: "http://test.com")

  assert hyperlink_tag(repo) =~ "<a href=\"http://test.com\""
end
Enter fullscreen mode Exit fullscreen mode

In the example above, we can easily create a GitHub.Repo and customize its html_url key for our test.

Similarities to Test Factories for Elixir

"Wait, isn't this just factories again?" you might be wondering. And you're right! It's similar to the factory mechanism found in other libraries.

While this can make the functions easy to navigate and localize using your editor, if you rename the GitHub.Repo module, you'll need to find and replace a bunch of tests. However, modern editors are usually powerful enough to handle the task in a single command, so this shouldn't be a big issue.

Another interesting aspect of providing data examples in your struct modules is that you don't need to organize your factory files, as they are organized together within your application code.

The popular ExMachina factories, your own helper functions that use your application's rules, and data example functions in your structs are all examples of factory pattern implementations.

Using ExMachina

Using ExMachina to create your factories helps you separate
test data from production code and gives you convenient functions. For example, when you define a factory using ExMachina, you can use that definition to generate test data with different strategies, like:

  • build to generate in-memory structs
  • insert to load data in a database
  • build_list to generate multiple items in a list
  • string_params_for to create map params with keys as strings like you would receive in a Phoenix controller

These are a few examples of functions that ExMachina can offer — and it has more! The convenience of these functions is debatable, though. For example, having the insert function so conveniently available could make you unnecessarily insert things into a database. And how often do your controller parameters' keys and value formats match the schema attributes generated by the string_params_for to make it really worth it? However, these convenient functions do the job for simple cases and offer a foundation for your entire test suite.

Up Next: Elixir Libraries for Test Data

Now that you understand the fundamental techniques for generating test data in Elixir, you should be able to do this for your own project without much trouble.

In the third and final part of this series, we'll dive into some Elixir libraries for your test data, including ExMachina, ExZample, Faker, and StreamData.

Happy coding!

P.S. If you'd like to read Elixir Alchemy posts as soon as they get off the press, subscribe to our Elixir Alchemy newsletter and never miss a single post!

Top comments (0)