DEV Community

Zoey de Souza Pessanha
Zoey de Souza Pessanha

Posted on

Parse, Don’t Validate: Embracing Data Integrity in Elixir

Introduction

In the world of functional programming, ensuring data integrity is paramount. One effective way to achieve this is by adopting the principle of "Parse, Don’t Validate". This approach emphasizes the transformation of raw input data into structured, well-defined data early in the application flow, thereby enhancing reliability and maintainability. While this concept is not new, its application in Elixir—a functional and concurrent programming language—offers unique benefits and challenges. This article delves into the theory behind parsing over validation and how it aligns with Elixir's paradigms.

Theoretical Foundations

Parsing vs. Validation

Validation involves checking if data meets certain criteria, often at multiple points in an application. This can lead to redundancy and inconsistencies, as the same checks are repeated, and errors may not be handled uniformly.

Parsing, on the other hand, transforms data into a structured format that inherently satisfies the required criteria. This approach ensures that once data is parsed successfully, it is guaranteed to be valid throughout the application, eliminating the need for repeated checks.

Why Parsing over Validation?

  1. Early Error Detection: Parsing catches errors at the boundaries of your system, preventing invalid data from entering the core logic.
  2. Simplified Code: By transforming data into a well-defined structure upfront, the core application logic becomes simpler and more focused on business requirements rather than data validation.
  3. Enhanced Maintainability: Centralizing data integrity checks in parsing functions makes the system easier to understand and maintain.

Functional Programming and Parsing

In functional programming, functions are first-class citizens, and immutability is a core principle. Parsing fits naturally into this paradigm as it allows data to be transformed in a pure, deterministic manner. Once data is parsed into a well-defined structure, it remains immutable, ensuring consistency and reliability.

Concurrency and Data Integrity in Elixir

Elixir, built on the Erlang VM, excels in building concurrent, distributed systems. In such environments, data integrity is crucial, as concurrent processes need to operate on reliable data. By parsing data at the boundaries, Elixir applications can ensure that all processes work with valid, consistent data, thereby reducing the risk of concurrency-related bugs.

Applying the "Parse, Don’t Validate" Principle in Elixir

Conceptual Approach

  1. Define Data Structures: Use Elixir structs or maps to define the shape of your data.
  2. Parse Input Data: Transform raw input data into these well-defined structures at the earliest possible point in your application.
  3. Centralize Parsing Logic: Encapsulate parsing logic in dedicated modules or functions to ensure uniformity and reuse.
  4. Leverage Pattern Matching: Utilize Elixir’s powerful pattern matching to simplify the parsing process and handle different data shapes effectively.

Example Scenario

Consider an API endpoint that accepts user registration data. Instead of validating fields individually, parse the entire payload into a User struct.

Defining the Data Structure

defmodule User do
  defstruct [:name, :email, :age, :address]
end
Enter fullscreen mode Exit fullscreen mode

Parsing the Input Data

defmodule UserParser do
  def parse(params) do
    with {:ok, name} <- validate_name(params["name"]),
         {:ok, email} <- validate_email(params["email"]),
         {:ok, age} <- validate_age(params["age"]),
         {:ok, address} <- validate_address(params["address"]) do
      {:ok, %User{name: name, email: email, age: age, address: address}}
    else
      {:error, reason} -> {:error, reason}
    end
  end

  defp validate_name(name) when is_binary(name) and byte_size(name) > 0, do: {:ok, name}
  defp validate_name(_), do: {:error, "Invalid name"}

  defp validate_email(email) when is_binary(email) and String.contains?(email, "@"), do: {:ok, email}
  defp validate_email(_), do: {:error, "Invalid email"}

  defp validate_age(age) when is_integer(age) and age > 0, do: {:ok, age}
  defp validate_age(_), do: {:error, "Invalid age"}

  defp validate_address(address) when is_map(address), do: {:ok, address}
  defp validate_address(_), do: {:error, "Invalid address"}
end
Enter fullscreen mode Exit fullscreen mode

Using the Parser in Your Application

defmodule UserController do
  alias MyApp.UserParser

  def register_user(conn, params) do
    case UserParser.parse(params) do
      {:ok, user} ->
        # Proceed with business logic using the parsed user
        json(conn, %{status: "success", user: user})

      {:error, reason} ->
        # Handle parsing errors
        json(conn, %{status: "error", reason: reason})
    end
  end
end
Enter fullscreen mode Exit fullscreen mode

Parsing with Peri

While the above example demonstrates a manual approach to parsing, the Peri library offers a more structured way to define and enforce schemas in Elixir.

Defining a Schema with Peri

defmodule MySchemas do
  import Peri

  defschema :user, %{
    name: :string,
    email: {:required, :string},
    age: :integer,
    address: %{
      street: :string,
      city: :string
    },
    role: {:required, {:enum, [:admin, :user]}}
  }
end
Enter fullscreen mode Exit fullscreen mode

Parsing Data with Peri

defmodule UserController do
  alias MyApp.MySchemas

  def register_user(conn, params) do
    case MySchemas.user(params) do
      {:ok, user} ->
        # Proceed with business logic using the parsed user
        json(conn, %{status: "success", user: user})

      {:error, errors} ->
        # Handle parsing errors
        json(conn, %{status: "error", errors: errors})
    end
  end
end
Enter fullscreen mode Exit fullscreen mode

Conclusion

Adopting the "Parse, Don’t Validate" principle in Elixir ensures data integrity, simplifies code, and enhances maintainability. By transforming raw input data into structured, well-defined data at the system's boundaries, you create a robust foundation for your application.

Elixir's functional and concurrent nature makes it an ideal language for embracing this approach. While manual parsing is effective, libraries like Peri offer powerful tools to define and enforce schemas, ensuring consistency and reliability throughout your application.

Embrace the power of parsing in Elixir, and let your code benefit from cleaner, more maintainable, and type-safe data handling.

Top comments (2)

Collapse
 
nelsonmestevao profile image
Nelson Estevão

One lib I am using is hexdocs.pm/goal/Goal.html. Similar to Peri but I like it better :)

Collapse
 
zoedsoupe profile image
Zoey de Souza Pessanha

although goal and peri seems to tackle the same problem, the approaches each one chooses to solve it is very different.

goal appears to focus more on the data than into the type structure. also goal only focus on map structures while peri can be used to validate any data structure, from the simples integer or string, going through tiples, lists, composable types, to the most complex nested schema.

the idea was borrows from the plumatic schema from clojure library that define schemas validations based on raw data structures definitions. so yes, goal is very pleasant to use but peri seems to tackle a plenty of other issues that goal doesn't can.