DEV Community

Matheus de Camargo Marques
Matheus de Camargo Marques

Posted on

Guide to Parsing Financial Protocols and the FEBRABAN Standard with Elixir Binaries

My name is Matheus de Camargo Marques. I’m not sure how my last article on Cognitive Architectures relates to this one; all I know is that while abstract theory is fascinating, the real world runs on protocols that haven't changed in decades.

Let’s start at the beginning. How I arrived at this guide isn't the point right now. For years, I’ve been using Elixir and Erlang to build resilient systems, often focusing on high-level design and architecture.

Recently, however, I found myself deep in the trenches of the FEBRABAN standard—the backbone of financial data exchange in Brazil. Back then, looking at those massive technical manuals, I didn't fully grasp the elegance that could be hidden within those rigid lines. I had parsed files before, and while I managed to make them work, I often felt I was fighting against the legacy formats rather than working with them.

But as I dug deeper, I began to comprehend that handling these protocols doesn't have to be a chaotic struggle of substrings and manual mapping. There are techniques that make immense sense—transforming a tedious obligation into a clean, robust solution.

However, for something to be practical, we must be open to a paradigm shift: the way we view binary data, the architecture of our parsers, and the design of our integration layers—everything changes.

1. Introduction: The Intersection of Functional Programming and Financial Legacy

In the modern landscape of software engineering, few domains present a more distinct juxtaposition of eras than the processing of Brazilian financial documents. On one side, we have the "Boleto Bancário," a payment instrument standardized in the early 1990s, reliant on fixed-width text layouts, check digits derived from modular arithmetic, and a physical barcode standard (Interleaved 2 of 5) designed for laser scanners. On the other side, we have Elixir, a dynamic, functional language running on the Erlang VM (BEAM), built for concurrency, fault tolerance, and—crucially for this report—unrivaled prowess in binary manipulation.

This report serves as a comprehensive technical deep-dive into utilizing Elixir’s binary pattern matching capabilities to parse, validate, and process FEBRABAN (Federação Brasileira de Bancos) compliant barcodes and CNAB (Centro Nacional de Automação Bancária) data streams. We address a specific, real-world architecture: a system receiving a raw octet stream from a JavaScript client—representing a barcode scan—which must be deconstructed, validated against complex banking rules, and parsed into a structured financial object without the use of external libraries.

Furthermore, this document addresses the imminent "Maturity Factor Reset" of 2025—a "Y2K-like" event for the Brazilian banking sector where the 4-digit date factor in boletos will overflow and reset for the first time since 1997. We will explore the mathematical implications of this event and implement a robust, pure-Elixir solution to handle the ambiguity of the "1000" factor post-February 2025.

1.1 The Power of the BEAM for Binary Parsing

To understand why Elixir is the superior tool for this task, one must look beneath the surface syntax and into the Erlang Virtual Machine (BEAM). In most high-level languages (like Python, Java, or JavaScript), strings are opaque objects. Parsing a fixed-width protocol in these languages typically involves extensive substring operations (slice, substring), which often generate new string objects on the heap, leading to memory churn.

The BEAM, however, treats binaries as a first-class primitive. It employs a sophisticated memory management strategy for binaries:

  • Heap Binaries: Small binaries (up to 64 bytes) are stored directly on the process heap.
  • Refc Binaries: Larger binaries are stored in a shared memory area, with the process holding only a reference (Refc) to the data.
  • Sub-binaries: When we perform pattern matching to extract a segment of a binary (e.g., extracting the first 3 bytes for a Bank ID), the BEAM does not copy the data. Instead, it creates a lightweight "sub-binary"—a window pointing to the original data with a specific offset and length.

This architecture means that parsing a 44-byte boleto or a 400-byte CNAB record using pattern matching is an $O(1)$ operation in terms of memory allocation for the data payload. We can deconstruct a multi-megabyte stream of remittance data into thousands of fields with negligible memory overhead compared to string-slicing approaches.

2. The Protocol: Anatomy of the FEBRABAN Barcode

The "Boleto Bancário" is governed by rigid standards set by FEBRABAN and the Central Bank of Brazil (BACEN). While the printed document contains a "Digitable Line" (Linha Digitável) of 47 or 48 digits with formatting and internal checksums, the machine-readable barcode is a contiguous string of 44 decimal digits.

It is critical to distinguish between the Digitable Line (human-readable, entered in apps) and the Barcode (machine-readable, scanned by lasers/cameras). This report focuses on parsing the Barcode (44 digits), which is the raw data captured by the client-side scanner.

Example Barcode: 00193373700000001000500940144816060680935031

  • Bank Code (001-003): 001 (Banco do Brasil)
  • Currency (004): 9 (Brazilian Real)
  • Verifier (005): 3 (Modulo 11 checksum)
  • Maturity Factor (006-009): 3737 (represents day 3737 since 1997-10-07, roughly 2007-12-31)
  • Amount (010-019): 0000100005 (R$ 1,000.05 in cents)
  • Free Field (020-044): 094014481606068093503 (Banco do Brasil specific data)

2.1 The 44-Digit Layout Structure

The 44-digit string is a positional protocol. Every byte has a specific meaning depending on its index. Unlike JSON or XML, there are no delimiters, tags, or keys. Meaning is derived solely from position.

The layout is divided into two logical sections: the Universal Header (Standard across all banks) and the Free Field (Defined by each bank).

Table 1: FEBRABAN Standard 44-Digit Barcode Layout

Position (1-Based) Length (Bytes) Field Name Description
01 - 03 3 Bank ID (Código do Banco) The unique identifier of the issuing bank (e.g., 001 for Banco do Brasil, 237 for Bradesco).
04 1 Currency Code (Código da Moeda) 9 for Real (BRL). 0 usually implies a variable index or generic reference.
05 1 Global Verifier (DV) A Modulo 11 checksum of the other 43 digits. This is the integrity check for the scan.
06 - 09 4 Maturity Factor (Fator de Vencimento) A count of days since the epoch (07/10/1997). This dictates the due date.
10 - 19 10 Value (Valor Nominal) The document value in cents. Fixed point (no decimal separator).
20 - 44 25 Free Field (Campo Livre) The "Camp Livre." Structure varies entirely by the Bank ID found in positions 01-03.

2.2 The Challenge of the "Campo Livre"

The parser cannot treat the barcode as a monolithic structure. It requires a dispatch strategy. Once the parser reads the first 3 bytes (Bank ID), it must switch context to interpret the final 25 bytes (Free Field).

  • Banco do Brasil (001): Uses the Free Field to store the "Convenio" (Merchant ID), "Nosso Número" (Boleto ID), and "Carteira" (Wallet). The layout of these internal fields shifts depending on the size of the Convenio (4, 6, or 7 digits).
  • Bradesco (237): Uses the Free Field for Agency, Wallet, Nosso Número, and Account.
  • Itaú (341): Encodes the Carteira, Nosso Número, and Agency/Account, often with internal Modulo 10 check digits embedded within the Free Field.

A robust parser must implement these specific layouts to be useful. A generic parser that only returns the "Free Field" as a raw string is insufficient for a financial system that needs to identify the payer or the specific invoice number.

3. The 2025 Maturity Factor Reset: The "Y2K" of Boletos

A central requirement of any boleto parser written today is the correct handling of the Maturity Factor. This is a 4-digit field (Positions 06-09).

3.1 The Epoch and the Mechanism

The system uses a daily counter starting from an established epoch.

  • Base Date: October 7, 1997 (07/10/1997).
  • Operation: The factor represents Base Date + Factor (days).
  • Example: On July 3, 2000, the factor was 1000.

3.2 The Overflow and Reset (February 2025)

Because the field allows only 4 digits, the maximum value is 9999.

  • The Limit: The factor reaches 9999 on February 21, 2025.
  • The Reset: On February 22, 2025, the factor does not roll over to 0000 or 0001. It resets to 1000.

This creates a deliberate ambiguity. A boleto with factor 1000 could mathematically refer to:

  1. July 3, 2000 (Cycle 1)
  2. February 22, 2025 (Cycle 2)
  3. Future date in ~2052 (Cycle 3)

3.3 The Sliding Window Solution

Legacy systems relying on simple date addition (1997-10-07 + factor) will break on February 22, 2025. They will interpret new boletos as being 25 years overdue.

The solution requires a Sliding Window Algorithm. Since boletos typically have a validity of a few months (or years for long-term financing), a factor of 1000 presented to a system in the year 2025 should logically be interpreted as the current date, not the year 2000.
The parser must check the system clock. If the calculated date (using 1997 base) is excessively in the past (e.g., > 7300 days), the parser effectively "shifts" the base date forward by one cycle (exactly the delta between the resets).
In our implementation, we will define a calculate_maturity_date/2 function that encapsulates this logic, ensuring the parser remains compliant for decades.

4. Ingesting the Data: The Client-Side Octet Stream

The prompt specifies parsing a barcode "coming from the client javascript in octet/stream." This implies a scenario where a browser-based application uses the specific stream API to send data to the Elixir backend.

4.1 The JavaScript Stream

In a modern web app, a barcode scanner (USB HID mode or Camera) might feed data into a ReadableStream. The browser might chunk this data and send it via fetch (POST) or a WebSocket (Phoenix Channel).

// Conceptual JS Client
const barcodeReader = getReaderStream(); // Returns ReadableStream
const reader = barcodeReader.getReader();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  // 'value' is a Uint8Array (Octets)
  // Send 'value' to Elixir via Channel or Fetch
  socket.push("barcode_chunk", value);
}
Enter fullscreen mode Exit fullscreen mode

4.2 The Elixir Receptor

In Elixir, this data arrives as a binary. If the client sends a Uint8Array containing the ASCII codes for "341...", Elixir sees <<51, 52, 49...>>.

However, network packets might fragment the message. A robust solution must handle streaming aggregation. The pattern matching parser works best on the complete message. Therefore, the first step in our module will be an accumulator or a buffer handling strategy, though for the sake of the parsing tutorial, we will assume the stream has been aggregated into a full binary payload before the parsing function is called.

The data type we are dealing with is explicitly binary (a sequence of bytes), representing ASCII characters.

4.2.1 Integration Example with Phoenix Channels

In a Phoenix application, a WebSocket channel can receive and aggregate barcode data:

# In a Phoenix Channel Handler
defmodule MyApp.BarcodeChannel do
  use Phoenix.Channel

  def join("barcode:lobby", _message, socket) do
    {:ok, socket}
  end

  def handle_in("scan", %{"data" => barcode_data}, socket) do
    # barcode_data arrives as a string from the JavaScript client
    case FebrabanProtocolEx.parse_barcode(barcode_data) do
      {:ok, parsed} ->
        broadcast!(socket, "barcode_scanned", %{
          bank: parsed.metadata.bank_name,
          amount: parsed.amount,
          maturity_date: parsed.maturity_date
        })
        {:noreply, socket}

      {:error, reason} ->
        broadcast!(socket, "barcode_error", %{error: reason})
        {:noreply, socket}
    end
  end
end
Enter fullscreen mode Exit fullscreen mode

The Elixir backend validates the barcode immediately upon receipt, allowing the client to receive feedback in real-time.

5. Implementation: The Pure Elixir Parser

We will now construct the solution. We adhere to the constraints: Pure Elixir, no external libraries for parsing, and extensive use of binary pattern matching. For handling monetary values with precision, we will use the Decimal library.

5.1 Module Structure

We define a module FebrabanProtocol.FinancialParser with a primary public API parse_barcode/1.

defmodule FebrabanProtocol.FinancialParser do
  @moduledoc """
  A pure Elixir parser for FEBRABAN financial standards.
  Handles 44-digit barcodes using binary pattern matching.
  """

  # Base Epoch for Maturity Factor (07/10/1997)
  @base_date ~D[1997-10-07]
  # The factor value where the reset occurs (1000)
  @reset_factor_floor 1000
  # The number of days in the first cycle before reset.
  @days_in_cycle 10000

  defstruct [
    :bank_code,
    :currency_code,
    :verifier_digit,
    :maturity_factor,
    :amount,
    :free_field,
    :barcode_raw,
    :valid_checksum?,
    :maturity_date,
    :metadata
  ]
end
Enter fullscreen mode Exit fullscreen mode

5.2 The Binary Matcher (The Core)

This is the heart of the system. We use Elixir’s bitstring syntax <<...>> to decompose the 44 bytes. We assume input is ASCII digits.

The syntax binary-size(N) is used to grab N bytes. Since we are parsing a string of digits, each digit is 1 byte (UTF-8/ASCII).

  @doc """
  Parses a 44-byte binary representing a FEBRABAN barcode.
  Input: binary (ASCII digits).
  """
  def parse_barcode(raw_input), do: parse_barcode(raw_input, Date.utc_today())

  def parse_barcode(<<
        bank_id_bin       :: binary-size(3),
        currency_bin      :: binary-size(1),
        verifier_bin      :: binary-size(1),
        factor_bin        :: binary-size(4),
        amount_bin        :: binary-size(10),
        free_field_bin    :: binary-size(25)
      >> = raw_input, today) do

    with {bank_code, ""} <- Integer.parse(bank_id_bin),
         {currency, ""}  <- Integer.parse(currency_bin),
         {verifier, ""}  <- Integer.parse(verifier_bin),
         {factor, ""}    <- Integer.parse(factor_bin),
         {amount_cts, ""} <- Integer.parse(amount_bin) do

      # 1. Logic: Calculate Maturity Date (Handling 2025 Reset)
      maturity_date = calculate_maturity_date(factor, today)

      # 2. Logic: Validate Checksum (Modulo 11)
      is_valid = validate_global_checksum(raw_input, verifier)

      # 3. Logic: Parse Free Field (Bank Specific Strategy)
      parsed_free_field = parse_free_field(bank_code, free_field_bin)

      {:ok, %__MODULE__{
        bank_code: bank_code,
        currency_code: currency,
        verifier_digit: verifier,
        maturity_factor: factor,
        amount: Decimal.new(amount_cts) |> Decimal.div(100),
        free_field: parsed_free_field,
        barcode_raw: raw_input,
        valid_checksum?: is_valid,
        maturity_date: maturity_date,
        metadata: generate_metadata(bank_code)
      }}
    else
      _ -> {:error, :parsing_failed}
    end
  end

  # Catch-all for invalid lengths
  def parse_barcode(binary, _today) when is_binary(binary) do
    {:error, {:invalid_length, byte_size(binary)}}
  end

  defp generate_metadata(1), do: %{bank_name: "Banco do Brasil"}
  defp generate_metadata(237), do: %{bank_name: "Bradesco"}
  defp generate_metadata(341), do: %{bank_name: "Itaú"}
Enter fullscreen mode Exit fullscreen mode

Here, we introduce a new function generate_metadata/1, which enriches our parsed data with the bank's name. For unknown bank codes, the parse_free_field/2 function handles the fallback (see Section 7). We also make a crucial choice: for the :amount field, we use the Decimal library. Standard floats can introduce precision errors in financial calculations, so Decimal is the correct tool for representing monetary values.

5.3 Deep Dive: The Pattern Matching Explanation

In the function head above, binary-size(3) tells the BEAM to look for exactly 3 bytes. If the input is "0019...", bank_id_bin binds to "001" (as a sub-binary).

Why is this superior to String.slice(binary, 0, 3)?

  1. Safety: If the binary is shorter than 44 bytes, the function clause simply doesn't match. We don't get an "Index out of bounds" error; we get a clean flow control mechanism.
  2. Performance: bank_id_bin is not a copy. It is a reference to the first 3 bytes of raw_input. When we pass free_field_bin (25 bytes) to the parsing logic, we are passing a tiny reference, not copying 25 bytes of memory. This efficiency scales massively when processing high-throughput streams (e.g., parsing thousands of CNAB records per second).

6. Detailed Implementation: The 2025 Maturity Logic

As established in Section 3, we cannot simply use Date.add(@base_date, factor). We must implement the "Sliding Window."

The logic implemented in calculate_maturity_date/2 works as follows:

  1. Calculate a "naive date" by adding the factor to the 1997-10-07 epoch.
  2. Compare this naive_date to the current system date (today, which is injected for testability).
  3. If the naive_date is more than a certain threshold in the past (e.g., 7300 days, or ~20 years), we assume the boleto belongs to the next cycle.
  4. To correct this, we "slide" the date forward by adding the number of days between the two possible dates for a factor of 1000. The first cycle starts at day 1000, and the second cycle resets the count of days to 1000, even though 9999 days have passed. This means we must add (@days_in_cycle - @reset_factor_floor) which is 9000 days.
  def calculate_maturity_date(0, _today), do: nil # Factor 0 implies "Contra-Apresentação"

  def calculate_maturity_date(factor, today) when factor >= @reset_factor_floor do
    naive_date = Date.add(@base_date, factor)

    # If the naive date is more than 7300 days in the past (approx 20 years),
    # assume we are in the new cycle (post-2025).
    # 7300 is a safe threshold, as boletos rarely have validity > 10-15 years.
    days_to_add = @days_in_cycle - @reset_factor_floor

    if Date.diff(today, naive_date) > 7300 do
      # Shift to the next cycle by adding the calculated day difference.
      Date.add(naive_date, days_to_add)
    else
      naive_date
    end
  end

  # Factors < 1000 are technically legacy/invalid for standard boletos
  # but might appear in testing or specific internal uses.
  def calculate_maturity_date(factor, _today), do: Date.add(@base_date, factor)
Enter fullscreen mode Exit fullscreen mode

This code is pure, deterministic (thanks to the injected today date), and handles the transition seamlessly.

7. The Strategy Pattern: Parsing the "Campo Livre"

The "Free Field" is where the binary pattern matching truly shines. Instead of complex if/else chains inside a single function, we define a private function parse_free_field/2 with multiple clauses pattern-matching on the bank_code.

7.1 Banco do Brasil (001) Layouts

Banco do Brasil is notoriously complex. It has distinct layouts depending on the length of the "Convenio" (Merchant ID). For this report, we will implement the common 6-digit Convenio layout.

  # Banco do Brasil (001)
  defp parse_free_field(1, binary) do
    # Attempt to match the 6-digit convenio layout
    case binary do
       <<convenio::binary-size(6), nosso_numero::binary-size(17), carteira::binary-size(2)>> ->
         %{
           bank: "Banco do Brasil",
           convenio: convenio,
           nosso_numero: nosso_numero,
           carteira: carteira,
           layout: :bb_convenio_6
         }
       # Fallback for other layouts
       _ -> %{bank: "Banco do Brasil", raw: binary, error: :unknown_layout}
    end
  end
Enter fullscreen mode Exit fullscreen mode

7.2 Bradesco (237) Layout

Bradesco uses a very consistent layout:

  # Bradesco (237)
  defp parse_free_field(237, <<
        agency        :: binary-size(4),
        wallet        :: binary-size(2),
        nosso_numero  :: binary-size(11),
        account       :: binary-size(7),
        _zero         :: binary-size(1)
      >>) do
    %{
      bank: "Bradesco",
      agency: agency,
      wallet: wallet,
      nosso_numero: nosso_numero,
      account: account
    }
  end
Enter fullscreen mode Exit fullscreen mode

7.3 Itaú (341) Layout

Itaú introduces Modulo 10 check digits inside the free field for validation of the Nosso Número and Account.

  # Itaú (341)
  defp parse_free_field(341, <<
        wallet        :: binary-size(3),
        nosso_numero  :: binary-size(8),
        dac_nn        :: binary-size(1),
        agency        :: binary-size(4),
        account       :: binary-size(5),
        dac_ac        :: binary-size(1),
        _zeros        :: binary-size(3)
      >>) do
    %{
      bank: "Itaú",
      wallet: wallet,
      nosso_numero: "#{nosso_numero}-#{dac_nn}",
      agency: agency,
      account: "#{account}-#{dac_ac}"
    }
  end
Enter fullscreen mode Exit fullscreen mode

7.4 Fallback for Unknown Banks

When the parser encounters a bank code not explicitly handled, it falls back to a generic handler:

  # Fallback for unknown banks
  defp parse_free_field(_bank_code, binary) do
    %{
      bank: "Unknown",
      raw: binary,
      error: :unknown_bank
    }
  end
Enter fullscreen mode Exit fullscreen mode

This design follows the Open-Closed Principle: the system is open for extension (add new bank clauses) but closed for modification (the main parsing logic remains unchanged). New banks can be supported by simply adding new function heads to parse_free_field/2.

8. Algorithms of Trust: Modulo 10 and 11

No financial parser is complete without validation. The FEBRABAN standard relies on Modulo 10 and Modulo 11.

  • Modulo 10: Used for the Digitable Line fields and internal bank check digits (like Itaú's).
  • Modulo 11: Used for the Global Verifier (Position 5) of the barcode.

8.1 The Modulo 11 Implementation (Global Verifier)

The algorithm for the global verifier is:

  1. Takes 43 digits (the whole barcode, excluding the DV at pos 5).
  2. Multipliers range from 2 to 9, cycling: 2, 3, 4, 5, 6, 7, 8, 9, 2, 3...
  3. Direction: Right to Left.
  4. Sum = sum(digit * multiplier)
  5. Remainder = Sum % 11
  6. DV = 11 - Remainder.
  7. Exception: If DV is 0, 10, or 11, the result is 1.

8.2 Functional Implementation

We avoid loop constructs. We use recursion on the binary.

  defp validate_global_checksum(barcode, expected_dv) do
    # Remove the DV at index 4 (5th byte)
    <<prefix::binary-size(4), _dv::binary-size(1), suffix::binary-size(39)>> = barcode

    # Concatenate to form the 43-digit payload
    payload = prefix <> suffix

    calculated = calculate_mod11(payload)
    calculated == expected_dv
  end

  defp calculate_mod11(binary) do
    sum = sum_mod11(binary, byte_size(binary) - 1, 2, 0)
    remainder = rem(sum, 11)
    result = 11 - remainder

    # Exception handling for Barcode DV
    if result in [0, 10, 11], do: 1, else: result
  end

  defp sum_mod11(_binary, -1, _mult, acc), do: acc

  defp sum_mod11(binary, index, mult, acc) do
    # Extract ASCII byte at index and convert to integer
    digit = :binary.at(binary, index) - 48
    new_acc = acc + (digit * mult)
    # Cycle multiplier: 2..9
    new_mult = if mult == 9, do: 2, else: mult + 1

    sum_mod11(binary, index - 1, new_mult, new_acc)
  end
Enter fullscreen mode Exit fullscreen mode

This implementation is extremely fast. It accesses the binary data directly using :binary.at/2 (which is O(1) for binaries) and consumes zero heap space for lists.

9. Beyond the Barcode: Parsing CNAB 240/400

The prompt asks about "Padrão CNAB". While the Barcode is for payment, the CNAB (Interchange) files are used for reconciliation. These are fixed-width text files (Standard 240 or 400 bytes per line).
Binary pattern matching is the definitive way to parse CNAB in Elixir.

9.1 The CNAB Structure

A CNAB file is a sequence of lines (records).

  • Header File: First line.
  • Header Lote: (CNAB 240 only) Batch header.
  • Detail Records: The actual transactions (Segment P, Q, R, etc.).
  • Trailers: Footers.

9.2 Parsing a CNAB File Stream

We can process a file line by line using recursive binary matching in a separate module, FebrabanProtocol.CnabParser.

defmodule FebrabanProtocol.CnabParser do
  @line_length 240 # For CNAB 240

  def parse_file(<<line::binary-size(@line_length), _newline, rest::binary>>),
    do: [parse_record(line) | parse_file(rest)]
  # ... other function heads for different newline characters and EOF

  defp parse_record(<<record_type, _rest::binary>> = line) do
    case record_type do
      ?0 -> parse_header(line)
      ?3 -> parse_detail(line)
      _ -> {:error, :unknown_record_type}
    end
  end

  # Detail Segment Parsing (Segment P example)
  defp parse_detail(<<
    _code_bank      :: binary-size(3),
    _batch          :: binary-size(4),
    _type           :: binary-size(1),
    _id             :: binary-size(5),
    segment         :: binary-size(1), # 'P', 'Q', etc.
    _filler         :: binary-size(1),
    _movement_code  :: binary-size(2),
    #... huge list of fields...
    amount          :: binary-size(15),
    _rest           :: binary
  >>) do
    %{
      type: :detail,
      segment: to_string(<<segment>>),
      amount: parse_money(amount)
    }
  end

  defp parse_money(bin) do
    {int, _} = Integer.parse(bin)
    Decimal.new(int) |> Decimal.div(100)
  end
end
Enter fullscreen mode Exit fullscreen mode

9.3 Advantages for CNAB

Using binary-size(N) allows us to map the official PDF documentation of the CNAB standard directly to Elixir code. If the spec says "Positions 18 to 22: Agência", we write _skip::binary-size(17), agency::binary-size(5), .... This readability drastically reduces bugs compared to line.substring(17, 22) where off-by-one errors are common.

10. Performance and Verification

Elixir's performance for this specific task is exceptional. In benchmarks, binary pattern matching on the BEAM outperforms regex and string slicing significantly because it avoids memory allocation. For a "Mega-Parser" processing millions of CNAB records (a common banking workload), this approach prevents the Garbage Collector from becoming a bottleneck.

10.1 Deterministic Date Testing

A critical aspect of testing date-sensitive logic is to remove dependency on the system clock, which makes tests flaky and non-repeatable. Our parse_barcode/2 function is designed for this.

# Instead of this, which can fail depending on when it's run:
FinancialParser.parse_barcode(some_barcode)

# We do this in our tests:
today = ~D[2026-01-03]
FinancialParser.parse_barcode(some_barcode, today)
Enter fullscreen mode Exit fullscreen mode

By explicitly passing the "current" date into the function, we can reliably test how the parser behaves at any point in time, especially around the February 2025 reset date.

10.2 Property-Based Testing

To further ensure the parser handles the "2025 Bug" and other edge cases correctly, Property-Based Testing with a library like StreamData is invaluable. Instead of writing dozens of individual examples, we can define a "property" that must hold true for all possible inputs.

For example, we can assert that for any valid factor, the calculated maturity date will never be in the distant past.

# In your test file (e.g., test/febraban_protocol/financial_parser_test.exs)
# Requires adding {:stream_data, "~> 0.5", only: :test} to mix.exs

use ExUnit.Case, async: true
import StreamData

alias FebrabanProtocol.FinancialParser

property "maturity date calculation is always logical" do
  check all factor <- integer(1000..9999),
            # Generate a random "today" from 2024 to 2050
            today <- StreamData.map(0..26, &Date.add(~D[2024-01-01], &1 * 365)) do

    calculated_date = FinancialParser.calculate_maturity_date(factor, today)

    # The property: The calculated date should not be more than ~20 years in the past.
    # This ensures our sliding window logic is working.
    assert Date.diff(today, calculated_date) < 7300
  end
end
Enter fullscreen mode Exit fullscreen mode

This single property test gives us much higher confidence than dozens of hand-picked examples, as it runs hundreds of variations automatically, searching for edge cases that might break our logic.

11. Conclusion

We have constructed a comprehensive, zero-dependency parser for Brazilian financial protocols. By leveraging Elixir's binary pattern matching, we transformed a complex integration problem involving legacy layouts, ambiguous dates (2025), and bank-specific polymorphism into a clean, declarative, and high-performance system.

Key Takeaways:

  1. Binary > String: Treating the barcode as a binary stream allows for safe, explicit, and memory-efficient parsing.
  2. 2025 Readiness: The "Maturity Factor" reset logic is critical. The "Sliding Window" implementation, made testable with date injection, ensures the system continues to operate correctly after February 22, 2025.
  3. Extensibility: The pattern-matching strategy pattern used for the "Free Field" allows the parser to support new banks by simply adding new function heads, adhering to the Open-Closed Principle.
  4. Precision Matters: Using the Decimal library for monetary values is non-negotiable for preventing floating-point rounding errors.
  5. Robust Testing: Combining deterministic date injection and property-based testing provides strong guarantees of correctness for complex business logic.

This architecture provides a robust foundation for any fintech application operating in the Brazilian market, ready to handle the scale and the legacy intricacies of the banking system.


Citations:

Referências citadas

  1. Binary pattern matching in Elixir with PNG parsing example - Zohaib, acessado em janeiro 3, 2026, https://zohaib.me/binary-pattern-matching-in-elixir/
  2. Binary Pattern Matching in Elixir - Peter Ullrich, acessado em janeiro 3, 2026, https://peterullrich.com/binary-pattern-matching-in-elixir
  3. Atenção: Mudança no Fator de Vencimento dos Boletos a partir de 22/02/2025 - KMEE, acessado em janeiro 3, 2026, https://kmee.com.br/blog/gestao-empresarial-4/atencao-mudanca-no-fator-de-vencimento-dos-boletos-a-partir-de-22-02-2025-23
  4. boleto package - github.com/italolelis/go-boleto - Go Packages, acessado em janeiro 3, 2026, https://pkg.go.dev/github.com/italolelis/go-boleto
  5. 6 - Boleto layout - FitBank API, acessado em janeiro 3, 2026, https://dev.fitbank.com.br/docs/boleto-layout
  6. boleto package - github.com/hubcash/boleto - Go Packages, acessado em janeiro 3, 2026, https://pkg.go.dev/github.com/hubcash/boleto
  7. “Layout” Padrão de Arrecadação/Recebimento com Utilização do Pagamento Instantâneo PIX VERSÃO 01.01 - Febraban, acessado em janeiro 3, 2026, https://cmsarquivos.febraban.org.br/Arquivos/documentos/PDF/Comunicado_FB005-2021_Layout%20_Padr%C3%A3o_de_Arrecada%C3%A7%C3%A3o.pdf
  8. ZXing not reading Brazilian Barcode bills (interleaved 2 of 5) correctly. How can I fix it?, acessado em janeiro 3, 2026, https://stackoverflow.com/questions/22234424/zxing-not-reading-brazilian-barcode-bills-interleaved-2-of-5-correctly-how-ca
  9. Fator de Vencimento dos Boletos – Atualização FEBRABAN, o que vai mudar?, acessado em janeiro 3, 2026, https://assinaturas.superlogica.com/hc/pt-br/articles/29654851094807-Fator-de-Vencimento-dos-Boletos-Atualiza%C3%A7%C3%A3o-FEBRABAN-o-que-vai-mudar
  10. How to Fix the 2025 Boleto Bug - YouTube, acessado em janeiro 3, 2026, https://www.youtube.com/watch?v=IqJJuwsC0ec
  11. 22 de Fevereiro de 2025: Por que emissores de boletos precisam se ..., acessado em janeiro 3, 2026, https://forum.casadodesenvolvedor.com.br/topic/49188-22-de-fevereiro-de-2025-por-que-emissores-de-boletos-precisam-se-atentar-a-esta-data/
  12. 29/01/2025 | IMPORTANTE! Nova regra de Fator de Vencimento FEBRABAN, acessado em janeiro 3, 2026, https://documentacao.senior.com.br/noticias/2025/2025-01-29-fator-vencimento-febraban.htm
  13. Verification of bank number using modulus 11 - Google Groups, acessado em janeiro 3, 2026, https://groups.google.com/g/comp.lang.python/c/ipiTD5QEreA
  14. Down the Rabbit Hole: Defining a guard-safe modulo operation in Elixir - Medium, acessado em janeiro 3, 2026, https://medium.com/@W_Mcode/down-the-rabbit-hole-defining-a-guard-safe-modulo-operation-in-elixir-6335ade7c078
  15. What is the Elixir way of decoding/parsing binary data?, acessado em janeiro 3, 2026, https://elixirforum.com/t/what-is-the-elixir-way-of-decoding-parsing-binary-data/53730

Top comments (0)