DEV Community

loading...

Where is my ETS table!?

Sergey Fedorov
・4 min read

Foreword

In a perfect world, everyone will read the documentation beforehand.

TL;DR

While working on the Elixir project Avrora (convenient encoding and decoding of Avro messages) I've received an issue with testing environment.

Which was odd to me because it turns out that some ETS references become invalid during a simple test case.

A very straightforward Phoenix controller

defmodule AvroraIssueWeb.PageController do
  use AvroraIssueWeb, :controller

  def index(conn, _params) do
    {:ok, _} = Avrora.encode(%{"login" => "fxn"}, schema_name: "com.example.User") |> IO.inspect()
    json(conn, %{})
  end
end

fail to pass dead simple test cases

defmodule AvroraIssueWeb.PageControllerTest do
  use AvroraIssueWeb.ConnCase

  test "1", %{conn: conn} do
    get(conn, "/")
    assert true
  end

  test "2", %{conn: conn} do
    get(conn, "/")
    assert true
  end
end

If we look inside the method Avrora.encode/2, we will find that it triggers JSON string parsing, which will create Elixir struct with a reference to the created ETS table. The result of the Avrora.encode/2 will be cached.

But how my ETS reference in the cache becomes invalid? And why?

Let's take a look at what ETS is?

It's an Erlang Built-In Term Storage which provides the ability to store very large quantities of data in an Erlang runtime system and to have constant access time to the data (depends on the table type).

In Elixir you can access it by using the :ets module.

If we continue reading official Erlang documentation, we will find one important sentence:

Notice that there is no automatic garbage collection for tables. Even if there
are no references to a table from any process, it is not automatically destroyed
unless the owner process terminates

ETS must have an owner process

To illustrate that ETS behavior let's take a look at this example:

# file: where_is_my_ets.ex

# This is a wrapper around the :ets module.
# We will use named tables to visualize it later.
defmodule Table do
  def new(name) do
    :ets.new(name, [:named_table])
  end
end

# This is a GenServer which can create named tables
# by calling Table.new/1
defmodule Owner do
  use GenServer

  @impl true
  def init(state) do
    {:ok, state}
  end

  @impl true
  def handle_call({:new, name}, _from, state) do
    {:reply, Table.new(name), state}
  end
end

Now let's start IEx session and try this code via iex where_is_my_ets.ex. Notice that IEx session is also a process, hence all created tables right in the REPL will belong to that process, let's give it a try:

# Let's remember the PID of a REPL session
iex(1)> self
#PID<0.113.0>

# Now let's check what kind of ETS tables already exist
iex(2)> :ets.all
[:logger, :ac_tab, #Reference<0.1096493602.1531314178.170593>, ...,
 :elixir_config, :elixir_modules, IEx.Config, IEx.Pry]

# Creating new table with name :repl
iex(3)> Table.new(:repl)
:repl

# Who is the owner?
iex(4)> :ets.info(:repl)
[
  id: #Reference<0.1096493602.1531314178.171287>,
  read_concurrency: false,
  write_concurrency: false,
  compressed: false,
  memory: 311,
  owner: #PID<0.113.0>, # <--- This is our REPL session!
  heir: :none,
  name: :repl,
  size: 0,
  node: :nonode@nohost,
  named_table: true,
  type: :set,
  keypos: 1,
  protection: :protected
]

# Now we are able to see :repl table in the list of all tables
iex(5)> :ets.all
[:logger, :ac_tab, #Reference<0.1096493602.1531314178.170593>, ...,
 :elixir_config, :elixir_modules, IEx.Config, IEx.Pry, :repl]

# Starting Owner process without linking to REPL process
iex(6)> {:ok, pid} = GenServer.start(Owner, [])
{:ok, #PID<0.123.0>}

# And create a new table in REPL, but by different process
iex(7)> GenServer.call(pid, {:new, :hello})
:hello

# Who is the owner of that table?
iex(8)> :ets.info(:hello)
[
  id: #Reference<0.1096493602.1531314178.171427>,
  read_concurrency: false,
  write_concurrency: false,
  compressed: false,
  memory: 311,
  owner: #PID<0.123.0>, # <--- This is our Owner process
  heir: :none,
  name: :hello,
  size: 0,
  node: :nonode@nohost,
  named_table: true,
  type: :set,
  keypos: 1,
  protection: :protected
]

# We can see :hello table too
iex(9)> :ets.all
[:logger, :ac_tab, #Reference<0.1096493602.1531314178.170593>, ...,
 :elixir_config, :elixir_modules, IEx.Config, IEx.Pry, :repl, :hello]

# What will happen if process Owner will die?
iex(10)> Process.exit(pid, [])
true

# Table :hello gone ...
iex(11)> :ets.info(:hello)
:undefined

# Also from the "all" list
iex(12)> :ets.all
[:logger, :ac_tab, #Reference<0.1096493602.1531314178.170593>, ...,
 :elixir_config, :elixir_modules, IEx.Config, IEx.Pry, :repl]

So what was the problem with tests?

In the Phoenix framework, every action in the controller will spawn a process to handle the request. Inside that process, an ETS table was created and its reference stored in a shared cache.

But once an action is done, the ETS table is wiped and the reference becomes invalid. To solve this issue I delegate ETS creation to a separate process that becomes their owner instead of short-living Phoenix controller.

Some useful links

  1. Avrora library
  2. Repository with reported issue
  3. Discussion around processes and ETS

Discussion (0)