Pau Riosa

Posted on Apr 8, 2022

Elixir Today: Combining a List of Map via Specific Field

#elixir #data #programming #webdev

References

Intro

Recently, I've been working on a similar question about How to combine a list of map via specific field.

I looked at the elixir forum and gladly I found something. I saw that one of the answers there is using Enum.group_by + Enum.map.

Given my curiosity, How can I come up with a solution using Enum.reduce?

Curiosity Kicks In

So I started working out using Enum.reduce.


 @list [
    %{id: 1, list: [1]},
    %{id: 1, list: [2]},
    %{id: 2, list: [1]},
    %{id: 3, list: [1]},
    %{id: 1, list: [3, 4, 5]},
    %{id: 2, list: [2, 3]},
    %{id: 3, list: [2, 3, 4, 5]},
    %{id: 4, list: [2, 3, 4, 5]},
    %{id: 4, list: [1]}
  ]

    @list
    |> Enum.reverse()
    |> Enum.reduce(%{}, fn %{id: id, list: _list} = f, acc ->
      case acc do
        %{^id => existing} -> Map.put(acc, id, [f | existing])
        %{} -> Map.put(acc, id, [f])
      end
    end)
    |> Enum.map(fn {id, list} -> %{id: id, list: Enum.flat_map(list, & &1.list)} end)

In this way, I created a sample list with specific id and their list.
I reverse the order of the list
I used Enum.reduce to sort out and combine the list of the ids accordingly.
Finally, used Enum.map to transform the data.

Benchmarking!

I started to compare the result between using group_by and reduce. In order to that, I used Benchee.

To install, add benchee to your dependencies.

{:benchee, "~> 1.0", only: :dev}

Given the setup

 @list [
    %{id: 1, list: [1]},
    %{id: 1, list: [2]},
    %{id: 2, list: [1]},
    %{id: 3, list: [1]},
    %{id: 1, list: [3, 4, 5]},
    %{id: 2, list: [2, 3]},
    %{id: 3, list: [2, 3, 4, 5]},
    %{id: 4, list: [2, 3, 4, 5]},
    %{id: 4, list: [1]}
  ]

def groupby(list) do
    Enum.group_by(list, & &1.id)
    |> Enum.map(fn {id, list} -> %{id: id, list: Enum.flat_map(list, & &1.list)} end)
end

def reduce(list) do
    list
    |> Enum.reverse()
    |> Enum.reduce(%{}, fn %{id: id, list: _list} = f, acc ->
      case acc do
        %{^id => existing} -> Map.put(acc, id, [f | existing])
        %{} -> Map.put(acc, id, [f])
      end
    end)
    |> Enum.map(fn {id, list} -> %{id: id, list: Enum.flat_map(list, & &1.list)} end)
end

Then run

Benchee.run(
   %{
        "groupby_enum_map" => fn -> PracticeElixir.groupby(@list) end,
        "reduce_enum_map" => fn -> PracticeElixir.reduce(@list) end
      },
      print: [fast_warning: false]
)

Benchmark Result

Given my dev environment, here is the result.

available_memory: "16 GB",
cpu_speed: "Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz",
elixir: "1.12.3",
erlang: "24.0.6",
num_cores: 16,
os: :macOS

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 14 s

Benchmarking groupby_enum_map ...
Benchmarking reduce_enum_map ...

Name                       ips        average  deviation         median         99th %
reduce_enum_map         1.08 M        0.93 μs   ±208.21%        0.80 μs           5 μs
groupby_enum_map        0.97 M        1.03 μs   ±251.03%        0.90 μs           4 μs

Comparison:
reduce_enum_map         1.08 M
groupby_enum_map        0.97 M - 1.12x slower +0.109 μs

You can view the full implementation here!

Happy Coding!

Top comments (3)

Alim Nastaev • Jun 27 '22 • Edited

You can do it even cleaner:

    @list
    |> Enum.reverse()
    |> Enum.reduce(%{}, fn %{id: id, list: list}, acc ->
      case acc do
        %{^id => existing} ->
          Map.put(acc, id, list ++ existing)

       # underscore here is much more explicit than just %{}
        _ ->
          Map.put(acc, id, list)
      end
    end)
    |> Enum.map(fn {id, list} -> %{id: id, list: list} end)

Usually, people religiously complain about ++ performance, but it's not always true:
The Seven Myths of Erlang Performance - 2.2 Myth: Operator "++" is Always Bad
erlang.org/doc/efficiency_guide/my...

P.S. Also, what's the purpose of Enum.reverse/1 before the reduce function?

Pau Riosa • Jun 28 '22

Thank you for this one will take a look at it. Reverse preserved the original ordering of the list.

Alim Nastaev • Jun 28 '22

Hmmmm...?!
Your example returns this:

[
  %{id: 1, list: [1, 2, 3, 4, 5]},
  %{id: 2, list: [1, 2, 3]},
  %{id: 3, list: [1, 2, 3, 4, 5]},
  %{id: 4, list: [2, 3, 4, 5, 1]} <--- list is not ordered (if that is what you've meant)
]