DEV Community

Samuel Nitsche
Samuel Nitsche

Posted on • Originally published at cleandatabase.wordpress.com on

Advent Of Code 2020 in SQL – Day 9: “Encoding Error”

The Day 9 puzzle wants us to find certain anomalies in a stream of numbers: Each number must be the sum of two previous numbers in a range of 25 (between line-1 and line-25). We should find the first number that doesn’t meet this requirement.

For databases are great at adding numbers, I have no problem doing a lot of additional work (well, let the machine do a lot of additional work) and calculate the sum of each of these possibilities.

I also added a config-subquery so I can easily switch my actual data to the data provided in the example. The puzzles really get so challenging that I usually start with the example-data.

with
  config as (
    select
      25 as preamble_length,
      25 as search_range
    from dual
  ),
  base_data as (
    select
      rownum line,
      to_number(column_value) num
    from table (
        aoc_file_loader.file_as_stringlist(
          aoc_file_loader.local_url('day_9_input.txt')
       )
    )
  ),
  calc_cube as (
    select
      d1.line line_1,
      d1.num num_1,
      d2.line line_2,
      d2.num num_2,
      d1.num+d2.num sum
    from base_data d1
      cross join base_data d2
      cross join config
    where
      d1.line-d2.line between 1 and config.search_range
      or d1.line-d2.line between 1 and config.search_range
  )
select * from calc_cube
Enter fullscreen mode Exit fullscreen mode

From there I can check whether there is such a sum in the requested range line by line, starting with the first line after the preamble.

with
  config as (
    select
      25 as preamble_length,
      25 as search_range
    from dual
  ),
  base_data as (
    select
      rownum line,
      to_number(column_value) num
    from table (
        aoc_file_loader.file_as_stringlist(
          aoc_file_loader.local_url('day_9_input.txt')
       )
    )
  ),
  calc_cube as (
    select
      d1.line line_1,
      d1.num num_1,
      d2.line line_2,
      d2.num num_2,
      d1.num+d2.num sum
    from base_data d1
    cross join base_data d2
    cross join config
    where
      d1.line-d2.line between 1 and config.search_range
      or d1.line-d2.line between 1 and config.search_range
  ),
  valid_sequence( line, num, parents_found) as (
    select
      line,
      num,
      (select count(*) from calc_cube cube
        where
          cube.sum = cur.num
          and cube.line_2 between cur.line-config.search_range and cur.line-1
          and cube.line_1 between cur.line-config.search_range and cur.line-1
      )
    from base_data cur
      cross join config
    where line = config.preamble_length+1 -- First line after preamble
    union all
    select
      cur.line,
      cur.num,
      (select count(*) from calc_cube cube
        where
          cube.sum = cur.num
          and cube.line_2 between cur.line-config.search_range and cur.line-1
          and cube.line_1 between cur.line-config.search_range and cur.line-1
      )
    from base_data cur
      inner join valid_sequence prev on cur.line = prev.line+1
      cross join config
    where prev.parents_found > 0
  )
select
  line,
  num
from valid_sequence
where parents_found = 0
Enter fullscreen mode Exit fullscreen mode

So far, so good – my database does this in like 1,5 seconds, which is okay.

Now Part 2. Phew. I really have no idea how to solve this in a set-based way, so I decided to take a “walk” approach with recursive subqueries.

After limiting the possible input data, I decide in each step what the next line to walk through will be. The logic is basically:

  • add to run_total as long as it’s not greater than invalid_number
  • reduce run_total by the amount of min_line until it’s lower than invalid_number again
    • Also increase min_line in that case
  • stop when run_total of the previous walk is equal to invalid_number
with
  config as (
    select
      41682220 as invalid_num
    from dual
  ),
  base_data as (
    select
      rownum line,
      to_number(column_value) num
    from table (
        aoc_file_loader.file_as_stringlist(
          aoc_file_loader.local_url('day_9_input.txt')
       )
    )
  ),
  possible_input as (
    select
      line,
      num
    from base_data
      cross join config
    where
      line < (select line from base_data where num = config.invalid_num)
  ),
  walk (line, num, run_total, min_line, max_line, action, next_line) as (
    select
      line,
      num,
      num,
      line,
      line,
      'add',
      line+1
    from possible_input
    where line = 1
    union all
    select
      cur.line,
      cur.num,
      case when prev.run_total+cur.num > config.invalid_num then
        prev.run_total - (select num from possible_input where line = prev.min_line)
      else
        prev.run_total + cur.num
      end run_total,
      case when prev.run_total+cur.num > config.invalid_num then
        prev.min_line+1
      else
        prev.min_line
      end min_line,
      cur.line max_line,
      case when prev.run_total+cur.num > config.invalid_num then
        'rem'
      else
        'add'
      end action,
      case when prev.run_total+cur.num > config.invalid_num then
        cur.line
      else
        cur.line+1
      end next_line
    from possible_input cur, walk prev, config
    where cur.line = prev.next_line
      and prev.run_total != config.invalid_num
  ) --cycle line, run_total set is_loop to 'Y' default 'N'
select
  (select min(num) from possible_input where line between walk.min_line and walk.max_line) min_num,
  (select max(num) from possible_input where line between walk.min_line and walk.max_line) max_num,
  (select min(num)+max(num) from possible_input where line between walk.min_line and walk.max_line) sum_num
from walk, config
where run_total = config.invalid_num
Enter fullscreen mode Exit fullscreen mode

The commented-out line was pretty helpful to check at which point I messed up and produced an infinite loop.

While the solution is unusual for a SQL-query, it’s very, very fast: ~250ms on my crappy dockerized database.

As always, you can find the code on my github-repository.

Heroku

Simplify your DevOps and maximize your time.

Since 2007, Heroku has been the go-to platform for developers as it monitors uptime, performance, and infrastructure concerns, allowing you to focus on writing code.

Learn More

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay