DEV Community

Samuel Nitsche
Samuel Nitsche

Posted on • Originally published at cleandatabase.wordpress.com on

Advent Of Code 2020 in SQL – Day 7: “Handy Haversacks”

Day 7’s puzzle comes with a bunch of rules about bags that contain other bags – and a lot of different colors (very cool puzzle setting btw!)

As always, the first thing we want to do is to normalize the input in a way that gives us a workable result set. In this case I aim to have a row for each child-rule, groupable by the bag’s name.

If we want to split a column into several rows, one of best approaches (if we cannot easily unpivot) is to use recursive subqueries. They seemed complicated to me at first, but once I used them several times I really get to like them:

with
  base_data as (
    select
      column_value line
    from table(
      aoc_file_loader.file_as_stringlist(
        aoc_file_loader.local_url('day_7_input.txt')
      )
    )
  ),
  bag_children as (
    select
      replace(
        regexp_replace(line, '^(.+) contain (.+)$', '\1'),
        'bags',
        'bag'
        ) bag,
      regexp_replace(line, '^(.+) contain (.+)$', '\2') children
    from base_data
  ),
  children_in_rows (bag, child, children) as (
    select
      bag,
      nvl(substr(children, 1, instr(children, ', ')-1), children),
      case when instr(children, ', ') > 0 then
        substr(children, instr(children, ', ')+2)
      else null end
    from bag_children
    union all
    select
      bag,
      nvl(substr(children, 1, instr(children, ', ')-1), children),
      case when instr(children, ', ') > 0 then
        substr(children, instr(children, ', ')+2)
      else null end
    from children_in_rows where children is not null
  )
select * from children_in_rows;
Enter fullscreen mode Exit fullscreen mode

It’s important we normalize the “bags” in the bagcolumn to “bag”, because we have rules that say “contain 1 other bag.” and we won’t be able to match that with the “bags”.

The first task now is to wander upwards and count how many different (parent) bag-colors we have.

This might be possible with connect by prior, too, but I noticed I have way more control about how exactly my recursion behaves if I use recursive subquery again:

with
  base_data as (
    select
      column_value line
    from table(
      aoc_file_loader.file_as_stringlist(
        aoc_file_loader.local_url('day_7_input.txt')
      )
    )
  ),
  bag_children as (
    select
      replace(
        regexp_replace(line, '^(.+) contain (.+)$', '\1'),
        'bags',
        'bag'
        ) bag,
      regexp_replace(line, '^(.+) contain (.+)$', '\2') children
    from base_data
  ),
  children_in_rows (bag, child, children) as (
    select
      bag,
      nvl(substr(children, 1, instr(children, ', ')-1), children),
      case when instr(children, ', ') > 0 then
        substr(children, instr(children, ', ')+2)
      else null end
    from bag_children
    union all
    select
      bag,
      nvl(substr(children, 1, instr(children, ', ')-1), children),
      case when instr(children, ', ') > 0 then
        substr(children, instr(children, ', ')+2)
      else null end
    from children_in_rows where children is not null
  ),
  bag_hierarchy( bag, child, hierarchy_level, path ) as (
    select
      bag,
      child,
      1,
      bag
    from children_in_rows
    where child like '%shiny gold bag%'
    union all
    select
      cur_bag.bag,
      cur_bag.child,
      parent_bag.hierarchy_level+1,
      cur_bag.bag || '/' || parent_bag.path
    from children_in_rows cur_bag
      inner join bag_hierarchy parent_bag on cur_bag.child like '%'||parent_bag.bag||'%'
  )
select count(distinct bag)
from bag_hierarchy
Enter fullscreen mode Exit fullscreen mode

Part 2 now wants us to walk down the tree and count all the bags inside of the shiny gold bag – and the bags inside the bags etc.

Therefore we need to extract the number from the text – easy with regular expressions. We also replace “no other bags.” with 0 so we can calculate properly.

I found it easier to calculate the cumulated children of each bag (on parent-level) rather than the cumulated number on child-level.

with
  base_data as (
    select
      column_value line
    from table(
      aoc_file_loader.file_as_stringlist(
        aoc_file_loader.local_url('day_7_input.txt')
      )
    )
  ),
  bag_children as (
    select
      replace(
        regexp_replace(line, '^(.+) contain (.+)$', '\1'),
        'bags',
        'bag'
        ) bag,
      regexp_replace(line, '^(.+) contain (.+)$', '\2') children
    from base_data
  ),
  children_in_rows (bag, child, children) as (
    select
      bag,
      nvl(substr(children, 1, instr(children, ', ')-1), children),
      case when instr(children, ', ') > 0 then
        substr(children, instr(children, ', ')+2)
      else null end
    from bag_children
    union all
    select
      bag,
      nvl(substr(children, 1, instr(children, ', ')-1), children),
      case when instr(children, ', ') > 0 then
        substr(children, instr(children, ', ')+2)
      else null end
    from children_in_rows where children is not null
  ),
  children_with_count as (
    select
      bag,
      to_number(
        replace(
          regexp_replace(child, '^([0-9]+) (.*)$', '\1'),
          'no other bags.',
          '0'
          )
        )child_count,
      regexp_replace(child, '^([0-9]+) (.*)$', '\2') child_name
    from children_in_rows
  ),
  bag_hierarchy( bag, child_name, child_count, cumulated_child_count, hierarchy_level, path ) as (
    select
      bag,
      child_name,
      child_count,
      child_count,
      1,
      bag
    from children_with_count
    where bag like '%shiny gold bag%'
    union all
    select
      cur_bag.bag,
      cur_bag.child_name,
      cur_bag.child_count,
      parent_bag.cumulated_child_count*cur_bag.child_count,
      parent_bag.hierarchy_level+1,
      parent_bag.path || '/' || cur_bag.bag
    from children_with_count cur_bag
      inner join bag_hierarchy parent_bag on parent_bag.child_name like '%'||cur_bag.bag||'%'
  )
select sum(cumulated_child_count)
from bag_hierarchy
Enter fullscreen mode Exit fullscreen mode

This was quite a challenge and I had several errors (mainly from the missing “one some-colored bag.” connection) – hierarchical queries are something I don’t do that often. But it was very funny and I learned and refreshed a lot!

You can find the whole code on my github-repository.

Speedy emails, satisfied customers

Postmark Image

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up

Top comments (0)

Billboard image

Create up to 10 Postgres Databases on Neon's free plan.

If you're starting a new project, Neon has got your databases covered. No credit cards. No trials. No getting in your way.

Try Neon for Free →

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay