⚠️ **SPOILER ALERT** ⚠️

This is a post with my solutions and learning from the puzzle. Don't continue reading if you haven't tried the puzzle on your own yet.

If you want to do the puzzle, visit adventofcode.com/2020/day/7.

My programming language of choice is `python`

and all examples below are in python.

## TOC

## Key learning

- Depth first search
- Possibly on part 1: Breadth first search

This puzzle can be viewed as a "graph" problem with nodes and edges. A simple way for beginners to start is utilizing a queue or recursive function.

## Puzzle

The puzzle describes relationship between bags in different colors. A bag in one color can contain a certain amount of bags in other colors.

Example input:

```
light red bags contain 1 bright white bag, 2 muted yellow bags.
dark orange bags contain 3 bright white bags, 4 muted yellow bags.
bright white bags contain 1 shiny gold bag.
muted yellow bags contain 2 shiny gold bags, 9 faded blue bags.
shiny gold bags contain 1 dark olive bag, 2 vibrant plum bags.
dark olive bags contain 3 faded blue bags, 4 dotted black bags.
vibrant plum bags contain 5 faded blue bags, 6 dotted black bags.
faded blue bags contain no other bags.
dotted black bags contain no other bags.
```

## Part 1

The puzzle is to find all bags that can directly or indirectly contain a `shiny gold bag`

. This can be viewed as a graph where bags are nodes and their relationships are edges.

An visualisation with example data:

```
-> green bag -> grey bag
/ \
blue bag - -> shiny gold bag
\ /
-> purple bag /
/
white bag -------------------
```

Our goal is to traverse from `shiny gold`

to the left and counting all nodes it finds. In this visualisation the answer is: `white bag, blue, green, grey`

### Parsing input

A big part of the problem is deciding on data structure and how to parse the input into that structure.

For part 1 I aimed for this data-structure:

```
bags = {
"light red": "1 bright white bag, 2 muted yellow bags.",
"dark orange": "3 bright white bags, 4 muted yellow bags.",
"bright white": "1 shiny gold bag.",
"muted yellow": "2 shiny gold bags, 9 faded blue bags.",
...
}
```

To keep track of which bags that contain a `shiny gold`

bag I used a `set`

to avoid duplicates. Below is the code to parse in the data. This approach does the simple parsing for bare minimum for part 1:

```
# Extract a list of lines from file
lines = [x.strip() for x in open('07.input', 'r').readlines()]
bags = {}
for l in lines:
bag, contains = l.split('contain')
bag = bag.replace(' bags','')
bags[bag] = contains
```

### Solution: Breadth first

Some bags will directly contain `shiny gold`

. Those bags can in turn be contained in other bags, which will indirectly contain `shiny gold`

.

This solution uses a work queue to traverse the "graph". Once we find a bag that can contain `shiny gold`

we add it to the queue to check if it can be contained by another bag. We keep on doing so until the work queue is empty. For every bag we find we store it in the `answer`

-set

This is a so called breadth first search. In this case I use a `queue`

.

```
answer = set() # Bags containing 'shiny gold'
q = ['shiny gold'] # Work queue for traversing bags
while len(q) != 0:
current = q.pop(0)
for bag in bags:
if bag in answer: # Skip if already counted.
continue
if current in bags[bag]: # If bag contains current-bag,
q.append(bag) # add to queue and answer
answer.add(bag)
print "Answer part 1: %d" % len(answer)
```

### Solution: Depth first

Another solution is the depth first search. Both algorithms do fine in this puzzle and input-size. But in coming advent-of-code-puzzles, it is possible that only one will be sufficient enough. Therefore we practice both now.

This example is a `depth first`

where I use a `recursive function`

:

```
bags = {}
for l in lines:
bag, contains = l.split('contain')
bag = bag.replace(' bags','')
bags[bag] = contains
def deep_search(bag, bags):
contains = []
for b in bags:
if bag in bags[b]:
# Add b to our list
contains.append(b)
# Add all children to b in our list
contains.extend(deep_search(b, bags))
# Remove duplicates
return set(contains)
print "Answer: ", len(deep_search('shiny gold', bags))
```

## Part 2

### Parsing input

For part 2 I aimed for this data-structure:

```
bags = {
"light red": { "bright white": 1, "muted yellow": 2 },
"dark orange": { "bright white": 3, "muted yellow": 4 },
"bright white": { "shiny gold": 1 },
"muted yellow": { "shiny gold": 2, "faded blue": 9 },
...
}
```

In this way I can call `bags['muted yellow']['shiny gold']`

to find out how many `shiny gold`

bags `muted yellow`

contain.

This is a ugly "simple" way of solving it. Further down is an regex example which is cleaner.

```
bags = {}
q = []
for l in lines:
# Clean line from unnecessary words.
l = l.replace('bags', '').replace('bag', '').replace('.','')
bag, contains = l.split('contain')
bag = bag.strip()
if 'no other' in contains: # If bag doesn't contain any bag
bags[bag] = {}
continue
contains = contains.split(',')
contain_dict = {}
for c in [c.strip() for c in contains]:
amount = c[:2]
color = c[2:]
contain_dict[color] = int(amount)
bags[bag] = contain_dict
```

### Solution

A good data structure enables us to do a clean solution. Here is an example with

a recursive function to count all bags:

```
def recursive_count(bag, bags):
count = 1 # Count the bag itself
contained_bags = bags[bag]
for c in contained_bags:
multiplier = contained_bags[c]
count += multiplier * recursive_count(c, bags)
return count
# Minus one to not count the shiny gold bag itself
result = recursive_count('shiny gold', bags) - 1
print "Result part 2: %d " % result
```

## Alternative: With regex

A lot easier way to parse the data is using regex. Here is an example:

```
import re
from collections import defaultdict
bags = defaultdict(dict)
for l in lines:
bag = re.match(r'(.*) bags contain', l).groups()[0]
for count, b in re.findall(r'(\d+) (\w+ \w+) bag', l):
bags[bag][b] = int(count)
def part1():
answer = set()
def search(color):
for b in bags:
if color in bags[b]:
answer.add(b)
search(b)
search('shiny gold')
return len(answer)
def part2():
def search(bag):
count = 1
for s in bags[bag]:
multiplier = bags[bag][s]
count += multiplier * search(s)
return count
return search('shiny gold' ) - 1 # Rm one for shiny gold itself
```

## Alternative: Network X

NetworkX is a good library for handling graphs. Haven't used it myself so much, but it appears quite often in the solutions megathread once there is a graph-problem.

There is a lot of documentation, but by reading solutions on the megathread you'll learn what common features to use.

Make sure you understand the above example with regex first before reading this. Here is an example solution with networkx:

```
import re
from collections import defaultdict
import networkx as nx
# Parse data
bags = defaultdict(dict)
for l in lines:
bag = re.match(r'(.*) bags contain', l).groups()[0]
for count, b in re.findall(r'(\d+) (\w+ \w+) bag', l):
bags[bag][b] = { 'weight': int(count)}
# Create a graph in networkx
G = nx.DiGraph(bags)
def part1():
# Reverse edges
RG = G.reverse()
# Get predecessors
predecessors = nx.dfs_predecessors(RG, 'shiny gold')
# Count predecessors
return len(predecessors)
print(part1())
def part2():
def search(node):
count = 1
# Iterate neighbors for node
for n in G.neighbors(node):
# Multiply weight with recursive search
count += G[node][n]['weight'] * search(n)
return count
return search('shiny gold') - 1
print(part2())
```

Some benefit in using networkx is that it gives us more powerful tools than just traversing the graph. If we would like to find the shortest path, then this one-liner would help:

```
print(nx.shortest_path(G, source='bright red', target='shiny gold'))
```

Links:

- networkx.org/documentation/networkx-2.3/reference/classes/digraph.html#
- [networkx.org/documentation/networkx-2.3/reference/algorithms/generated/networkx.algorithms.shortest_paths.generic.shortest_path.html](https://networkx.org/documentation/networkx-2.3/reference/algorithms/generated/networkx.algorithms.shortest_paths.generic.shortest_path.html#networkx.algorithms.shortest_paths.generic.shortest_path

## More python tools

These are tools worth having in back of your head while doing advent of code.

defaultdict: Useful when you don't want to think about initiating all values first.

deque: Useful when you want to pop left and right from a queue.

heapq: Amazing for tool or solving problems that involve extremes such as largest, smallest, optimal, etc.

## Thanks for reading!

I hope these solutions were helpful for you.

Complete code can be found at: github.com/cNille/AdventOfCode/blob/master/2020/07.py

## Discussion (1)

This is an awesome post! Thanks for sharing your insights!