Advent of Code 2020 - Day 23

#adventofcode #python #puzzles

https://adventofcode.com/2020/day/23 ('Crab Cups')

After reading part 1 the first thing that came to mind was pretty much a literal translation of the problem into code using list slicing:

def part1():
    cups = [2, 4, 7, 8, 1, 9, 3, 5, 6]
    num_cups = len(cups)

    for _ in range(100):
        # Keep things arranged so the current cup is always at 
        # index 0
        current_cup = cups[0]

        # This means we always want to extract the cups at 
        # indices 1, 2 and 3.
        # The cup at index 4 will always be the next current 
        # cup, so should end up at the start.
        # The existing current cup should be moved to the end.
        extracted = cups[1:4]
        cups = cups[4:] + cups[:1]

        destination_cup = current_cup
        while True:
            destination_cup -= 1
            if destination_cup < 1:
                destination_cup += num_cups
            if destination_cup not in extracted:
                break

        destination_index = cups.index(destination_cup) + 1
        cups = cups[:destination_index] + extracted + cups[destination_index:]

    one_index = cups.index(1)
    cups = cups[(one_index + 1):] + cups[:one_index]
    return ''.join(str(cup) for cup in cups)

This ran in about 0.2ms, which seemed reasonably fast until I read part 2! It was immediately clear that the approach in part 1 wasn't going to cut it with an input 100,000 times as big over 100,000 times as many rounds:

I was creating lots of new list objects each round as a result of all the list slicing and concatenation operations.
Apart from the churn of creating lots of short-lived objects, the slicing and concatenation operations are O(n) as copies are being made of parts of the list data each time.
The destination cup is found through a linear search - also O(n).

Extending the list to a million items in the code above and running it for 100 iterations takes around 6s, or about 30,000 times as long as for a 9-element list. Extending the run from 100 to 10,000,000 iterations would therefore take about 7 days to complete!

There seemed to be two ways forward:

Remove as many of the O(n) operations as possible so the size of the list of cups isn't a factor when calculating each iteration, or
Spot a pattern in the way the cups move so that the two cups directly after cup 1 can be predicted without having to evaluate the whole problem.

Approach 2 has worked in the past for some previous AoC problems - 2019 Day 16 is an example where looking for a pattern collapses the problem down into something more manageable. However on this occasion after a fair bit of time staring at some examples I couldn't see any obvious way to simplify things.

This leaves option 1. As we're constantly adding/removing sublists from within a list, a linked list seems like a better option as splitting and concatenating a linked list is O(1).

Python doesn't have a linked list in its standard library, though it does have a double-ended queue or deque. I considered going down this route as adding/removing items to/from either end of a deque is an O(1) operation. However using a deque doesn't actually help us much: while removing and re-adding the three elements would be O(1), hunting for where to reinsert by repeatedly rotating the deque would still be O(n).

Surely a linked list will have the same problem? Searching in a linked list is also an O(n) operation so figuring out where to insert the extracted elements will still take a long time for a large list. After a bit of thought I realised that we can simply create an index! The linked list would consist of n objects, one for each cup, each with two properties: the cup number and a pointer to the next cup - something like this:

class Cup:
    def __init__(self, number, next_cup=None):
        self.number = number
        self.next_cup = next_cup # reference to next Cup object

The index would be a dictionary from cup number to cup object. This should give O(1) slicing/concatenation and O(1) search!

In fact, we can simplify this more:

As the index is a dictionary from cup number to Cup object and the cup numbers are a contiguous set of integers, we can just use an array instead of a dictionary as the index. The object for cup n would be stored at array index n.
As the index contains references to all the Cup objects, the Cup objects themselves don't have to point to other Cup objects - they can just contain the number of the following cup and the code can go via the index to get the corresponding cup object.
But doing this makes the Cup objects themselves redundant: we can just have an array where index n contains the number of the cup that follows cup n.

All this leads to the following for part 2:

def part2():
    MAX_CUP = 1000000

    # Create a 1,000,001-entry array where cups[n] gives the 
    # number of the cup following cup n.
    # As there's no cup zero we can set it to point to itself.
    cups = [0]
    # The cups in order are [2, 4, 7, 8, 1, 9, 3, 5, 6, 10, 11, 12, 13...]
    # So cup 9 follows cup 1, cup 4 follows cup 2, cup 5
    # follows cup 3 etc.
    cups.extend([9, 4, 5, 7, 6, 10, 8, 1, 3])
    # For cups[10] to cups[MAX_CUP - 1], each cup points to the
    # one one number higher
    cups.extend(range(11, MAX_CUP + 1))
    # The last cup (cups[MAX_CUP]) points back to cup 2
    cups.append(2)

    current_cup = 2

    for _ in range(10000000):
        # Cups to remove are the three following the current cup
        e1 = cups[current_cup]
        e2 = cups[e1]
        e3 = cups[e2]
        next_cup = cups[e3]

        # Close the gap by making the cup after the current cup
        # be the cup that was after the block being removed.
        cups[current_cup] = next_cup

        # Work out where to reinsert the extracted section
        destination_cup = current_cup
        while True:
            destination_cup -= 1
            if destination_cup < 1:
                destination_cup += MAX_CUP
            if not (destination_cup == e1 or destination_cup == e2 or destination_cup == e3):
                break

        # Move the current cup along one.
        current_cup = next_cup

        # Reinsert the extracted section by changing the
        # following index of the destination cup to point at
        # the extracted section, and the following index of the
        # extracted section to point at the original following 
        # index of the destination.
        next_cup = cups[destination_cup]
        cups[destination_cup] = e1
        cups[e3] = next_cup

    c2 = cups[1]
    c3 = cups[c2]

    return c2 * c3

During the loop there are no new lists being allocated and no linear searches through lists, so this ought to run equally fast for a million cups as it would for 9 cups. In fact it churns through 100,000,000 rounds in about 8.5 seconds - and it's more than twice as quick as the original implementation when applied to part 1.

This is one of those problems where choosing the right representation of the data makes an enormous difference. Looking back, I probably could have predicted that part 2 would be a scaled-up version of part 1 and thought a bit more before diving in and implementing the naive version.

Top comments (2)

Neil Gall • Dec 23 '20 • Edited

I really like your analysis and break down to a very simple representation so it works even in Python. Linked lists are the way to go. Circular ones even better. I've been doing AoC in Rust this year but resorted to C for part 2 today. Runs in under a second. github.com/neilgall/advent-of-code...

There are a few tricks in here from my old embedded C days. I allocate all the objects in one big contiguous buffer then set up the links between them using a pointer-to-pointer to track the previous link in the chain. Changing the order is only ever a few pointer moves.

For fun I did it again in unsafe Rust, but to be honest the C version looks nicer! github.com/neilgall/advent-of-code...

matju • Dec 27 '20

Yeah - linked lists with pointers are notoriously hard in Rust! As it's just using an array of numbers, I think the Python solution above is pretty much line-by-line translatable to a safe Rust solution though.