Discussion on: Daily Challenge #17 - Double Trouble

View post

Iterating over the whole queue is slow when the number is really large. I wasn't able to find an exact formula, but I was able to find a good upper bound from which I can find the exact solution in linear time.

The solution is in Perl:

#!/usr/bin/perl
use warnings;
use strict;

sub slow {
    my ($queue, $n) = @_;
    push @$queue, (shift @$queue) x 2 while --$n;
    return $queue->[0]
}

sub fast {
    my ($queue, $n) = @_;
    return $queue->[0] if 1 == $n;
    my $upper_bound = 5 + int(log($n / @$queue) / log(2));

    my $x;
    do {
        --$upper_bound;
        $x = (2 ** $upper_bound - 1) * @$queue + 1;
    } until $x <= $n;
    return $queue->[($n - $x) / (2 ** $upper_bound)]
}

use Test::More;

is slow([qw[ Sheldon Leonard Penny Rajesh Howard ]], 3), 'Penny';
is slow([qw[ Sheldon Leonard Penny Rajesh Howard ]], 7), 'Sheldon';
is slow([qw[ Sheldon Leonard Penny Rajesh Howard ]], 8), 'Leonard';

my @queue;
for my $char ('a' .. 'z') {
    push @queue, $char;
    is fast(\@queue, $_), slow([@queue], $_),
        "t_$_\_" . scalar @queue for 2 .. 2000;
}
done_testing();

Benchmarking is easy using the Benchmark module:

use Benchmark qw{ cmpthese };
cmpthese(-2, {
    slow_1000 => sub { slow(['a' .. 'h'], 1_000) },
    fast_1000 => sub { fast(['a' .. 'h'], 1_000) },
    slow_10_000 => sub { slow(['a' .. 'h'], 10_000) },
    fast_10_000 => sub { fast(['a' .. 'h'], 10_000) },
});

The output shows a table ordered from the slowest to the fastest.

                Rate slow_10_000   slow_1000   fast_1000 fast_10_000
slow_10_000    271/s          --        -90%       -100%       -100%
slow_1000     2845/s        952%          --        -99%        -99%
fast_1000   415531/s     153512%      14505%          --         -0%
fast_10_000 415531/s     153512%      14505%          0%          --

Unfortunately, it breaks for longer sequences, as the rounding error becomes too large. Switching to arbitrary precision is too slow, increasing the upper bound helps - the failures are easy to detect as the function returns undef instead of an element of the queue.

sub fast {
    my ($queue, $n, $correction) = @_;
    return $queue->[0] if 1 == $n;
    $correction ||= 4;
    my $upper_bound = $correction
        + int((log $n - log @$queue) / log 2);

    my $x;
    do {
        $x = (2 ** --$upper_bound - 1) * @$queue + 1;
    } until $x <= $n;
    my $r = $queue->[($n - $x) / (2 ** $upper_bound)]
        // fast($queue, $n, $correction + 2);
    return $r
}

Yozen Hernandez • Jul 18 '19 • Edited

So I think I was able to get this with an exact calculation. I thought it was going to turn out to be pretty convoluted, but I guess it's not. I think got it right, at least:

#!/usr/bin/perl

use v5.24;
use strict;
use warnings;
use feature qw(signatures);
no warnings "experimental::signatures";
use Carp;
use List::Util qw(sum0);
use POSIX qw(ceil);

sub cola ( $n, @queue ) {
    my $q_len    = @queue;
    my $i        = 0;
    my $last_idx = @queue;

    $last_idx += ( ( 2**++$i ) * @queue ) while ( $n > $last_idx );

    my $prev_last_idx = $last_idx - ( ( 2**($i) ) * @queue );
    my $nth_can       = ceil( ( $n - $prev_last_idx ) / 2**$i ) - 1;
    return $queue[$nth_can];
}

my @queue = qw(Sheldon Leonard Penny Rajesh Howard);

use Test::More tests => 4;
is( cola( 4,  @queue ), "Rajesh" );
is( cola( 6,  @queue ), "Sheldon" );
is( cola( 12, @queue ), "Rajesh" );
is( cola( 27, @queue ), "Penny" );

Benchmarking with timethis in Benchmark got me this result:

use Benchmark qw(timethis);

timethis( -2, sub{ cola( 12, @queue ) } );

#timethis for 2:  2 wallclock secs ( 2.00 usr +  0.00 sys =  2.00 CPU) @ 859529.50/s (n=1719059)

Basically, I just keep track of how many nodes there are in a tree structure which represents the doubling of the people in the queue. In each level of the tree, you double the number of nodes in the level before it, but we want the running total so far (which I call $last_index above because that is the index of the last node in the level where the $nth cola would be).

Once you figure out which "level" of the tree the $nth cola would be, you get the last index of the level before so that you can zero out the current level and any indices there. That makes it easy to figure out "where" on the current level of the tree the $nth cola is, and you just divide by the depth of the tree so far to turn that into an index that works on the initial array. Subtract one to get the 0-index value into the original array.

Edit Looks like Corey figured out the same thing, but with more math.