DEV Community 👩‍💻👨‍💻

Cover image for Remove terrible bus routes (find an algorithm)
Håvard Krogstie
Håvard Krogstie

Posted on

Remove terrible bus routes (find an algorithm)

Here is a challenge for those who want to do some algorithms programming, similar to what you would find in algorithms competitions, such as the IOI. It is not terribly difficult, but a great way to combine pen and paper solving with programming. This task uses 2D sorting, and my python solution is ~20 lines and runs in O(n*log n) time

Problem statement

You are taking the bus from A to B, sometime today. You have a list of all possible trips, with a trip being represented as the tuple (leave_time, arrive_time). The time you leave A, and the time you get to B, respectively. Both times are represented as integers.

You don't care about what the route looks like, so even really bad choices are included. There may be arbitrarily many ways of getting from A to B, some worse than others.

Turns out many of the trips are so bad that you would never want to take them. Therefore you would want to tidy up the list. Duplicates should be removed, and any trip that is objectively worse than another trip should be removed.

A trip X is objectively worse than another trip Y if X.leave_time <= Y.leave_time && X.arrive_time >= Y.arrive_time. Basically if you can leave from A later, and still arrive at B earlier, there is no point in ever taking trip X.

Given the list, write a function that returns a sorted and tidied up list.

Input

Input is read from a file or the console.
The first line contains an integer N (1 < N <= 1e5), the number of trips given.
The following N lines contain two integers leave_time and arrive_time (1 <= leave_time <= arrive_time <= 1e6), denoting a trip in the list.

Output

Output is printed out in the console or to a file.
One the first line, print the number of trips in the tidied list M.
Followed by M lines, with the integers arrive_time and leave_time for each trip in sorted order.

Sample input 1

8
15 30
12 28
15 32
8 42
15 30
10 30
25 40
28 38

Expected output:

3
12 28
15 30
28 38

Explanation

Example 1

Each interval represents a possible trip. The (leave_time, arrival_time) tuple is written on the box. If the trip is objectively worse than another, the box is red, with an explanation written on it.

The routes that are left (green) are then sorted.

Sample input 2

30
23 59
17 82
85 90
76 95
44 87
78 78
51 88
73 80
10 31
84 95
38 56
92 96
66 71
77 98
94 98
94 98
91 99
83 98
91 94
63 77
33 69
3 63
13 54
37 80
27 40
52 92
90 98
41 91
11 96
16 65

Expected output:

9
10 31
27 40
38 56
66 71
78 78
85 90
91 94
92 96
94 98

If you want a solution, I will publish a commented one written in python in the comments. However, the solution is probably easier to understand if you get to it on your own.

Top comments (9)

Collapse
citizen428 profile image
Michael Kohl • Edited on

Here we go for a simple Ruby solution (I couldn't be bothered to deal with files, so I just dumped your sample input 2 into a data section at the end of the file):

require 'set'

trips = DATA.readlines[1..-1].map { |l| l.split(' ').map(&:to_i) }
solutions = Set.new

trips.sort_by(&:first).each do |trip|
  next if solutions.include?(trip)
  solutions.reject! { |depart, arrive| depart <= trip.first && arrive >= trip.last }
  solutions.add(trip)
end

puts solutions.length
solutions.each { |solution| puts solution.join(' ') }

__END__
30
23 59
17 82
85 90
76 95
44 87
78 78
51 88
73 80
10 31
84 95
38 56
92 96
66 71
77 98
94 98
94 98
91 99
83 98
91 94
63 77
33 69
3 63
13 54
37 80
27 40
52 92
90 98
41 91
11 96
16 65

Output:

9
10 31
27 40
38 56
66 71
78 78
85 90
91 94
92 96
94 98
Collapse
detunized profile image
Dmitry Yakimenko

Have you tested on the maximal N = 1e6 as stated in the problem description? Your solution seems to be O(N*N) so it will be really slow.

Collapse
citizen428 profile image
Michael Kohl

Hence “simple” not “efficient‘ 😉

Collapse
choroba profile image
E. Choroba

Perl solution, inserting each line into the final array right after having read it. Implementing the binary search myself.

#! /usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

<>; # Ignore the size, just process all the lines.

my @good;
while (<>) {
    my ($leave, $arrive) = split;

    # Binary search.
    my ($from, $to) = (0, $#good);
    while ($from < $to) {
        my $middle = int(($from + $to) / 2);
        if ($good[$middle][0] < $leave) {
            $from = $middle + 1;
        } else {
            $to = $middle;
        }
    }
    my $new = $from + (@good && $good[-1][0] < $leave);

    if ($new < $#good && $good[$new][0] == $leave) {
        $good[$new][1] = $arrive if $arrive < $good[$new][1];
        next
    }

    if ($new > $#good || $arrive < $good[$new][1]) {
        splice @good, $new, 0, [$leave, $arrive];

        if ($new) {
            my $before = $new;
            do { --$before } while $before && $good[$before][1] >= $arrive;
            splice @good, $before + 1, $new - $before - 1;
        }
    }
}

say scalar @good;
say "@$_" for @good;
Collapse
hkrogstie profile image
Håvard Krogstie Author

Python

Complexity O(nlog n)

#!/bin/env python3

n = int(input())
trips = []
for _ in range(n):
    times = input().split(' ')
    trips.append((int(times[0]), int(times[1])))

# Sort by decreasing arrive_time when the leave_time is the same
trips.sort(key=lambda x:x[1], reverse=True)
# By increasing leave_time
trips.sort(key=lambda x:x[0])

# Remove duplicates
trips = [trips[0]] + [b for a,b in zip(trips, trips[1:]) if a!=b]
stack = []
for trip in trips:
    while stack and stack[-1][1] >= trip[1]:
        #The trip on the stack is worse than trip
        stack.pop()
    stack.append(trip)

print(len(stack))
for a, b in stack:
    print(a, b)

Explanation

The stack contains trips not yet determined to be bad. The trips are processed in ascending leave_time. This means any trip already on the stack, has a lower or equal leave_time than the trip we are currently looking at.

If the trip on the top of the stack arrives later than the current trip,
that means the trip on the stack starts earlier and ends later
than the current trip. Therefore the stack trip is worse, and is popped off the stack.

This is repeated until the trip on the top of the stack ends before the current trip.
At that point, all trips down the stack are safe, because
no trip further down the stack ends after the top trip.
After the stack-popping is done, we add the current trip.

If multiple trips start at the same time, the one
with the larger arrive_time comes first, because
it needs to be popped by the better trip, when it comes later.
The sorting we do at the start makes sure
that a trip always comes before the trip it is objectively worse than.
This means the last trip is guaranteed to be good.

Collapse
choroba profile image
E. Choroba

Why is 27 40 not part of the solution 2?

Collapse
hkrogstie profile image
Håvard Krogstie Author • Edited on

Ok, there is not an issue with the program, but the test input, rather. The first line says 20, when there are 30 trips to read. Fixed.

Collapse
hkrogstie profile image
Håvard Krogstie Author

That's a very good question! I'll debug my program right after breakfast.

Collapse
hkrogstie profile image
Håvard Krogstie Author

One possibility is comparing every pair of trips to see if one is objectively worse than another, but that would take a long time if N were the theoretical maximum of 100000.

🌚 Browsing with dark mode makes you a better developer by a factor of exactly 40.

It's a scientific fact.