DEV Community

Cover image for Using Python To Merge Array of Ranges
Grant Riordan
Grant Riordan

Posted on

Using Python To Merge Array of Ranges

So I was doing an Advent of Code 2025 challenge the other day, and had an issue where I needed to count all the unique numbers between multiple ranges. It reminded me of some code I'd written previously for a booking system.

The problem was that the ranges overlapped in many cases. Now, looping over all the numbers and using a set() was not feasible, nor performant as the ranges could be in the millions.

The Naïve Approach

unique_values = set()
for start, end in ranges:
    unique_values.update(range(start, end + 1))
Enter fullscreen mode Exit fullscreen mode

For a range of say 1-100_000_000_000 you'd create 1 billion numbers in memory, that's huge! You're also materialising data that doesn't need to exist.

Solution - Merge First, Count Later

Instead of creating every number in memory (expensive), we merge overlapping ranges into fewer, larger ranges, then calculate the count mathematically.

Lets look at a smaller range for the purpose of this tutorial:

Let's say we have the following ranges:

3-5, 
10-14, 
16-20, 
12-18
Enter fullscreen mode Exit fullscreen mode

some of them overlap or touch each other.
10-14 and 12-18 overlap (they share 12, 13, 14)

If merged, they become one range: 10-18. Without merging, you'd count numbers twice or waste memory storing every individual number.

How To Merge Overlapping Ranges

First Thing - Let's Get The Ranges Sorted

Sorting the arrays by their start position makes the merging a lot easier - we read from left-to-right on the number line.

sorted_ranges = sorted(fresh_ranges)
# Before: [(3,5), (10,14), (16,20), (12,18)]
# After:  [(3,5), (10,14), (12,18), (16,20)]
Enter fullscreen mode Exit fullscreen mode

How does this help though? When you encounter a new range, you only need to check if it overlaps with the last merged range. There's no need to check all previous ranges. If ranges are sorted, overlaps always happen with the most recent range, saving time and effort checking other ranges.

Sorted number line view:
   3---5      10----14
              12------18
                  16----20

# You can see overlaps flow left-to-right!
Enter fullscreen mode Exit fullscreen mode

Second - Merge the overlapping ranges

merged = []
for start, end in sorted_ranges:
    if merged and start <= merged[-1][1] + 1:
        # Overlaps or touches - merge!
        merged[-1] = (merged[-1][0], max(merged[-1][1], end))
    else:
        # Doesn't overlap - add as new range
        merged.append((start, end))
Enter fullscreen mode Exit fullscreen mode

Why merged and start <= merged[-1][1] + 1?
This condition checks two things:

merged → Is there at least one range already merged?

First iteration: merged = [] → False, so add first range

Later: merged = [(3,5)] → True, check overlap
start <= merged[-1][1] + 1 → Does current range touch/overlap the last merged range?

Lets take a look at some examples:

# Case 1: Adjacent (touching)
Last merged: (10, 14)  # ends at 14
Current:     (15, 20)  # starts at 15
Check: 15 <= 14 + 1?  15 <= 15?  YES! They touch

# Case 2: Gap between ranges
Last merged: (3, 5)    # ends at 5
Current:     (10, 14)  # starts at 10
Check: 10 <= 5 + 1?  10 <= 6?  NO! Gap exists

Enter fullscreen mode Exit fullscreen mode

Why + 1?

Ranges for example like (5,10) and (11,15) are adjacent. They shouldn't stay separate (gap at position 10.5). The + 1 is about merging adjacent ranges that touch, not just overlapping ones.

Example:

Range A: (5, 9)   →  5, 6, 7, 8, 9
Range B: (10, 15) →  10, 11, 12, 13, 14, 15

Number line:
5--6--7--8--9--10--11--12--13--14--15
[======A======][========B========]
              ↑
          They TOUCH at position 9→10

Without +1:
Check: 10 <= 9? → NO ❌
Result: Keep separate [(5,9), (10,15)]
This is WRONG - there is no gap! They should be one range: (5,15)

With +1:
Check: 10 <= 9+1? → 10 <= 10? → YES ✅
Result: Merge to (5, 15)
This is CORRECT!

# Actual Gap
Range A: (5, 9)   →  5, 6, 7, 8, 9
Range B: (11, 15) →  11, 12, 13, 14, 15

Number line:
5--6--7--8--9--(10)--11--12--13--14--15
[======A======]  GAP  [=======B=======]
              ↑      ↑
          Ends at 9  Starts at 11
          10 is missing!

With +1:
Check: 11 <= 9+1? → 11 <= 10? → NO ✅
Result: Keep separate [(5,9), (11,15)]
This is CORRECT - there's a gap at position 10
Enter fullscreen mode Exit fullscreen mode

Simply, sum each range to get the total number of items within the ranges.

total_count = sum(end - start + 1 for start, end in merged)
    return total_count
Enter fullscreen mode Exit fullscreen mode

Final Code:


# Sort ranges by start position
    sorted_ranges = sorted(fresh_ranges)

    # Merge overlapping ranges
    merged = []
    for start, end in sorted_ranges:
        print(merged)
        if merged and start <= merged[-1][1] + 1:
            # Overlapping or adjacent - merge with last range
            merged[-1] = (merged[-1][0], max(merged[-1][1], end))
        else:
            # No overlap - add new range
            merged.append((start, end))

    # Calculate total count from merged ranges
    total_count = sum(end - start + 1 for start, end in merged)
    return total_count
Enter fullscreen mode Exit fullscreen mode

Real Life Usages:

There are many scenarios in real life where you may need to handle multiple ranges needing merged, or to be counted;

For example:

Appointment Booking Systems:

Meeting A: 9:00 AM - 10:30 AM
Meeting B: 10:15 AM - 11:00 AM  
Meeting C: 2:00 PM - 3:30 PM
Meeting D: 3:30 PM - 5:00 PM

Sorted: [(9:00, 10:30), (10:15, 11:00), (2:00, 3:30), (3:30, 5:00)]

Merged: [(9:00, 11:00), (2:00, 5:00)]
         ↑               ↑
   Meetings A+B overlap  Meetings C+D touch (adjacent)

Free times: Before 9:00, 11:00-2:00, After 5:00
Enter fullscreen mode Exit fullscreen mode

When is the room actually free?

IP Address Blocking

Block: 192.168.1.0 - 192.168.1.50
Block: 192.168.1.45 - 192.168.1.100
Block: 192.168.1.200 - 192.168.1.255

Without merging, check 3 rules for every connection.
With merging -
[(192.168.1.0, 192.168.1.100), (192.168.1.200, 192.168.1.255)]

Only 2 rules to check = Faster firewall!

Maintenance Downtime

Server or Machine scheduled for maintenance:

Maintenance A: Hour 0-5
Maintenance B: Hour 4-8
Maintenance C: Hour 20-24
Enter fullscreen mode Exit fullscreen mode

Merged downtime: [(0, 8), (20, 24)]
Total downtime: 8 + 4 = 12 hours (not 17 if counted individually)

Why This Matters in All Cases:

  • Avoid double-counting overlapping periods
  • Optimise checks - fewer ranges to search
  • Find gaps - available time slots
  • Accurate reporting - real busy vs. free time
  • Performance - checking fewer merged ranges vs. many overlapping ones

Conclusion

The above pattern / method is used by multiple companies around the globe, handling calendar invites, database indexing, network traffic analysis and much more.

Go ahead, use the above code to increase your applications performance.

If you want to chat, or reach out don't hesitate to comment below, or drop me a follow on x/twitter.

Top comments (0)