DEV Community

JosephAkayesi
JosephAkayesi

Posted on

Prefix sums and range queries

Discovering prefix sums is one of those moments where you go aha!

It’s staggering how memory-efficient this technique is.

Prefix sums are used when you want to compute values over a range —

i.e. range queries.


Example

Let’s say you want to compute the number of YouTube views over different time periods.

views   = [7, 1, 2, 5, 2, 6, 8]
periods = [[0, 1], [0, 5], [1, 4], [3, 3]]

output = [8, 23, 10, 5]
Enter fullscreen mode Exit fullscreen mode
  • i represents the i-th day in the views array
  • views[i] is the number of views on that day
  • Each element in periods represents a range over which we want the cumulative sum

For example, periods[0] = [0, 1] means:

compute the cumulative views from day 0 to day 1 (inclusive).


A naïve approach

A valid approach would be:

  1. For each period, find the sub-array for that range
  2. Iterate through it and sum all the elements
  3. Append the result to an output array

The problem with this approach is that we recompute many overlapping ranges without keeping track of previous work.

Can we do better?


Prefix sums to the rescue 🚀

Yes. An optimal approach is to use a prefix sum.

The idea is simple:

  • Compute a cumulative sum array once
  • Each element at index i stores the sum of values from index 0 to i

Let’s call this array prefix.

prefix[0] = views[0]
prefix[1] = views[0] + views[1]
prefix[2] = views[0] + views[1] + views[2]
prefix[3] = views[0] + views[1] + views[2] + views[3]
...
Enter fullscreen mode Exit fullscreen mode

For our example:

prefix = [7, 8, 10, 15, 17, 23, 31]
Enter fullscreen mode Exit fullscreen mode

Querying ranges with prefix sums

For any period [l, r]:

  • If the range starts at day 0, the answer is simply:
sum = prefix[r]
Enter fullscreen mode Exit fullscreen mode

Because each prefix[r] already represents the cumulative sum from day 0 to day r.


What if the range does not start at 0?

Let’s say the period is [1, 4].

We know:

prefix[4] = views[0] + views[1] + views[2] + views[3] + views[4]
prefix[0] = views[0]
Enter fullscreen mode Exit fullscreen mode

So to get the sum from day 1 to day 4, we subtract what came before the range:

sum = prefix[r] - prefix[l - 1]
Enter fullscreen mode Exit fullscreen mode

In general:

sum = prefix[r] - prefix[l - 1]
Enter fullscreen mode Exit fullscreen mode

This works because:

  • prefix[r] gives the total up to the end of the range
  • prefix[l - 1] gives everything before the range
  • Subtracting removes what we don’t care about

Intuition (what’s really happening)

prefix[4] = views[0] + views[1] + views[2] + views[3] + views[4]
prefix[0] = views[0]
-----------------------------------------------
result    =          views[1] + views[2] + views[3] + views[4]
Enter fullscreen mode Exit fullscreen mode

Everything before the range is discarded. We only keep what’s inside [l, r].


Single-day ranges

For a period like [3, 3]:

prefix[3] = views[0] + views[1] + views[2] + views[3]
prefix[2] = views[0] + views[1] + views[2]
Enter fullscreen mode Exit fullscreen mode
sum = prefix[3] - prefix[2]
sum = 15 - 10
sum = 5
Enter fullscreen mode Exit fullscreen mode

Which is exactly views[3].


That’s the magic of prefix sums:

precompute once, answer every range query in O(1).

Top comments (0)