The Math Behind O(log n): Binary Search, log , and Why Halving Wins

#algorithms #computerscience #beginners #maths

You have read O(log n) a hundred times. Binary search is O(log n). A balanced BST lookup is O(log n). Heap insert is O(log n). We nod along — log n means "fast" — and move on.

But what is that logarithm actually counting? Once it clicks, a whole family of algorithms stops being trivia you memorized and becomes one idea you understand.

A logarithm counts halvings

Forget the textbook definition for a second. Here's the one that matters for algorithms:

log₂(n) is the number of times you can halve n before you reach 1.

That's it. Start with n, keep dividing by 2, count the steps:

64 → 32 → 16 → 8 → 4 → 2 → 1     = 6 steps   →  log₂(64) = 6
1000 → 500 → ... → ~1            ≈ 10 steps  →  log₂(1000) ≈ 9.97
1,000,000 → ...                 ≈ 20 steps  →  log₂(1,000,000) ≈ 19.93

The headline number every engineer should have burned into memory: log₂ of a million is about 20. A billion is about 30. The input grew by 1000×, the work grew by 10. That gap is the entire reason O(log n) feels like magic.

Binary search is just "halve until found"

Binary search is the canonical halving algorithm. Each comparison throws away half of what's left:

def binary_search(arr, target):
    lo, hi = 0, len(arr) - 1
    steps = 0
    while lo <= hi:
        steps += 1
        mid = (lo + hi) // 2
        if arr[mid] == target:
            return mid, steps
        elif arr[mid] < target:
            lo = mid + 1      # discard the lower half
        else:
            hi = mid - 1      # discard the upper half
    return -1, steps

Run it on a sorted array of one million integers and steps never exceeds 20 — no matter which element you search for. A linear scan would average 500,000 comparisons. Same data, same machine, a 25,000× difference in worst-case work. The only thing that changed is that binary search halves while the scan decrements.

That's the whole trick: decrementing gives you O(n); halving gives you O(log n).

Where halving shows up once you see it

The moment you read O(log n) as "halving," you start spotting the same move everywhere:

Balanced trees (AVL, red-black, B-trees): each level splits the remaining keys in two, so the tree is ~log₂(n) deep. A B-tree with a high branching factor b is log_b(n) deep — same idea, fatter halving. That's why a 4-level B-tree can index millions of rows.
Binary heaps: sift-up and sift-down walk one root-to-leaf path. Path length = tree height = O(log n).
Divide and conquer: merge sort splits the array in half each level, giving log₂(n) levels of O(n) work → O(n log n). The log factor is the number of splits.
Bits: the number of bits to represent n is ⌈log₂(n+1)⌉. "How many bits?" and "how many halvings?" are literally the same question.
Exponential search / doubling: when you double a buffer or probe 1, 2, 4, 8, …, you reach n in log₂(n) steps. Halving, run backwards.

"But my data isn't a power of two"

Real inputs aren't clean powers of 2, and the base isn't always 2 — a ternary split is log₃, a B-tree with branching factor 256 is log₂₅₆. When you need the actual value for a back-of-the-envelope estimate, you lean on the change-of-base formula:

log_b(n) = ln(n) / ln(b) = log₁₀(n) / log₁₀(b)

So the depth of a B-tree holding 10 million keys with a branching factor of 256 is log₂₅₆(10,000,000) = ln(10,000,000) / ln(256) ≈ 16.1 / 5.55 ≈ 2.9 — three levels. You can punch that into any logarithm calculator that supports an arbitrary base instead of reaching for a language's math.log(x, base) mid-discussion. The point isn't the tool; it's that base is just branching factor, and change-of-base lets you compare a binary split against a 256-way split on the same axis.

One more identity worth keeping: in Big-O, the base doesn't matter. log₂(n) and log₁₀(n) differ only by the constant factor 1/ln(2) vs 1/ln(10), and Big-O eats constants. That's why we write O(log n) with no base at all — the existence of halving is what we're claiming, not its flavor.

The takeaway

Next time you read O(log n), don't translate it to "fast." Translate it to "this algorithm halves the problem every step," and ask where the halving happens — the comparison, the tree level, the recursive split, the doubling buffer. That single reframing turns binary search, balanced trees, heaps, and divide-and-conquer into variations on one theme instead of four things to memorize.

Halving is the cheapest superpower in computer science. Logarithms are just how we count it.