Big O Notation Explained: Derive It, Don't Memorize It

#leetcode #career #algorithms #interview

You can recite that binary search is O(log n) and nested loops are O(n²). That recitation stops helping the moment an interviewer writes an unfamiliar function on the whiteboard and asks you to analyse it. The useful version of Big O is the one you can derive, not the one you can look up.

TL;DR: Big O describes how an algorithm's work grows as input size grows. The interview skill isn't naming complexities from memory. It's counting how many times the dominant operation runs as a function of input size and watching what dominates as the input gets large.

What Big O actually measures

Big O describes the upper bound on how an algorithm's running time or space usage grows as its input grows. That word, grows, is the one that matters. It doesn't tell you that binary search takes exactly 20 comparisons on an array of a million elements. It tells you that doubling the array adds roughly one more comparison. The relationship between input and work is the thing being measured.

There are two simplifications Big O does on purpose. It drops constants, so O(3n) becomes O(n). And it drops lower-order terms, so O(n² + n) becomes O(n²). Both simplifications stop mattering at scale, which is the only scale the notation is built for. If n is 10, the constants matter. If n is 10 million, the highest-order term swallows everything else.

The "upper bound" piece is the worst case. An algorithm marked O(n²) might finish faster on certain inputs. It just won't do worse than quadratic. Worst-case behaviour is what interviewers care about because best-case behaviour doesn't protect you from the input that breaks the solution.

The five growth classes you'll actually see

Every algorithm you'll meet in an interview lives in one of these.

Complexity	Name	n=10	n=1,000	n=1,000,000
`O(1)`	Constant	1	1	1
`O(log n)`	Logarithmic	3	10	20
`O(n)`	Linear	10	1,000	1,000,000
`O(n log n)`	Linearithmic	33	10,000	20,000,000
`O(n²)`	Quadratic	100	1,000,000	1,000,000,000,000

The jump from O(n) to O(n²) is where most interview problems live. A brute-force solution that checks every pair in an array is quadratic. The optimised version using a hash map or two pointers drops it to linear. At a million elements, that's the difference between a million operations and a trillion.

O(n log n) is worth a side note. It's the theoretical floor for comparison-based sorting. Merge sort, heapsort, well-implemented quicksort all sit there. The number isn't arbitrary. It falls out of information theory.

Where the derivation skill lives

Most Big O guides hand you a lookup table. Binary search is logarithmic, merge sort is linearithmic, hash insert is constant amortised. Fine for reference. Useless when the interviewer hands you a function you've never seen.

The interview skill is derivation. Counting the dominant operation as a function of input size. The cleanest way to build that skill is to use Big O on code where you don't already know the answer.

Take a single loop. One pass through n elements does n units of work, so it's O(n). Two nested loops over the same array multiply: outer runs n times, inner runs up to n times, total work is n × n which is O(n²). The inner loop in something like a pair-check usually doesn't run the full n each time. The first outer pass runs the inner n-1 times, the next runs n-2, and so on. The sum is n(n-1)/2, which simplifies to O(n²) because the constant 1/2 and the linear term drop out.

The halving pattern, derived from scratch

Take binary search. It's the cleanest example of how O(log n) actually shows up.

def binary_search(arr, target):
    left, right = 0, len(arr) - 1
    while left <= right:
        mid = (left + right) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1

Each loop iteration cuts the search space in half. Start with n elements, one comparison leaves n/2, two leave n/4, three leave n/8. You stop when the space hits 1. The question is: how many times can you halve n before reaching 1?

The answer is log₂(n). An array of a million elements needs at most 20 comparisons. A billion needs at most 30. That's why binary search feels almost instant on large inputs.

The shape generalises. Any loop where the search space shrinks by a constant fraction each step is logarithmic. You'll see the same shape when an index doubles (i *= 2), when a range halves, when a balanced-tree walk descends one level per iteration. The derivation isn't memorised. It's derived from the mechanic.

I keep coming back to this one because it's the easiest place to feel the difference between memorising and deriving. Once you've drawn the halving five or six times on a whiteboard, recognising it in unfamiliar code stops being effortful. The searching lesson on my own site walks through the halving invariant frame by frame, then extends the same derivation to predicate search problems where you're binary searching on an answer space instead of an array. Same mechanic, different surface.

A five-step derivation protocol

The five steps below work on any code, whether you've seen it before or not. They're the steps I use myself when an interviewer drops a function I don't recognise.

Identify the dominant operation. Inside the inner loop, the recursive call, the comparison. Whatever you'd count if you were ticking marks on paper.
Express the iteration count as a function of n. A single loop is n. Nested loops multiply. A halving loop is log n. A recursive call that branches twice and reduces by 1 each level builds a tree of depth n, giving 2ⁿ.
Sum across the levels. Single loop: n × O(1) = O(n). Recursive tree: branching factor b, depth d, total nodes roughly b^d. Multiply by the work per node.
Drop constants and lower-order terms. 3n + 5 becomes O(n). n² + n becomes O(n²). The simplification only matters because the notation is about behaviour at large n, where the highest-order term dominates everything.
Do the same thing for space. Track auxiliary memory. Count call-stack frames in recursion. Count the size of any data structure that grows with input.

The protocol works on hash-map resize, on BFS over a graph, on string concatenation in a loop, on dynamic programming with memoisation. It works on code you've never seen. That's the whole point.

A worked example on recursion

Take naive Fibonacci.

def fib(n):
    if n < 2:
        return n
    return fib(n - 1) + fib(n - 2)

The dominant operation is the recursive call. Each call to fib(k) makes two more calls. The call tree branches at every level. Depth is roughly n. Total nodes are roughly 2ⁿ. That's O(2ⁿ), which is why naive Fibonacci is unusably slow even for moderate n.

Adding memoisation collapses the tree to O(n) because each subproblem fib(k) is computed exactly once. The work per subproblem is constant, and there are n of them. Same recursion, different complexity, derived from the same counting.

The space complexity of the naive version is O(n) from the call stack, not O(2ⁿ) from the tree. At any moment, only one root-to-leaf path is on the stack. The branches aren't all simultaneously in memory. Easy mistake to make. Easy to catch if you're explicit about which operations are happening at the same time versus across the whole run.

Space complexity, including the parts that get forgotten

Time complexity gets most of the attention. Space gets asked about often enough that ignoring it loses interview points.

Space complexity measures the additional memory your algorithm uses beyond the input. The input array itself doesn't count toward auxiliary space. Sorting in place is O(1) auxiliary. Copying the array first is O(n) auxiliary for the same sort.

The piece that catches even experienced engineers is the recursive call stack. Every recursive invocation adds a frame. A function that recurses n levels deep uses O(n) stack space, even if it doesn't allocate any explicit data structure.

def sum_recursive(arr, i=0):
    if i == len(arr):
        return 0
    return arr[i] + sum_recursive(arr, i + 1)
    # Time: O(n), Space: O(n) due to call stack

The iterative version of the same logic uses constant space. That's the concrete reason converting recursion to iteration is sometimes worth the extra code.

Common space-complexity traps worth knowing by name:

Forgetting to count the recursive call stack.
Counting the input array size as auxiliary space.
Ignoring that hash maps and sets grow with input.
Missing that repeated string concatenation in a loop creates O(n²) total memory because each new string is a fresh allocation.

Where this shows up in interviews

Interviewers don't ask you to recite that merge sort is O(n log n). They hand you a function and ask you to analyse it, or they ask you to improve a brute-force O(n²) solution to O(n) or O(n log n). Derivation, not recall.

That's where a lot of self-taught preparation falls apart. You can grind hundreds of problems and memorise their complexities. If you haven't drilled the reasoning process of counting operations as a function of input, novel code still feels opaque. Learning science has a name for the right drill here: elaborative interrogation. Asking yourself why each operation contributes the work it does, rather than accepting the labelled answer.

Before Big O is derivable, you read a solution and think that looks fast. After, you look at the same solution and know it's O(n log n) because you can point at the divide step and the merge step independently. You can explain why adding a hash map drops a nested loop from quadratic to linear. You can evaluate code you've never seen and tell the interviewer how it scales.

At that point, Big O stops being a label you apply after the fact. It becomes a tool you reason with.

I wrote a longer version on my own blog that includes amortised analysis (the hash-map resize derivation in particular) and the recursion-tree shortcuts for divide-and-conquer.

What's a problem where the textbook complexity made sense to you only after you traced the work by hand? I'm curious which derivations clicked first for people, and which ones stayed stubborn.