If you've ever wondered why arr[0] is the first element and not arr[1], the answer isn't pedantry or tradition for its own sake. It comes straight from how a computer finds an element in memory.
An Index Is an Offset, Not a Count
An array is a contiguous block of memory. When you create one, the program knows a single thing about where it lives: the address of its first byte, called the base address. Every element after that is found by doing arithmetic from that base.
To locate any element, the machine computes:
address = base + index * elementSize
If you have an array of 4-byte integers starting at address 1000, then element 0 lives at 1000 + 0 * 4 = 1000, element 1 lives at 1000 + 1 * 4 = 1004, and so on. The index isn't answering "which element in counting order" — it's answering "how many elements past the start." The first element is zero elements past the start, so its index is 0.
Once you see it this way, zero-based indexing stops looking like a quirk. A count-from-one scheme would force the machine to subtract one on every single access (base + (index - 1) * elementSize), doing extra work to undo a human convention. Zero indexing makes the index be the offset, with nothing to adjust. The convention is really just the address arithmetic showing through.
Dijkstra and the Case for Half-Open Ranges
There's a second, more subtle reason zero-based indexing tends to win, and it's about counting ranges rather than addresses. In a well-known 1982 note titled "Why numbering should start at zero," Edsger Dijkstra argued for writing ranges as half-open intervals: include the lower bound, exclude the upper one. Written mathematically, that's [start, end).
The payoff is that the length of such a range is simply end - start, with no +1 or -1 correction anywhere. The elements 0, 1, 2, 3, 4 are exactly the range [0, 5), which has length 5 - 0 = 5. Two adjacent ranges like [0, 3) and [3, 6) join cleanly with no gap and no overlap, because the end of one is the start of the next. This is exactly the shape of nearly every loop you write:
for (int i = 0; i < n; i++) { ... } // runs n times, indices 0..n-1
The loop touches n elements, the last index is n - 1, and the bound i < n reads as "while still inside the range." If arrays started at one and ranges were closed on both ends, you'd be sprinkling +1 and -1 adjustments through your code — and off-by-one errors thrive in exactly those adjustments.
Zero-based indexing's biggest practical win isn't the address math — it's that it pairs naturally with half-open ranges
[start, end). Because the length is justend - start, slicingarr[2:5]gives you exactly5 - 2 = 3elements, and adjacent slicesarr[0:3]andarr[3:6]tile perfectly without overlap. There's simply no place for a stray off-by-one to hide.
A Convention, Not a Law of Nature
It's worth being precise: zero-based indexing is a convention, cemented largely by C and the languages that inherited its memory model, not a rule the universe enforces. Plenty of languages chose differently. Fortran arrays default to starting at one, MATLAB indexes from one, and Lua's idiomatic arrays (tables) conventionally start at one. R and several mathematics-oriented languages do the same, because they prioritize matching mathematical notation, where a vector's first component is usually written with subscript one.
None of these are wrong. They reflect a different priority: closeness to human and mathematical convention over closeness to the machine. The reason zero-based indexing feels "default" to most working programmers today is simply that the dominant systems languages — C, C++, Java, JavaScript, Python, Go, Rust — all adopted it, and that lineage traces back to the offset-from-base model that hardware uses anyway.
FAQ
So the next time someone calls zero-based indexing confusing, you can hand them the one-line answer: the index is the distance from the start, and the start is zero away from itself.
Originally published at pickuma.com. Subscribe to the RSS or follow @pickuma.bsky.social for new reviews.
Top comments (0)