Why Dynamic Arrays Aren't Actually Dynamic

#datastructures #algorithms #softwareengineering

TL;DR: Standard arrays are fixed-size by design to ensure O(1) access. Dynamic arrays manage this constraint by over-allocating memory and periodically resizing. By growing the backing array geometrically—usually 1.5x or 2x—the expensive O(n) copy operations are rare enough that the average cost per insertion remains O(1), a concept known as amortized constant time.

I often find that one of the first abstractions we take for granted as engineers is the dynamic array. Whether you are using an ArrayList in Java or a standard array in JavaScript, it is easy to assume these structures just "grow" naturally. In reality, memory is still a rigid series of fixed slots. I want to look at how we maintain the abstraction of contiguous growth while staying within the physical limits of memory allocation.

Why are standard arrays fixed-size?

I look at standard arrays as fixed-size blocks because the operating system requires a contiguous chunk of memory to provide O(1) random access. If I want to calculate the address of the fifth element, the CPU needs to know exactly how far to offset from the starting address without checking for gaps.

When I allocate an array, the system reserves a specific range of addresses. If I need to add an eleventh item to a ten-item array, I cannot simply expand the block in-place. There is no guarantee that the memory address immediately following my array isn't already claimed by another process or variable. To grow, I am forced to move the entire data set to a new, larger location that can accommodate the new size.

How does the resizing logic work during an overflow?

In my experience, the most critical part of this process is the growth factor. When the backing array hits its capacity limit, the system allocates a brand-new, larger buffer and executes a linear O(n) copy operation to move all existing elements from the old memory block to the new one.

I see a lot of logic where the new array is scaled by a factor of 1.5x or 2x rather than just adding a single slot. This geometric growth is intentional. If I only increased the size by one slot every time I ran out of room, I would be performing an O(n) copy on every single addition. By doubling the capacity, I ensure that as the dataset grows, the intervals between these expensive reallocations become significantly longer.

Current Capacity	Elements Added	Resize Triggered?	New Capacity	Copy Cost (Elements)
4	4	No	4	0
4	5	Yes	8	4
8	8	No	8	0
8	9	Yes	16	8
16	17	Yes	32	16

What is the performance impact of amortized constant time?

I use the term amortized constant time to describe an operation that is occasionally expensive but usually very cheap. In the context of dynamic arrays, the O(n) cost of a resize is spread out across a large number of O(1) insertions, making the average cost per operation effectively constant.

I like to explain this through the lens of a gym membership. I make one large payment of 600 dollars at the start of the year (representing the expensive O(n) copy operation). For the next 364 days, I walk into the gym for free (representing the O(1) insertion). If I analyze the cost of a single day, it is either 600 dollars or nothing. However, the amortized cost—the cost averaged over the whole year—is only about 1.64 per day. For most intents and purposes, I can treat the operation as O(1) because the "spikes" in latency are so infrequent.

FAQ

Why do some implementations use a 1.5x growth factor instead of 2x?
I've noticed that 1.5x (used in the JVM and some C++ STL versions) is often preferred for memory efficiency. Mathematically, a 2x growth factor prevents the allocator from ever reusing the memory chunks that were previously freed during earlier resizes. A 1.5x factor allows for memory reuse, which can reduce fragmentation in long-running processes.

Can I avoid the O(n) copy cost if I know my data size?
Yes. If I know ahead of time that I need to store 10,000 items, I always initialize the dynamic array with that specific capacity. This pre-allocation skips all intermediate resizes and copies, ensuring every single addition stays at a strict O(1) without the amortized spikes.

Is there a downside to using very large growth factors?
While a higher growth factor like 3x or 4x would make resizes even more infrequent, I find it creates too much memory waste. You end up with a massive backing array that is mostly empty space. The 1.5x to 2x range is the industry standard because it strikes the best balance between memory overhead and CPU performance.

Cheers!