DEV Community

Discussion on: Hit Me Baby One More Time - What Are Cache Hits and Why Should You Care?

Collapse
 
rcosteira79 profile image
Ricardo Costeira

Great post. I completely agree with Karim Manaouil on this one. However, when explaining why mult2 was faster than mult1, it might have been worth it to talk about spatial locality and how arrays are stored in memory. Still, thank you for writing about this.

Collapse
 
frosnerd profile image
Frank Rosner

Hey Ricardo,

Glad you enjoyed the post :)

You are right there are many details I did not talk about but they are equally important. Would you mind to elaborate a bit on the spatial locality and how arrays are stored in memory? I think that'd be super valuable for the discussion!

Thank you for commenting!

Collapse
 
rcosteira79 profile image
Ricardo Costeira

Sure :)

Simply put, spatial locality states that the data elements near an element that was just accessed have a high probability of being accessed as well. As for arrays, single dimension ones are stored in memory in a contiguous fashion (for arrays with more dimensions, you need to know the size at compile time, i.e. they have to be fixed size. If you don't know, the rows' elements are still stored contiguously, but there's no guarantee that the rows themselves will be close to each other. There are some tricks you can do with pointers to bypass this, though).

So, the caching process takes both of these ideas into consideration, being that contiguous array elements are usually cached together. This is why mult2 had a better performance - You were traversing the columns as if they were rows (which, as said above, are single dimension arrays). In other words, you were accessing contiguous elements in memory, and not jumping who knows how many memory addresses to access each element like in mult1. Hence the cache hits and cache misses on both examples.

Knowing this, some people even choose to represent their matrices through single dimension arrays. For instance, with a 2D matrix[10][10], you would have a single dimension matrix[100]. You would access each element through matrix[width * row + col], and thus, due to spatial locality and contiguous storage, take full advantage of your cache.

Hope that was clear :)