Author Topic:   Coding Cache friendly
posted October 08, 1999 04:34 PM            

While trying to eek the most performance out of my design
I have come across the idea of cache performance. While I
understand how and when cache works I don't understand how
to implement cache friendly code. Can anyone give me some
insight and maybe a little snippet.



LDA Seumas
posted October 09, 1999 11:50 AM           
Unless you want to get into processor specific tricks, cache friendly code is generally code that deals with as little memory accessing as possible, and when it does access memory, it does so in closely connected blocks. You get the most performance if your code and data fits within the L1 caches, which are usually quite small (8 or 16k each, now up to 32 or 64k on some CPUs).

As one example of writing code to make the best use of cache, one way to speed up software texture mapping routines (and even hardware texture mapping) is to "swizzle" your texture maps into little square boxes, so that a contiguous run of 32 bytes holds the 16 texels (assuming 2-byte texels) for a 4x4 block of image data in 2D. In this way, when reading the texture vertically, a cache line read of 32 bytes gives you fast access to 4 texels in a vertical row. With a normal image layout, vertical texturing causes a cache miss for every line of texels you progress, whereas horizontal texturing can use the cache quite well. In this example it's a matter of making the best case a little slower in order to make the worst case much faster.

-- Seumas McNally, Lead Programmer, Longbow Digital Arts