ABOUT THIS SITE

FRONT PAGE

ARS BeOS

BUYER'S GUIDE

CPU & CHIPSET
GUIDE

DAMAGE LABS

DIARY OF
A GEEK

THE FORUM

PRODUCT
REVIEWS

SEARCH ARS

SESSE SEKO'S
WANKERDESK

TIPS FROM
THE CRYPT

TWEAKMEISTER'S
TOME OF LOVE

WHO WE ARS

LINKS

-

 

Visit The Chip Merchant


banner.jpg (15264 bytes)

Understanding CPU caching and performance
by  Hellazon

Caching is one of the most important concepts in computer systems. Understanding how caching works is the key to understanding system performance on all levels. On a PC, caching is everywhere. We'll focus our attention here on CPU-related cache issues; specifically, how code and data are cached for maximum performance. We'll start at the top and work our way down, stopping just before we get to the RAM.

Feeding the beast
L1, L2, RAM, page files, etc…Why all these caches? The answer is simple: you’ve got to feed the beast if you want it to work. When there’s a large access-time or transfer rate disparity between the producer (the disk or RAM) and the consumer (the CPU), we need caches to speed things along. The CPU needs code and data to crunch, and it needs it now. There are two overriding factors in setting up a caching scheme: cost and speed. Faster storage costs more (add that to the list of things that are certain, along with death and taxes). As you move down the storage hierarchy from on-chip L1 to hard disk swap-file, you’ll notice that access time and latency increase as the cost-per-bit decreases. So we put the fastest, most expensive, and lowest-capacity storage closest to the CPU, and work our way out by adding slower, cheaper, and higher-capacity storage. What you wind up with is a pyramid-shaped caching hierarchy.

So what it all boils down to is this: if the CPU needs something, it checks its fastest cache. If what it needs isn’t there, it checks the next fastest cache. If that’s no good, then it checks the next fastest cache . . . all the way down the storage hierarchy. The trick is to make sure that the data that’s used most often is closest to the top of the hierarchy (in the smallest, fastest, and most expensive cache), and the data that’s used the least often is near the bottom (in the largest, slowest, and cheapest cache).

Most discussions on caching break things down according to issues in cache design, e.g. “cache coherency and consistency”, “caching algorithms”, etc.  I’m not going to do that here.  If you want to read one of those discussions, then you should buy a good textbook.  The approach I’ll take here is to start at the top of the cache hierarchy and work my way down, explaining in as much detail as I can each cache’s role in enhancing system performance.  In particular, I’ll focus on how code and data work with and against this caching scheme.

One more thing, I look forward to getting feedback on this article.  As always, if I’m out of line, then please feel free to correct me.   And as always, leave the ‘tude at the door.  We all try to be professional and courteous here; we’ll respect you if you respect us.  That having been said, lets get on with the show.

next.jpg (9831 bytes)