Wednesday, May 19, 2010

Memory Cache - Overview

Memory Cache - Overview

1.General Aspects
The time access for cache is at least 10 times smaller than main memory. Because it has a very expensive technology, the size is limited. To achieve high performance the addresses need to be located most of the time in cache. The terminology used for cache efficiency is “Hit Rate” (Hit Rate - represents the rapport between the number of addresses accessed from cache and the total number of addresses accessed during that time). The antonym for “Hit Rate” is named “Miss Rate” which can be determined by Miss Rate = 1 – Hit Rate formulas.

Data Transfer
Fig.1 Data Transfer

Speed Access
Fig.2 Speed Access

2. How to Evaluate the Cache Performance?
Access -> read/write operation
Hit -> data was found in Cache
Miss -> data wasn’t found in Cache; accessed from a slower memory
Miss_rate = #misses / #accesses; Hit_rate = #hits / #accesses
Example:
Taccess(cache) = 1ns, Taccess(main memory)=10ns,
Hit_Rate=90% (from 100 accesses, 90 will be from cache)
Taccess (with cache) = 90*1 + 10*1 = 100ns
Taccess(without cache) = 100*10 = 1000ns
Speed up = 100/10 = 10
Increase Hit Rate -> Improve Cache Performance -> Lower Power Consumption -> Improve Speed Execution.

3.Type of Cache Misses
Compulsory miss – initially the cache memory is empty and the first access is always a miss. The program needs to run some cycles until the cache will be full and this type of misses will no more occur.
Conflict miss – this type appears only for direct-map caches and usually is because the same data is found in many places in cache.
Capacity miss – occurs because the cache is full and need to make room for current data by eliminating another data from cache; most cases an old data.

CPU DIE
Fig.3 CPU Die

4.Cache Architectures
Cache granularity is given by “line” which has 4 parts:
- Valid Bit: is set to 1 when a valid data is stored in cache.
- Dirty Bit: is set to 1 when data is changed and is not updated to main memory in the same time.
- Tag: this field tells which address is in that line.
- Data: the data fetched from main memory.

Data Block
Fig.4 Data Block

Direct Mapped – each block from main memory is mapped only to a unique cache block, no matter if that block is empty or not. This model of cache organization doesn’t require a replacement policy.
Fully Associative – each block from main memory can be mapped to any of cache blocks. If the cache is full then a replacement policy need to decide which data will be invalidated from cache for making room of this new data block.
Set Associative – the cache is split into many sets. The new data can be mapped only to the blocks of a certain set, determined by some bits from address field. In practice is used mostly because of a very efficient ratio between implementation and efficiency.

5. Cache Management
Cache Memory has a small size compared to main memory of any system. Knowing that the time access from cache is least 10 times smaller than external memory, fetching data from cache is a big advantage. The process of eliminating a data from cache for making room of a new one is called “Cache Replacement Algorithm”.
Today there are a lot of these types of algorithms that are implemented in cache controller. Some of them are:
- LRU (Least Recently Used) – Replace the data that has not been used for the longest time. LRU is implemented by a linked list.
- LFU (Least Frequently Used) – Every line of cache has a counter which is incremented every time that data is accessed from cache.
- FIFO (First In First Out) – It’s very simple to implement, but has a problem when physical memory is bigger (Belady’s Anomaly).
- ARC (Adaptive Replacement Cache) – combine the LRU and LFU solutions and dynamically adjust between them.

6.Cache Optimization
Using the algorithm for cache management performances can be meaningfully increased by using the advantages of 2 more principles:
a. Spatial Locality – The requested data from cache is located near the previous data used in physical memory.
b. Temporal Locality – The requested data from cache was recently used or reused.
Some solutions very often used for these 2 methods are prefetching, loop blocking and fusion, array padding and merging.

No comments:

Post a Comment