Adaptive replacement cache algorithm

4.1k Views Asked by At

I'm trying to implement the Adaptative Replacement Cache algorithm but, i'm reading in the literature, and i can't understand the algorithm. Anyone can explain me that algorithm? I see that it use two lists L1 to the frequency and L2 to the recency. But the T1, B1 and T2, B2 for the L1 and L2 lists, i can't understand.

ftp://paranoidbits.com/ebooks/Outperforming%20LRU%20with%20an%20Adaptive%20Replacement%20Cache.pdf in this paper i saw this information.

2

There are 2 best solutions below

0
On BEST ANSWER

T1 and T2 hold the actual data that is being cached. T1 holds all the data (pages) that has been referenced exactly once. T2 holds pages that have been referenced 2 or more times. B1 is a ghost list which is used to remember which pages once were in the T1 cache. B2 is the same, only for the T2 cache. When you add T1 and B1 together, you get a set (called L1) of references to pages that were or currently are cached because they were referenced once. The same goes for L2, only it contains references to which pages have been referenced at least twice.

T1 and T2 share a fixed sized set of slots (each slot can hold one page), and we use B1 and B2 to decide how to share these slots among T1 and T2. This is what makes ARC adaptive - the automatic adjustment of who gets the most slots. If B1 holds multiple references to the same page, it means that we don't allow T1 to hold on to its pages long enough, and that it should be allowed more slots to counter this. The same goes for B2.

I won't try and explain the entire algorithm here, but this is at least my understanding of the ARC lists.

0
On

I would like to highlight some of the key ideas and help you follow along the paper's narrative. This should help you develop some intuition about ARC.

We start with an LRU cache of size c. Other than the cache of pages, we also maintain a cache directory T of size c, which is just an array of metadata describing pages in the cache. Page metadata contains, for example, the physical address of the page in the storage medium.

Now, observe that LRU is not scan-resistant. If a process pulls a sequence of c pages, every page in the cache will be evicted. This is not very efficient, and we want to be able to optimize for both recency and frequency.

Key Idea #1: split the cache into 2 lists: T1 and T2. T1 contains recently used pages, and T2 contains frequently used pages. This is scan-resistant, because a scan will cause T1 to be emptied, but leave T2 mostly unaffected.

When the cache is full, the algorithm has to choose to evict a victim from T1, or from T2. Observe that these lists do not have to have the same size, and we can intelligently give more of the cache to recent pages or frequent pages depending on access patterns. You can invent your own heuristic here.

Key Idea #2: keep extra history. Let B1 and B2 track the metadata of evicted pages from T1 and T2 respectively. If we observe many cache misses in T1 that are hits in B1, we regret the eviction, and give more of the cache to T1.

ARC keeps a number p to split the cache between T1 and T2.

Key Idea #3: On a cache miss in T1 and a hit in B1, ARC increases p by \frac{|B_2|}{|B_1|}. This is the "delta", or "regret ratio", which compares the probability of a hit in B1 to the probability of a hit in B2, assuming a uniform distribution across all pages.