On Redis cache miss, get data only executes once

1.1k Views Asked by At

We currently use LazyCache in our .NET Core microservice, but want to utilize a distributed cache, so we are looking at a solution with Redis.

I am looking into the cache miss scenario. Let's assume we have 100 requests (on the same server) coming in that all need to get the same data. We have a background call to a database, but it is a complex query, so we cached the result with LazyCache. If the data is not in the cache, LazyCache does a call to the database to get the data and locks all other requesters until there is a response. So the database call is only done once.

This is an advantage of LazyCache over the traditional MemoryCache from Microsoft. The problem is explained very well in this article: https://blog.novanet.no/asp-net-core-memory-cache-is-get-or-create-thread-safe/

Now we want to do the same with Redis, but I wonder if this is possible? We use the StackExchange.Redis client.

Effectively we want to do this:

public async Task<string> GetOrAddAsync(string key, Func<Task<string>> dataFactory)
{
    var result = await cache.GetStringAsync(key);

    if(result != null)
    {
        return result;
    }

    result = await dataFactory();

    await cache.SetStringAsync(key, result);

    return result;
}

Let's assume we have 100 requests at the same time. Until the SetStringAsync is called by one thread, all other threads will be calling dataFactory(); to get the data as well. This could be a huge performance problem. What I would like to see is that only the first thread for that key calls dataFactory, but all other threads just get locked there and wait until the first request is completed and the SetStringAsync has been set. This is the same behavior as in LazyCache.

It seems a pretty obvious scenario to me, so it surprises me I couldn't find anything online about this? Does anyone know this is possible with the StackExchange.Redis client or do I have to implement this locking myself?

2

There are 2 best solutions below

0
On

I have faced the same issue, we solved it by using an in-memory cache on top of our distributed cache, (guava provides this out of the box). Seems to solve the issue.

0
On

I wonder if this is possible?

Not generally. This is relatively easy with an in-memory cache, but a distributed cache then requires a distributed lock. Which can then be abandoned, and you have to define some kind of recovery semantics in that case.

In summary, it's a pretty big problem to solve, and with a distributed lock, you'd be adding more performance problems (i.e., every request having to coordinate distributed locking just to insert/remove in the cache). Ironically, distributed locks are commonly implemented using distributed caches these days.

So, while you can implement it yourself, I recommend doing some serious testing with realistic scenarios to test whether the performance is better or worse with a distributed locking cache.

It's not possible to predict the results that testing would produce, but I suspect it would be faster to keep the in-memory cache (you would have dataFactory call per node, not per request), or possibly a 2-level cache (where both caches are checked and updated, but only the in-memory one is locked).