Thread safe Dictionary-like collection with upper bound

1.8k Views Asked by At

I am after a collection with the following properties:

  • Thread safe: It will be used in asp.net and multiple clients could try to add, remove and access members concurrently
  • Max elements: I want to be able to set an upper bound, a maximum number of elements, at construction time
  • TryAdd: A method that works the same as BlockingCollection<T>.TryAdd(T) would be perfect, i.e. it would return false if the maximum number of elements has been reached
  • Dictionary-like: In most other respects a ConcurrentDictionary would be perfect, i.e. ability to identify elements by a key, remove any item (not just the first or last, which I think would be the limitation with BlockingCollection)

Before I attempt to roll my own, my questions are:

  1. have I missed a built in type that would put a safe ceiling on the number of elements in a collection?
  2. Is there a way to achieve this functionality with BlockingCollection somehow?

Finally, if I do need to try and make my own, what approach should I think about? Is it as simple as a wrapped Dictionary with locks?

Example use: A chat room with a defined limit on number of participants could store the connection information of participants and reject new entrants until there is room to enter when full

6

There are 6 best solutions below

4
On BEST ANSWER

The simplest solution is just make a wrapper class that uses a normal dictionary and uses a ReaderWriterLockSlim to control thread safe access.

public class SizeLimitedDictionary<TKey, TValue> : IDictionary<TKey, TValue>
{
    private readonly int _maxSize;
    private readonly IDictionary<TKey, TValue> _dictionary;
    private readonly ReaderWriterLockSlim _readerWriterLock;

    public SizeLimitedDictionary(int maxSize)
    {
        _maxSize = maxSize;
        _dictionary = new Dictionary<TKey, TValue>(_maxSize);
        _readerWriterLock = new ReaderWriterLockSlim();
    }

    public bool TryAdd(TKey key, TValue value)
    {
        _readerWriterLock.EnterWriteLock();
        try
        {
            if (_dictionary.Count >= _maxSize)
                return false;

            _dictionary.Add(key, value);
        }
        finally
        {
            _readerWriterLock.ExitWriteLock();
        }

        return true;
    }

    public void Add(TKey key, TValue value)
    {
        bool added = TryAdd(key, value);
        if(!added)
            throw new InvalidOperationException("Dictionary is at max size, can not add additional members.");
    }

    public bool TryAdd(KeyValuePair<TKey, TValue> item)
    {
        _readerWriterLock.EnterWriteLock();
        try
        {
            if (_dictionary.Count >= _maxSize)
                return false;

            _dictionary.Add(item);
        }
        finally
        {
            _readerWriterLock.ExitWriteLock();
        }

        return true;
    }

    public void Add(KeyValuePair<TKey, TValue> item)
    {
        bool added = TryAdd(item);
        if (!added)
            throw new InvalidOperationException("Dictionary is at max size, can not add additional members.");
    }

    public void Clear()
    {
        _readerWriterLock.EnterWriteLock();
        try
        {
            _dictionary.Clear();
        }
        finally
        {
            _readerWriterLock.ExitWriteLock();
        }

    }

    public bool Contains(KeyValuePair<TKey, TValue> item)
    {
        _readerWriterLock.EnterReadLock();
        try
        {
            return _dictionary.Contains(item);
        }
        finally
        {
            _readerWriterLock.ExitReadLock();
        }

    }

    public void CopyTo(KeyValuePair<TKey, TValue>[] array, int arrayIndex)
    {
        _readerWriterLock.EnterReadLock();
        try
        {
            _dictionary.CopyTo(array, arrayIndex);
        }
        finally
        {
            _readerWriterLock.ExitReadLock();
        }
    }

    public bool Remove(KeyValuePair<TKey, TValue> item)
    {
        _readerWriterLock.EnterWriteLock();
        try
        {
            return _dictionary.Remove(item);
        }
        finally
        {
            _readerWriterLock.ExitWriteLock();
        }
    }

    public int Count
    {
        get
        {
            _readerWriterLock.EnterReadLock();
            try
            {
                return _dictionary.Count;
            }
            finally
            {
                _readerWriterLock.ExitReadLock();
            }
        }
    }

    public bool IsReadOnly
    {
        get
        {
            _readerWriterLock.EnterReadLock();
            try
            {
                return _dictionary.IsReadOnly;
            }
            finally
            {
                _readerWriterLock.ExitReadLock();
            }
        }
    }

    public bool ContainsKey(TKey key)
    {
        _readerWriterLock.EnterReadLock();
        try
        {
            return _dictionary.ContainsKey(key);
        }
        finally
        {
            _readerWriterLock.ExitReadLock();
        }
    }

    public bool Remove(TKey key)
    {
        _readerWriterLock.EnterWriteLock();
        try
        {
            return _dictionary.Remove(key);
        }
        finally
        {
            _readerWriterLock.ExitWriteLock();
        }
    }

    public bool TryGetValue(TKey key, out TValue value)
    {
        _readerWriterLock.EnterReadLock();
        try
        {
            return _dictionary.TryGetValue(key, out value);
        }
        finally
        {
            _readerWriterLock.ExitReadLock();
        }
    }

    public TValue this[TKey key]
    {
        get
        {
            _readerWriterLock.EnterReadLock();
            try
            {
                return _dictionary[key];
            }
            finally
            {
                _readerWriterLock.ExitReadLock();
            }
        }
        set
        {
            _readerWriterLock.EnterUpgradeableReadLock();
            try
            {
                var containsKey = _dictionary.ContainsKey(key);
                _readerWriterLock.EnterWriteLock();
                try
                {
                    if (containsKey)
                    {
                        _dictionary[key] = value;
                    }
                    else
                    {
                        var added = TryAdd(key, value);
                        if(!added)
                            throw new InvalidOperationException("Dictionary is at max size, can not add additional members.");
                    }
                }
                finally
                {
                    _readerWriterLock.ExitWriteLock();
                }
            }
            finally
            {
                _readerWriterLock.ExitUpgradeableReadLock();
            }
        }
    }

    public ICollection<TKey> Keys
    {
        get
        {
            _readerWriterLock.EnterReadLock();
            try
            {
                return _dictionary.Keys;
            }
            finally
            {
                _readerWriterLock.ExitReadLock();
            }
        }
    }

    public ICollection<TValue> Values
    {
        get
        {
            _readerWriterLock.EnterReadLock();
            try
            {
                return _dictionary.Values;
            }
            finally
            {
                _readerWriterLock.ExitReadLock();
            }
        }
    }

    public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator()
    {
        return _dictionary.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return ((IEnumerable)_dictionary).GetEnumerator();
    }
}

This class implements the full IDictionary<Tkey,TValue> interface. The way this works is all insertions pass through TryAdd, if you are at or above the max size and try to insert a new member you get a false from TryAdd and a InvalidOperationException from methods that do not return bool.

The reason I did not use a ConcurrentDictionary is there is no good way to try to check the count before adding a new member in an atomic way, so you would need to lock anyway. You could potentially use a concurrent dictionary and remove all of my EnterReadLock's and replace the EnterWriteLock's with normal lock calls, but you would need to do performance testing to see which would do better.

If you want methods like GetOrAdd it would not be hard to implement yourself.

2
On

You'll end up with custom implementation anyways, that said there's no built in type that behaves dictionary-like and has capacity limitations...

To make it completely custom, you may go for ConcurrentHashSet limiting amount of entries will work for you.

Concurrent HashSet<T> in .NET Framework?

1
On

If you have all these additional requirements isn't it better to create a class that composes a List rather than is one? Put the list inside the class you're making.

For example, I would say a chat room contains a list rather than being a special type of list. I would have all the max number, get chatter by name etc logic separate from the actual list. Then I would use a lock around interactions with the list, or some threadsafe collection like ConcurrentBag. As far a whether you want a dictionary, it really depends on teh detail of the data and how you're going to be accessing it.

0
On

If you need to create something like a ConcurrentDictionary with some extra features(e.g. max elements) I'd go for an Adaptor that will hold a private ConcurrentDictionary and expand it where you need to expand it.

A lot of method calls will stay with no change(you will simple call your private ConcurrentDictionary and do nothing).

0
On

Here is a simple implementation for this:

public class ConcurrentDictionaryEx<TKey, TValue>
{
    private readonly object _lock = new object();
    private ConcurrentDictionary<TKey, TValue> _dic;
    public int Capacity { get; set; }
    public int Count { get; set; }
    public ConcurrentDictionaryEx(int capacity, int concurrencyLevel = 2)
    {
        this.Capacity = capacity;
        _dic = new ConcurrentDictionary<TKey, TValue>(concurrencyLevel, capacity);
    }

    public bool TryAdd(TKey key, TValue value)
    {
        lock (_lock)
        {
            if (this.Count < this.Capacity && _dic.TryAdd(key, value))
            {
                this.Count++;
                return true;
            }
            return false;

        }
    }

    public bool TryRemove(TKey key, out TValue value)
    {
        lock (_lock)
        {
            if (_dic.TryRemove(key, out value))
            {
                this.Count--;
                return true;
            }
            return false;
        }
    }

    public bool TryGetValue(TKey key, out TValue value)
    {
        lock (_lock)
        {
            return _dic.TryGetValue(key, out value);
        }
    }

    public bool TryUpdate(TKey key, TValue newValue, TValue comparisonValue)
    {
        lock (_lock)
        {
            return _dic.TryUpdate(key, newValue, comparisonValue);
        }
    }
}
0
On

Scott Chamberlain's answer covers nicely the scenario of frequent readers and infrequent writers, by using a ReaderWriterLockSlim. But what if writing is as frequent as reading? The ReaderWriterLockSlim won't help much in this case.

My idea for mitigating this scenario is to move out of the protected section the calculation of the hashcode, and protect only the operations that involve shared state. This should be beneficial in case the cost of calculating the hashcode of the values is not trivial, for example when the values are long strings. Below is an implementation of this idea, for building a bounded concurrent HashSet<T>. This collection uses a HashSet<(T, int)> as underlying storage, with the int property being the precalculated hashcode of the T value:

public class BoundedConcurrentHashSet<T>
{
    private readonly HashSet<(T Value, int HashCode)> _set;
    private readonly int _boundedCapacity;
    private readonly IEqualityComparer<T> _comparer;

    public BoundedConcurrentHashSet(int boundedCapacity,
        IEqualityComparer<T> comparer = default)
    {
        _boundedCapacity = boundedCapacity;
        _comparer = comparer ?? EqualityComparer<T>.Default;
        _set = new(new _Comparer(_comparer));
    }

    // A comparer that returns the precalculated HashCode
    private class _Comparer : IEqualityComparer<(T, int)>
    {
        private readonly IEqualityComparer<T> _comparer;
        public _Comparer(IEqualityComparer<T> comparer) { _comparer = comparer; }
        public bool Equals((T, int) x, (T, int) y) => _comparer.Equals(
            x.Item1, y.Item1);
        public int GetHashCode((T, int) obj) => obj.Item2;
    }

    public int Count { get { lock (_set) return _set.Count; } }
    public bool IsEmpty => Count == 0;
    public int BoundedCapacity => _boundedCapacity;

    public bool Contains(T value)
    {
        int hashCode = _comparer.GetHashCode(value);
        lock (_set) return _set.Contains((value, hashCode));
    }

    public bool TryGetValue(T equalValue, out T actualValue)
    {
        int hashCode = _comparer.GetHashCode(equalValue);
        lock (_set)
        {
            if (_set.TryGetValue((equalValue, hashCode), out var existing))
            {
                actualValue = existing.Value; return true;
            }
            actualValue = default; return false;
        }
    }

    public bool TryAdd(T value)
    {
        int hashCode = _comparer.GetHashCode(value);
        lock (_set)
        {
            if (_set.Count < _boundedCapacity) return _set.Add((value, hashCode));
            return false;
        }
    }

    public bool TryGetOrAdd(T equalValue, out T actualValue)
    {
        int hashCode = _comparer.GetHashCode(equalValue);
        lock (_set)
        {
            if (_set.TryGetValue((equalValue, hashCode), out var existing))
            {
                actualValue = existing.Value; return true;
            }
            if (_set.Count < _boundedCapacity)
            {
                bool added = _set.Add((equalValue, hashCode));
                Debug.Assert(added);
                actualValue = equalValue; return true;
            }
            actualValue = default; return false;
        }
    }

    public bool TryRemove(T value)
    {
        int hashCode = _comparer.GetHashCode(value);
        lock (_set) return _set.Remove((value, hashCode));
    }

    public bool TryRemove(T equalValue, out T actualValue)
    {
        int hashCode = _comparer.GetHashCode(equalValue);
        lock (_set)
        {
            if (_set.TryGetValue((equalValue, hashCode), out var existing))
            {
                bool removed = _set.Remove((equalValue, hashCode));
                Debug.Assert(removed);
                actualValue = existing.Value; return true;
            }
            actualValue = default; return false;
        }
    }

    public T[] ToArray()
    {
        lock (_set) return _set.Select(e => e.Value).ToArray();
    }
}

The public members of this collection are:

  • Properties: Count, IsEmpty and BoundedCapacity.
  • Methods: Contains, TryGetValue, TryAdd, TryGetOrAdd, TryRemove and ToArray.

Using internally a HashSet<T> with a different T has implications regarding the protection against HashDoS attacks. In case you plan to use this collection with string keys of potentially hostile origin, check out this GitHub issue before proceeding.

Below are some benchmarks involving four different implementations, and with different lengths for the string values.

  • Lock-Optimized: is the implementation above.
  • Lock-Simple: is a simple HashSet<T>+lock implementation that doesn't precalculate the hashcodes.
  • ReaderWriterLockSlim: is Scott Chamberlain's implementation.
  • Native: is an implementation that wraps a ConcurrentDictionary<T, object>. It's not bounded, so it's not a fair contender. It is included here just for reference.

The scenario is the same for all tests: 4 worker threads that randomly and concurrently read (50%) or add (25%) or remove (25%) values from the same hashset. The reported metric is the total number of operations by all workers per second.

String length Lock-Optimized Lock-Simple ReaderWriterLockSlim Native (not bounded)
10 3,833,272 4,564,383 1,978,146 10,369,830
30 4,196,021 4,419,448 2,023,593 10,697,999
100 4,024,539 3,417,730 1,913,043 8,365,800
300 3,952,180 2,128,451 1,551,082 4,644,652
1000 1,994,425 1,018,591 839,897 2,110,027

The Lock-Simple outperforms the Lock-Optimized for short strings. The Lock-Optimized starts to shine for strings with length 100 and more.