Why is ConcurrentDictionary always faster than normal Dictionary?

Question

Why is ConcurrentDictionary always faster than normal Dictionary?

434 Views Asked by Zyo At 17 August 2025 at 17:30

I was wondering if I should use a List or Dictionary knowing that my structure would have very few items. So I did a simple benchmark to test for 1, 3, 10, 20 and 50 what was fastest data structure to find the element.

To my surprise, the ConcurrentDictionary was FASTER than the normal Dictionary. I had the feeling that it would be slower because of locks to allow multi-thread access.

Is there a problem with my benchmark?
Is there a logical reason?
Should I always use the ConcurrentDictionary

BenchmarkDotNet=v0.13.5, OS=Windows 10 (10.0.19045.3803/22H2/2022Update)
AMD Ryzen 9 5900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK=8.0.100
  [Host]     : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2

Method	N	Mean	Error	StdDev	Allocated
ListLoopSearch	1	1.416 ns	0.0153 ns	0.0143 ns	-
DictionarySearch	1	4.253 ns	0.1054 ns	0.1128 ns	-
ConcurrentDictionarySearch	1	2.891 ns	0.0144 ns	0.0127 ns	-
ListLoopSearch	3	6.237 ns	0.0281 ns	0.0234 ns	-
DictionarySearch	3	10.849 ns	0.1968 ns	0.1745 ns	-
ConcurrentDictionarySearch	3	6.225 ns	0.0760 ns	0.0711 ns	-
ListLoopSearch	10	36.174 ns	0.5519 ns	0.5162 ns	-
DictionarySearch	10	33.303 ns	0.6607 ns	0.7608 ns	-
ConcurrentDictionarySearch	10	18.787 ns	0.1348 ns	0.1261 ns	-
ListLoopSearch	20	118.554 ns	1.5678 ns	1.4665 ns	-
DictionarySearch	20	67.028 ns	0.4722 ns	0.3943 ns	-
ConcurrentDictionarySearch	20	36.108 ns	0.6998 ns	0.6546 ns	-
ListLoopSearch	50	979.931 ns	6.9250 ns	5.4066 ns	-
DictionarySearch	50	161.658 ns	0.5295 ns	0.4953 ns	-
ConcurrentDictionarySearch	50	87.737 ns	1.2800 ns	1.0689 ns	-

Here is the full code for the benchmark.

using BenchmarkDotNet.Attributes;
using System.Collections.Concurrent;

public class Item {
    public int Id { get; set; }
    public int Value { get; set; }
}

public class SearchBenchmark {
    [Params(1, 3, 10, 20, 50)] // Number of elements in the structure
    public int NumberOfElements;
    private List<Item> _itemList;
    private Dictionary<int, Item> _itemDictionary;
    private ConcurrentDictionary<int, Item> _itemConcurrentDictionary;
    private List<int> _searchIds; // List of IDs to search

    [GlobalSetup]
    public void Setup() {
        _itemList = Enumerable.Range(1, NumberOfElements).Select(i => new Item { Id = i, Value = i }).ToList();
        _itemDictionary = _itemList.ToDictionary(item => item.Id, item => item);
        _itemConcurrentDictionary = new ConcurrentDictionary<int, Item>(_itemList.Select(item => new KeyValuePair<int, Item>(item.Id, item)));

        // Prepare a list of IDs to search for, covering all elements
        _searchIds = Enumerable.Range(1, NumberOfElements).ToList();
    }

    [Benchmark]
    public void ListLoopSearch() {
        foreach (var id in _searchIds) {
            foreach (var item in _itemList) {
                if (item.Id == id)
                    break; // Found the item, move to the next ID
            }
        }
    }

    [Benchmark]
    public void DictionarySearch() {
        foreach (var id in _searchIds) {
            _itemDictionary.TryGetValue(id, out var result);
        }
    }
    [Benchmark]
    public void ConcurrentDictionarySearch() {
        foreach (var id in _searchIds) {
            _itemConcurrentDictionary.TryGetValue(id, out var result);
        }
    }
}

@jonasH asked if the fact the elements were in consecutive order in the structure and search could be related. So I've modified the code to randomized the elements in the structures and the search. Also added a case with 1000 items. I've get pretty much the same result.

Result with randomized

Method	NumberOfElements	Mean	Error	StdDev
ListLoopSearch	1	1.784 ns	0.0163 ns	0.0153 ns
DictionarySearch	1	4.178 ns	0.0557 ns	0.0465 ns
ConcurrentDictionarySearch	1	3.361 ns	0.0110 ns	0.0102 ns
ListLoopSearch	10	42.138 ns	0.1538 ns	0.1363 ns
DictionarySearch	10	33.073 ns	0.0994 ns	0.0929 ns
ConcurrentDictionarySearch	10	18.996 ns	0.1257 ns	0.1176 ns
ListLoopSearch	50	1,007.447 ns	4.6698 ns	4.1397 ns
DictionarySearch	50	161.110 ns	0.5328 ns	0.4984 ns
ConcurrentDictionarySearch	50	87.332 ns	1.4004 ns	1.3100 ns
ListLoopSearch	1000	324,126.481 ns	1,032.4962 ns	965.7976 ns
DictionarySearch	1000	3,233.511 ns	11.0090 ns	10.2978 ns
ConcurrentDictionarySearch	1000	1,825.452 ns	19.3568 ns	18.1064 ns

Original Q&A

There are 3 best solutions below

ThatGuy On 22 December 2023 at 01:09

It's because a concurrent dictionary's TryGetValue is a basic linked list enumeration of nodes on a null check while a base dictionary's is an array that involves multiple checks and explicit type casting. The difference for retrieving is negligible.

Your benchmarks are flawed. You do nothing with the results such as handing them off to a Consumer object. This means .NET could be doing anything and everything to optimize the code in release configuration. It is best practice to make sure tests consume what they produce properly to avoid .NET from optimizing what you're trying to test. I fixed it and increased the element count to 10,000,000 elements in both dictionaries. The results are not as drastic as you think. Here is the code:

BenchmarkDotNet v0.13.11, Windows 11 (10.0.22621.2861/22H2/2022Update/SunValley2)
AMD Ryzen 5 Microsoft Surface Edition, 1 CPU, 12 logical and 6 physical cores
.NET SDK 8.0.100
  [Host]     : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2

Method	Mean	Error	StdDev
DictionarySearch	67.74 ms	1.222 ms	1.083 ms
ConcurrentDictionarySearch	53.68 ms	0.610 ms	0.540 ms

Remember that this is a 14ms difference over 10,000,000 elements

public class Benchmarks
{
    public class Item {
        public int Id { get; set; }
        public int Value { get; set; }
    }
    
    private const int ItemCount = 10_000_000;
    private Item[] _itemList = new Item[ItemCount];
    private int[] _searchIds = new int[ItemCount]; // List of IDs to search

    private Dictionary<int, Item> _itemDictionary = new();
    private ConcurrentDictionary<int, Item> _itemConcurrentDictionary = new();

    private Consumer _consumer = new();
    
    [GlobalSetup]
    public void Setup() {
        var rand = new Random();
        for (var i = 0; i < _itemList.Length; i++)
        {
            var item = new Item
            {
                Id = i,
                Value = Random.Shared.Next()
            };

            _searchIds[i] = item.Id;
            _itemList[i] = item;
            _itemDictionary.Add(item.Id, item);
            _itemConcurrentDictionary.TryAdd(item.Id, item);
        }

        rand.Shuffle(_itemList);
    }

    [Benchmark]
    public void DictionarySearch() {
        // Modified to search for each ID
        foreach (var id in _searchIds)
        {
            if (!_itemDictionary.TryGetValue(id, out var result))
                continue;
            
            _consumer.Consume(result);
        }
    }
    
    [Benchmark]
    public void ConcurrentDictionarySearch() {
        // Modified to search for each ID
        foreach (var id in _searchIds) {
            if (!_itemConcurrentDictionary.TryGetValue(id, out var result))
                continue;
            
            _consumer.Consume(result);
        }
    }
}

While reading from a concurrent dictionary is a very small amount faster, writing to one is far slower than a base dictionary on a single thread. There's not much reason to use a concurrent dictionary if it's not a concurrent environment because of how slow it will become. I modified the same benchmark to clear both dictionaries after every iteration and reduced the amount of elements to 1,000,000. After running:

Method	Mean	Error	StdDev
DictionaryWrite	98.84 ms	1.413 ms	1.252 ms
ConcurrentDictionaryWrite	490.79 ms	3.616 ms	3.206 ms

[GlobalSetup]
public void GlobalSetup() {
    var rand = new Random();
    for (var i = 0; i < _itemList.Length; i++)
    {
        var item = new Item
        {
            Id = i,
            Value = Random.Shared.Next()
        };

        _searchIds[i] = item.Id;
        _itemList[i] = item;
    }

    rand.Shuffle(_searchIds);
    rand.Shuffle(_itemList);
}

[IterationSetup]
public void IterationSetup()
{
    _itemDictionary.Clear();
    _itemConcurrentDictionary.Clear();
}

[Benchmark]
public void DictionaryWrite() { 
    foreach (var i in _itemList)
    {
        var added = _itemDictionary.TryAdd(i.Id, i);
        _consumer.Consume(added);
    }
}

[Benchmark]
public void ConcurrentDictionaryWrite() {
    foreach (var i in _itemList) {
        var added = _itemConcurrentDictionary.TryAdd(i.Id, i);
        _consumer.Consume(added);
    }
}

For the purpose of practicality to be at least a little relevant to a real scenario, here's a two sum benchmark because if you're just enumerating it, you're probably using the wrong data structure:

Method	TwoSumValCount	Mean	Error	StdDev
TwoSum	50	867.6 ns	11.55 ns	11.34 ns
TwoSumConcurrent	50	2,549.8 ns	25.00 ns	22.16 ns
TwoSum	10000000	5,949,825.4 ns	94,161.74 ns	128,889.65 ns
TwoSumConcurrent	10000000	66,325,225.3 ns	1,321,310.23 ns	2,577,114.13 ns

public class Benchmarks
{
    [Params(50, 10_000_000)]
    public int TwoSumValCount;
    private long[] _twoSumValues = Array.Empty<long>();
    private long _twoSumTarget = 0;

    private Consumer _consumer = new();
    
    [GlobalSetup]
    public void Setup()
    {
        Trace.Assert(TwoSumValCount > 2);
        var nums = new HashSet<long>();
        while (nums.Count < TwoSumValCount)
        {
            var l = Random.Shared.NextInt64(-1_000_000_000, 5_000_000_000);
            nums.Add(l);
        }

        _twoSumValues = nums.ToArray();
        Random.Shared.Shuffle(_twoSumValues);
        _twoSumTarget = _twoSumValues[^1] + _twoSumValues[^2];
    }

    [Benchmark]
    public void TwoSum()
    {
        var map = new Dictionary<long, long> { { _twoSumValues[0], 0 } };

        long[]? result = null;
        for (var i = 1; i < _twoSumValues.Length; i++)
        {
            var n = _twoSumValues[i];
            if (map.TryGetValue(_twoSumTarget - n, out var neededI))
            {
                result = new[] { neededI, i };
                break;
            }
            
            map.TryAdd(n, i);
        }

        Trace.Assert(result != null);
        _consumer.Consume(result);
    }
    
    [Benchmark]
    public void TwoSumConcurrent()
    {
        var map = new ConcurrentDictionary<long, long>();
        map.TryAdd(_twoSumValues[0], 0);

        long[]? result = null;
        for (var i = 1; i < _twoSumValues.Length; i++)
        {
            var n = _twoSumValues[i];
            if (map.TryGetValue(_twoSumTarget - n, out var neededI))
            {
                result = new[] { neededI, i };
                break;
            }
            
            map.TryAdd(n, i);
        }

        Trace.Assert(result != null);
        _consumer.Consume(result);
    }
}

Like I said in the comments of your question, definitely look into a frozen dictionary if all you need to do is read.

Ian Kemp On 22 December 2023 at 19:21

I was wondering if I should use a List or Dictionary knowing that my structure would have very few items.

I've been developing software for over two decades now and haven't yet encountered a situation where using a different data structure made a critical difference to performance. Don't "wonder" and especially don't waste time micro-optimising before you even know you need to do so: use whatever makes your program easiest to read, and therefore maintain, first; optimise second, and only if you absolutely need to.

Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

— Donald Knuth, 1974 Turing Award lecture, pg. 671

**Guru Stron** · Accepted Answer

I had the feeling that it would be slower because of locks to allow multi-thread access.

As written in the ConcurrentDictionary<TKey,TValue>:

For modifications and write operations to the dictionary, ConcurrentDictionary<TKey,TValue> uses fine-grained locking to ensure thread safety. (Read operations on the dictionary are performed in a lock-free manner)

So since you are testing read-only operation locking should not have effect (thought note that there are so called lock-free approaches to synchronization which can induce some performance cost if used).

As for performance - after quick scanning through the current implemntation I was not able to find something which could clearly explain the difference, at least at the first glance. Dictionary.TryGetValue has an shared method call which maybe not inlined which can have some minor effect, also it uses slightly different approach to handle collisions buckets - it uses 2 arrays (one for array containing all dictionary records and another to store index of first element in the bucket, if I got everything right, see more here) while concurrent version uses linked list for every bucket (see more here).

Is there a problem with my benchmark?

Probably. It is quite synthetic, tests only one concrete key type with one fill scenario. Different key types and/or different collision percentage can (in theory) produce different results.

Should I always use the ConcurrentDictionary

By default - no. I would argue that following the principle of least surprise when developing code will outweigh potential performance benefits (if any) in most scenarios. If you don't expect dictionary usage to be concurrent - do not use concurrent dictionary. And do not forget about dangers of premature optimization.

If you consider a path involving the dictionary usage to be a "critical" one then first of all you need to verify that Dictionary.TryGet is actually that influential in it and then thoroughly test your actual scenario (with appropriate data and usage patterns especially including modification operations if any, and also actual hardware) and setup infrastructure to continually benchmark the corresponding path. Which can be unjustified from resources/costs point of view.

Also if you need readonly dictionary - you should consider FrozenDictionary<TKey,TValue>:

Provides an immutable, read-only dictionary optimized for fast lookup and enumeration.

Used @ThatGuy's benchmark as base to compare with FrozenDictionary. "On my machine" it produces the following results (for 50 items dictionary):

Method	Mean	Error	StdDev
DictionarySearch	355.7 ns	2.54 ns	2.12 ns
FrozenDictionarySearch	272.3 ns	1.82 ns	1.42 ns
ConcurrentDictionarySearch	206.9 ns	2.60 ns	2.43 ns

Note that with string keys in my benchmarks FrozenDictionary showed better performance than both alternatives.

P.S.

Submitted an issue @github for more knowledgeable people to reply.

Why is ConcurrentDictionary always faster than normal Dictionary?

There are 3 best solutions below

Related Questions in C#

Related Questions in PERFORMANCE

Related Questions in DICTIONARY

Related Questions in .NET-8.0

Related Questions in CONCURRENTDICTIONARY

Trending Questions

Popular # Hahtags

Popular Questions