I was wondering if I should use a List or Dictionary knowing that my structure would have very few items. So I did a simple benchmark to test for 1, 3, 10, 20 and 50 what was fastest data structure to find the element.
To my surprise, the ConcurrentDictionary
was FASTER than the normal Dictionary
. I had the feeling that it would be slower because of locks to allow multi-thread access.
- Is there a problem with my benchmark?
- Is there a logical reason?
- Should I always use the
ConcurrentDictionary
BenchmarkDotNet=v0.13.5, OS=Windows 10 (10.0.19045.3803/22H2/2022Update)
AMD Ryzen 9 5900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK=8.0.100
[Host] : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2
DefaultJob : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2
Method | N | Mean | Error | StdDev | Allocated |
---|---|---|---|---|---|
ListLoopSearch | 1 | 1.416 ns | 0.0153 ns | 0.0143 ns | - |
DictionarySearch | 1 | 4.253 ns | 0.1054 ns | 0.1128 ns | - |
ConcurrentDictionarySearch | 1 | 2.891 ns | 0.0144 ns | 0.0127 ns | - |
ListLoopSearch | 3 | 6.237 ns | 0.0281 ns | 0.0234 ns | - |
DictionarySearch | 3 | 10.849 ns | 0.1968 ns | 0.1745 ns | - |
ConcurrentDictionarySearch | 3 | 6.225 ns | 0.0760 ns | 0.0711 ns | - |
ListLoopSearch | 10 | 36.174 ns | 0.5519 ns | 0.5162 ns | - |
DictionarySearch | 10 | 33.303 ns | 0.6607 ns | 0.7608 ns | - |
ConcurrentDictionarySearch | 10 | 18.787 ns | 0.1348 ns | 0.1261 ns | - |
ListLoopSearch | 20 | 118.554 ns | 1.5678 ns | 1.4665 ns | - |
DictionarySearch | 20 | 67.028 ns | 0.4722 ns | 0.3943 ns | - |
ConcurrentDictionarySearch | 20 | 36.108 ns | 0.6998 ns | 0.6546 ns | - |
ListLoopSearch | 50 | 979.931 ns | 6.9250 ns | 5.4066 ns | - |
DictionarySearch | 50 | 161.658 ns | 0.5295 ns | 0.4953 ns | - |
ConcurrentDictionarySearch | 50 | 87.737 ns | 1.2800 ns | 1.0689 ns | - |
Here is the full code for the benchmark.
using BenchmarkDotNet.Attributes;
using System.Collections.Concurrent;
public class Item {
public int Id { get; set; }
public int Value { get; set; }
}
public class SearchBenchmark {
[Params(1, 3, 10, 20, 50)] // Number of elements in the structure
public int NumberOfElements;
private List<Item> _itemList;
private Dictionary<int, Item> _itemDictionary;
private ConcurrentDictionary<int, Item> _itemConcurrentDictionary;
private List<int> _searchIds; // List of IDs to search
[GlobalSetup]
public void Setup() {
_itemList = Enumerable.Range(1, NumberOfElements).Select(i => new Item { Id = i, Value = i }).ToList();
_itemDictionary = _itemList.ToDictionary(item => item.Id, item => item);
_itemConcurrentDictionary = new ConcurrentDictionary<int, Item>(_itemList.Select(item => new KeyValuePair<int, Item>(item.Id, item)));
// Prepare a list of IDs to search for, covering all elements
_searchIds = Enumerable.Range(1, NumberOfElements).ToList();
}
[Benchmark]
public void ListLoopSearch() {
foreach (var id in _searchIds) {
foreach (var item in _itemList) {
if (item.Id == id)
break; // Found the item, move to the next ID
}
}
}
[Benchmark]
public void DictionarySearch() {
foreach (var id in _searchIds) {
_itemDictionary.TryGetValue(id, out var result);
}
}
[Benchmark]
public void ConcurrentDictionarySearch() {
foreach (var id in _searchIds) {
_itemConcurrentDictionary.TryGetValue(id, out var result);
}
}
}
@jonasH asked if the fact the elements were in consecutive order in the structure and search could be related. So I've modified the code to randomized the elements in the structures and the search. Also added a case with 1000 items. I've get pretty much the same result.
Result with randomized
Method | NumberOfElements | Mean | Error | StdDev |
---|---|---|---|---|
ListLoopSearch | 1 | 1.784 ns | 0.0163 ns | 0.0153 ns |
DictionarySearch | 1 | 4.178 ns | 0.0557 ns | 0.0465 ns |
ConcurrentDictionarySearch | 1 | 3.361 ns | 0.0110 ns | 0.0102 ns |
ListLoopSearch | 10 | 42.138 ns | 0.1538 ns | 0.1363 ns |
DictionarySearch | 10 | 33.073 ns | 0.0994 ns | 0.0929 ns |
ConcurrentDictionarySearch | 10 | 18.996 ns | 0.1257 ns | 0.1176 ns |
ListLoopSearch | 50 | 1,007.447 ns | 4.6698 ns | 4.1397 ns |
DictionarySearch | 50 | 161.110 ns | 0.5328 ns | 0.4984 ns |
ConcurrentDictionarySearch | 50 | 87.332 ns | 1.4004 ns | 1.3100 ns |
ListLoopSearch | 1000 | 324,126.481 ns | 1,032.4962 ns | 965.7976 ns |
DictionarySearch | 1000 | 3,233.511 ns | 11.0090 ns | 10.2978 ns |
ConcurrentDictionarySearch | 1000 | 1,825.452 ns | 19.3568 ns | 18.1064 ns |
As written in the
ConcurrentDictionary<TKey,TValue>
:So since you are testing read-only operation locking should not have effect (thought note that there are so called lock-free approaches to synchronization which can induce some performance cost if used).
As for performance - after quick scanning through the current implemntation I was not able to find something which could clearly explain the difference, at least at the first glance.
Dictionary.TryGetValue
has an shared method call which maybe not inlined which can have some minor effect, also it uses slightly different approach to handle collisions buckets - it uses 2 arrays (one for array containing all dictionary records and another to store index of first element in the bucket, if I got everything right, see more here) while concurrent version uses linked list for every bucket (see more here).Probably. It is quite synthetic, tests only one concrete key type with one fill scenario. Different key types and/or different collision percentage can (in theory) produce different results.
By default - no. I would argue that following the principle of least surprise when developing code will outweigh potential performance benefits (if any) in most scenarios. If you don't expect dictionary usage to be concurrent - do not use concurrent dictionary. And do not forget about dangers of premature optimization.
If you consider a path involving the dictionary usage to be a "critical" one then first of all you need to verify that
Dictionary.TryGet
is actually that influential in it and then thoroughly test your actual scenario (with appropriate data and usage patterns especially including modification operations if any, and also actual hardware) and setup infrastructure to continually benchmark the corresponding path. Which can be unjustified from resources/costs point of view.Also if you need readonly dictionary - you should consider
FrozenDictionary<TKey,TValue>
:Used @ThatGuy's benchmark as base to compare with
FrozenDictionary
. "On my machine" it produces the following results (for 50 items dictionary):Note that with
string
keys in my benchmarksFrozenDictionary
showed better performance than both alternatives.P.S.
Submitted an issue @github for more knowledgeable people to reply.