I have the following use case. Multiple threads are creating data points which are collected in a ConcurrentBag. Every x ms a single consumer thread looks at the data points that came in since the last time and processes them (e.g. count them + calculate average).
The following code more or less represents the solution that I came up with:
private static ConcurrentBag<long> _bag = new ConcurrentBag<long>();
static void Main()
{
Task.Run(() => Consume());
var producerTasks = Enumerable.Range(0, 8).Select(i => Task.Run(() => Produce()));
Task.WaitAll(producerTasks.ToArray());
}
private static void Produce()
{
for (int i = 0; i < 100000000; i++)
{
_bag.Add(i);
}
}
private static void Consume()
{
while (true)
{
var oldBag = _bag;
_bag = new ConcurrentBag<long>();
var average = oldBag.DefaultIfEmpty().Average();
var count = oldBag.Count;
Console.WriteLine($"Avg = {average}, Count = {count}");
// Wait x ms
}
}
- Is a
ConcurrentBagthe right tool for the job here? - Is switching the bags the right way to achieve clearing the list for new data points and then processing the old ones?
- Is it safe to operate on
oldBagor could I run into trouble when I iterate overoldBagand a thread is still adding an item? - Should I use
Interlocked.Exchange()for switching the variables?
EDIT
I guess the above code was not really a good representation of what I'm trying to achieve. So here is some more code to show the problem:
public class LogCollectorTarget : TargetWithLayout, ILogCollector
{
private readonly List<string> _logMessageBuffer;
public LogCollectorTarget()
{
_logMessageBuffer = new List<string>();
}
protected override void Write(LogEventInfo logEvent)
{
var logMessage = Layout.Render(logEvent);
lock (_logMessageBuffer)
{
_logMessageBuffer.Add(logMessage);
}
}
public string GetBuffer()
{
lock (_logMessageBuffer)
{
var messages = string.Join(Environment.NewLine, _logMessageBuffer);
_logMessageBuffer.Clear();
return messages;
}
}
}
The class' purpose is to collect logs so they can be sent to a server in batches. Every x seconds GetBuffer is called. This should get the current log messages and clear the buffer for new messages. It works with locks but it as they are quite expensive I don't want to lock on every Logging-operation in my program. So that's why I wanted to use a ConcurrentBag as a buffer. But then I still need to switch or clear it when I call GetBuffer without loosing any log messages that happen during the switch.
Its the right tool for a job, this really depends on what you are trying to do, and why. The example you have given is very simplistic without any context so its hard to tell.
The answer is no, for probably many reasons. What happens if a thread writes to it, while you are switching it?
No, you have just copied the reference, this will achieve nothing.
Interlock methods are great things, however this will not help you in your current problem, they are for thread safe access to integer type values. You are really confused and you need to look up more thread safe examples.
However Lets point you in the right direction. forget about ConcurrentBag and those fancy classes. My advice is start simple and use locking so you understand the nature of the problem.
If you want multiple tasks/threads to access a list, you can easily use the
lockstatement and guard access to the list/array so other nasty threads aren't modifying it.Obviously the code you have written is a nonsensical example, i mean you are just adding consecutive numbers to a list, and getting another thread to average them them. This hardly needs to be consumer producer at all, and would make more sense to just be synchronous.
At this point i would point you to better architectures that would allow you to implement this pattern, e.g Tpl Dataflow, but i fear this is just a learning excise and unfortunately you really need to do more reading on multithreading and try more examples before we can truly help you with a problem.