ConcurrentDictionary missing values?

123 Views Asked by At

I have the following code

ConcurrentDictionary<string, XElement> DocumentsElement = new ConcurrentDictionary<string, XElement>();

        public void AddDocument(string docId)
        {
            bool added;
            try
            {
                var documentElement = new XElement("Document", new XAttribute("DocID", docId));

                lock (lockObject)
                {
                    added = DocumentsElement.TryAdd(docId, documentElement);

                    string path = $"{_batchNumber}";
                    if(!Directory.Exists(path)) { Directory.CreateDirectory(path); }
                    File.AppendAllText($"{path}/{docId}.txt", documentElement.ToString());

                }
            }
            catch (Exception ex)
            {
                added = false;
            }

            if (added)
            {
                Debug.WriteLine("Document added successfully.");
            }
            else
            {
                Debug.WriteLine("Failed to add document.");
            }

            lock (lockObject)
            {
                Debug.WriteLine("From AddDocument method " + DocumentsElement.Count);
            }
        }

The issue is that at the end of the entire process, DocumentsElement is missing some values., however the "File.AppendAllText" is correctly writing the XElement value into a text file and all files are there. There are multiple threads that access this AddDocument method.

e.g. DocumentsElement.Count = 4495 No. Text files = 4500 (correct count).

Is there some synchronization issue that I'm missing?

The calling code is in my main app, Process(ConcurrentDictionary<long, byte> idList)

private ConcurrentDictionary<string, DocumentLoadFile> loadfiles;

private async Task Process(ConcurrentDictionary<long, byte> idList)
{
     var tasks = idList.Select(async (id) =>
     {
          var metadata = FetchMetadataForId(id);
          var loadfile = loadfiles[metadata.BatchId];
          
          loadfile.AddDocument(metadata.DocumentId);

          if(metadata.HasAttachment)
              ProcessAttachment(metadata.AttachmentId)
     });

     await Task.WhenAll(tasks);

     //dostuff
}


private void ProcessAttachment(long attId)
{
          var metadata = FetchMetadataForId(attId);
          var loadfile = loadfiles[metadata.BatchId];
          
          loadfile.AddDocument(metadata.DocumentId);

}

2

There are 2 best solutions below

4
Fildor On

For a first approach, I'd recommend to make (and understand) these changes, which makes this a little smaller and tells you when there are dupes:

public void AddDocument(string docId)
{
    try
    {
        lock (lockObject)
        {
            var documentElement = new XElement("Document", new XAttribute("DocID", docId));
            if (!DocumentsElement.TryAdd(docId, documentElement))
                Debug.WriteLine("Duplicate : {0}", docId);

            // File I/O is slow. You may consider outsourcing this
            // to a queue and execute in parallel.
            string path = $"{_batchNumber}";
            if(!Directory.Exists(path)) { Directory.CreateDirectory(path); }
            File.AppendAllText($"{path}/{docId}.txt", documentElement.ToString());
            Debug.WriteLine("Document added successfully.");
            Debug.WriteLine("From AddDocument method " + DocumentsElement.Count);
        }
    }
    catch (Exception ex)
    {
         Debug.WriteLine("Failed to add document.");
    }
}
2
Guru Stron On

You are adding to the file "unconditionally", while dictionary will skip duplicates. From the ConcurrentDictionary<TKey,TValue>.TryAdd(TKey, TValue) docs:

Attempts to add the specified key and value to the ConcurrentDictionary<TKey,TValue>.
Returns true if the key/value pair was added to the ConcurrentDictionary<TKey,TValue> successfully; false if the key already exists.

Try adding to the file only in case if the item is not present in the dictionary:

var documentElement = new XElement("Document", new XAttribute("DocID", docId));
added = DocumentsElement.TryAdd(docId, documentElement);
if (added)
{
    lock (lockObject)
    {
        string path = $"{_batchNumber}";
        if(!Directory.Exists(path)) { Directory.CreateDirectory(path); }
        File.AppendAllText($"{path}/{docId}.txt", documentElement.ToString());
    }
}