Parallel.Foreach and WriteAsync to improve performance?

380 Views Asked by At

I have written the following codes in C# which I am looking for tips on improvement.

I have used Parallel.ForEach() to help parallelize the work needed.

What this does is to loop through a list of "students". For each student, it will go through the text file associated with that student and do calculations. Afterwards, it appends the result to the "outputLines" and then output the whole result at the end for each student.

Two things I have tried that helps me identify the bottleneck of running the program:

  1. If I only do the calculation and add the result to "outputLines" and "outputLine2", the time taken would be approximately 3 seconds.

  2. If I write the "outputLines" and "outputLines2" to different files as shown in the codes below, the time taken would have increased significantly to 1 minute 30 seconds.

3 seconds and 1.5 minutes here are only for this small sample of students. In fact, I need to go through many more students' files to do the work. In the actual project, the time taken for the "no writing to file" takes 5 minutes to complete, and "writing to file" takes 1 hour for completion.

I am new to async, and wonder if anyone can tell me where and how I could improve the performance of this prorgram if I were to write the output asynchronously. In the codes below, I have only used File.WriteAllLines() to do the writing work.

Please let me know how I could amend the codes to improve the performance.

    private void generateFiles()
    {
        List<string> students = new List<string>();

        students.Add("Peter");
        students.Add("James");
        students.Add("Sarah");

        // There are more than 100 students in this case, which I have removed them here for the sake of brevity.



        Parallel.ForEach(students, student =>
        {
            string[] lines = File.ReadAllLines(Path.Combine(@"C:\", student + ".txt"));
            List<string> outputLines = new List<string>();
            List<string> outputLines2 = new List<string>();

            foreach (string line in lines)
            {
                // for each "line", we process something and then add the result to "outputLines" and "outputLines2"
                // outputLines.Add(result)
                // outputLines2.Add(result)

            }

            File.WriteAllLines(Path.Combine(@"C:\Output\", student + ".txt"), outputLines);
            File.WriteAllLines(Path.Combine(@"C:\Output2\", student + ".txt"), outputLines2);

        });

    }
0

There are 0 best solutions below