Slow ReadLine by `\n` on BinaryReader

189 Views Asked by At

I am using a BinaryReader to read a file and split by new line \n into ReadOnlySpan<byte> (to add context I want bytes and not strings as I am using Utf8JsonReader and trying to avoid copying from string to byte array).

There is a reason for the large buffer it is deliberate - 16kB is OK for the application and is processed one buffer at a time.

However compared to File.ReadAllBytes(filename) which completes in 1 second, the code below takes 30+ seconds on the same machine.

I am naively assuming BinaryReader would be reading forward and caching in advance - seems not the case or at least not using any flags for this (I can't seem to fine any).

How can i improve my performance, or implement the line splitting via an alternative class?

static void Main(string[] args)
{
    using var fileStream = File.Open(args[0], FileMode.Open);
    using (var reader = new BinaryReader(fileStream))
    {
        var i = 0;
        ReadOnlySpan<byte> line = null;
        while ((line = reader.ReadLine()) != null)
        {       
            // Process the line here, one at a time.
            i++;
        }
        Console.WriteLine("Read line " + i);
    }
}

public static class BinaryReaderExtensions
{
    public static ReadOnlySpan<byte> ReadLine(this BinaryReader reader)
    {
        if (reader.IsEndOfStream())
            return null;

        // Buffer size is deliberate, we process one line at a time.
        var buffer = new byte[16384];
        var i = 0;

        while (!reader.IsEndOfStream() && i < buffer.Length)
        {
            if((buffer[i] = reader.ReadByte()) == '\n')
                return new ReadOnlySpan<byte>(buffer, 0, i + 1);

            i++;           
        }
        return null;
    }

    public static bool IsEndOfStream(this BinaryReader reader)
    {
        return reader.BaseStream.Position == reader.BaseStream.Length;
    }
}
0

There are 0 best solutions below