Read from a compressing GZipStream

242 Views Asked by At

I'm exploring how to implement an HTTP server in C#. (And before you ask, I know there is Kestrel (and nothing else that isn't obsolete), and I want a much, much smaller application.) So, the response could be a Stream that cannot be seeked and has an unknown length. For this situation, chunked encoding can be used instead of sending a Content-Length header.

The response can also be compressed with gzip or br as indicated by the client. This can be accomplished with e.g. the GZipStream class. I had almost said "easily", because that's not really the case. I always find the GZipStream API confusing each time I use it. I usually bump into every exception there is until I finally get it right.

It seems like I can only write (push) to a GZipStream and the compressed data will trickle out the other end into the specified "base" stream. But that's not desirable because I can't just let the compressed data flow to the client. It needs to be chunked. That is, each bit of compressed data needs to be prefixed with its chunk size. Of course the GZipStream cannot produce that format.

Instead, I'd like to read (pull) from the compressing GZipStream, but that doesn't seem to be possible. The documentation says it will throw an exception if I try that. But there has to be some instance that brings the compressed bytes into the chunked format.

So how would I get the expected result? Can it even be achieved with this API? Why can't I pull from the compressing stream, only push?

I'm not trying to make up (non-functional) sample code because that would only be confusing.

PS: Okay, maybe this:

Stream responseBody = ...;
if (canCompress)
{
    responseBody = new GZipStream(responseBody, CompressionMode.Compress);   // <-- probably wrong
}
// not shown: add appropriate headers
while (true)
{
    int chunkLength = responseBody.Read(buffer);   // <-- not possible
    if (chunkLength == 0)
        break;
    response.Write($"{chunkLength:X}\r\n");
    response.Write(buffer.AsMemory()[..chunkLength]);
    response.Write("\r\n");
}
response.Write("0\r\n\r\n");
1

There are 1 best solutions below

2
David L On

Your usage of GZipStream is incomplete. While your input responseBuffer is the correct target buffer, you have to actually write the bytes TO the GZipStream itself.

In addition, once you are done writing, you must close the GZipStream instance to write all compressed bytes to your target buffer. This is the critical step because there is no such thing as "partial compression" of an input stream in GZip. You would have to analyze the entire input in order to properly compress it. As such, this is the critical missing link that MUST happen before you can continue to write the response.

Finally, you need to reset the position of your output stream so that you can read it into an intermediary response buffer.

using MemoryStream responseBody = new MemoryStream();
GZipStream gzipStream = null; // make sure to dispose after use

if (canCompress)
{
    using MemoryStream gzipStreamBuffer = new MemoryStream(bytes);
    gzipStream = new GZipStream(responseBody, CompressionMode.Compress, true);

    gzipStreamBuffer.CopyTo(gzipStream);
    gzipStream.Close(); // close the stream so that all compressed bytes are written

    responseBody.Seek(0, SeekOrigin.Begin);  // reset the response so that we can read it to the buffer
}

var buffer = new byte[20];

while (true)
{
    int chunkLength = responseBody.Read(buffer);

    if (chunkLength == 0)
        break;
        
    // write response
}

In my test example, my bytes input was 241 bytes, whereas the compressed bytes written to the buffer totaled 82 bytes.