I am trying to implement block-by-block compression using GzipStream class. .NET Core 3.1, Visual Studio 2019, Console App. My OS is Windows 10. I believe it should be possible because gzip files consist of independent blocks one after another as per format specification. But resulting files that I get are corrupted. Here's the code:
using System;
using System.Globalization;
using System.IO;
using System.IO.Compression;
namespace GzipToMemoryStreamExample
{
class Program
{
private static int blockSize;
private static string sourceFileName = "e:\\SomeFolder\\SomeFile.ext";
private static byte[] currentBlock;
private static FileStream readingStream;
private static FileStream writingStream;
static void Main(string[] args)
{
Console.WriteLine("Enter block size:");
string blockSizeStr = Console.ReadLine();
int.TryParse(blockSizeStr, out blockSize);
readingStream = new FileStream(sourceFileName, FileMode.Open);
string resultingFileName = Path.ChangeExtension(sourceFileName, ".gz");
CreateAndOpenResultingFile(resultingFileName);
while (ReadBlock())
{
byte[] processedBlock = ProcessBlock(currentBlock);
writingStream.Write(processedBlock, 0, processedBlock.Length);
}
readingStream.Dispose();
writingStream.Dispose();
Console.WriteLine("Finished.");
Console.ReadKey();
}
private static bool ReadBlock()
{
bool result;
int bytesRead;
currentBlock = new byte[blockSize];
bytesRead = readingStream.Read(currentBlock, 0, blockSize);
result = bytesRead > 0;
return result;
}
private static byte[] ProcessBlock(byte[] sourceData)
{
byte[] result;
using (var outputStream = new MemoryStream())
{
using var compressionStream = new GZipStream(outputStream, CompressionMode.Compress);
compressionStream.Write(sourceData, 0, sourceData.Length);
result = outputStream.ToArray();
}
return result;
}
private static void CreateAndOpenResultingFile(string fileName)
{
if (File.Exists(fileName))
{
File.Delete(fileName);
}
writingStream = File.Create(fileName);
}
}
}
When I look at resulting files I see that result somehow depends on block size I choose. if it's smaller than ~100 Kb, resulting "compressed" blocks are of 10 bytes size each, which leads to extremely small useless file. If size of block is greater than ~100 Kb, then the size of file becomes reasonably large, about 80% of the original, but still corrupted.
Also I checked the block headers and it turns out they're strange. OS is set to TOPS-20 (0x0a value), ISIZE at the end of block is always totally wrong.
What is my mistake?
It's solved just with moving result = outputStream.ToArray(); line out of the compressionStream using scope as Mark Adler suggested in the comments.