Here is a JUnit test that demonstrates my issue:
package stream;
import static org.junit.jupiter.api.Assertions.*;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import java.io.PrintWriter;
import java.nio.ByteBuffer;
import java.util.Arrays;
import java.util.concurrent.ArrayBlockingQueue;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;
import org.junit.jupiter.api.Test;
class StreamTest {
public static class LoopbackStream {
private final byte[] END_MARKER = new byte[0];
private final ArrayBlockingQueue<byte[]> queue = new ArrayBlockingQueue<>(1024);
public OutputStream getOutputStream() {
return new OutputStream() {
@Override
public void write(int b) throws IOException {
this.write(new byte[] { (byte) b });
}
@Override
public void write(byte[] b, int off, int len) {
try {
queue.put(Arrays.copyOfRange(b, off, len - off));
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
@Override
public void close() {
try {
queue.put(END_MARKER);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
};
}
public InputStream getInputStream() {
return new InputStream() {
private boolean finished = false;
private ByteBuffer current = ByteBuffer.wrap(new byte[0]);
@Override
public int read() {
if (ensureData()) {
return Byte.toUnsignedInt(current.get());
} else {
return -1;
}
}
@Override
public int read(byte[] b, int off, int len) {
if (ensureData()) {
int position = current.position();
current.get(b, off, Math.min(len, current.remaining()));
return current.position() - position;
} else {
return -1;
}
}
private boolean ensureData() {
if (!finished && !current.hasRemaining()) {
try {
byte[] data = queue.take();
current = ByteBuffer.wrap(data);
finished = data == END_MARKER;
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
return false;
}
}
return !finished;
}
};
}
}
@Test
void testVanilla() throws IOException {
LoopbackStream objectUnderTest = new LoopbackStream();
PrintWriter pw = new PrintWriter(new OutputStreamWriter(objectUnderTest.getOutputStream()), true);
BufferedReader br = new BufferedReader(new InputStreamReader(objectUnderTest.getInputStream()));
pw.println("Hello World!");
assertEquals("Hello World!", br.readLine());
}
@Test
void testVanilla2() throws IOException {
LoopbackStream objectUnderTest = new LoopbackStream();
PrintWriter pw = new PrintWriter(new OutputStreamWriter(objectUnderTest.getOutputStream()), true);
BufferedReader br = new BufferedReader(new InputStreamReader(objectUnderTest.getInputStream()));
pw.println("Hello World!");
assertEquals("Hello World!", br.readLine());
pw.println("Hello Otherworld!");
assertEquals("Hello Otherworld!", br.readLine());
}
@Test
void testGzipped() throws IOException {
LoopbackStream objectUnderTest = new LoopbackStream();
PrintWriter pw = new PrintWriter(new OutputStreamWriter(new GZIPOutputStream(objectUnderTest.getOutputStream(), true)), true);
BufferedReader br = new BufferedReader(new InputStreamReader(new GZIPInputStream(objectUnderTest.getInputStream())));
pw.println("Hello World!");
assertEquals("Hello World!", br.readLine());
}
}
There are two individual tests. One that uses vanilla input and output streams (which works fine) and another that wraps those streams in their gzip equivalents.
I've used the GZIPOutputStream's syncFlush option which I am expecting to automatically flush any remaining bytes from the stream whenever the parent stream is flushed. I'm using the PrintWriter's autoFlush option to flush its data whenever it does a println.
Is there a better way to force the GZIPOutputStream to flush its buffers after a println?
I know that this is not the full answer to your question, but it is too long for a comment...
Update:
After further investigation it seems that it's not the
GZIPOutputStreamthat doesn't flush (by addingSystem.out.println("xy");statements in thepublic void write(byte[] b, int off, int len)method you can see that theGZIPOutputStreamwrites two byte arrays into yourOutputStream: one is the gzip stream header, the other one is the encoded content of the first line of text).It seems that the reading process blocks because of a bad interaction between the
java.io.InputStreamReader(respectively thesun.nio.cs.StreamDecoderit uses) and theGZIPInputStream.Basically, if the
StreamDecoderneeds to read bytes from the underlying stream it tries to read as many bytes as possible (as long as underlying stream reportsin.available() > 0, implying that the underlying stream can yield some more bytes without blocking)StreamDecoderin.available()The problem with this check is that the
InflaterInputStream(the superclass of theGZIPInputStream) always returns1for the number of available bytes, even if its source stream has no bytes available (see the source ofInflaterInputStream.available())So it seems that while you can write line by line into a
GZIPOutputStream, you cannot easily read line by line from aGZIPInputStream...Original answer:
The problem is not the
GZIPOutputStream, the problem is with theboolean ensureData()method that refuses to read more than one block.The following test fails with vanilla streams too: