I applied gzip compression to the string test-string
. When I use Scala 2.13.8 with Java 11.0.13 (Java HotSpot(TM) 64-Bit Server VM), it results in the compressed string H4sIAAAAAAAAACtJLS7RLS4pysxLBwCFdJByCwAAAA==.
However, when I perform the same compression operation with Scala 2.13.8 on Java 17.0.4.1 (OpenJDK 64-Bit Server VM), it yields H4sIAAAAAAAA/ytJLS7RLS4pysxLBwCFdJByCwAAAA==.
however, both of these compressed strings correctly decompressed to retrieve the original string test-string
.
I assume this can depend on several factors like Default Compression Levels: the default compression level might differ between Java 11 and Java 17, resulting in different output for the same input. Algorithm Improvements: The Gzip implementation in Java 17 may have been optimized, leading to different compression results.Internal Implementation Details: The internal implementation details of Gzip compression may have changed between Java 11 and Java 17, affecting the compressed output.
What could be the reason behind this? I am attaching the code below.
val bos = new ByteArrayOutputStream("test-string".length)
val b64os = new Base64OutputStream(bos)
val gzip = new GZIPOutputStream(b64os)
gzip.write("test-string".getBytes("UTF-8"))
gzip.close()
val compressed = new String(bos.toByteArray, "UTF-8")
bos.close()
compressed.trim
If we look at both your outputs in hex, we have these two pieces of data
Basically 1 byte changed from 0 to 255 (ff in hex). That's the OS header of the gzip format, which was changed from 0 to 255 in java 16 according to this :https://bugs.openjdk.org/browse/JDK-8244706