byte array gzip and base64 encoding results in OOM error upon retrieval and decode+unzip at high load

Question

byte array gzip and base64 encoding results in OOM error upon retrieval and decode+unzip at high load

101 Views Asked by Rajat Somani At 24 June 2025 at 23:46

We have an XML document of size 1.4MB which we gzipCompress and encode to Base64 and save in cosmos. Upon receiving some updates, we read cosmos, decode from base64 and unzip to get the original string. What we are observing is at some high load the slanted apostrophe character is creating junk data while saving in cosmos upon update processing. the base64 encoded data looks like - /F9nYk3vKlhqHb65KybqXTJfLvTvuy24HFwOq1wOT55oEkdJ+0bmcuWJJisvbfanpsb7//2//w8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA........

and decoding,unzip this gives OOM with size growing into GB with this character � like �� Logic for encoding and gzip

    String compressed;
    try (var baos = new ByteArrayOutputStream(); var gzipOut = new GZIPOutputStream(baos);) {
      gzipOut.write(data.getBytes(StandardCharsets.UTF_8));
      gzipOut.close();
      compressed = new String(Base64.getEncoder().encode( baos.toByteArray()));
    } catch (IOException e) {
      throw new FOSInjestorApplicationException(Errors.UNEXPECTED_ERROR
              , Errors.UNEXPECTED_ERROR.getDescription());
    }

Logic for decoding and unzip

    byte[] decodebase64 = Base64.getDecoder().decode(arr);
    byte[] gzip;
    try (var bais = new ByteArrayInputStream(arr); var gzip = new GZIPInputStream(bais);) {
      gzip = gzip.readAllBytes();
    } catch (IOException e) {
      throw new FOSInjestorApplicationException(Errors.UNEXPECTED_ERROR
              , Errors.UNEXPECTED_ERROR.getDescription());
    }
    return new String(gzip);

When we place this same document with slanted apostrophe in non-prod, its working fine.

I am using java 11 and java cosmos 4.x SDK What could cause this to fail at high load?

We tried to process too many updates (1 at a time) on a document which had special character - slanted apostrophe and the update should not corrupt the data but we found this junk character after decoding/unzip - �� which was ever growing in size into 1 GB and give OOM

Original Q&A

There are 1 best solutions below

**Rajat Somani** · Answer 1

So, upon checking why decoding&decompressing then compressing&encoding was creating an issue with this slanted apostrophe, we found out that JAVA11 uses default UTF-16 file encoding and our application VM uses some ISO-8859-1 file encoding. So these encodings dont understand the tilted apostrophe and when we do zip-unzip too many times the size of junk characters ?????? is exponentially increasing with each update and finally if the decompressed&decoded size grows till 1GB fopr a 2MB junk payload saved in cosmos, application sees OOM error. The fix was to add -Dfile.encoding=UTF-8 JVM parameter explicitly and observed that the special character(tilted apostrophe) was getting correctly read into apostrophe '

byte array gzip and base64 encoding results in OOM error upon retrieval and decode+unzip at high load

There are 1 best solutions below

Related Questions in BASE64

Related Questions in JAVA-11

Related Questions in GZIPINPUTSTREAM

Related Questions in GZIPOUTPUTSTREAM

Related Questions in OOM

Trending Questions

Popular # Hahtags

Popular Questions