Gmail 'full' body base64url encoding unexpected output

358 Views Asked by At

When trying to decode a message from Gmail using official GoogleAPIs, I am running into unexpected issues when trying to decode the body data from the full message get request.

The code used to retrieve full:

// Fetch the email data
const gmailMessageData = await gmailService.users.messages.get({
    userId: 'me', id: message.id, format: 'full'
});

// The emails being received don't contain any parts, so I don't have any additional logic that combines parts together.
// Parse the response
const bodyData = gmailMessageData.data.payload.body.data
    .replace(/-/g, '+')
    .replace(/_/g, '/');

const decodedData = Buffer.from(bodyData, 'base64').toString('utf-8');

And the value of decoded data is:

<br />mail is sent as a courtesy from [redacted]. <br /> reply to: [redacted]. <br />sent on behalf of [redacted] <br /> <br />ation: [redacted] <br /> <br />ration Type: [redacted] <br />val Action: [redacted] <br /> <br /> <br />ID: [redacted] <br />

Although when retrieving the email by using the raw request using the code below:

const gmailMessageData = await gmailService.users.messages.get({
    userId: 'me', id: message.id, format: 'raw'
});

const rawData = gmailMessageData.data.raw;

const decodedData = Buffer.from(rawData, 'base64url').toString('utf-8');

With the value of the decoded data being (specifically focusing on the body, and disregarding some formatting/replacing that needs to take place):

This email is sent as a courtesy from [redacted].=0D<br />=0APlease rep= ly to: [redacted].=0D<br />=0AEmail sent on behalf = of [redacted]=0D<br />=0A=0D<br />=0AApplication: [redacted]= =0D<br />=0A=0D<br />=0ARegistration Type: [redacted] =0D<br />=0A= Retrieval Action: [redacted] =0D<br />=0A=0D<b= r />=0A=0D<br />=0AGuest ID:[redacted]=0D<br />=0A=0D<br />

The raw data body when decoded lines up with the actual data that is expected, while the full data seems to be missing characters after decoding.

I have tried several libraries as listed below, but still have been unable to resolve the issue of missing characters:

  • base64url
  • js-base64
  • urlsafe-base64

I have also tried decoding directly from base64url in the Buffer.from method for the full snippet of code to forego the replacing, still to no avail.

I have tested this code on the following platforms, and all produce the same result, so I don't believe it is system related:

  • Windows 10 20H2 - Node 16
  • MacOS 12.0.1 - Node 17
  • CentOS 8 - Node 17
  • Ubuntu 20.04 - Node 17
  • Node-17-alpine Docker container

This also doesn't seem to be related to outputting the variable either as I have outputted to a web browser using express, outputted directly to console, and even outputted to file; all three produce the same output.

I have no clue what else to try at this point.

Edit Followup: After removing the replace()s from the full snippet as suggested, going straight into base64, and even base64url decoding, I still experience the same issue; even throughout all of the systems described above.

1

There are 1 best solutions below

4
kevintechie On

Your replace()s are causing the problem. You only need to base64 decode.