In our express server using @google-cloud/storage, in some cases while downloading a file from our buckets through a stream readable, even subscribing to the .error in the stream (which does not get called), I get an error, even though this call is wrapped into a try-catch, and our express instance restarts, not killing the pod but restarting express itself, very weird.
The error I get from the express logs looks like this:
TypeError: Cannot read properties of null (reading 'length')
at getStateLength (/usr/src/node_modules/stream-shift/index.js:16:28)
at shift (/usr/src/node_modules/stream-shift/index.js:6:99)
at Duplexify._forward (/usr/src/node_modules/duplexify/index.js:170:35)
at PassThrough.onreadable (/usr/src/node_modules/duplexify/index.js:136:10)
at PassThrough.emit (node:events:518:28)
at emitReadable_ (node:internal/streams/readable:832:12)
at process.processTicksAndRejections (node:internal/process/task_queues:81:21)
I get another trace that says it blew up at this point:
return state.buffer[0].length
Which seems to correspond to this part of stream-shift code: https://github.com/mafintosh/stream-shift/blob/2ea5f7dcd8ac6babb08324e6e603a3269252a2c4/index.js#L16C1-L16C34
My download code looks like this:
const { bucketName, keyFilename } = config.google.storage;
if (!bucketName) {
throw badImplementation('config.google.storage.bucketName is undefined');
}
if (!keyFilename) {
throw badImplementation('config.google.storage.keyFilename is undefined');
}
const storage = new Storage({
keyFilename,
retryOptions: { autoRetry: true, maxRetries: 1 },
});
const bucket = storage.bucket(bucketName);
const [exists] = await bucket.file(name).exists();
if (!exists) {
const error = `CDN download, file ${name} does not exist`;
console.log(error);
throw notFound(error);
}
log.info(`CDN download, create read stream on ${name} begin`);
const readStream = bucket
.file(name)
.createReadStream()
.on('response', (response) => {
// Server connected and responded with the specified status and headers.
console.log(`CDN download, stream on file ${name}, response is: ${JSON.stringify(response)}`);
})
.on('end', () => {
// The file is fully downloaded.
console.log(`CDN download, stream on file ${name}, file fully downloaded`);
})
.on('error', (err) => {
// Something happened while downloading the file
console.log(`CDN download, stream on file ${name}, error is: ${JSON.stringify(err)}`);
});
log.info(`CDN download, create read stream on ${name} done`);
return readStream;
I thought the file could not exist but I added the check of .exists() which returns true and therefore creates the readStream().
I even get a trace from the .on('response' part identifying the file.
{
"headers": {
"cache-control": "no-cache, no-store, max-age=0, must-revalidate",
"content-disposition": "attachment",
"content-length": "1309467",
"content-type": "application/octet-stream",
"date": "Wed, 24 Jan 2024 11:17:05 GMT",
"etag": "CLSOmpWa8oMDEAE=",
"expires": "Mon, 01 Jan 1990 00:00:00 GMT",
"last-modified": "Tue, 23 Jan 2024 00:00:33 GMT",
"pragma": "no-cache",
"server": "UploadServer",
"vary": "Origin, X-Origin",
"x-goog-generation": "1705968033761076",
"x-goog-hash": "crc32c=EeUAng==,md5=Duc9MjxstOaEXhEeZRphIw==",
"x-goog-metageneration": "1",
"x-goog-storage-class": "STANDARD",
"x-goog-stored-content-encoding": "identity",
"x-goog-stored-content-length": "1309467",
"x-guploader-uploadid": "ABPtcPpJ0EZifzef-2dHFzbfURL0E_niJIylxjegZyJhjJ0kyhM8FGb7jymom35PJ4UrOcti3mp8CxNuqw"
}
}
Could it be that even though the client checks say file exists we don't have permissions to download the file?
UPDATE 1: After further investigation and rolling back our docker images we discovered that between Jan 9 and Jan 11 our base image node:20 pushed a change which seems to be the issue: https://github.com/nodejs/docker-node/commit/ab5769dc69feb4007d9aafb03316ea0e3edb4227
This changed from node 20.10 to 20.11 and it's the only possible explanation to have something like this happening, is there any known issue reported?
UPDATE2: Workaround from docker image node:20 to node:20.10.0 solved the issue, something must've been introduced in node:20.11.0 (aka latest), could anybody from node or google investigate what is going on?
Per this GitHub comment thread, besides downgrading to Node
20.11.0you can alternatively use a package override to usestream-shiftversion1.0.2which should resolve this issue.Hopefully that gets updated in the
@google-cloud/storagepackage soon as well.