tl/dr;
Using bulkWrite to batch replaceOne operations to Azure CosmosDB with MongoAPI with upsert:true. Works correctly on empty test collection but causes Time Out on intended collection. Intended collection already has ~7300 pieces of data. Why do I get time out with one collection but not the other? How can I avoid the time out without using extremely small batch sizes which increases execution time?
Background
I have an Azure Function which is used to retrieve data from an API and save the data in a database collection. I am using Azure CosmosDB with the MongoAPI adapter as my database.
The function retrieves the data from the API and creates an array of operations for use with bulkWrite. The operations are all of type replaceOne, with upsert set to true and a filter is provided for the upsert functionality. Operations are batched into groups of 500 and passed to bulkWrite to be written to the database. Each batch is sent after the previous bulkWrite has completed.
Issue
When implementing this feature I used a new empty collection to test the functionality. When using this Test Collection the feature worked as intended. However, after switching to the collection I intend on using in production I get a "Request timed out" error when attempting to use bulkWrite:
Error
[2020-12-09T18:29:41.761] (node:18988) UnhandledPromiseRejectionWarning: BulkWriteError: Request timed out.
[2020-12-09T18:29:41.764] at OrderedBulkOperation.handleWriteError (C:\Users\Liam\Desktop\Work\myApp\node_modules\mongodb\lib\bulk\common.js:1257:9)
[2020-12-09T18:29:41.767] at resultHandler (C:\Users\Liam\Desktop\Work\myApp\api\node_modules\mongodb\lib\bulk\common.js:521:23)
[2020-12-09T18:29:41.768] at handler (C:\Users\Liam\Desktop\Work\myApp\api\node_modules\mongodb\lib\core\sdam\topology.js:942:24)
[2020-12-09T18:29:41.769] at C:\Users\Liam\Desktop\Work\myApp\api\node_modules\mongodb\lib\cmap\connection_pool.js:356:13
[2020-12-09T18:29:41.770] at handleOperationResult (C:\Users\Liam\Desktop\Work\myApp\api\node_modules\mongodb\lib\core\sdam\server.js:558:5)
[2020-12-09T18:29:41.771] at MessageStream.messageHandler (C:\Users\Liam\Desktop\Work\myApp\api\node_modules\mongodb\lib\cmap\connection.js:275:5)
[2020-12-09T18:29:41.772] at MessageStream.emit (events.js:315:20)
[2020-12-09T18:29:41.772] at MessageStream.EventEmitter.emit (domain.js:482:12)
[2020-12-09T18:29:41.773] at processIncomingData (C:\Users\Liam\Desktop\Work\myApp\api\node_modules\mongodb\lib\cmap\message_stream.js:144:12)
[2020-12-09T18:29:41.774] at MessageStream._write (C:\Users\Liam\Desktop\Work\myApp\api\node_modules\mongodb\lib\cmap\message_stream.js:42:5)
[2020-12-09T18:29:41.775] at doWrite (_stream_writable.js:403:12)
[2020-12-09T18:29:41.775] at writeOrBuffer (_stream_writable.js:387:5)
[2020-12-09T18:29:41.814] at MessageStream.Writable.write (_stream_writable.js:318:11)
[2020-12-09T18:29:41.856] at TLSSocket.ondata (_stream_readable.js:717:22)
[2020-12-09T18:29:41.857] at TLSSocket.emit (events.js:315:20)
[2020-12-09T18:29:41.857] at TLSSocket.EventEmitter.emit (domain.js:482:12)
Question
What could be causing this Timeout error, when I can upload to the other collection without problems?
In this instance I am uploading about 650 pieces of data. The existing collection already has 7300 pieces of data, while the Test Collection has 0 pieces of data.
Code
Azure Function Body (Using azure-function-express)
app.post('/api/YahooUploadUserSelectedData', passport.authenticate('jwt', { session: false }), async (req, res) => {
toUpload = [];
...
// Getting data and pushing it to toUpload
...
// "Upsert" data. Create new record if data does not exist. If data already exists, replace it
try {
upsertData(toUpload, collection);
} catch (e) {
res.status(500).send(e);
}
res.status(200).send("Yahoo data uploaded successfully.");
});
upsertData function
upsertData = async (data, collection) => {
const bulkReplaceOps = data.map(item => {
// Define filters (Unique identifiers for each object record)
const filters = {
YahooNFLPlayer: {
objectType: "YahooNFLPlayer",
league_key: item.league_key,
player_key: item.player_key,
guid: item.guid
},
YahooNFLStanding: {
objectType: "YahooNFLStanding",
team_key: item.team_key,
guid: item.guid
},
YahooNFLDraftResult: {
objectType: "YahooNFLDraftResult",
league_key: item.league_key,
pick: item.pick,
round: item.round,
guid: item.guid
},
YahooNFLTransaction: {
objectType: "YahooNFLTransaction",
transaction_key: item.transaction_key,
guid: item.guid
},
YahooNFLTeamRoster: {
objectType: "YahooNFLTeamRoster",
league_key: item.league_key,
team_key: item.team_key,
player_key: item.player_key,
guid: item.guid
},
YahooNFLTeam: {
objectType: "YahooNFLTeam",
guid: item.guid,
team_key: item.team_key
}
}
// Map data to array of replace operations
return {
replaceOne: {
filter: filters[item.objectType], // Select filter based on type of data
replacement: item, // Data to be uploaded
upsert: true // Create new doc if not existing, replace otherwise
}
}
});
// Batch in groups of 500 (Best practice)
while (bulkReplaceOps.length) {
try {
await collection.bulkWrite(bulkReplaceOps.splice(0, 500));
console.log("YahooUploadUserSelectedData: Successful bulk upsert");
} catch (e) {
throw e;
}
}
}
What I've tried
- Reduce batch size
- By reducing the batch size we can avoid the time out. However each individual batch takes a long time to complete, increasing execution time. This is bad as it may exceed the 5 minute run time of the Azure Function, and hampers the user experience.
- Reduce filter complexity
- I was concerned that the number of fields in my upsert filter (up to 5 for some data) was too high, possibly incurring some overhead in the runtime of the query. As a test I used only one field in each filter. I still encountered the TimeOut
- Test query on Test Collection after adding additional data
- I thought maybe the discrepancy was because the container I wanted to use in production already had some data in it (7300 pieces), while the Test Collection had no initial data. My thinking is that this was increasing the time it takes to run the upsert query as it must filter through the existing data to find the right documents. As a test I copied the data from my DataSources collection into the Test Collection using Studio 3T. I was still able to bulkWrite the data to the Test Collection without encountering a time out issue.
Any help or insight into this issue is greatly appreciated.