How to track BackBlaze pre-signed file upload references consistently in postgresql

84 Views Asked by At

I have encountered a problem when storing files to BackBlaze using pre-signed URLs and tracking the uploaded file references in PostgreSQL. The primary issue I faced is that BackBlaze doesn't have events like S3 or minio on pre-signed upload complete. How can I find out if a file has been uploaded successfully and mark it as uploaded in PostgreSQL. Please can someone give me a high-level idea to implement this? Also in advance, I am using the s3 compatibility API from Backblaze

Thanks in advance for the help.

The existing system depends on the client to send an upload complete acknowledgement. In some cases, this has failed and left dangling files in BackBlaze and orphaned records in PostgreSQL.

1

There are 1 best solutions below

1
On

This is a pretty common eventual-consistency issue that is best solved by cleanup processes after-the-fact.

There are two different problems here:

  1. Dangling files in BackBlaze: This is probably most easily fixed with a reconciliation process that cleans up the orphaned files. Depending on the urgency, you can have it run as often as you like, cleaning up files that don't have a corresponding record in the DB.
  2. Orphaned records in PostgreSQL: You can take a 2-phase commit approach here. You are probably stuck with the client-based callback to confirm the file, but by doing two calls back to the server you can have a second field for the "confirmation" once the client has completed the upload where you will update the row to confirm the upload. That way you can easily delete rows that were never confirmed during your cleanup process. If you are having trouble with the callback, you can always have the process check the target and see if the file ended up there before deleting the row.

If you need true transactional consistency you will have a more expensive process where you will need to open a transaction for the database server-side and wait for the file upload to complete. The tricky part will be re-connecting to that transaction on what will most definitely be a second client-side call. I would recommend, if you can't use the above cleanup processes for eventual consistency that you move away from pre-signed urls and upload the file server-side so that you can upload the file inside a database transaction. Trying to connect two different requests to the same connection/transaction is possible, but quite difficult.