We are about to start implementing a data load service which will consume some CSV files from an AWS S3 bucket and load them into a database. There would be multiple instances of the loader running at the same time.
Looking at the Camel AWS S3 component as opposed to Camel File component there is not a locking mechanism in place to prevent a file from being consumed by multiple instances of the loader service at the same time. Or at least I could not spot any.
Considering locking out of the box is not possible we defined a Camel route to list all objects in our S3 bucket and insert the file names in a database table. Then using SELET file_name FROM files_table LIMIT 1 FOR UPDATE SKIP LOCKED syntax we found a way to easily have only one service downloading a certain file.
The issue is that as far as I understood the Camel AWS S3 documentation if you want to consume just a file from an S3 bucket you need to specify the query parameter called fileName. If this value is hard coded you can only consume that file. We would like instead to specify the file name dynamically say in the route end point you specify the bucket name but pass the file name as a header. To me this functionality looks like a very basic requirement and wondering if I missed something.
It is very first time when I am trying to use Apache Camel AWS S3 component but I used Camel File component lots and would have expected not exactly the same but very similar functionality.
Thank you in advance for your inputs.