apache-camel apache-camel-aws - Upload a huge single file by reading events from kafka

17 Views Asked by At

I am trying to read the millions of events from kafka using camel's upload stream mode. My requirement is that I do not want multiple files after processing is there any way we can get the results in a single file or merge them without reading them again in the memory. Here is an example of my route.

 from("direct:AppendToFile")
                            .routeId("AddToFile")
                            .autoStartup(true)
                            .startupOrder(1)
                            .shutdownRunningTask(ShutdownRunningTask.CompleteAllTasks)
                            .bean("TransformerProcess")
                            .choice()
                            .when(constant(config.getS3().getBucketName().isBlank()))
                            .log("Bucket name is not present, routing to file mode")
                            .to("bean:AppendToFile")
                            .when(constant(!config.getS3().getBucketName().isBlank()))
                            .log("Routing to S3 bucket: ${body}")
                            .process(e -> {
                                Boolean toCommit = e.getIn().getHeader(KafkaConstants.LAST_RECORD_BEFORE_COMMIT, Boolean.class);
                                if (toCommit) {
                                    KafkaManualCommit commit = e.getIn().getHeader(KafkaConstants.MANUAL_COMMIT, KafkaManualCommit.class);
                                    commit.commit();
                                }
                            })
                            .to("aws2-s3://" + config.getS3().getBucketName() +
                "?amazonS3Client=#EcsS3client&streamingUploadMode=true" +
                "&batchMessageNumber=9000" +
                "&streamingUploadTimeout=500000" +
                "&namingStrategy=progressive" +
                "&KeyName=test_topic")
                            .log("Body after s3 upload: ${body}");

I have tried multipart upload and stream upload both. Issue with multipart is that it expects an object of File based on local system which does not satisy my use case and I do not want to use that.

0

There are 0 best solutions below