The application reads a lot of different files from AWS S3 and then sends them to some recipients.
Issues:
- Instantly growing number of live threads (it grows untill 1030-1040 threads and then stops at that limit. Almoust all threads are threads to AWS S3 in "parked" state.)
- Instantly growing usage of "OLD" space in the heap. After garbage collection it almoust not getting free.
For loading file I use pollEnrich and consumer endpoint of AWS2-S3 component.
Application uses
- Spring Boot 2.4.6
- Apache Camel 3.10.0
- Java 11
Route
from("direct:loadFile")
.routeId("LoadFileRoute")
.pollEnrich()
.method(amazonS3Service, "generateConsumerEndpointUrlForLoadingFile")
.timeout(60000L)
.cacheSize(-1)
.threads().executorService(pollEnrichThreadPool)
.end().id("loadFile");
Endpoint creation
public EndpointConsumerBuilder generateConsumerEndpointUrlForLoadingFile(
@ExchangeProperty(BUCKET) String bucket,
@ExchangeProperty(FILEPATH) String filepath) {
return aws2S3(bucket)
.fileName(filepath)
.deleteAfterRead(false)
.includeBody(true)
.amazonS3Client(amazonS3Client)
.scheduledExecutorService(s3EndpointThreadPool)
.advanced()
.autocloseBody(false);
}
Additionaly I've created 2 different thread pools to check different cases:
@Bean("S3EndpointThreadPool")
public ScheduledExecutorService scheduledExecutorService(CamelContext context) throws Exception {
return context.getExecutorServiceManager()
.newScheduledThreadPool(this, "S3EndpointThreadPool1", 10);
}
@Bean("PollEnrichThreadPool")
public ExecutorService executorService(CamelContext context) throws Exception {
return new ThreadPoolBuilder(context)
.poolSize(10)
.maxPoolSize(10)
.maxQueueSize(Integer.MAX_VALUE)
.build("PollEnrichThreadPool2");
}
Cases that I've tried but they did not affect memory usage in any kind:
- Endpoint without explicitly specified
includeBody
andautocloseBody
params (default values). - Endpoint with
includeBody = true
andautocloseBody = false
- Endpoint with
includeBody = false
andautocloseBody = true
- Endpoint. Add ScheduledExecutorService -
.scheduledExecutorService(s3EndpointThreadPool)
(10 threads) - PollEnrich EIP. Set thread pool PollEnrichThreadPool -
.threads().executorService(pollEnrichThreadPool)
(10 threads) - PollEnrich EIP. Disable caching for URI producers/consumers (
.cacheSize(-1)
) pollEnrich uses dynamic URI for loading files. Basically all URIs are unique and as a main scenario one file ususally read only once.
What am I missing here? Do you have any ideas how to solve these issues?