I am using switchyard which is a wrapper over apache camel. My file consumer consumes from a directory where large number of files(some times 2,000,000) are written. Ideal consumption speed of my consumer is 1000+ files per second but when more than 50000 files are written then my consumer becomes slow and the consumption speed reduces 5 times.
I have disabled the sortBy option and even enabled shuffle option. But no luck. Here is my file binding detail.
<file:binding.file name="XXXXXXXXXXXX">
<file:additionalUriParameters>
<file:parameter name="antInclude" value="*.xml"/>
<file:parameter name="consumer.bridgeErrorHandler" value="true"/>
<file:parameter name="shuffle" value="true"/>
</file:additionalUriParameters>
<file:directory>directory path</file:directory>
<file:autoCreate>false</file:autoCreate>
<file:consume>
<file:delay>100</file:delay>
<file:maxMessagesPerPoll>20</file:maxMessagesPerPoll>
<file:delete>true</file:delete>
<file:moveFailed>directory path</file:moveFailed>
<file:readLock>markerFile</file:readLock>
</file:consume>
</file:binding.file>
How can I make my consumer to maintain same consumption speed of 1000 files/second even when there are large number of files in the inbound directory?
Your configuration is telling Camel to:
So, I expect that you are getting about 200 files per second?
Set file:
maxMessagesPerPoll=200.Of course, the assumption is that all your downstream processing can handle that extra load.
As @Conffusion commented above, you are shuffling the list of files. So, that likely creates a list of all the files, then shuffles it and gives you the number you asked for. Do you really need that as part of your requirement?
Essentially...play with each of the file parameters and see what impact it makes.