i am new spring batch. i recently tried a batch which will read records from file and insert into MariaDB. But for inserting 10k records its taking 2min 30sec. I know its too much time. Table have only 3 columns without any Keys.
Here is my job-XML
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/batch
http://www.springframework.org/schema/batch/spring-batch-2.2.xsd
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-3.2.xsd
">
<import resource="../../context.xml" />
<import resource="../../database.xml" />
<bean id="itemProcessor" class="com.my.sbatch.processors.CustomItemProcessor" />
<batch:job id="file_to_db">
<batch:step id="step1">
<batch:tasklet transaction-manager="transactionManager" start-limit="100">
<batch:chunk reader="cvsFileItemReader"
writer="databaseItemWriter" commit-interval="10">
</batch:chunk>
</batch:tasklet>
</batch:step>
</batch:job>
<bean id="multiResourceReader"
class=" org.springframework.batch.item.file.MultiResourceItemReader">
<property name="resources"
value="file:batch/csv/processing/*.csv" />
<property name="delegate" ref="cvsFileItemReader" />
</bean>
<bean id="mappingBean" class="com.my.sbatch.bean.Batch1Bean"
scope="prototype" />
<bean name="customFieldSetMapper" class="com.my.sbatch.core.CustomFieldSetMapper">
<property name="classObj" ref="mappingBean"/>
</bean>
<bean id="cvsFileItemReader" class="com.my.sbatch.customReader.CustomItemReader" scope="step">
<property name="resource" value="file:#{jobParameters['inputFile']}" />
<property name="lineMapper">
<bean class="com.my.sbatch.core.CustomLineMapper">
<property name="lineTokenizer">
<bean
class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value="#{jobParameters['delimiter']}" />
</bean>
</property>
<property name="fieldSetMapper" ref="customFieldSetMapper" />
</bean>
</property>
</bean>
<bean id="databaseItemWriter" class="org.springframework.batch.item.database.JdbcBatchItemWriter" scope="step">
<property name="dataSource" ref="dataSource" />
<property name="sql">
<value>
<![CDATA[
#{jobParameters['insert_JobQuery']}
]]>
</value>
</property>
<property name="ItemPreparedStatementSetter">
<bean class="com.my.sbatch.core.CustomPreparedStatement" />
</property>
Here is my context.xml
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-3.2.xsd">
<!-- stored job-meta in memory -->
<!--
<bean id="jobRepository"
class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
<property name="transactionManager" ref="transactionManager" />
</bean>
-->
<!-- stored job-meta in database -->
<bean id="jobRepository"
class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean">
<property name="dataSource" ref="dataSource" />
<property name="transactionManager" ref="transactionManager" />
<property name="databaseType" value="mysql" />
</bean>
<bean id="transactionManager"
class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />
<bean id="jobLauncher"
class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
</bean>
In com.my.sbatch.core.CustomFieldSetMapper, com.my.sbatch.core.CustomPreparedStatement classes i am using reflections for mapping fields from File -> bean and Bean -> DB(Prepared statement).
Can you please advice me any solution for why it is taking too much time
In this example, your batch process is executed by a single thread. This may be the reason it takes so long. I recommend you use multhreading with TaskExecutor bean.
For example:
This is the simplest solution for multithreading but you may have problems in concurrent environments with access to information.
I recommend that you read this information to see the different scalability strategies.
More information about scalability here