I am attempting to insert ~57,000 entities in my database, but the insert method takes longer and longer as the loop progresses. I have implemented batches of 25 - each time flushing, clearing, and closing the transaction (I'm pretty sure) without success. Is there something else I need to be doing in the code below to maintain the insert rate? I feel like it should not take 4+ hours to insert 57K records.
[Migrate.java]
This is the main class that loops through 'Xaction' entities and adds 'XactionParticipant' records based off each Xaction.
// Use hibernate cursor to efficiently loop through all xaction entities
String hql = "select xaction from Xaction xaction";
Query<Xaction> query = session.createQuery(hql, Xaction.class);
query.setFetchSize(100);
query.setReadOnly(true);
query.setLockMode("xaction", LockMode.NONE);
ScrollableResults results = query.scroll(ScrollMode.FORWARD_ONLY);
int count = 0;
Instant lap = Instant.now();
List<Xaction> xactionsBatch = new ArrayList<>();
while (results.next()) {
count++;
Xaction xaction = (Xaction) results.get(0);
xactionsBatch.add(xaction);
// save new XactionParticipants in batches of 25
if (count % 25 == 0) {
xactionParticipantService.commitBatch(xactionsBatch);
float rate = ChronoUnit.MILLIS.between(lap, Instant.now()) / 25f / 1000;
System.out.printf("Batch rate: %.4fs per xaction\n", rate);
xactionsBatch = new ArrayList<>();
lap = Instant.now();
}
}
xactionParticipantService.commitBatch(xactionsBatch);
results.close();
[XactionParticipantService.java]
This service provides a method with "REQUIRES_NEW" in an attempt to close the transaction for each batch
@Transactional(propagation = Propagation.REQUIRES_NEW)
public void commitBatch(List<Xaction> xactionBatch) {
for (Xaction xaction : xactionBatch) {
try {
XactionParticipant xp = new XactionParticipant();
// ... create xp based off Xaction info ...
// Use native query for efficiency
String nativeQueryStr = "INSERT INTO XactionParticipant .... xp info/data";
Query q = em.createNativeQuery(nativeQueryStr);
q.executeUpdate();
} catch (Exception e) {
log.error("Unable to update", e);
}
}
// Clear just in case??
em.flush();
em.clear();
}
That is not clear what is the root cause of your performance problem: java memory consumption or db performance, please check some thoughts below:
Since you are retrieving full-functional entities, those entities get stored in persistence context (session-level cache), and in order to free memory up you need to detach entity upon entity has been processed (i.e. after
xactionsBatch.add(xaction)or// ... create xp based off Xaction info ...), otherwise at the end of processing you consume the same amount of memory as you were doingList<> results = query.getResultList();, and here I'm not sure what is better: consume all memory required at the start of transaction and release all other resources or keep cursor and jdbc connection open for 4 hours.yes, in general, JDBC should be faster than JPA API, however that is not your case - you are inserting records one-by-one instead of using batch inserts. In order to take advantage of batches your code should look like:
BTW, Hibernate may do the same if
hibernate.jdbc.batch_sizeis set to large enough positive integer and entities are properly designed (id generation is backed up by DB sequence and allocationSize is large enough)