Why doesn't Hibernate batch inserts of fields annotated with @ElementCollection?

1.2k Views Asked by At

I have an @Entity containing a few @OneToMany relationships, but since they consist of collections of Enums, I'm using @ElementCollection. The entity has an id that gets generated at the database level (MySQL).

Here is a small example I just made up that corresponds to the structure of my entity.

@Entity
public class Student {

  @Id @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Integer id;

  @ElementCollection(targetClass = Language.class)
  @CollectionTable(name="student_languages", joinColumns=@JoinColumn(name="student_id"))
  private Set<Language> languages;

  @ElementCollection(targetClass = Module.class)
  @CollectionTable(name="student_modules", joinColumns=@JoinColumn(name="student_id"))
  private Set<Module> modules;

  @ElementCollection(targetClass = SeatPreference.class)
  @CollectionTable(name="student_seats", joinColumns=@JoinColumn(name="student_id"))
  private Set<SeatPreference> seatPreference;

[...]
}

I know that GenerationType.IDENTITY deactivates batching, but I thought that would be the case for the main entity only, not for the single properties too. I'm havin to bulk import a few entities (~20k), each with a handful of properties, but Hibernate seems to be generating one insert for each property in the sets, making the import impossibly slow (between 10 and 20 inserts for each record).

I have now spent so long trying to make this faster, that I'm considering just generating an SQL file that I can manually import in the database.

Is there no way to instruct Hibernate to batch inserts the @ElementCollection fields? Am I doing something wrong?

2

There are 2 best solutions below

6
On

Basically, seem hibernate will not help with @ElementCollection batching but you can use the SQL bulk inserts. Seems you are on MySQL which does support the bulk inserts and its JDBC driver can automatically modify / rewrite the individual insert statements into one bulk statement if you enable the rewriteBatchedStatements property.

So in your case what you need to do is tell hibernate to enable batching and order the batch inserts and updates.

hibernate.jdbc.batch_size=100
hibernate.order_inserts=true
hibernate.order_updates=true

This will ensure that when inserting the data into DB the inserts statements generated by Hibernate will be executed in a batch and they will be ordered.

So the SQL generated by Hibernate will be something like this:

insert into student_languages (student_id, languages) values (1,1)
insert into student_languages (student_id, languages) values (1,2)
insert into student_languages (student_id, languages) values (1,3)
insert into student_languages (student_id, languages) values (1,4)

Next, you will need to tell the JDBC driver to rewrite the individual inserts into the bulk insert by setting the rewriteBatchedStatements=true

jdbc:mysql://db:3306/stack?useSSL=false&rewriteBatchedStatements=true

So this will instruct the driver to rewrite the inserts into bulk form, so the above several SQL statements will be rewritten into something like this

insert into student_languages (student_id, languages) values (1,1),(1,2),(1,3),(1,4)

Just as an info this may not work if you are using old versions of the MySQL driver and Hibernate.

0
On

I tested this both with MySQL and MariaDB and actually Hibernate does batch inserts into the collection table. But it's not visible to the naked eye, you have to use DataSource-Proxy to see it:

INFO  com.example.jpa.AddStudents - Adding students
DEBUG n.t.d.l.l.SLF4JQueryLoggingListener - 
Name:dataSource, Connection:3, Time:1, Success:True
Type:Prepared, Batch:False, QuerySize:1, BatchSize:0
Query:["insert into student (name) values (?)"]
Params:[(Smith)]
DEBUG n.t.d.l.l.SLF4JQueryLoggingListener - 
Name:dataSource, Connection:3, Time:1, Success:True
Type:Prepared, Batch:False, QuerySize:1, BatchSize:0
Query:["insert into student (name) values (?)"]
Params:[(Snow)]
DEBUG n.t.d.l.l.SLF4JQueryLoggingListener - 
Name:dataSource, Connection:3, Time:78, Success:True
Type:Prepared, Batch:True, QuerySize:1, BatchSize:6
Query:["insert into student_languages (student_id, language) values (?, ?)"]
Params:[(6,2),(6,0),(6,1),(7,0),(7,4),(7,3)]
INFO  com.example.jpa.AddStudents - Added

The SEQUENCE ID generator is considered the best for Hibernate. It doesn't create lock contention as the TABLE generator does and allows for batching. It is unfortunate that MySQL doesn't support sequences still (MariaDB does).

Am I doing something wrong?

Hibernate is optimized for small-scale changes in the database. It maintains a first-level cache and also supports a second-level cache which will only hinder performance for large-scale operations. Therefore, indeed, you might be better off using JDBC or jOOQ for this particular operation as was suggested in the comments.


I used MySQL 8.0.3, MariaDB 10.5.13 and Hibernate 5.6.3.Final.