What is a best practice to store 'large' data, represented by List in Java, in database?

1.1k Views Asked by At

What is a best practice to store 'large' data, represented by List in Java, in database?

i'm considering 3 variants:

  1. Use '@OneToMany' to store data in separate table.
  2. Serialize data and store it in parent table.
  3. Store data as files(naming conventions? same as id?).

To be more specific

'Large' data entities:

class SingleSleeper{

    private Double startPositionOnLeft;
    private Double endPositionOnLeft;
    private Double startPositionOnRight;
    private Double endPositionOnRight;
....
}

class RutEntry{

    private Double width;
    private Double position;
...
}

There are about 50 instances of SingleSleeper class and about 25000 instances of RutEntry class in one parent instance. Parent instances are generated about 40 times every day. i'm using EclipseLink JPA 2.1, derby

Addition

Most of all i'm interested in best readability in Java. But i'm afraid that database speed will significantly decrease if i will store too much data into database. An overwhelming number of requests will be to select all instances of SingleSleeper or RutEntry classes of particular parent entity. I'm not interested for support to different database types, but i can move to other database, if needed.

1

There are 1 best solutions below

6
On BEST ANSWER

I think I would do neither of your variants.

I would add a ManyToOne to the child entities (which is somehow the opposite of your first variant):

public class SingleSleeper {
   @ManyToOne(optional = false, fetch = FetchType.LAZY)
   private ParentEntity parent;

   ...
}

public class RutEntry {
   @ManyToOne(optional = false, fetch = FetchType.LAZY)
   private ParentEntity parent;

}

This ensures that you have a mapping and that you never load all 25000 entities for a parent object, if you don't need them (the lazy fetch ensures that you even don't need to load the parent entity).

You can create a OneToMany in the parent object with a mappedBy link, if you really want to. For example because you always need all child objects in the parent entity:

class ParentEntity {
    @OneToMany(mappedBy = "parent", fetch = FetchType.LAZY)
    Collection<SingleSleeper> singleSleepers;

    @OneToMany(mappedBy = "parent", fetch = FetchType.LAZY)
    Collection<RutEntry> rutEntries;
}

But I don't know how EclipseLink is working here - for Hibernate you need at least an additional BatchSize annotation to indicate that it should load as many child entities as possible at once. It can't fetch all together with the parent instance (e.g. by defining both as FetchType.EAGER), as only one is allowed to be fetched eagerly (and otherwise you would have 25000 * 50 result rows in the result set of the corresponding SQL select statement).

The best to load all child entities for a parent entity is to load them separate, either using JPQL (easier to read, faster to write) or the Criteria API (typesafe, but you need a metamodel):

ParentEntity parent = entityManager.find(ParentEntity.class, id);

// JPQL:
List<SingleSleeper> singleSleepers = entityManager.createQuery(
   "SELECT s FROM SingleSleeper s WHERE s.parent = %parent"
   ).setParameter("parent", parent).getResultList();

// Or Criteria API:
CriteriaBuilder criteriaBuilder = entityManager.getCriteriaBuilder();
CriteriaQuery<SingleSleeper> query = criteriaBuilder.createQuery(SingleSleeper.class);
Root<SingleSleeper> s = query.from(SingleSleeper.class);
query.select(s).where(criteriaBuilder.equal(s.get(SingleSleeper_.parent), parent));
List<SingleSleeper> singleSleepers = entityManager.createQuery(query).getResultList();

You have three advantages of that approach:

  1. Still easy to read - if you put the loading into its own method.
  2. You are flexible to decide when to load the 25050 children.
  3. You can load a subset of the children as well (by modifying the result of createQuery with Query.setFirstResult and Query.setMaxResults).