Our application is a gigaspace based solution, which basically reads from multiple flat files and stores the data in a object. Now flat files basically contain some shipment details. So we have multiple files
- Dockyard Details
- Container details
- Shipment Details
- etc.
Now we have Dockyard
as a parent object under which there can many objects of shipment details. We currently use an ArrayList
to maintain shipment details for almost 50k dockyard detail objects. Current volume of data suggests that for each Dockyard
object we will have to maintain around 1500 shipment detail object and there will almost 50k dockyard object lying in heap. Our current heap is 8GB.
So wanted to know if ArrayList is the best way to keep so many object. I have looked for other APIs as well like trove
, HPPC
but they mostly offer benefits when it comes to primitive collection. Ours will be a collection of Objects. So other than increasing heap size. can someone suggest any other better alternatives.
You don't need to keep all you objects on the heap. With Chronicle Map for example, you can keep all the objects off heap and since they are memory mapped files, they don't even have to be in memory. You might find you can reduce your heap size if the bulk of your data is off heap.
This is not a lot of objects. Even if each object uses 1 KB, then you are only using a 50 MB. If you object are much bigger than this, it highly likely you should look at ways to reduce the size of the individual objects.
When we use primitive based collections it is mostly to avoid the object header for each element. This saves 8 - 16 bytes per entry or up to 800 KB in you case.
However if you objects are 1 KB to 100 KB as you suggest, you might be able to halve the size they use in memory by restructuring them or using different data types.
BTW a 1 GB is worth about an hour of your time. I would explore doubling the memory size before spending too much time on this.