How batch processing systems deal with a lot of objects

884 Views Asked by At

I have a question in mind which I'll try to explain it as well as I can.

Batch processing frameworks such as spring batch deal with a handful amount of objects.

They should process the objects part by part so they don't encounter java heap space error.

What will these kind of frameworks or systems do with garbage they produces?

Do they call System.gc() sometimes or they handle it in different ways?

1

There are 1 best solutions below

0
On

Garbage collection will be done prior to throwing an OutOfMemoryError, so you won't run out of memory because consumption is outpacing cleanup.

The question is too broad to cover what frameworks might or might not do, but in general, it's good practice to hold references to objects for a little scope as possible. That means adopting a "stream" approach to processing - where all operations on an object occur (eg read from input stream, process, write to output) before the next object is processed.

This is in contrast to reading in all objects, processing all objects then writing all output - which requires large memory usage.