How to render large number of rows (order of 50k) in BIRT excel reports?

1.5k Views Asked by At

I am using BIRT runtime 4.8.0 in a java project for generating Excel reports. The excel report has 1k columns and can have 10k to 50k rows (result of 1 query, maps to 1 table). I am using spudsoft ExcelEmitter for rendering the static excel reports.

Data source: Impala jdbc connection, using 1 dataset with 1 query

The issue is it takes 6 to 7GB of heap space (java) just to render 10k rows in this report, so as to load everything in memory and then write it to file.

Is there any way to reduce the memory footprints (predictable heap space usage preferably under 3GB) while rendering the excel sheets (options like pagination of query results, rendering file in parts, etc)?

3

There are 3 best solutions below

1
On BEST ANSWER

I was able to generate excel for large dataset (order of 50k rows and 1k columns) by directly using Apache POI Streaming APIs. Aspose APIs are another good tool for doing this.

Using POI streaming APIs you can render excel with order of 50k rows, 1k columns in about a minute or two under 2GB of peak RAM usage.

So if you extend the Spudsoft excel emitter to use the POI streaming APIs then it can be handled using BIRT as well.

0
On

I solved it with a new version of the spudsoft emitter. That changes the apache poi from xssf to sxssf:

ExcelEmitter.ExtractMode Experimental feature! When set to true, the emitter should run faster for XLSX files, but with a limited feature set:

  • Images will be omitted.
  • Merged cells are not allowed.
  • Structure header and footer are not supported. See ExcelEmitter.StructuredHeader.

https://www.eclipse.org/forums/index.php/m/1804253/#msg_1804253

0
On

Use below code to set the limit to 15K. This resolved my problem.

reportContext.getAppContext().put("MAX_PAGE_BREAK_INTERVAL", 15000);