I have so difficulty to use the CachedRowSetImpl class in java. I want to analyse the data of a huge postgres table, that contains ~35,000,000 lines and 3 integer columns.
I cannot load everything into my computer physical memory, then I want to read these lines per batch of 100000 lines. When executing the corresponding query (select col1,col2,col3 from theTable limit 10000) in psql prompt or in a graphical interface such as pgadmin, it takes around 4000ms to load the 100000 lines and a few megabytes of memory.
I try to do the same operation with the following java code:
CachedRowSet rowset = new CachedRowSetImpl();
int pageSize=1000000;
rowset.setCommand("select pk_lib_scaf_a,pk_lib_scaf_b,similarity_evalue from from_to_scaf");
rowset.setPageSize(pageSize);
rowset.setReadOnly(true);
rowset.setFetchSize(pageSize);
rowset.setFetchDirection(ResultSet.FETCH_FORWARD);
rowset.execute(myConnection);
System.out.println("start !");
while (rowset.nextPage()) {
while (rowset.next()) {
//treatment of current data page
} // End of inner while
rowset.release();
}
When running the above code, the "start !" message is never displayed in the console and the execution seems to be stuck in the rowset.execute() line. Moreover, the memory consumption gets crazy and reach the limit of my computer physical memory (8gb).
That's strange, it looks like the program tries to fill the rowset with the ~35,000,000 lines, without considering the pageSize configuration.
Does anybody experienced such problem with java JDBC and postgres drivers ? What do I miss ?
postgres 9.1 java jdk 1.7
In fact, the support of cursor is implicitly coded in the postgres JDBC as described in its documentation. A cursor is however created automatically with some conditions.
http://jdbc.postgresql.org/documentation/head/query.html