I had this code which works fine when the database is small with few records, it writes the json to the file properly, but when the data is huge, it just times out
<cfloop list="t" index="k">
<cfquery name="qry">
select * from #k#
</cfquery>
<cfset js= serializeJSON(qry,'struct')>
<cffile action="write" file="#k#" output="#js#">
</cfloop>
I tried using threads but they are also not working, it just creates empty tables files with no values if i use cfthread with joins
Thought of splitting the files into a combination of 1000 records for each table and and then doing like
table_1, table2, table3, of the same table which is table because it has millions of records and skip for those if they have less than 1000 records to create only 1 file.
but i am just thinking which approach is best and a starting pointing is needed
First of all, let's split this up:
Resultset from database
Serializing the resultset from database
Writing the serialized resultset from memory onto the filesystem
Doing all of this within the same request/thread
Conclusion
Your code tortures the JVM heap, because references have to be kept until the end of each iteration. The GC can only clean up after a full table has been processed. Large tables (1.000.000+ rows) will likely kill the thread or even hang the JVM.
The Fix: Resultset from database
Retrieving large resultsets at once will always hurt performance. While streaming lots of data within a local network (assuming the database is in the same network) just takes a bit more time, the memory required to store the full resultset is going to be an issue for the JVM.
Instead of doing everything at once, consider splitting it up in smaller chunks of data. Use
OFFSETandFETCHin the SQL statement to limit the number of rows per loop. Having multiple iterations will allow the Java GC to free up memory used by previous iterations, relieving the heap.The Fix: Serializing the resultset from database
Same issue. Big datasets whill hurt performance. Split the resultset by serializing row by row instead of all rows at once.
Writing the serialized resultset from memory onto the filesystem
While this one probably doesn't need a fix, you eventually have to switch to writing line after line.
Some code
For an reference implementation of
queryRowToStruct, check CFLib.