I'm using Atom/Electron for building an app that has visualisations over video based on data. Each video has a corresponding CSV file with information for each frame. The videos are about 100 minutes, so the files has plenty of data!
The problem I'm having is that it takes a couple of seconds to load and parse the file. Most of the time this is not a problem. But I need to make playlist of parts of videos, and loading the whole CSV file each time a video is changed is not a viable option.
I been looking to file streaming options as fast-csv, but I didn't manage to start reading for an arbitrary part of the file.
EDIT: from the FS documentation. In this case, the question is how can I know which byte corresponds to the position I want in the file?
options can include start and end values to read a range of bytes from the file instead of the entire file. Both start and end are inclusive and start at 0.
What do you think would be the better and most performant approach to this situation?
In concrete:
Is there a way of starting to read a stream from any part of a CSV file?
Do you consider there is another storage method that would allow me to solve this problem better?
UPDATE:
In the end, I solved this by storing the data in a file in binary format. Since I know how many columns the file has I can just read straight from the segment of the file without any performance implications.
Per my comment, Sqlite seems to be what your looking for. It may not be your permanent solution in the long run but it will certainly work for the time being while you decided if you want to stick with it or code your own solution.
Inner Workings of Sqlite
Sqlite is optimized to the core but it has three major features that cause it to execute faster than normal disk reads, especially CSV files:
If you are really interested in understanding the full scope of all of this, I have found no better explanation than this amazing blog post by Julia Evans.
Possible Drawbacks
Besides the inner workings, Sqlite is designed to be client side working on the users machine. If this is not a viable solution there are workarounds that can be done. Sqlite can be used as a web server for example but it really thrives in a stand alone or mixed installation best. Also remember every clients computer is different. One computer may process records faster than the next but in general you do not need to worry since client side computers are usually under little load.
Summary
Sqlite will work for what you need but may require a little homework to setup in a fashion you need. Sqlite4java for example is easy to install but confusing to learn since their documentation is so poor; Stack Overflow got me through it though. Sqlite is also a use it and forget it type of installation so to answer your question, it will handle 25 rows a second like cake, you do not need to worry about optimizing it only your own code.