I have a database which contains all the data from a CSV that I downloaded from a botanic website, and I want to it always updated based on the version of this CSV. My problem is that the CSV is extremely huge, It contains at least 1 million records which takes on average an entire hour to save everything. Any ideas on how can I perform this update without the need to rewrite the entire thing? TYSM!
Some extra info:
- It's a nodeJs project, and I'm using Prisma for the ORM
- It's actually a txt file which I parse as a CSV since all the data is separated by tabs
- I'm using posgresql
- Currently I'm dropping the table and saving everything from the ground up
You would need to essentially find differences between two csv versions and find the records which have changed and just update those records.
You could use any text/csv comparator npm library to find the difference, and once you get the records which has updates you could just initiate an update query for them.