I have a very large text file (several GB in size) which I need to read into Python and then process line by line.
One approach would be to simply call data=f.readlines()
and then process the content. With that approach I know the total number of lines and can easily measure the progress of my processing. This however is probably not the ideal approach given the file size.
The alternative (and I think better) option would be to say:
for line in f:
do something
Just now I am not sure how to measure my progress anymore. Is there a good option that does not add a huge overhead? (One reson why I may want to know the progress is for one to have a rough indicator of the remaining time, as all lines in my file have similar sizes, and to ascertain whether my script is still doing something or has gotten stuck somewhere.)
if using linux os there is a way out it seems.
On reading you get the number of lines as well as name of file