I have a file sized 15-16GB containing json objects seperated by new line (\n).
I am new to python and reading the file using the following code.
with open(filename,'rb') as file:
for data in file:
dosomething(data)
If while reading the reading ,my script fails after 5GB, how can I resume my read operation from the last read position and continue from there.
I am trying to do the same by using the file.tell() to get position and move the pointer using the seek() function.
Since this file contains json objects, after seek operation am getting the below error.
ValueError: No JSON object could be decoded
I am assuming that after seek operation the pointer is not getting proper json.
How can I solve this?. Is there any other way to read from last read position in python.
You can use for i, line in enumerate(opened_file) to get the line numbers and store this variable. when your script fails you can display this variable to the user. You will then need to make an optional command line argument for this variable. if the variable is given your script needs to do opened_file.readline() for i in range(variable). this way you will get to the point where you left.