FileNotFoundError when unpickling multiple files within a folder

108 Views Asked by At

I am currently trying to unpickle multiple individually pickled files that exist within a folder.

As there are around 600 individual pickled files I'd like to be able to iterate through the folder instead of manually unpickling each file.

Code

data_dirA = 'Data/FolderA'

for img in os.listdir(data_dirA):
    with open(img,'rb') as f:
        pickle.load(f)

Error

Raises a FileNotFoundError in Jupyter Notebook:

      1 for img in os.listdir(data_dirA):
----> 2     with open(img,'rb') as f:
      3         pickle.loads(f)

FileNotFoundError: [Errno 2] No such file or directory: '1.pck'

What I tried

I have been able to avoid the above PATH error using:

with open("Data/FolderA/1.pck", "rb") as f:
    img = pck.load(f)

But this unfortunately only works for a single file and must be manually changed between files. (Although this still does throw an unpickling error stating that: 'pickle data was truncated', but that seems out of the scope of this question).

Note

I should also add that I am aware of how to iterate through and open multiple files. My challenge here is that as I understand unpickling must be performed on files based on their name (please correct me if I have misunderstood as I don't see any threads that directly answer this question).

EDIT

with Answer Code:

for root, _, files in os.walk('Data/Intact'):
for content in files:
    if not content.endswith('.pkl'):
        continue

    print(f"\n{root}/{content} data:")
    with open(f"{root}/{content}", 'r') as f:
        while 1:
            try:
                print(pickle.load(f))
            except EOFError:
                break
1

There are 1 best solutions below

3
On

I expect all your files are in FolderA and have file extension .pck.

Use file_name.endswith(('.pck','.pkl',....)): to validate file type to be pickle and continue to next step to read its content.

Your pickle load worries me, as it loads only the last serialized pickle object. Try this instead.

while True:
    try:
        data.append(pickle.load(file_obj))
    except EOFError:
        break
print(data)

error - pickle data was truncated

You data stream is corrupted and you should get the whole file to be pickled properly again.

There are 6 protocols which can be used for pickling. Your python version should match for pickling and unpickling above lowest orders.

Example: you get error while unpickling protocol 5 file in python 3.5.


Sample code:

for root, _, files in os.walk(path):
    for content in files:
        if not content.endswith('.pkl'):
            continue

        print(f"\n{root}/{content} data:")
        with open(f"{root}/{content}", 'r') as f:
            while 1:
                try:
                    print(pickle.load(f))
                except EOFError:
                    break