Numpy memmap and num_workers: Freezes and restart as executing multiple pytorch scripts

108 Views Asked by At

Mostly I work with huge file more than my workstation ram. So I uses numpy memmap file format with pytorch. If I set num_workers more than 1, my workstation freezes and starts to reboot itself.

I tried to save those memmap files to each separate disks with the exactly same condition as num_workers = 1. But it still the same as freezing itself and starting to reboot.

Is there any solution or workaround for this situation?

Currently the maxium number of python script to execute simultaneously is two with num_workers=1.

from torch.utils.data import DataLoader, Dataset

def read_memmap(mem_file_name):
    with open(mem_file_name+'.conf', 'r') as file:
        memmap_configs = json.load(file)
        return np.memmap(mem_file_name, mode='r+', \
                        shape=tuple(memmap_configs['shape']), \
                        dtype=memmap_configs['dtype'])

x = read_mmap(file_path)
y = read_mmap(file_path)

dset = Dataset(x, y)

def my_dataloader(dset, batch_size):
    return DataLoader(
                    dataset     = dset, 
                    batch_size  = batch_size, 
                    num_workers = 1,
                    pin_memory  = True,
                    )
0

There are 0 best solutions below