Mostly I work with huge file more than my workstation ram. So I uses numpy memmap file format with pytorch.
If I set num_workers
more than 1, my workstation freezes and starts to reboot itself.
I tried to save those memmap files to each separate disks with the exactly same condition as num_workers = 1
. But it still the same as freezing itself and starting to reboot.
Is there any solution or workaround for this situation?
Currently the maxium number of python script to execute simultaneously is two with num_workers=1
.
from torch.utils.data import DataLoader, Dataset
def read_memmap(mem_file_name):
with open(mem_file_name+'.conf', 'r') as file:
memmap_configs = json.load(file)
return np.memmap(mem_file_name, mode='r+', \
shape=tuple(memmap_configs['shape']), \
dtype=memmap_configs['dtype'])
x = read_mmap(file_path)
y = read_mmap(file_path)
dset = Dataset(x, y)
def my_dataloader(dset, batch_size):
return DataLoader(
dataset = dset,
batch_size = batch_size,
num_workers = 1,
pin_memory = True,
)