so i created a for loop so I can run various batch sizes, where each loop will open and close a neptune run. The first time runs fine, but the following runs, the accuracy doesn't record into neptune, and python does not throw an error? Can anyone think what the problem may be?
for i in range(len(percentage)):
run = neptune.init(
project="xxx",
api_token="xxx",
)
epochs = 600
batch_perc = percentage[i]
lr = 0.001
sb = 64 #round((43249*batch_perc)*0.00185)
params = {
'lr': lr,
'bs': sb,
'epochs': epochs,
'batch %': batch_perc
}
run['parameters'] = params
torch.manual_seed(12345)
td = 43249 * batch_perc
vd = 0.1*(43249 - td) + td
train_dataset = dataset[:round(td)]
val_dataset = dataset[round(td):round(vd)]
test_dataset = dataset[round(vd):]
print(f'Number of training graphs: {len(train_dataset)}')
run['train'] = len(train_dataset)
print(f'Number of validation graphs: {len(val_dataset)}')
run['val'] = len(val_dataset)
print(f'Number of test graphs: {len(test_dataset)}')
run['test'] = len(test_dataset)
train_loader = DataLoader(train_dataset, batch_size=sb, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=sb, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=1, shuffle=False)
model = GCN(hidden_channels=64).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
criterion = torch.nn.CrossEntropyLoss()
for epoch in range(1, epochs):
train()
train_acc = test(train_loader)
run['training/batch/acc'].log(train_acc)
val_acc = test(val_loader)
run['training/batch/val'].log(val_acc)
Prince here,
Try using the stop() method to kill the previous run, because currently, you are creating new run objects without killing them, and that might cause some problems.
Docs: https://docs.neptune.ai/api-reference/run#.stop