I am trying to measure the speed at which elmo model from the allen.nlp library performs on CPU and T4 GPU on google collab. The eval function accepts a function that makes sentence embedding with the given model
for m in elmo_models:
print('\n\n\n', m, '\n==========================================')
options_file = "/content/" + m + "/options.json"
weight_file = "/content/" + m + "/model.hdf5"
model = Elmo(options_file, weight_file, 2, dropout=0)
speed_task_gpu.eval(lambda x: elmo_embed(x, model), m)
speed_task_cpu.eval(lambda x: elmo_embed(x, model, 'cpu'), m)
And this is the function which is meant to get text embeddings:
def elmo_embed(text, model, name='cuda:0'):
character_ids = batch_to_ids(text)
device = torch.device(name)
model = model.to(device)
character_ids = character_ids.to(device)
with torch.no_grad():
embeddings = model(character_ids)['elmo_representations'][0]
embeddings = embeddings.sum(dim=(0, 1), keepdim=True)
embeddings = torch.nn.functional.normalize(embeddings)
embeddings = embeddings[0][0].cpu().numpy()
return {'cls': embeddings}
Even though I transfer both model and character_ids to the same device, I get this error when trying to evaluate model's performance on CPU:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)
What is the correct way of transfering model and character_ids to the same device?