IndexError: invalid index to scalar variable when trying to calculate AUC metrics with rdkit

187 Views Asked by At

I want to calculate the ROC curve with rdkit implementation:

rdkit.ML.Scoring.Scoring.CalcAUC(scores, col)

Determines the area under the ROC curve

code:

import rdkit.ML.Scoring.Scoring

rdkit.ML.Scoring.Scoring.CalcAUC(scores, y)

and I get the following error:

IndexError: invalid index to scalar variable.

my data:

scores

array([32.336, 31.894, 31.74 , ..., -0.985, -1.629, -1.82 ])

y

array(['Inactive', 'Inactive', 'Inactive', ..., 'Inactive', 'Inactive','Inactive'], dtype=object)

I do not know what's wrong.

1

There are 1 best solutions below

0
On
from rdkit.ML.Scoring.Scoring import CalcAUC
scores = [32.336, 31.894, 31.74, 30., 20.]  # assume scores is sorted in descending order
y = ['Inactive', 'Inactive', 'Inactive', 'Active', 'Inactive']

label_map = {'Active': 1, 'Inactive': 0}
labels = [label_map[y_true] for y_true in y]
auc = CalcAUC(list(zip(scores, labels)), 1)
print('Area Under the ROC Curve:', auc)

As mentioned in the comment above. The documentation for CalcAUC and other metrics is here but is pretty minimal.