The network produces 1 x N x K tensor, where N is number of pixel positions and K is number of classes, each value represents score for a class at given position.
Current code to retrieve best class affinity for each position is working, but it is terribly slow and takes x4 more time, than the network run itself.
private int[,] GetClasses(List<DisposableNamedOnnxValue> output)
{
Tensor<float> outTensor = output.First().AsTensor<float>();
int[,] classes = new int[frameWidth,frameHeight];
for (int i = 0; i < frameWidth; ++i)
{
for (int j = 0; j < frameHeight; ++j)
{
int finalClass = 0;
float finalClassScore = 0;
for (int k = 0; k < nClasses; ++k)
{
float score = outTensor[0, i * frameHeight + j, k];
if (score > finalClassScore)
{
finalClassScore = score;
finalClass = k;
}
}
classes[i, j] = finalClass;
}
}
return classes;
}
Is there a better, faster way of doing this in Microsoft.ML ?
The solution I went with was to add argmax layer to the initial keras model. Keras output single value through argmax.