How to use GPU in Windows ML example

685 Views Asked by At

I'm trying to adapt this tutorial to use my own neural net and images. I can do that on my CPU, but what I cannot do either with the unchanged tutorial, or my adaptation of it, is use my GPU. According to system information, I have an "NVIDIA Quadro P2200", not that I need to specify this anywhere as far as I can tell. Instead, it seems all I need do is replace:

LearningModelDeviceKind deviceKind = LearningModelDeviceKind::Default;

with:

LearningModelDeviceKind deviceKind = LearningModelDeviceKind::DirectX;

When I do this, I get an exception in:

auto results = session.Evaluate(binding, L"RunId");

After constructing the second parameter, this drops into:

template <typename D> WINRT_IMPL_AUTO(Windows::AI::MachineLearning::LearningModelEvaluationResult) consume_Windows_AI_MachineLearning_ILearningModelSession<D>::Evaluate(Windows::AI::MachineLearning::LearningModelBinding const& bindings, param::hstring const& correlationId) const
{
    void* result{};
    check_hresult(WINRT_IMPL_SHIM(Windows::AI::MachineLearning::ILearningModelSession)->Evaluate(*(void**)(&bindings), *(void**)(&correlationId), &result));
    return Windows::AI::MachineLearning::LearningModelEvaluationResult{ result, take_ownership_from_abi };
}

A winrt::hresult_error is thrown immediately upon stepping into the check_hresult(...) line. I think this means bindings is somehow invalid... but (a) I'm not sure about that and (b) I have no idea what to do to make it valid. Help?

EDIT: I can now get the MS sample working, but not my adaptation. When I view the MS sample .onnx file using Netron, the input and output nodes have reasonable names, and the tensor sizes reported are also reasonable. On the model I am trying to use, the input & output nodes both have ":0" as the last part of their name, and the tensor sizes have one "unknown" size e.g. input size is reported as "unk_123 x 3 x 224 x 224". Do either of these create any incompatibility? The network is supplied to me, so I'd like to understand if either require change before asking for it...

1

There are 1 best solutions below

0
omatai On

It all works as intended. Having tripped up several times trying to adapt Windows ML code to my requirements, my strong advice is:

  • double-check everything - use the debugger to prove that variables contain what you think they do at every step of the set up.

For example, in response to the EDIT section, the issue was copied/pasted/edited code that changed the output shape from 1 x 1000 x 1 x 1 (pasted) to 1 x 10 x 1 x 1 (edited) when it needed to be 1 x 10. That was detected by following my own advice above :-)

I can confirm that setting deviceKind = LearningModelDeviceKind::DirectX is what invokes the GPU, but that you may not get any noticeable speed improvement from doing so.