The concatenation of 2 tensors runs on the GPU but Inference does not. C++-Code:
cppflow::model* model = new cppflow::model(path_to_model)
cppflow::tensor img1;
cppflow::tensor img2;
std::vector<cppflow::tensor> tsvec;
tsvec.push_back(img1);
tsvec.push_back(img2);
cppflow::tensor batch = cppflow::concat(0, tsvec);
cppflow output = (*model)(batch);
std::cout<<"device batch " << batch.device() << std::endl;
std::cout<<"device output " << output.device() << std::endl;
The expected output is as follows:
device batch /job:localhost/replica:0/task:0/device:GPU:0
device output /job:localhost/replica:0/task:0/device:GPU:0
But instead I get:
device batch /job:localhost/replica:0/task:0/device:GPU:0
device output /job:localhost/replica:0/task:0/device:CPU:0
How to run both on the gpu?
The model was build with python 3.10, tensorflow 2.10.
I use for the Inference nvidia rtx 6000, cppflow, libtensorflow-gpu-windows-x86_64-2.10.0 and cuda 11.2