Caffe2: How to load and use MNIST tutorial model in C++

1.8k Views Asked by At

I am struggling to replicate in C++ results of the trained MNIST caffe2 tutorial model. What I have done is that I have slightly modified MNIST python tutorial (code available here) and on the python side everything works OK.

If I run mnist.py I'll get two ".pb" files with net definition and initialization. If I load this net on python side and feed it with some image from DB then I'll get correct predictions:

timg = np.fromfile('test_img.dat', dtype=np.uint8).reshape([28,28])
workspace.FeedBlob('data', (timg/256.).reshape([1,1,28,28]).astype(np.float32))
workspace.RunNet(net_def.name)
workspace.FetchBlob('softmax')
array([[  1.23242417e-05,   6.76146897e-07,   9.01260137e-06,
      1.60285403e-04,   9.54966026e-07,   6.82772861e-06,
      2.20508967e-09,   9.99059498e-01,   2.71651220e-06,
      7.47664250e-04]], dtype=float32)

So it is pretty sure the test image is '7' (and it is correct).

But I fail to obtain the same result from C++. I've taken a look at how it is done in other projects (here and here) and have come up with the following:

C++ net initialization

QByteArray img_bytes; // where the raw image bytes are kept (size 28x28)
caffe2::NetDef init_net, predict_net;
caffe2::TensorCPU input;
// predictor and it's input/output vectors
std::unique_ptr<caffe2::Predictor> predictor;
caffe2::Predictor::TensorVector input_vec;
caffe2::Predictor::TensorVector output_vec;
...
QFile f("mnist_init_net.pb");

...
auto barr = f.readAll();
if (! init_net.ParseFromArray(barr.data(), barr.size())) {

...
f.setFileName("mnist_predict_net.pb");

...
barr = f.readAll();
if (! predict_net.ParseFromArray(barr.data(), barr.size())) {

...
predictor.reset(new caffe2::Predictor(init_net, predict_net));
input.Resize(std::vector<int>{{1, 1, IMG_H, IMG_W}});
input_vec.resize(1, &input);

This initialization runs without a problem. Since the deploy network does not have scaling and casting to float I have to do this (the same as in python snippet above) and I do that as follows:

float* data = input.mutable_data<float>();
for (int i = 0; i < img_bytes.size(); ++i)
    *data++ = float(img_bytes[i])/256.f;

and finally I feed the predictor:

if (! predictor->run(input_vec, &output_vec) || output_vec.size() < 1
                                             || output_vec[0]->size() != 10)
...

The result I get on the same file is that '7' is at 17% (not 99.9%) and the remaining categories are around 5-10%.

Right now I'm stuck and I don't know where the problem is, so I'd appreciate any tips/hints/pointers.

1

There are 1 best solutions below

0
On

It turned out that there is no problem with my usage of Caffe2 but with my preprocessing. Since img_bytes is a QByteArray with basic type char and since by default (in gcc) char is a signed type this conversion&scaling:

*data++ = float(img_bytes[i])/256.f;

resulted in some negative values (instead of floats in range [0,1]). Correct version is:

*data++ = static_cast<unsigned char>(img_bytes[i])/256.f