Is it training too slow or at normal speed? GPU + python + tensorflow-gpu

533 Views Asked by At

I am training a "faster_rcnn_inception_resnet_v2_atrous_coco" for custom object detection using tensorflow's api.

I set up a machine at azure with following configuration:

Intel cxeon CPU E5-2690 v3 @ 2.60GHz RAM 56GB windows10 64bit GPU tesla k80 total memory 11.18GB

When i run train.py I get the following speed per second:

INFO:tensorflow:global step 458: loss = 0.5601 (3.000 sec/step) I1009 19:30:13.254615 5916 tf_logging.py:115] global step 458: loss = 0.5601 (3.000 sec/step) INFO:tensorflow:global step 459: loss = 0.5724 (3.077 sec/step) I1009 19:30:16.331734 5916 tf_logging.py:115] global step 459: loss = 0.5724 (3.077 sec/step) INFO:tensorflow:global step 460: loss = 0.8615 (3.018 sec/step) I1009 19:30:19.350132 5916 tf_logging.py:115] global step 460: loss = 0.8615 (3.018 sec/step) INFO:tensorflow:global step 461: loss = 0.6021 (3.062 sec/step) I1009 19:30:22.428256 5916 tf_logging.py:115] global step 461: loss = 0.6021 (3.062 sec/step)

Is it fast enough or it should be faster as it is using a GPU? batchsize of the config file is 1. When I change it to 2 or higher it runs out of memory.

it takes 3 seconds per step in a dataset of 93 images. Ok... but after training, when I load the frozen graph and try to predict it over all images, it takes 1 second per image.. with the GPU... seems too slow.. what I am doing wrong?

0

There are 0 best solutions below