TensorFlow Eigen underlying execution model

100 Views Asked by At

I have been writing TensorFlow custom op and hence am dealing with some matrix operations using Eigen library. I am trying to understand how Eigen library executes the operations. I have the below code:

void  __attribute__((optimize("O0"))) quantizeDequantize(const GPUDevice& d, TTypes<float>::ConstMatrix inputs,
                       float delta, float offset, float minVal, float maxVal,
                       TTypes<float>::Matrix outputs, int channel)
{
   float invScale = 1.0f / ((float)delta);

   const auto clampedTensor         = inputs.chip<1>(channel).cwiseMax(min).cwiseMin(max);
   const auto tensor = (clampedTensor * invScale).round() + offset;
   const auto tensor_2 = (tensor - offset) * scale;
   outputs.chip<1>(channel).device(d) = clampedTensor; // line taking the most time
}

If I disable the below line, the code is almost 7 times faster when running on a large model when compared to having the line in (I understand output wont be correct).

outputs.chip<1>(channel).device(d) = clampedTensor; 

But, if I have following code, the execution time is pretty much close to what I see with all the code in.

void  __attribute__((optimize("O0"))) quantizeDequantize(const GPUDevice& d, TTypes<float>::ConstMatrix inputs,
                        float delta, float offset, float minVal, float maxVal, 
                        TTypes<float>::Matrix outputs, int channel)
{
   outputs.chip<1>(channel).device(d) = inputs.chip<1>(channel);
}

The above two experiments are leading me to infer the following,

  • Eigen backend would not run any operations if we are not using the intermediate results to generate the output. Is it correct?
  • If above is True, how does Eigen library know the graph. Does it figure these details out at compile time similar to how GCC optimizes the code?
  • Does adding attribute((optimize("O0"))) make any difference to the way Eigen backend executes the above code?
1

There are 1 best solutions below

0
On

Eigen seems to have answered these questions here: https://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html