Runtime ~100X higer when return a graph with tf.function and serving

44 Views Asked by smm70 At 29 July 2025 at 05:59

I have a simple tf.function to calculate sum of entries from a mask; noticed runtime is around .25s on my laptop, each time the function is called for serving (after warmup). when I return 0 instead of my_var the runtime is 3 orders lower, ~0.003. is it safe to assume this is because by returning the tensor it tf calls graph to value conversion which in turn takes runtime?

more importantly, any thoughts on improving the runtime while enabling access to value when function is called for serving?

thank you.

@tf.function(input_signature=[tf.TensorSpec(shape=None, dtype=tf.float32)])
def my_serve(self, img_file):

    mask = self.model(img_file)[0]

    my_var= tf.cast(tf.reduce_sum(mask[..., 0]), tf.float16)

    return my_var

Original Q&A

Runtime ~100X higer when return a graph with tf.function and serving

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in SERVING

Trending Questions

Popular # Hahtags

Popular Questions