tensorflow multi-thread inference gpu utilization low

53 Views Asked by zhu3359 At 06 August 2023 at 09:17

I load a tensorflow model with LoadSavedModel and predict with the singleton bundle.session->Run, when predict only single thread(last prediction finished then next require post), it cost very short time and gpu utlization can go very high(more than 40%), but when I change to use multi-threads with the same bundle.session(different threads call the same bundle.session to get predict result), the inference time become very long(3 or 4 times of single thread) but the gpu utlization becomes very low(less than 10%), i do not know how to solve the problem, any suggestion?

i set TF_FORCE_GPU_ALLOW_GROWTH=true but it seems dosenot work

Original Q&A

tensorflow multi-thread inference gpu utilization low

There are 0 best solutions below

Related Questions in TENSORFLOW

Related Questions in GPU

Related Questions in TENSORFLOW-SERVING

Trending Questions

Popular # Hahtags

Popular Questions