I have retrained a RESNET50 model for reidentification on EDGE TPU. However, it seems to be no way to fetch a batch of image to EDGE_TPU.
I have come up with a solution of running multiple same model for images.
However, is there anyway to speed up the model inference for multiple model? The threading now is even slower than single model inference

Yeah, the edgetpu's architect won't allow processing in batch size. Have you tried model pipelining? https://coral.ai/docs/edgetpu/pipeline/
Unfortunately only available in C++ right now, but we're looking to extends it to python in mid Q4.