Tensorflow Hub module reuse

185 Views Asked by At

Say I want to use a specific module (text embeddings) from TF Hub to create two distinct models, that I would then like to export and serve.

Option 1: Import the module for each model, put each classifier on top, and export 2 models; serve each in its own docker container. These models contain both the underlying embedding module and the classifier.

Option 2: Serve the module itself, and have its output go to 2 different served models, that themselves do not contain the embeddings. (Is this even possible?)

My computer science background tells me that option 2 is better, since we are re-using the original embeddings module for both models, also decoupling the models themselves from the embeddings module.

However, from a practical standpoint, when a data scientist is coding, they are importing the module and training with the classifier on top of it, so it becomes cumbersome having to export the model itself without the underlying embeddings.

Can anyone point me in the right direction? Hopefully my question makes sense, I am not a data scientist myself, I am coming more from a development background.

Thanks

1

There are 1 best solutions below

0
On BEST ANSWER

Putting a classifier on top of an embedding module creates a fairly strong dependency: the classifier must be trained to the particular embedding space. Unless you make very special arrangements, just swapping in another embedding module won't work. So Option 1 is quite good: it yields two models that can be served and updated independently. They have some overlap, akin to two statically linked programs using the same library, but the source code is still modular: using Hub embedding modules through their common signature makes them interchangeable.

Option 2, by comparison, gives you three moving parts with non-trivial dependencies. If your goal is simplicity, I wouldn't go there.