I am new to rasa and started creating a very domain-specific chatbot. As part of it, I understand its better to use supervised embeddings as part of nlu pipeline, since my use case is domain-specific.
I have an example intent in my nlu.md
## create_system_and_config
- create a [VM](system) of [12 GB](config)
If I try to use a supervised featurizer, it might be working fine with my domain-specific entities, but my concern here is, by using only supervised learning, won't we lose the advantage of pre-trained models? For example, in a query such as add a (some_system) of (some_config)
. add and create are very closely related. pre-trained models will be able to pick such verbs easily. Is it possible to have a combination of pre-trained model and then do some supervised learning on top of it in our nlu pipeline, something like transfer learning?
If you're creating domain-specific chatbot, it's always better to use supervised embedding instead of pre-trained
In your case also
your model needs to capture that words VM and Virtual Machine are same. Pretrained featurizers are not trained to capture this and they are more generic.
For more details you can refer Rasa document