In the original Google's SmartReply paper:
Our search is conducted as follows. First, the elements of R (the set of all possible responses) are organized into a trie. Then, we conduct a left-to-right beam search, but only retain hypotheses that appear in the trie. This search process has complexity O(bl) for beam size b and maximum response length l. Both b and l are typically in the range of 10-30, so this method dramatically reduces the time to find the top responses and is a critical element of making this system deployable.
I want to implement something similar. Some questions:
- How to represent the set of all possible responses (R) in a trie data structure in TF?
- How to modify the current beam search to retain only hypothesis that are available in R?