Why CountVectorsFeaturizer is used twice in config.yml produced by rasa init?

59 Views Asked by etang At 03 June 2025 at 13:21

Below is part of config.yml generated by rasa init.

- name: WhitespaceTokenizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
  analyzer: char_wb
  min_ngram: 1
  max_ngram: 4
- name: DIETClassifier

It shows CountVectorsFeaturizer is being specified twice. Why is it so?

Original Q&A

There are 1 best solutions below

Eunice On 28 November 2023 at 14:04

In the first one, you simply take each token as a feature for the BoW, while 'char_wb' first creates character n-grams within the token's boundaries and then adds them to the BoW's feature set. The two pair well together. See doc rasa and sklearn.

Why CountVectorsFeaturizer is used twice in config.yml produced by rasa init?

There are 1 best solutions below

Related Questions in RASA-NLU

Trending Questions

Popular # Hahtags

Popular Questions