How can I evaluate whether my training data is suitable for NLP model?

25 Views Asked by At

The data with semantic meaning is pretty suitable for NLP models. However, when I look at my task data, I don't know how to evaluate whether these data is suitable for NLP models.

The following is my data format:

  1. BGP protocol is used to transfer routing informations among ASes. Now I want to analyze AS_PATH data, which is presented as "AS2449 AS3356 AS32934". The data is a list of AS number.
  2. The AS_PATH has some regulations: the first AS is called vantage point, the last AS is called origin AS and there is no circle.
  3. The number of the data is millions.

I wonder if there are some criteria to judge whether the data is suitable for NLP model.

0

There are 0 best solutions below