How do I train to find the occurrence of a US state in NLP?

84 Views Asked by At

How do I train to find the occurrence of a US state, when this set is constrained to 50 states because we need a large amount of data (say 1000 rows) to train a certain label.

1

There are 1 best solutions below

0
On

I think it depends on the task you're trying to solve here. Do you need to differentiate if some two-letter combinations are US state name or not? Just a simple set of names would work? Or you're trying to build some kind of simple NER (https://en.wikipedia.org/wiki/Named-entity_recognition) for state names? This way, you can also start with simple matching by regex, but if you want to train some model later - you have much more than 50 examples. Your dataset won't be just "is these two letters represent state or not", but many sentences, which have state names somewhere in them, or not at all.