I'm trying to extract verbs from German sentences. The problem is, for example in this sentence
Ich rufe noch einmal an.
Im getting rufe as the verb but its anrufe. I'm using textBlob and dont really know anything about linguistic. and using textblob I came accross POS tags. It tagged an
as "RP"(doesnt know what that means) and rufe
as "VB". I could just glue all "RP" and "VB" together but then again there could more than one verb in a sentence.
What is the right way of doing this?
If I understand correctly,
download_corpora
method is a part oftextblob
installation. Like in this example:Then, you can use
textblob
for text analysis:More one interesting sub-library for German is here: https://pypi.org/project/textblob-de/
Maybe, this answer helps you to deep in POS-tagging, because your POS-tagger probably uses this universal tagset: Java Stanford NLP: Part of Speech labels?
P.S. In German, word
an
is a part of verb. 'RB' is a participle. Hence, POS tags 'VB' and 'RP' related to the one verb.