I'm trying to make a spacy pattern that recognizes when a noun is followed by an adjective, which I have as follows:
pattern = [{'POS':'NOUN'}, {'POS':'ADJ'}]
however, I am trying to make a case exception where the adjective is not a participle form of a verb. My examples are in spanish, so I apologize. For example, I want to find and retokenize things like 'institución educativa' but not 'institución comprometida', as 'comprometida' has the VerbForm_part=True in its tag.
I tried adding the following, but it only made the pattern stop working alltogether in cases like 'institución educativa': pattern = [{'POS':'NOUN'}, {'OP':'!', 'TAG':'VerbForm_part'}, {'POS':'ADJ'}]
I also tried: pattern = [{'POS':'NOUN'}, {'POS':'ADJ', 'TAG': not 'VerbForm_part'}]
In summary, I need to group together nouns followed by adjectives, but only SOME types of adjectives, and excluding others based on their TAG attribute 'VerbForm_part'
Is there any way to do this in Spacy? Does it support exceptions in its patterns?
Thank you!
I found a solution, which was by defining my own matcher and using it to retokenize when it found matches:
If anyone can improve upoon this, or tell me if spacy supports this, it would be greatly appreciated!