Identifying different tenses of a word in Amazon Comprehend Medical

164 Views Asked by At

I am using Amazon Comprehend Medical for entity detection of injuries.

Lets say I have a piece of text as follows:

John had surgery to repair a dislocated left knee and a full ACL tear."

Amazon comprehend medical (ACM) is able to recognize dislocated as a medical condition. However consider the next piece of text:

"John is sidelined with a dislocated right kneecap."

In this piece of text ACM is not able to recognize dislocated as a medical condition. Similarly, if I were to put in a piece of text like "Left ankle sprain", ACM is able to recognize ankle sprain as a medical condition however if I were to put in "sprained left ankle" it does not catch on to the word sprained as a medical condition.

Is there any way in which I can clean my text of change the order of the words so that those entities can be tagged accurately?

1

There are 1 best solutions below

0
On

What you are looking for is called lemmatization. You can use the NLTK toolkit for example to reduce every word to its non-inflected baseform (lemma) which will give you "dislocate" and "sprain" as base forms. This may improve precision of the entity detection. The order of the words should actually not be of importance. Otherwise, train your own NER (https://nlpforhackers.io/named-entity-extraction/).