Grammar/Spelling checking with word suggestion Python

2.2k Views Asked by At

I am working on an NLP project that analyzes specifications in natural language. I am using NLTK toolkit and autocorrect for tokenizing, POS tagging and checking for misspelling. But I run into a problem recently. So the example is "Then it terns left." while the user actually means "Then it turns left."

The POS tagger from the NLTK toolkit recognizes the "terns" as an Adjective. But since the sentence itself is grammatically incorrect and NLTK parser is still limited to corrected sentences, I won't blame it. And since "tern" is a correct English word, the autocorrect function also doesn't catch the error. When I use grammar tools like Grammarly to test the sentence, it gives me suggestion like: the word "terns" does not seem to fit this context, and suggest me to replace it with "turns".

How can I fix this problem? For example, report the error and give suggestion on the sentence "Then it terns left." --> "Then it turns left."

My thought now is to check the grammar first. For example, maybe to say the word between "it" and "left" should be a verb. Then gives the suggestion based on the fact that we need a verb. The NLTK parser doesn't really tell which word cause the problem. I also tried grammar-check and language-check (which they are the same). It is too slow for my purpose.

Any suggestion on how to solve this problem?

1

There are 1 best solutions below

1
On

What you're describing here is a difficult problem, but it could be potentially solved by checking word concordance, or in other words, by checking the contexts in which the word is used in other environments. You can then make an educated guess based on the context if that word usage makes any sense where it is used in the subject sentence. Here's an example, from the nltk docs, using Moby Dick as a search space.

>>> text1.concordance("monstrous")
Displaying 11 of 11 matches:
ong the former , one was of a most monstrous size . ... This came towards us ,
ON OF THE PSALMS . " Touching that monstrous bulk of the whale or ork we have r
ll over with a heathenish array of monstrous clubs and spears . Some were thick
d as you gazed , and wondered what monstrous cannibal and savage could ever hav
that has survived the flood ; most monstrous and most mountainous ! That Himmal
they might scout at Moby Dick as a monstrous fable , or still worse and more de
th of Radney .'" CHAPTER 55 Of the monstrous Pictures of Whales . I shall ere l
ing Scenes . In connexion with the monstrous pictures of whales , I am strongly
ere to enter upon those still more monstrous stories of them which are to be fo
ght have been rummaged out of this monstrous cabinet there is no telling . But
of Whale - Bones ; for Whales of a monstrous size are oftentimes cast up dead u
>>>

Additionally, if you're not already using the Stanford POS tagger instead of the default NTLK tagger, it can generate better results at the expense of performance.