ignore punctuation when checking if a word is an English word

107 Views Asked by At

I am looking for the best way to correct potential misspells for words in a string, without taking the punctuation into account. I do not want to strip it before doing that evaluation as this would alter the final edited string. My current approach uses py-enchant (.check() method) after having splitted the string on whitespaces, but this will not ignore punctuation.

misspelled_string = 'This is a (tesl strung.'

Desired output :

corrected_string = 'This is a (test string.'

1

There are 1 best solutions below

0
Georgina Skibinski On

Try splitting by anything that is not a letter, with re:

import re
misspelled_string = 'This is a (tesl strung.'

res=re.split(r"[^\w]+", misspelled_string)

Output:

>>> res
['This', 'is', 'a', 'tesl', 'strung', '']