I have the following dataframe:
df = pd.DataFrame({'id':[1,2,3],'text':['a foox juumped ovr the gate','teh car wsa bllue','why so srious']})
I would like to generate a new column with the fixed spelling errors using the pyspellchecker library.
I have tried the following but it did not correct any spelling errors:
import pandas as pd
from spellchecker import SpellChecker
spell = SpellChecker()
def correct_spelling(word):
corrected_word = spell.correction(word)
if corrected_word is not None:
return corrected_word
else:
return word
df['corrected_text'] = df['text'].apply(correct_spelling)
Below is a dataframe for what the expected output should look like
pd.DataFrame({'id':[1,2,3],'text':['a foox juumped ovr the gate','teh car wsa bllue','why so srious'],
'corrected_text':['a fox jumped over the gate','the car was blue','why so serious']})
I don't know anything about this package (how to fix accuracy) but you can split the strings in each row into a list and then iterate over a list of lists. This example uses a list comprehension:
Output (As you can see you would need to work on the accuracy):