I am trying to fuzzy match two different datasets based on the name column in python.
I am getting an error code that the name is not found in the index. The name variable is a column in both datasets. Can anyone provide me with any suggestions to fix the error?
Here is my code:
from pathlib import Path
import fuzzymatcher
al_usa = pd.read_csv('al_all_reference_usa.dta.csv', skipinitialspace=True)
al_sample = pd.read_csv('al_final.dta.csv', skipinitialspace=True)
left_on = ['name', 'state']
right_on = ['name', 'state']
matched_results = fuzzymatcher.fuzzy_left_join(al_usa,
al_sample,
left_on,
right_on,
left_id_col='id',
right_id_col='id')
Here is my error output:
raise KeyError(f"{not_found} not in index")
KeyError: "['name'] not in index"````