I'm trying to build a method to import multiple types of csvs or Excels and standardize it. Everything was running smoothly until a certain csv showed up, that brought me this error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcd in position 133: invalid continuation byte
I'm building a set of try/excepts to include variations of data types but for this one I couldn't figure out how to prevent.
if csv_or_excel_path[-3:]=='csv':
try: table=pd.read_csv(csv_or_excel_path)
except:
try: table=pd.read_csv(csv_or_excel_path,sep=';')
except:
try:table=pd.read_csv(csv_or_excel_path,sep='\t')
except:
try: table=pd.read_csv(csv_or_excel_path,encoding='utf-8')
except:
try: table=pd.read_csv(csv_or_excel_path,encoding='utf-8',sep=';')
except: table=pd.read_csv(csv_or_excel_path,encoding='utf-8',sep='\t')
By the way, the separator of the file is ";".
So:
a) I understand it would be easier to track down the problem if I could identify what's the character in "position 133", however I'm not sure how to find that out. Any suggestions?
b) Does anyone have a suggestion on what to include in that try/except sequence to skip this prob?
For the record, this is probably better than multiple
try/except
s