There is no proper solution available for "utf-8' codec can't decode byte 0x99 in position 21"

313 Views Asked by At

Currently stuck with a .csv file having 10lac rows. I was loading the dataset in a data frame Python called rawdata. The file is having ascii codes probably because of which its shooting the error 'utf-8' codec can't decode byte 0x99 in position 21: invalid start byte The solution for this is not properly mentioned anywhere. enter code here

import numpy as np import pandas as pd import scipy as sci

import sys reload(sys) sys.setdefaultencoding("ISO-8859-1")

import os print os.getcwd()

os.chdir('D:\DJ\Placement reports\')

setwd()

rawdata=pd.read_csv('D:\DJ\Placement reports\Copy of Placement Reports _ Apr_Mar_May Page 2.csv', newline='', encoding='utf-8')

rawdata=pd.read_csv("D:\DJ\Placement reports\Copy of Placement Reports _ Apr_Mar_May Page 1.csv")

rawdata=pd.read_csv("D:\DJ\Pyhton analysis\wagering.csv")

a.encode('utf-8').strip()

x = pd.read_csv("D:\DJ\Placement reports\Test1.csv")

open('D:\DJ\Placement reports\Copy of Placement Reports _ Apr_Mar_May Page 2.csv', newline='', encoding='utf-8')

Data Frame :: Month Placement Placement URL Type Campaign Ad group Clicks Impr. CTR Avg. CPC Cost Apr-18 Mobile App: Cric Informer(Dream11,Myteam11 tips & IPL NEWS ) (Google Play), by BRAJ & GEETA INC https://play.google.com/store/apps/details?id=manager.attendance.fantasycrickettips Mobile application Display-Affinity-Keyword-Topics Display_Keywords 52,584 61,07,340 0.86% ? 1.76 ? 92,484.03 Mar-18 Mobile App: NewsDog - Latest News, Breaking News, Local News (Google Play), by NewsDog Team https://play.google.com/store/apps/details?id=com.newsdog Mobile application Display-Affinity-Keyword-Topics Audience_Affinity 99,361 58,55,703 1.70% ? 0.82 ? 81,644.29 Apr-18 Mobile App: Cric Informer(Dream11,Myteam11 tips & IPL NEWS ) (Google Play), by BRAJ & GEETA INC https://play.google.com/store/apps/details?id=manager.attendance.fantasycrickettips Mobile application Display-Custom-Intent-India Custom-Intent 28,106 43,14,179 0.65% ? 2.85 ? 79,991.28 Apr-18 Mobile App: Cric Informer(Dream11,Myteam11 tips & IPL NEWS ) (Google Play), by BRAJ & GEETA INC https://play.google.com/store/apps/details?id=manager.attendance.fantasycrickettips Mobile application Display-Affinity-Keyword-Topics Audience_Affinity 39,526 39,54,727 1.00% ? 1.79 ? 70,662.24 Apr-18 us.com http://us.com Site Display-Affinity-Keyword-Topics Audience_Affinity 23,792 60,06,433 0.40% ? 2.83 ? 67,301.35 Mar-18 Mobile App: GiftMoney (Google Play), by KingToUpper https://play.google.com/store/apps/details?id=com.akp151998.giftmoney Mobile application Display-Affinity-Keyword-Topics Audience_Affinity 27,012 3,15,541 8.56% ? 2.47 ? 66,765.34 Apr-18 Mobile App: mCent Browser - Fast and Safe plus Free Data (Google Play), by mCent https://play.google.com/store/apps/details?id=com.mcent.browser Mobile application Display-Affinity-Keyword-Topics Display_Keywords 31,898 56,07,897 0.57% ? 1.77 ? 56,368.85 Apr-18 Mobile App: Cric Informer(Dream11,Myteam11 tips & IPL NEWS ) (Google Play), by BRAJ & GEETA INC https://play.google.com/store/apps/details?id=manager.attendance.fantasycrickettips Mobile application Display-Affinity-Keyword-Topics Display_Keywords 52,584 61,07,340 0.86% ? 1.76 ? 92,484.03

I have tried all these steps to solve but failed. Please help providing solutions or links. PLease convert into CSV format

1

There are 1 best solutions below

1
On

Try a.encode('utf-8', 'ignore'). This should just drop any invalid bytes.