Python Pandas read_csv dtype fails to covert "string" to "float64"

2.3k Views Asked by At

I have a csv file with header = "col1" and 5 values

col1
398
5432
5986
8109
/N

I intended to set this as a numeric col in pandas so i wrote

import pandas as pd
data = pd.read_csv(r'\test1.csv', dtype = {'col1': 'float64'})

but error message ValueError: could not convert string to float: '/N'

Above code works fine without the slash and last row will turn into "Nan". But without changing my original data value, is there any way to suppress the "slash" and make the code run?

3

There are 3 best solutions below

0
On BEST ANSWER

data = pd.read_csv(r'\test1.csv', dtype = {'col1': 'float64'}, na_values=[r'/N'])

According to the docs, the na_values parameter is a list-like structure of strings that can be recognised as NaN.

2
On

Try with error_bad_lines=False:

data = pd.read_csv(r'\test1.csv', dtype = {'col1': 'float64'}, on_bad_lines='skip')
0
On

You can use converters, use errors='coerce' to convert to NaN:

def convert_float(val):
    return pd.to_numeric(val, errors='coerce')

df = pd.read_csv('test.csv', converters={'col1': convert_float})
print(df)

     col1
0   398.0
1  5432.0
2  5986.0
3  8109.0
4     NaN