Replace missing values (given as strings) in pandas dataframe by np.NaN

10.2k Views Asked by Peaceful At 19 June 2025 at 13:07

I have a dataframe energy with missing values in some column. The missing values are represented by a string ... in the dataframe. I want to replace all these values by np.NaN

In [3]: import pandas as pd

In [4]: import numpy as np

In [7]: energy = pd.read_excel('test.xls', skiprows = 17, skip_footer = 38, parse_cols = range(2, 6), index_col = None, names = ['Country', 'ES'
   ...: , 'ESC', '% Renewable'])

In [8]: energy[(energy['ES'] == "...") | (energy['ESC'] == "...")]
Out[8]: 
                          Country   ES  ESC  % Renewable
3                  American Samoa  ...  ...     0.641026
86                           Guam  ...  ...     0.000000
150      Northern Mariana Islands  ...  ...     0.000000
210                        Tuvalu  ...  ...     0.000000
217  United States Virgin Islands  ...  ...     0.000000

To replace these values, I tried:

In [9]: energy[(energy['ES'] == "...")]['ES'] = np.NaN
/usr/local/bin/ipython:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  #!/usr/bin/python3

I don't understand the error and also I don't see any other way to achieve what I want to. Any ideas?

Original Q&A

There are 2 best solutions below

jezrael On 21 December 2016 at 16:21 BEST ANSWER

I think you need:

energy['ES'] = energy.loc[energy['ES'] != "...", 'ES']

Another solution:

energy['ES'] = energy['ES'].mask(energy['ES'] == "...")

Or:

energy['ES'] = energy['ES'].replace({'...': np.nan})

But the best is ayhan comment:

you can pass na_values='...' to pd.read_excel

Ashu007 On 02 August 2019 at 05:03

If Energy is your pandas dataframe then in your case you can also try:

for col in Energy.columns:
    Energy[col] = pd.to_numeric(Energy[col], errors = 'coerce')

Above code will convert all your missing values to nan automatically for all columns in your dataframe.

Replace missing values (given as strings) in pandas dataframe by np.NaN

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in PYTHON-3.X

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in MISSING-DATA

Trending Questions

Popular # Hahtags

Popular Questions