I want to convert an R dataframe to python using rpy2. I don't know how to convert NA appearing in a text column to a Python value.
This is an example which shows my problem.
import rpy2.robjects as ro
ro.r('n = c(1,2)')
ro.r("b = c(NA,'def')")
ro.r("df = data.frame(n,b)")
rdf = ro.r('df')
from rpy2.robjects.conversion import localconverter
from rpy2.robjects import pandas2ri
with localconverter(ro.default_converter + pandas2ri.converter):
df = ro.conversion.rpy2py(rdf)
Produces:
>>> print(df)
n b
1 1.0 NA_character_
2 2.0 def
A similar code with an old version of rpy2 used to work
import rpy2.robjects as ro
ro.r('n = c(1,2)')
ro.r("b = c(NA,'def')")
ro.r("df = data.frame(n,b)")
rdf = ro.r('df')
from rpy2.robjects import pandas2ri
df = pandas2ri.ri2py(rdf)
Produced:
>>> print(df)
n b
0 1.0 NaN
1 2.0 def
How do I get back the old behaviour?
This seems to be a bug as of rpy2 version 3.5.14, which also appeared with the numpy converter. Until it is fixed, you could use the deprecated
pandas2ri.activate().Outputs:
Note that instead of
NaNthe first value isNone, but this makes sense since the values of columnbare strings instead of numbers. Ifbwas a column of number, e.g.ro.r("b = c(NA, 3)"), then you would getNaN.