How to convert numpy one dimensional array to Pandas Series or Dataframe

3.3k Views Asked by At

I have spent quiet some time on what seems to be very easy thing. All I want is to convert a numpy array to a Series and then combine Series to make a dataframe. I have two numpy arrays.

import numpy as np

rooms = 2*np.random.rand(100, 1) + 3
price = 265 + 6*rooms + abs(np.random.randn(100, 1))

I wanted to convert rooms and price to series and then combine the two series into a dataframe to make lmplot

So could any one tell me how to do that? Thanks.

2

There are 2 best solutions below

2
On BEST ANSWER

you can use ravel() to convert the arrays to 1-d data:

pd.DataFrame({
     'rooms': rooms.ravel(),
    'price': price.ravel()
})
1
On

The problem with passing the arrays directly to pd.Series is the dimensionality: rooms and price are 2d-array of shape (100,1) while pd.Series requires a 1d-array. To reshape them you can use different methods, one of which is .squeeze(), namely:

import pandas as pd
import numpy as np

rooms = 2*np.random.rand(100, 1) + 3
price = 265 + 6*rooms + abs(np.random.randn(100, 1))

rooms_series = pd.Series(rooms.squeeze())
price_series = pd.Series(price.squeeze())

Now to go from series to dataframe you can do:

pd.DataFrame({'rooms': rooms_series,
              'price': price_series})

Or directly from the numpy arrays:

pd.DataFrame({'rooms': rooms.squeeze(),
              'price': price.squeeze()})