python, pandas - dataframe with time, create shifted data

137 Views Asked by At

I have a DataFrame:

df = pd.DataFrame(
    np.random.rand(10, 3),
    columns='sensor_id|unix_timestamp|value'.split('|'))

I want to create 5 more columns in which each new column is a shifted version of the value column.

   sensor_id  unix_timestamp     value  value_shift_0  value_shift_1  value_shift_2  value_shift_3  value_shift_4
0   0.901001        0.036683  0.945908            NaN            NaN            NaN            NaN            NaN
1   0.751759        0.038600  0.117308            NaN            NaN            NaN            NaN            NaN
2   0.737604        0.484417  0.602733            NaN            NaN            NaN            NaN            NaN
3   0.259865        0.522115  0.074188            NaN            NaN            NaN            NaN            NaN
4   0.932359        0.662560  0.648445            NaN            NaN            NaN            NaN            NaN
5   0.114668        0.066766  0.285553            NaN            NaN            NaN            NaN            NaN
6   0.795851        0.565259  0.888404            NaN            NaN            NaN            NaN            NaN
7   0.082534        0.355506  0.671816            NaN            NaN            NaN            NaN            NaN
8   0.336648        0.651789  0.859373            NaN            NaN            NaN            NaN            NaN
9   0.917073        0.842281  0.458542            NaN            NaN            NaN            NaN            NaN

But I don't know how to fill in with the appropriated shifted value columns.

1

There are 1 best solutions below

0
On BEST ANSWER

pd.concat with a dictionary comprehension along with join

df.join(
    pd.concat(
        {'value_shift_{}'.format(i): df.value.shift(i) for i in range(5)},
        axis=1))

enter image description here


alternative with numpy

def multi_shift(s, n):
    a = np.arange(len(s))
    i = (a[:, None] - a[:n]).ravel()
    e = np.empty(i.shape)
    e.fill(np.nan)
    w = np.where(i >= 0)
    e[w] = df.value.values[i[w]]

    return pd.DataFrame(e.reshape(10, -1),
                        s.index, ['shift_%i' % s for s in range(n)])

df.join(multi_shift(df.value, n))

timing

enter image description here