pandas :pd.date_range output error

Question

pandas :pd.date_range output error

254 Views Asked by ajay At 23 June 2025 at 21:16

I have a data set with 3 columns ID, vrddat, enddat and 21000 rows.

ID       vrddat      enddat

1       2015.01.01  2015.01.03    
2       2015.03.01  2015.03.03

PS: Each ID can have multiple vrddat's and enddat's.

I need output as below:

ID      vrddat      enddat       day

1    2015.01.01   2015.01.03  2015.01.01
1    2015.01.01   2015.01.03  2015.01.02
1    2015.01.01   2015.01.03  2015.01.03
2    2015.03.01   2015.03.03  2015.03.01
2    2015.03.01   2015.03.03  2015.03.02
2    2015.03.01   2015.03.03  2015.03.03

I used following code to get above output

for index,row in data.iterrows():
    data_2 = pd.DataFrame(pd.date_range(row['vrddat'],row['enddat'], freq ='D'))

Using above code I get only 98 rows, but ideally output should contain way more rows than the input. Could any one suggest why I'm getting this kind of output. Is my code not iterating each and every row? How do I get ID, vrddat and enddat variables also in my output?

Please suggest.

Original Q&A

There are 1 best solutions below

**jezrael** · Accepted Answer

You can use first cast to_datetime both columns vrddat and enddat and then use itertuples with concat for creating new expanding DataFrame. Last merge, but is necessary ID in df is unique.

df.vrddat = pd.to_datetime(df.vrddat)
df.enddat = pd.to_datetime(df.enddat)

df1 = pd.concat([pd.Series(r.ID,pd.date_range(r.vrddat, r.enddat)) for r in df.itertuples()])
        .reset_index()
df1.columns = ['day','ID']
print (df1)
         day  ID
0 2015-01-01   1
1 2015-01-02   1
2 2015-01-03   1
3 2015-03-01   2
4 2015-03-02   2
5 2015-03-03   2

print (pd.merge(df,df1, on='ID', how='left'))
   ID     vrddat     enddat        day
0   1 2015-01-01 2015-01-03 2015-01-01
1   1 2015-01-01 2015-01-03 2015-01-02
2   1 2015-01-01 2015-01-03 2015-01-03
3   2 2015-03-01 2015-03-03 2015-03-01
4   2 2015-03-01 2015-03-03 2015-03-02
5   2 2015-03-01 2015-03-03 2015-03-03

If ID is not unique, is possible use unique index for merging:

df.vrddat = pd.to_datetime(df.vrddat)
df.enddat = pd.to_datetime(df.enddat)
df = df.reset_index()

df1=pd.concat([pd.Series(r.index,pd.date_range(r.vrddat,r.enddat)) for r in df.itertuples()])
      .reset_index()
df1.columns = ['day','index']
print (df1)
         day  index
0 2015-01-01      0
1 2015-01-02      0
2 2015-01-03      0
3 2015-03-01      1
4 2015-03-02      1
5 2015-03-03      1

print (pd.merge(df,df1, on='index', how='left').drop('index', axis=1))
   ID     vrddat     enddat        day
0   1 2015-01-01 2015-01-03 2015-01-01
1   1 2015-01-01 2015-01-03 2015-01-02
2   1 2015-01-01 2015-01-03 2015-01-03
3   2 2015-03-01 2015-03-03 2015-03-01
4   2 2015-03-01 2015-03-03 2015-03-02
5   2 2015-03-01 2015-03-03 2015-03-03

pandas :pd.date_range output error

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in LOOPS

Related Questions in DATETIME

Related Questions in PANDAS

Related Questions in DATE-RANGE

Trending Questions

Popular # Hahtags

Popular Questions