Resampling pandas DataFrame for every nth row with different parameters on each column?

Question

Resampling pandas DataFrame for every nth row with different parameters on each column?

813 Views Asked by R.2 At 17 August 2025 at 04:04

I've got minute by minute stock data with col_headings Date, Time, Open, High, Low, Close, Volume. I need to resample for every n'th row (in this example every 3rd row), but with different parameters on several columns. The code I have so far is:

import pandas as pd
import numpy as np

spy = pd.read_csv("C:\\Users\\PC\\Desktop\\spy_test.csv")

#Sample for every n minutes
n = 3
b = n-1


spy_date = pd.DataFrame(spy['Date'])
date = spy_date.iloc[b::n, :]

spy_time = pd.DataFrame(spy['Time'])
time = spy_time.iloc[b::n, :]
time = time.reset_index(drop=True)
spy_open = pd.DataFrame(spy['Open'])
open = spy_open.iloc[::n, :]
open = open.reset_index(drop=True)

spy_high = pd.DataFrame(spy['High'])
high_s = spy_high.iloc[::n, :].max()
high = pd.DataFrame(high_s)
high = high.reset_index(drop=True)

spy_low = pd.DataFrame(spy['Low'])
low_s = spy_low.iloc[::n, :].min()
low = pd.DataFrame(low_s)
low = low.reset_index(drop=True)

spy_close = pd.DataFrame(spy['Close'])
close = spy_close.iloc[::n, :]
close = close.reset_index(drop=True)

spy_volume = pd.DataFrame(spy['Volume'])
volume_s = spy_volume.iloc[n::3, :].sum()
volume = pd.DataFrame(volume_s)
volume = volume.reset_index(drop=True)

joined = [date, time, open, high, low, close, volume]

result = pd.concat(joined, axis=1)
result.columns = ['Date', 'Time', 'Open', 'High', 'Low', 'Close', 'Volume']

print(result)

This code worked except that the High, Low, and Volume columns only returned a single value. Where I wanted the max/min/sum of each 3 minute block, it returned the high/low/sum for the entire column.

Also, if you know of an easier way to do this that a beginner would more or less understand, I am all ears. I've only been coding a couple weeks so I really have no clue what I'm doing.

Note: I considered using qcut but as far as I can tell I would need to figure out how many bins I would want ahead of time. As I'm going to be running varying sets of data through this same basic format it seemed less than ideal.

Original Q&A

There are 1 best solutions below

**Ted Petrou** · Accepted Answer

Since you have military time convert it to timedelta with the following first. Also you will likely have to add leading zeros since you have integers. I have done this with zfill.

df['Time'] = pd.to_timedelta(pd.to_datetime(df.Time.map(lambda x: str(x).zfill(4)),format='%H%M').dt.time.astype(str))
df.set_index('Time').groupby(['Date', pd.Timegrouper('3T')]).agg({'Open':'last', 'High':'max', 'Low':'min', 'Close':'last','Volume':'sum'})

Resampling pandas DataFrame for every nth row with different parameters on each column?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in RESAMPLING

Trending Questions

Popular # Hahtags

Popular Questions