I am trying to read only the values of every 4th row of my excel data sheet (every value corresponding to "media"). I can do this but I want to read every 4th row, starting from row 2. Or technically it would be row 2 and row 6 and row 10 etc. But no matter what I try I can't get what I want. excel datafile
I tried variations that I found from different questions:
cols=[' estacion_id','estacion_nombre','sensor_id','sensor_alias','sensor_nombre','fecha_ observacio n','metrica','valor' ]
f=pd.read_csv('env_picacho-ma_2013.csv', usecols=['fecha_observacion', 'valor'])
date=pd.read_csv('env_picacho-ma_2013.csv', usecols=['fecha_observacion'],skiprows=lambda x: logic(x) )
temp=pd.read_csv('env_picacho-ma_2013.csv', usecols=['valor'],skiprows= lambda x:(x % 4))
This gives me every 4th row: valor 0 7.7 1 8.4 2 4.7 3 6.8 4 8.2 ... ... 1820 791.0 1821 771.0 1822 841.0 1823 806.0 1824 795.0
I also tried:
def logic(index=0):
if index % 4 == 3:
return True
return False
temp=pd.read_csv('env_picacho-ma_2013.csv', usecols=['valor'],skiprows= lambda x: logic(x))
This skips every 3rd row out of 4. So I thought I could just skip every row1 and row3 and row4 in order to keep only row2. But this doesn't work either:
def logic(index=0):
if index % 4 == 3 and index % 4== 2:
return True
return False
gives me all the values again. I can sort of understand why this wold not work because I am contradicting myslef but I don't know how to do it right:
valor
0 NaN 1 3.78417 2 0.60000 3 7.70000 4 NaN ... ... 7295 806.00000 7296 20382.00000 7297 NaN 7298 NaN 7299 795.00000
I then found this which selects every top 2 values every 5 rows but I don't know how to modify this for my case (every 2nd value every 4 rows):
df.groupby(df.index//5).head(2)
I tried quite a few variations and I guess my other option would either be a loop but I am terrible at coding loops so if anyone can offer any advice it would be much appreciated. What I want is: valor 0 3.78417 1 4.48482 2 2.38661 ...
You can use :
Output :
Intermediates (callable) :