read every 2nd value every 4 rows in pandas

Question

read every 2nd value every 4 rows in pandas

81 Views Asked by Mireia At 20 October 2023 at 17:13

I am trying to read only the values of every 4th row of my excel data sheet (every value corresponding to "media"). I can do this but I want to read every 4th row, starting from row 2. Or technically it would be row 2 and row 6 and row 10 etc. But no matter what I try I can't get what I want. excel datafile

I tried variations that I found from different questions:

cols=[' estacion_id','estacion_nombre','sensor_id','sensor_alias','sensor_nombre','fecha_ observacio n','metrica','valor' ]  
f=pd.read_csv('env_picacho-ma_2013.csv', usecols=['fecha_observacion', 'valor'])

date=pd.read_csv('env_picacho-ma_2013.csv', usecols=['fecha_observacion'],skiprows=lambda x: logic(x)  )
temp=pd.read_csv('env_picacho-ma_2013.csv', usecols=['valor'],skiprows= lambda x:(x % 4))

This gives me every 4th row: valor 0 7.7 1 8.4 2 4.7 3 6.8 4 8.2 ... ... 1820 791.0 1821 771.0 1822 841.0 1823 806.0 1824 795.0

I also tried:

def logic(index=0):

if index % 4 == 3:
    return True

return False

temp=pd.read_csv('env_picacho-ma_2013.csv', usecols=['valor'],skiprows= lambda x: logic(x))

This skips every 3rd row out of 4. So I thought I could just skip every row1 and row3 and row4 in order to keep only row2. But this doesn't work either:

def logic(index=0):

if index % 4 == 3 and index % 4== 2:
    return True

return False

gives me all the values again. I can sort of understand why this wold not work because I am contradicting myslef but I don't know how to do it right:

        valor

0 NaN 1 3.78417 2 0.60000 3 7.70000 4 NaN ... ... 7295 806.00000 7296 20382.00000 7297 NaN 7298 NaN 7299 795.00000

I then found this which selects every top 2 values every 5 rows but I don't know how to modify this for my case (every 2nd value every 4 rows):

df.groupby(df.index//5).head(2)

I tried quite a few variations and I guess my other option would either be a loop but I am terrible at coding loops so if anyone can offer any advice it would be much appreciated. What I want is: valor 0 3.78417 1 4.48482 2 2.38661 ...

Original Q&A

There are 2 best solutions below

Daniel Wyatt On 20 October 2023 at 18:11

What about using the following formula which skips every row except the first row (the header) and rows that don't conform to the pattern 2,6,10,14...

def logic(index=0):
    if index==0: # keep the header row
        return False
    if (index+2) % 4 == 0: # keep every 4th row starting from row 2
        return False
    return True

date = pd.read_csv('env_picacho-ma_2013.csv', usecols=['fecha_observacion'],skiprows=lambda x: logic(x))

**Timeless** · Accepted Answer · 2023-10-20T18:40:24.620000

I am trying to read only ... (every value corresponding to "media").

You can use :

SR, GAP = 2, 3

df = pd.read_csv("env_picacho-ma_2013.csv", # add more params if needed
                 skiprows=lambda i: (i+SR)%(GAP+1) and i!=0)

Output :

print(df)

    estacion_id estacion_no  sensor_id  ... fecha_observacion metrica valor
0            90  PICACHO_MA        225  ...        01/01/2013   media  3.78
1            90  PICACHO_MA        225  ...        02/01/2013   media  4.48
2            90   PICACHO_M        225  ...        03/01/2013   media  2.39
3            90  PICACHO_M.        225  ...        04/01/2013   media  3.18
4            90  PICACHO_M.        225  ...        05/01/2013   media  3.86
5            90  PICACHO_MA        225  ...        06/01/2013   media  2.27
6            90  PICACHO_MA        225  ...        07/01/2013   media  2.42
7            90  PICACHO_MA        225  ...        08/01/2013   media  0.89
8            90  PICACHO_M.        225  ...        09/01/2013   media  2.07
9            90  PICACHO_M.        225  ...        10/01/2013   media  2.90
10           90   PICACHO M        225  ...        11/01/2013   media  3.95

Intermediates (callable) :

index  0 # --> False (don't skip)
index  1 # -->  True
index  2 # --> False (don't skip)
index  3 # -->  True
index  4 # -->  True
index  5 # -->  True
index  6 # --> False (don't skip)
index  7 # -->  True
index  8 # -->  True
index  9 # -->  True
index 10 # --> False (don't skip)
index 11 # -->  True
index 12 # -->  True
index 13 # -->  True
index 14 # --> False (don't skip)
index 15 # -->  True
index 16 # -->  True
index 17 # -->  True
index 18 # --> False (don't skip)
...

read every 2nd value every 4 rows in pandas

There are 2 best solutions below

Related Questions in PANDAS

Related Questions in ROW

Related Questions in READ-CSV

Trending Questions

Popular # Hahtags

Popular Questions