python subtract every even column from previous odd column

1.4k Views Asked by At

Sorry if this has been asked before -- I couldn't find this specific question.

In python, I'd like to subtract every even column from the previous odd column:

so go from:

292.087 190.238 299.837 189.488 255.525 187.012
300.837 190.887 299.4   188.488 248.637 187.363
292.212 191.6   299.038 188.988 249.65  187.5
300.15  192.4   307.812 189.125 247.825 188.113

to

101.849 110.349 68.513
109.95  110.912 61.274
100.612 110.05  62.15
107.75  118.687 59.712

There will be an unknown number of columns. should I use something in pandas or numpy?

Thanks in advance.

2

There are 2 best solutions below

1
On BEST ANSWER

You can accomplish this using pandas. You can select the even- and odd-indexed columns separately and then subtract them.

@hiro protagonist, I didn't know you could do that StringIO magic. That's spicy.

import pandas as pd
import io

data = io.StringIO('''ROI121  ROI122  ROI124  ROI125  ROI126  ROI127
                      292.087 190.238 299.837 189.488 255.525 187.012
                      300.837 190.887 299.4   188.488 248.637 187.363
                      292.212 191.6   299.038 188.988 249.65  187.5
                      300.15  192.4   307.812 189.125 247.825 188.113''')

df = pd.read_csv(data, sep='\s+')

Note that the even/odd terms may be counterintuitive because python is 0-indexed, meaning that the signal columns are actually even-indexed and the background columns odd-indexed. If I understand your question properly, this is contrary to your use of the even/odd terminology. Just pointing out the difference to avoid confusion.

# strip the columns into their appropriate signal or background groups
bg_df = df.iloc[:, [i for i in range(len(df.columns)) if i%2 == 1]]
signal_df = df.iloc[:, [i for i in range(len(df.columns)) if i%2 == 0]]

# subtract the values of the data frames and store the results in a new data frame
result_df = pd.DataFrame(signal_df.values - bg_df.values)

result_df contains columns which are the difference between the signal and background columns. You probably want to rename these column names, though.

>>> result_df
         0        1       2
0  101.849  110.349  68.513
1  109.950  110.912  61.274
2  100.612  110.050  62.150
3  107.750  118.687  59.712
0
On
import io

# faking the data file
data = io.StringIO('''ROI121  ROI122  ROI124  ROI125  ROI126  ROI127
292.087 190.238 299.837 189.488 255.525 187.012
300.837 190.887 299.4   188.488 248.637 187.363
292.212 191.6   299.038 188.988 249.65  187.5
300.15  192.4   307.812 189.125 247.825 188.113''')

header = next(data)  # read the first line from data
# print(header[:-1])
for line in data:
    # print(line)
    floats = [float(val) for val in line.split()]  # create a list of floats
    for prev, cur in zip(floats[::2], floats[1::2]):
        print('{:6.3f}'.format(prev-cur), end=' ')
    print()

with output:

101.849 110.349 68.513 
109.950 110.912 61.274 
100.612 110.050 62.150 
107.750 118.687 59.712 

if you know what data[start:stop:step] means and how zip works this should be easily understood.