Pandas: how to retrieve values from a DataFrame given a list of (row, column) pairs?

387 Views Asked by At

tldr; I want to pass a series of positions on a DataFrame and receive a series of values, If possible with a DataFrame method.

I have a Dataframe with some columns and an index

import pandas as pd

df_a = pd.DataFrame(
{'A':[0,1,3,7],
 'B':[2,3,4,5]}, index=[0,1,2,3])

I want to retrieve the values at specific (row, column) positions on the DataFrame

rows = [0, 2, 3]
cols = ['A','B','A']

df_a.loc[rows, cols] returns a 3x3 DataFrame

   |A  |B  |A
0   0   2   0
2   3   4   3
3   7   5   7

I want the series of values corresponding to the (row, col) values, a series of length 3

[0, 4, 7]

What is the best way to do this in pandas?

3

There are 3 best solutions below

2
On BEST ANSWER

Most certainly! you can use DataFrame.lookup to achieve exactly what you want: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.lookup.html

import pandas as pd

df_a = pd.DataFrame({'A':[0,1,3,7], 'B':[2,3,4,5]}, index=[0,1,2,3])

rows = [0, 2, 3]
cols = ['A','B','A']

values = df_a.lookup(rows, cols)

print(values)
array([0, 4, 7], dtype=int64)
1
On

Pandas does not support that kind of indexing, only numpy

>>> df.to_numpy()[rows, df.columns.get_indexer(cols)]
array([0, 4, 7])
0
On

There is a solution to stack df and collect values with .loc or .reindex. This works if (col, row) pair is missing.

import pandas as pd

df = pd.DataFrame({'A':[0,1,3,7], 'B':[2,3,4,5]}, index=[0,1,2,3])

rows = [0, 2, 3, 100]
cols = ['A','B','A', 'C']

idx = pd.MultiIndex.from_arrays([rows, cols])
values = df.stack().reindex(idx)

values
## 0    A    0.0
## 2    B    4.0
## 3    A    7.0
## 100  C    NaN
## dtype: float64