Given a Pandas DataFrame
of numeric values how can one produce a list of the .loc
cell locations that one can then use to then obtain the corresponding n
largest values in the entire DataFame?
For example:
A | B | C | D | E | |
---|---|---|---|---|---|
X | 1.3 | 3.6 | 33 | 61.38 | 0.3 |
Y | 3.14 | 2.71 | 64 | 23.2 | 21 |
Z | 1024 | 42 | 66 | 137 | 22.2 |
T | 63.123 | 111 | 1.23 | 14.16 | 50.49 |
An n
of 3 would produce the (row,col) pairs for the values 1024
, 137
and 111
.
These locations could then, as usual, be fed to .loc
to extract those values from the DataFrame. i.e.
df.loc['Z','A']
df.loc['Z','D']
df.loc['T','B']
Note: It is easy to mistake this question for one that involves .idxmax
. That isn't applicable due to the fact that there may be multiple values selected from a row and/or column in the n largest.
You could try:
Edit: In pandas versions 0.24.0 and above,
MultiIndex.labels
has been replaced byMultiIndex.codes
(see Deprecations in What’s new in 0.24.0 (January 25, 2019)). The above code will throwAttributeError: 'MultiIndex' object has no attribute 'labels'
and needs to be updated as follows:Edit 2: This question has become a "moving target", as the OP keeps changing it (this is my last update/edit). In the last update, OP's dataframe looks as follows:
The desired output can be obtained using: