I have a dataframe that looks like:
ID | timestamp |Phase| current
========================================
001 | 2020-09-20 07:00 | A | 1.4
001 | 2020-09-20 07:00 | B | 2.0
001 | 2020-09-20 07:00 | C | 1.6
002 | 2020-09-20 09:00 | A | 1.4
002 | 2020-09-20 09:00 | B | 1.23
002 | 2020-09-20 09:00 | C | 1.46
I need to calculate the % difference in the phases of each ID/timestamp grouping, so I create a groupby:
imbalanced = df.groupby(['timestamp','ID']).apply(calcImbalance)
and here is calcImbalance:
def calcImbalance(pole):
phA = pole.loc[pole['Phase'] == 'A']['current'].astype('float')
phB = pole.loc[pole['Phase'] == 'B']['current'].astype('float')
phC = pole.loc[pole['Phase'] == 'C']['current'].astype('float')
imb = abs((phA-phB)/phB)
print ('imb:', imb)
if imb >= 0.3:
return pole
imb = abs((phB-phA)/phA)
if imb >= 0.3:
return pole
imb = abs((phA-phC)/phC)
if imb >= 0.3:
return pole
imb = abs((phC-phA)/phA)
if imb >= 0.3:
return pole
But this just prints:
imb: 2661 NaN
2662 NaN
Name: Amps, dtype: float64
imb: 2661 NaN
2662 NaN
Name: Amps, dtype: float64
and then
throws an exception:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
What I'm trying to do is create a dataframe of only the instances in df that have a > 30% difference between phases. I think I have gone down a rabbit hole for something that seems like it should be trivial
In the above example, the 'imbalanced' dataframe should contain:
ID | timestamp |Phase| current
========================================
001 | 2020-09-20 07:00 | A | 1.4
001 | 2020-09-20 07:00 | B | 2.0
The apply function doesn't test the imbalance between phases B & C, only A & B and A & C
IIUC you can find the desired rows with pandas functions
Output
How this works
To find groups with changes between phases > .30
Output
This gives the percent change in groups
Output
The accumulated changes per group
Output
What can this solution detect?
In dataframe
With this solution
Result. Note that
cng
is the cumulative product to compute the change to the first row.