Using a mask to calculate a difference between values marked by the mask and follow up values

42 Views Asked by At

I have a np.array with float values and a mask (True/False) and I want to calculate the difference between data in the array marked by the mask with "True" and the follow up value in the np.array. If the difference is under my defined threshold of 600, the follow up value should also be marked in the mask with "True". The usecase is for anomaly detection of two anomalies next to each other, since my algorithm is based on gradients and therefore currently not working for this specific case.

For example:

x = [1.5 16000 16100 2.5 NaN 3.1 3.4 -15000 4.1 NaN]
mask = [False True False False False False False True False False]
threshold = 600

The calculation in this case should be: abs(16000)-abs(16100) = 100 100<threshold = True

and abs(-15000)-4.1=14995.9 14995.9>threshold=True

Output: mask_new = [False True True False False False False True False False]

It should also be considered, that the follow up value might sometimes be "NaN" and in this case the mask should stay "False". There might be up two 5 Trues behind each other.

I was thinking about using np.where, but I don't know how to calculate the difference to the follow up value within a np.where-function.

Can anyone help me to solve this?

2

There are 2 best solutions below

0
Soaib On

You can use np.isnan()[1] to check if the value is NaN. Then, based on the check you can calculate the difference.

indices = np.where(mask)[0] 
for idx in indices:
    if idx < len(x) - 1 and not np.isnan(x[idx + 1]):
        diff = abs(x[idx] - x[idx + 1])
        if diff < threshold:
            mask[idx + 1] = True
print(mask)
0
Mohamed Fathallah On

You can use the following function to solve your issue.

  1. the function start by taking the difference between each consecutive elements and because you have nan in your data (x) some values will be nan in the result.
  2. We start by determine the indices that we need to update We used mask[:-1] to Selects all elements of the mask except the last one. As we're focusing on differences between consecutive elements, (np.abs(diff) < threshold) will check if the absolute difference is less than the threshold, and & is to select only when True in the mask and True in the threshold condition [0] is just like unsqueezing to get index values.
  3. now we have the indices, it is time to update the mask We can just iterate over the indices and if the next index (index+1) is not nan, we update the mask to be True.

import numpy as np

def update_mask(x, mask, threshold):
    diff = np.diff(x)  
    update_indices = np.where(mask[:-1] & (np.abs(diff) < threshold))[0]


for index in update_indices:
    if not np.isnan(x[index+1]):
        mask[index+1] = True

return mask