Condition based array zero padding

315 Views Asked by At

I have two arrays:

 a = numpy.array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])
 label = numpy.array(['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b'])

What I am looking for, is padding the zeros according to the following condition:

If the label[i-1] != label[i]:
   pad several zeros (say, 3) to the 'a' array at the same 'i' location

So, my desired result would be:

a = numpy.array([ 1,  2,  3,  4,  5,  6,  7, 0, 0, 0, 8,  9, 10])
label = numpy.array(['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b'])

As you can see, array a now has 3 zeros after value 7, which were padded by the condition where label value has changed.

I have tried the following code:

for i in range(len(a)):
    if label[i-1] != label[i]:         
        a = numpy.pad(a, (0,3), 'constant')
    else:
       pass

But, the zeros are padded at the end of the a array. As I suspect, I should be equating padded operation to the same array, as it is changing within the for loop.

4

There are 4 best solutions below

0
On BEST ANSWER

Here's a numpy based approach:

def pad_at_diff(x, y, n):   
    # boolean mask where diffs occur 
    m = np.r_[False, y[:-1]!= y[1:]]
    # output array, expanded taking into account 
    # zeros to add
    x_pad = np.zeros(len(x)+n*len(m[m]))
    # assign at according indices adding cumsum of m
    x_pad[np.arange(len(x))+np.cumsum(m)*n] = x
    return x_pad

a = np.array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])
label = np.array(['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b'])
pad_at_diff(a, label, 3)
array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  0.,  0.,  0.,  8.,  9., 10.])

Or for this other example:

a = np.array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10,11,12])
label = np.array(['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'c', 'c'])
pad_at_diff(a, label, 3)
array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  0.,  0.,  0.,  8.,  9., 10.,
        0.,  0.,  0., 11., 12.])
2
On
  • You need to make changes on a based on label so you need to traverse through label, not through a
  • Now you should add i != 0 condition in the if or else you will be punished for the first element also if the first and last are same because -1 gets back to the last element.
import numpy as np
a = np.array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])
label = np.array(['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b'])

offset = 0
for i in range(len(label)):
    if i != 0 and label[i-1] != label[i]:
        len_ = 2 # no of 0's you want to add
        a = np.insert(a, i + offset, np.array([0] * len_))
        offset += len_
print(a)

output:

[ 1  2  3  4  5  6  7  0  0  8  9 10]
0
On

Is this what you want?

>>> for i in range(a.size-1):
        if label[i]!=label[i+1]:
            np.insert(a,i+1,[0]*3)

This is what I got:

array([ 1, 2, 3, 4, 5, 6, 7, 0, 0, 0, 8, 9, 10])

I have used your if condition for reference.

0
On

The pad function of np addes to the ends of your array. I think what you are looking for is insert. The problem with inserting is that once you insert the values your index as you loop changes. If you loop from the back though it works:

import numpy as np
a = np.array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])
label = np.array(['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b'])

prev=None
for i,ele in enumerate(label[::-1]):
   if prev:
      if ele!=prev:
         a=np.insert(a,-i, [0,0,0])
         print(ele)
   prev=ele