Clipping tensor data to a bounding volume

69 Views Asked by At

I have 2 questions about tensorflow 2.0, with the focus on how tensorflow handles combined conditional tests in it's operations graph.

The task: cut up a volume of data points into blocks, and store the indices to the samples that belong to the volume (not the samples themselves).

My initial approach: loop all elements and collect the indices of the data points that are inside the 'bounding volume'. This was pretty slow, no matter how I reordered the compares on the coordinates.

    # X.shape == [elements,features]
    # xmin.shape == xmax.shape == [features]

    def getIndices(X, xmin, xmax):
        i = 0
        indices = tf.zero(shape[0], dtype = tf.int32)
        for x in X:
            if (x[0] > xmin[0]):
                if (x[1] > xmin[1]):
                    if (x[2] <= xmax[2]):

                        # ...and so on...

                        indices = tf.concat([indices, i], axis = 0)
            i = i + 1
        return indices

I then came up with the idea to produce boolean tensors and logically 'and' them to get the indices of the elements I need. A whole lot faster, as shown in the next sample:

    # X.shape == [elements,features]
    # xmin.shape == xmax.shape == [features]

    def getIndices(X, xmin, xmax):
        # example of 3 different conditions to clip to (a part of) the bounding volume 
        # X is the data and xmin and xmax are tensors containing the bounding volume

        c0 = (X[:,0] >   xmin[0])
        c1 = (X[:,1] >   xmin[1]) # processing all elements
        c2 = (X[:,2] <=  xmax[2]) # idem

        # ... there could be many more conditions, you get the idea..

        indices = tf.where(tf.math.logical_and(c1, tf.math.logical_and(c2, c3) )

        return indices

    #    ...

    indices = getIndices(X, xmin, xmax)
    trimmedX = tf.gather(X, indices)

This code produces the correct result, but I wonder if it is optimal.

The first question is about scheduling:

Will the tensorflow graph that holds the operations cull (blocks of) conditional tests if it knows some (blocks of) elements already tested False. Because of the logical_and combining the logical conditionals, no subsequent conditional test on these elements will ever yield a True.

Indeed, in the above example c1 and c2 are asking questions on elements that may c0 already excluded from the set. Especially when you have a high number of elements to test, this could be a waste of time, even on parallel hardware platforms

So, what if we cascade the tests based on the results of a previous test? Although it seems like a solved problem, this solution is incorrect, because the final indices tensor will refer to a subset _X, not to the total set X:

    # X.shape == [elements,features]
    # xmin.shape == xmax.shape == [features]

    def getIndices(X, xmin, xmax):
        c0 = (X[:,0] >   xmin[0])
        indices = tf.where(c0)
        _X = tf.gather(X, indices)

        c1 = (_X[:,1] >   xmin[1]) # processing only trimmed elements
        indices = tf.where(c1)
        _X = tf.gather(_X, indices)

        c2 = (_X[:,2] <=  xmax[2]) # idem
        indices = tf.where(c2)
        return indices

    ...
    indices = getIndices(X, xmin, xmax)
    trimmedX = tf.gather(X, indices)  # fails: indices refer to a trimmed subset, not X

I could of course 'solve' this by simply expanding X, so that each element also includes the index of itself in the original list, and then proceed as before.

So my second question is about functionality:

Does tf have a method to make the GPU/tensor infrastructure provide the bookkeeping without spending memory / time on this seemingly simple problem?

1

There are 1 best solutions below

1
On

This will return all indices larger than minimum and less than maximum when both of these have the same number of features as X

import tensorflow as tf

minimum = tf.random.uniform((1, 5), 0., 0.5)
maximum = tf.random.uniform((1, 5), 0.5, 1.)

x = tf.random.uniform((10, 5))

indices = tf.where(
    tf.logical_and(
        tf.greater(x, minimum),
        tf.less(x, maximum)
        )
    )
<tf.Tensor: shape=(22, 2), dtype=int64, numpy=
array([[0, 3],
       [0, 4],
       [1, 1],
       [1, 2],
       [1, 3],
       [1, 4],
       [3, 1],
       [3, 3],
       [3, 4],
       [4, 0],
       [4, 4],
       [5, 3],
       [6, 2],
       [6, 3],
       [7, 1],
       [7, 4],
       [8, 2],
       [8, 3],
       [8, 4],
       [9, 1],
       [9, 3],
       [9, 4]], dtype=int64)>