I am given this sample array as follows:

[[1, 0, 1, 0], [1, 0, 1, 0], [1, 0, 1, 0], [1, 1, 1, 1], [1, 1, 1, 1], [0, 1, 1, 0], [0, 1, 1, 0], [1, 0, 0, 1]]

I want to create all possible largest continuous vector space that is closed under the sum operation of bitwise_and, and that sum is some user defined constant (i.e. np.bitwise_and(x1, x2).sum() >= constant).

For example, For the above array, I can have three possible vector spaces as follows:

V1: [[1, 0, 1, 0], [1, 0, 1, 0], [1, 0, 1, 0], [1, 1, 1, 1], [1, 1, 1, 1]]

V2: [[1, 1, 1, 1], [1, 1, 1, 1], [0, 1, 1, 0], [0, 1, 1, 0]]

V3: [[1, 0, 0, 1]]

If you take the first vector space, for any two vectors the sum of their bitwise_and >= 2 (i.e. some defined constant). Similarly for second, and third the sum of their bitwise_and >= 2.

I tried several ways, but cant reach an efficient solution. Any tips or suggestion will be helpful.

I tried two different ways:

First Approach: Using Regular Expressions as follows:

def match_strings(strings, pattern, indices):
    matched_strings = [(idx, string) for idx, string in zip(indices, strings) if re.match(pattern, string)]
    return matched_strings

def identify_user_distribution(availability_matrix, indices):
    string_rep = []
    for instance in availability_matrix:
        temp = ''
        for item in instance:
            temp += str(item)
        string_rep.append(temp)

    patterns = set(string_rep)
    patterns = [instance.replace('0', "[01]+") for instance in patterns]

    clusters = []
    for instance_pattern in patterns:
        cluster = match_strings(string_rep, instance_pattern, indices)
        clusters.append([cluster[0][0], cluster[-1][0]])

    return clusters

With this approach, the problem gets solved, but the issue it is not closed. For example, If I have a pattern 1***, then in the vector space i am getting all the vectors that are satisfying the property, but I am also getting some that are not. like: 1000, 1101, 1111 are all in the same group.

So I tried the numpy approach something like below:

def bitwise_and_groups_with_sum(vectors, target_sum):
groups = {}
index = {}
flag = False
for idx, vector in enumerate(vectors):
    temp = ''
    for item in vector:
        temp += str(item)

    if len(groups) == 0:
        groups[temp] = [vector]
        index[temp] = [idx]
    else:
        for key, value in groups.items():
            if vector in value:
                index[key].append(idx)
                continue
            elif np.all([np.bitwise_and(vector, instance).sum() == target_sum for instance in value]):
                groups[key].append(vector)
                index[key].append(idx)
                flag = True

        if not flag:
            groups[temp] = [vector]
            index[temp] = [idx]
            flag = False

return index

But I am not able to formulate the right logic to withstand the closure property.

0

There are 0 best solutions below