Finding a pattern in a binary matrix with R

142 Views Asked by At

I have a nxn symetrical binary matrix and I want to find the largest rectangle (area) with 0 at the top-left and bottom-right corners and 1 at the top-right and bottom-left corner. If I just do it with loops, checking all the rectangles from the biggest to the smallest it takes "days" for n=100. Does anyone have an idea to do it efficiently?

Thanks a lot !

2

There are 2 best solutions below

0
On

Here's an approach that you can try for now. It doesn't require symmetry, and it treats all nonzero elements like ones for efficiency.

It loops over the ones, assuming that there are fewer ones than zeros. (You would want to loop over zeros in the reverse case with fewer zeros than ones.)

This approach probably isn't optimal, since it loops over all of the ones even if the largest box is identified early. You can devise a clever stopping condition to short-circuit the loop in that case. But it is still fast for n = 100, requiring less than half of a second on my machine, even when ones and zeros occur in roughly equal proportion (the worst case):

f <- function(X) {
    if (!is.logical(X)) {
        storage.mode(X) <- "logical"
    }
    J <- which(X, arr.ind = TRUE, useNames = FALSE)
    i <- J[, 1L]
    j <- J[, 2L]
    nmax <- 0L
    res <- NULL
    for (k in seq_along(i)) {
        i0 <- i[k]
        j0 <- j[k]
        ok <- i < i0 & j > j0
        if (any(ok)) {
            i1 <- i[ok]
            j1 <- j[ok]
            ok <- !(X[i0, j1] | X[i1, j0])
            if (any(ok)) {
                i1 <- i1[ok]
                j1 <- j1[ok]
                n <- (i0 - i1 + 1L) * (j1 - j0 + 1L)
                w <- which.max(n)
                if (n[w] > nmax) {
                    nmax <- n[w]
                    res <- c(i0 = i0, j0 = j0, i1 = i1[w], j1 = j1[w])
                }
            }
        }
    }
    res
}
mkX <- function(n) {
    X <- matrix(sample(0:1, n * n, TRUE), n, n)
    X[upper.tri(X)] <- t(X)[upper.tri(X)]
    X
}

set.seed(1L)
X <- mkX(6L)
X
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    0    1    0    0    1    0
## [2,]    1    0    1    1    0    0
## [3,]    0    1    0    1    1    1
## [4,]    0    1    1    0    0    0
## [5,]    1    0    1    0    0    1
## [6,]    0    0    1    0    1    0

f(X)
## i0 j0 i1 j1 
##  5  1  1  5 
Y <- mkX(100L)
microbenchmark::microbenchmark(f(Y))
## Unit: milliseconds
##  expr     min       lq     mean   median       uq      max neval
##  f(Y) 310.139 318.3363 327.8116 321.4109 326.5088 391.9081   100
1
On

thanks for your answers. Matrices I use are adjacency matrices of random Erdos-Renyi graphs. But one can take any random symetrical binary matrix to test it. Until now, I use 4 nested loops :

switch<-function(Mat)
{
n=nrow(Mat) 
for (i in 1:(n-1)) { 
    for(j in seq(n,i+1,by=-1)) {
        for(k in 1:(n-1)) { 
            if ((k==i)||(k==j) || (Mat[i,k]==1)||(Mat[j,k]==0)) next 
            for(l in seq(n,k+1,by=-1)) { 
                if ((l==i)||(l==j)|| (Mat[i,l]==0)||(Mat[j,l]==1)) next 
                return(i,j,k,l)
            }
        }
    }
}