How to find a sequence of numbers in an array in R?

877 Views Asked by At

I want to find a specific sequence of numbers in an array. As an example, suppose I want to find the sequence of two numbers 7, i.e., c(7,7).

Take matrix M, where

set.seed(100)
M = matrix(sample(10,100,replace = T), nrow = 10)

If you run the code, both M[5,4] and M[5,5] are equal to 7. So matrix M does have a sequence equal to the one I'm looking for.

As a result, I'd like to know the index of the line the sequence is located, hence, my answer would be 5. To make it even better, I'd like to know the column at which the sequence starts. In this case, this would be 4.

I found two answers here on StackOverflow related to this topics. Question 1 and Question 2

Question 1 is about finding a sequence in an array. I tried coupling the solution given with an apply function as in

apply(M, 1, *solution from Question 1*)

but it didn't work out.

Question 2 seems to do what I want, but it's in HPH and I didn't fully grasp the code.

To do all this, I'm using R. Thanks for the feedback.

1

There are 1 best solutions below

0
On

The ordering of items in a matrix or array is "column-major" so you would need to test both the original matrix and its transpose if you wanted sequences that occurred in both columns and rows.

> which(diff(M)==0)
[1]  5 17 48 61 68 75
> which(diff(t(M))==0)
[1]  5  7  8 35 40 64 90
> M
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    4    7    6    5    4    4    5    5    5    10
 [2,]    3    9    8   10    9    2    7    4    6     3
 [3,]    6    3    6    4    8    3   10    6   10     4
 [4,]    1    4    8   10    9    3    7   10   10     5
 [5,]    5    8    5    7    7    6    5    7    1    10
 [6,]    5    7    2    9    5    3    4    7    6     4
 [7,]    9    3    8    2    8    2    5    9    8     6
 [8,]    4    4    9    7    9    3    5    8    3     2
 [9,]    6    4    6   10    3    6    3    9    4     1
[10,]    2    7    3    2    4    3    7    1    8     8

The '5' in the first which result is referring to the 5's in positions [5:6,1] while the 5 in the which done on transposed M is referring to the 4's at position [1, 5:6]. The adjacent 7's you were asking about are identified by the 35 in the second result.

You might want to look at these two matrices. You could add acolumn a FALSE at the end of second and row oof FALSE below the first if you wnated to get the results to "line-up" with the original:

 t(diff(t(M))==0)
       [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9]
 [1,] FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE  TRUE FALSE
 [2,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [3,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [4,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
 [5,] FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE
 [6,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [7,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [8,]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [9,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[10,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
> diff(M)==0
       [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
 [1,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [2,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [3,] FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE
 [4,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [5,]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE
 [6,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [7,] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
 [8,] FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [9,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE