How to perform shuffle on a list of words?

5.2k Views Asked by At

I want to do a shuffle on a list of words in R, either using permute::shuffle/shuffleSet() or any other function.

Code:

require(permute)
a <- c("Ar","Ba","Bl","Bu","Ca")
x <- list(a)
x
shuffleSet(x)
shuffle(x)

But when I try this code I get the error

shuffleSet : Error in seq_len(n) : argument must be coercible to non-negative integer

shuffle : Error in factor(rep(1, n)) : 
  (list) object cannot be coerced to type 'integer'

Should permute::shuffle/shuffleSet() only be used only for integers? How do I perform a shuffle for the above list? Is there any other function to do this?

3

There are 3 best solutions below

3
On BEST ANSWER

why not use sample?

     > a <- c("Ar","Ba","Bl","Bu","Ca")
     > a
     [1] "Ar" "Ba" "Bl" "Bu" "Ca"
     > x <- list(a)
     > x
     [[1]]
     [1] "Ar" "Ba" "Bl" "Bu" "Ca"
     > x[[1]][sample(1:length(a))]
     [1] "Ba" "Bu" "Ar" "Bl" "Ca"
6
On

If you want to generate the entire shuffle set, rather than a random permutation, you can try the following:

require(permute)
a <- c("Ar","Ba","Bl","Bu","Ca")
x <- list(a)
y <- shuffleSet(as.factor(x[[1]]), quietly=TRUE)
y[] <- sapply(y, function(x) a[x])
y <- as.data.frame(y)
> head(y)
#  V1 V2 V3 V4 V5
#1 Ar Ba Bl Ca Bu
#2 Ar Ba Bu Bl Ca
#3 Ar Ba Bu Ca Bl
#4 Ar Ba Ca Bl Bu
#5 Ar Ba Ca Bu Bl
#6 Ar Bl Ba Bu Ca

Hope this helps.

1
On

This is really a follow-up to the other answers - go upvote both of them, not this one.

If all you want to do is randomly permute a vector of observations, sample() from the base R installation works just fine (as shown by @t.f in their answer):

a <- c("Ar","Ba","Bl","Bu","Ca")
x <- list(a)
sample(a)
sample(x[[1]])

> sample(a)
[1] "Ca" "Bu" "Ba" "Ar" "Bl"
> sample(x[[1]])
[1] "Bl" "Ba" "Bu" "Ar" "Ca"

Note you don't need sample(1:length(a)) or to use this to index into x[[1]]; sample() does the right thing if you give it a vector as input, as shown above.

shuffle() and shuffleSet() were designed as an interface to restricted permutations but they needed to handle complete randomisation as a special case and so if you dig deep enough you'll see that shuffle or shuffleSet call shuffleFree and internally that uses sample.int() (for efficiency reasons, but for all intents and purposes this is a call to sample() just a bit more explicit about setting all arguments.)

Because these two functions were designed for more general problems, all we need to know is the number of observations to iterate; hence the first argument to both should be the number of observations. If you pass them a vector, then as a little bit of sugar I just compute n from the size of the object (length for a vector, nrow for a matrix etc).

@RHertel notes that shuffleSet() generates, with this example, all the permutations of the set a. This is due to heuristics that try to provide better random permutations when the set of permutations is small. A more direct way of generating the set of all permutations is to do it via allPerms():

> head(allPerms(seq_along(a)))
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    2    3    5    4
[2,]    1    2    4    3    5
[3,]    1    2    4    5    3
[4,]    1    2    5    3    4
[5,]    1    2    5    4    3
[6,]    1    3    2    4    5

Note that allPerms(a) won't work because a is of class "character" and there isn't a nobs() method for that (but I'll go fix that in permute so it will work in future.)

shuffle and shuffleSet return permutations of the indices of the thing you want to permute, not the permuted elements themselves; in fact they never really know about the actual elements/things you want to permute. Hence why @RHertel uses the sapply call to apply the permuted indices to a. A quicker way of doing this is to store the permutations and then insert back into this matrix of permutations the elements of a, indexed by the permutation matrix:

perms <- allPerms(seq_along(a)) # store all permutations
perms[] <- a[perms]             # replace elements of perms with shuffled elements of a

> perms <- allPerms(seq_along(a))
> perms[] <- a[perms]
> head(perms)
     [,1] [,2] [,3] [,4] [,5]
[1,] "Ar" "Ba" "Bl" "Ca" "Bu"
[2,] "Ar" "Ba" "Bu" "Bl" "Ca"
[3,] "Ar" "Ba" "Bu" "Ca" "Bl"
[4,] "Ar" "Ba" "Ca" "Bl" "Bu"
[5,] "Ar" "Ba" "Ca" "Bu" "Bl"
[6,] "Ar" "Bl" "Ba" "Bu" "Ca"

The efficiency here is that the replacement is done in a single function call to <-.[(), rather than separate calls to [(). You need the [] on perms otherwise you'd overwrite perms not replace in place.