In a contest, each winner and prize is assigned a random integer [1, 9] called a "ticket" number and a unique "ID" number [1111, 9999]. Each winner receives a unique prize from a limited stock of prizes based on the winner's ticket number ±1.
Question 1: Duplicate Prizes
How can I prevent the script (below) from returning duplicate prizes? I've used the duplicate() function before, but I'm unsure how to implement it in this case.
Question 2: Cannot Match a Winner with a Prize
How would I implement this rule in my script: If a non-duplicated prize cannot be found, then return a prize from the unclaimed stock that is the next closest match.
Here's what I have thus far:
# Function to generate a data frame with random parameters
generate <- function(n) {
ID <- as.factor(sample(1111:9999, n))
ticket <- sample(1:9, n, replace = TRUE)
lower.bound <- ticket - 1
upper.bound <- ticket + 1
winners.df <- cbind.data.frame(ID, ticket, lower.bound, upper.bound)
return(winners.df)
}
# Generate a master data frame
master <- generate(20)
# Split master data frame into "prizes" and "winners"
prizes <- master[1:16, ]
winners <- master[17:20, ]
# Eliminate upper/lower bound columns in prizes as they are not needed
prizes <- prizes[, -c(3, 4)]
# Set an empty variable to serve as a container
picks <- list(NULL)
for (x in 1:length(winners$ID)) {
pool <- subset(prizes, ticket >= winners$lower.bound[x] & ticket <= winners$upper.bound[x])
picks[[x]] <- pool[sample(nrow(pool), 1), ]
}
picks <- do.call(rbind.data.frame, picks)
# Generate a summary of winners and their prizes
winners.prizes <- data.frame(winnerID = winners$ID,
winnerTicket = winners$ticket,
prizeID = picks$ID,
prizeTicket = picks$ticket)
Original Answer
For question 1.
You need to remove the prize chosen from the prizes data.frame in order for them not to be picked again.
New Answer
I've put a little more thought into this as I looked more into your code.
I would avoid using
subset
as it can have unintended consequences. Also it's not necessary to save your picks into a list if you're just going to transform it into a data.frame. You're better off starting with a data.frame and then updating it. Lastly, I think it may be better to include a new column that highlights whether or not the prize was chosen versus removing the chosen prize from your initial set.One final note - I would recommend not using periods in variable names. They can be misinterpreted as S3 methods.
I set up a function to generate the winners table and a prizes table to show which were/weren't chosen. Too many variables were being created in the global env. So it makes more sense to keep this contained.