I have a dataframe with about 1700 observations (rows) in 29 variables (columns) and I have to select a mixture of 7 of these observations for which the weighted mean (of each variable) is as equal as possible as the weighted mean (of each variable) of the 1700 observations. I tried bootstrapping the data in R with boot() with the aim of getting several possible mixtures (calculating all possible mixtures takes too long) but I failed in selecting samples of just 7 observations from the whole data set. I know that bootstrapping generally generates samples of the same size as the original dataset. Here is the code I have tried:
functionWAM<-function(data, i, p){
data<-data[sample(nrow(data), p),];
SAM<-weighted.mean(data[i,3], data[i,1]);
CAM<-weighted.mean(data[i,4], data[i,1]);
return(cbind(SAM,CAM))
}
WAMresults<-boot(datamatrix, statistic=functionWAM, R=1000, p=7)
The bootstrapping will be done with replacement, so you have the potential in your 7 values of getting the same observation twice. If you don't want that, you could basically make your own function that would do this (in fact, you've pretty much got it already). The function below returns the weighted mean for the chosen seven observations, but also returns as an attribute of the vector the observation numbers that produced the result.
Here's a hypothetical example. It uses
replicate()
to do multiple iterations, in this case, 3.