I'd like to know how to make multiple sampling in R. For example, when I try dividing some data into 60(train data):40(validate data), I can write the code like this:
original.data = read.csv("~.csv", na.strings="")
train.index = sample(c(1:dim(original.data)[1]), dim(original.data)[1]*0.6)
train.data = original.data[train.index,]
valid.data = original.data[-train.index,]
However, it is so hard to figure out making multiple sampling like dividing some data into 60:20:20.
I would appreciate if you make me know the best solution!
If you want more than two sets, then the other solutions are close but you need just a little more. There are at least two options.
First:
Because it's random, you want always get your exact proportions.
Second:
If you must have exact proportions, you can do this:
This can easily be generalized to support an arbitrary number of partitions.