I have a dataset and need to cut the age factor of my dataset into 3 different age categories...e.g. age group 1 (10-20 years old), age group 2 (21-30 years old), and age group 3 (31-40 years old).
If I type
breaks=c(10, 20, 30, 40)
when creating the cut function, the outcome is as follows:
age group 1 being 10-20
age group 2 being 20-30
age group 3 being 30-40
I do not want this! I need age group 2 to be from 21-30 years of age (however 20 is part of this age category now)...I would appreciate some help thank you
I think that you are misinterpreting the results. The intervals are half-open. They include the upper bound, but not the lower bound. So
Means that the number 20 is only in the first group (10,20] but not in the second group (20,30] Also notice that the default does not include the lower limit so better than what I wrote before is
cut(age, breaks=c(10, 20, 30, 40), include.lowest = TRUE)
which will make the lowest level be the fully closed interval [10,20].