Error in t.test for p-value calculation in R

64 Views Asked by At

I have a dataframe mat that has information

mat <- structure(list(ids = c("id1", "id2", "id3", "id4", "id5", "id6", 
"id7", "id8", "id9", "id10", "id11", "id12", "id13", "id14", 
"id15", "id16", "id17", "id18", "id19", "id20"), Group = c("A", 
"A", "A", "A", "A", "B", "B", "B", "B", "B", "B", "C", "C", "C", 
"C", "C", "C", "A", "A", "B"), number = c(2L, 2L, 3L, 2L, 2L, 
44L, 172L, 34L, 78L, 27L, 31L, 55L, 23L, 34L, 14L, 18L, 25L, 
2L, 2L, 12L)), class = "data.frame", row.names = c(NA, -20L))

I'm trying to calculate difference between groups. Tried using t.test like below, but I end up with Error:

t.test(Group, number, data=mat)

it says Error in t.test(Group, number, data = mat) : object 'Group' not found

And also tried:

t.test(Group ~ number, data=mat)

Error in t.test.formula(Group ~ number, data = mat) : 
  grouping factor must have exactly 2 levels

What could be the issue and how do I get p-value from the t.test?

1

There are 1 best solutions below

1
Anderson N. Barbosa On

If your goal is to calculate the Student's t-test, @jay.sf is correct. The test compares means between two groups, and the grouping column you're trying to use has more than two groups.

So, if you want to perform the Student's t-test only for two specific groups (e.g., A and B), here's an example you can use as a base:

# Create filtered dataframes for the groups of interest
mat_group_A <- subset(mat, Group == "A")
mat_group_B <- subset(mat, Group == "B")
    
# Perform the t-test only for the groups you've specified (in pairs)
t.test(mat_group_A$number, mat_group_B$number)

Now, if your intention is to compare all means among all groups, it's more sensible to use ANOVA. This will reduce errors in your analysis. Remember that ANOVA will inform you whether there are or aren't significant differences in comparisons. To identify where these differences are, you need to conduct a post-hoc test, such as the Tukey test, for example. The script below accomplishes what I mentioned:

# Calculate ANOVA
anova.result <- aov(number ~ Group, data = mat)
    
# Display the ANOVA result
summary(anova.result)
    
# Compare means between the groups (Tukey Post-hoc Test)
TukeyHSD(anova.result)