Counting factor observation via aggregate in R

387 Views Asked by At

My vector is

 Name
  s1
  s1
  s1
  s2
  s2
  s3

I need to count number of occurrences of each value. The expected output is something like this

 Names  No.
 s1      3
 s2      2
 s3      1

I am using aggregate function for that which is

aggregate(case2$Name,by=list(Names =case2$Name),table)

It gives me the correct result but in diagnol matrix form instead of another vector as in my expected output.

If I try aggregate function with count, as like here

aggregate(case2$Name,by=list(Names =case2$Name),count)

It gives me this error

Error in UseMethod("group_by_") : 
no applicable method for 'group_by_' applied to an object of class "factor"

not sure what shall I do with that?

3

There are 3 best solutions below

2
On BEST ANSWER

Agreed that table(Name) is the most straight forward approach but for reference the correct syntax for using aggregate to get the same result is:

aggregate(Name, by=list(Name), length)

0
On

Use a simple call to table, something like

table(Name)

For your example, you'll find something like...

> Name = as.factor( c ( 's1' , 's1' , 's1' , 's2' , 's2' , 's3' ) )
> Name
[1] s1 s1 s1 s2 s2 s3
Levels: s1 s2 s3
> table(Name)
Name
s1 s2 s3
 3  2  1


> t <- table(Name)
> str(t)
 'table' int [1:3(1d)] 3 2 1
 - attr(*, "dimnames")=List of 1
  ..$ Name: chr [1:3] "s1" "s2" "s3"
> t[1]
s1 
 3 
> t[2]
s2 
 2 
> t[3]
s3 
 1 
> t['s1']
s1 
 3 

> str(t['s1'])
 Named int 3
 - attr(*, "names")= chr "s1"

> sprintf( "abcd = %d" , t[1] )
[1] "abcd = 3"
> t[1] + 5
s1 
 8 
0
On

The solution by @jxramos works perfectly, but the table format can sometimes be slightly inconvenient. Data stored in matrices, dataframes, or vectors are usually somewhat easier to handle. If you want a matrix as an output (with one column in this case, so it is essentially a vector), you could perform a minor modification like this:

v1 <- c ('s1' , 's1' , 's1' , 's2' , 's2' , 's3' ) 
v2 <- as.matrix(table(v1))
colnames(v2) <- "Name"

This is the output:

> v2
   Name
s1    3
s2    2
s3    1