Apply Bins to Data Frame Groups without making subset Data Frames

Question

Apply Bins to Data Frame Groups without making subset Data Frames

60 Views Asked by C. Peacock At 27 July 2025 at 18:06

I have a data frame containing fish population sampling data. I would like to create bins to count how many fish are in a given length group for each species. The below code accomplishes this task for 2 species. Doing this for all species in the data frame doesn't seem like the most elegant way to achieve this goal.

Plus I would like to apply this code to other lakes with different species. It would be great to find an "automated" way to apply these bins to each species group in the data frame.

The data frame looks like:

Species TL   WT
BLG     75    6
BLG    118   27
LMB    200   98
LMB    315  369
RBS    112   23
RES    165   73
SPB    376  725
YEP    155   33


ss = read.csv("SS_West Point.csv" , na.strings="." , header=T)
blg = ss %>% subset(Species == "BLG")
lmb = ss %>% subset(Species == "LMB") 
blgn = blg %>% summarise(n = n())
lmbn = lmb %>% summarise(n = n())

###  20mm Length Groups - BLG  ###
blg20 = blg %>% group_by(gr=cut(TL , breaks = seq(0 , 1000 , by = 20))) %>% 
            summarise(n = n()) %>% mutate(freq = n , percent = ((n/blgn$n)*100) , 
                                   cumfreq = cumsum(freq) , cumpercent = cumsum(percent))
###  20mm Length Groups - BLG  ###
lmb20 = lmb %>% group_by(gr=cut(TL , breaks = seq(0 , 1000 , by = 20))) %>%
            summarise(n = n()) %>% mutate(freq = n , percent = ((n/lmbn$n)*100) , 
                            cumfreq = cumsum(freq) , cumpercent = cumsum(percent))

I've successfully used do() to run linear models on this data frame but can't seem to get it to work on cut(). Here is how I used do() on lm():

ssl = ss %>% mutate(lTL = log10(TL) , lWT = log10(WT)) %>% group_by(Species)
m = ssl %>% do(lm(lWT~lTL , data =.)) %>% mutate(wp = 10^(.fitted))

Original Q&A

There are 1 best solutions below

**Jon Spring** · Accepted Answer

Does this do what you expect?

ss20 <- ss %>%
  add_count(Species) %>%
  rename(Species_count = n) %>%
  # I added Species_count to the grouping so it goes along for the ride in summarization
  group_by(Species, Species_count, gr=cut(TL , breaks = seq(0 , 1000 , by = 20))) %>%
  summarise(n = n()) %>%
  mutate(freq = n, percent = ((n/Species_count)*100), 
         cumfreq = cumsum(freq) , cumpercent = cumsum(percent)) %>%
  ungroup()


> ss20
# A tibble: 8 x 8
  Species Species_count gr            n  freq percent cumfreq cumpercent
  <chr>           <int> <fct>     <int> <int>   <dbl>   <int>      <dbl>
1 BLG                 2 (60,80]       1     1      50       1         50
2 BLG                 2 (100,120]     1     1      50       2        100
3 LMB                 2 (180,200]     1     1      50       1         50
4 LMB                 2 (300,320]     1     1      50       2        100
5 RBS                 1 (100,120]     1     1     100       1        100
6 RES                 1 (160,180]     1     1     100       1        100
7 SPB                 1 (360,380]     1     1     100       1        100
8 YEP                 1 (140,160]     1     1     100       1        100

Apply Bins to Data Frame Groups without making subset Data Frames

There are 1 best solutions below

Related Questions in R

Related Questions in DATAFRAME

Related Questions in SUBSET

Related Questions in BINS

Trending Questions

Popular # Hahtags

Popular Questions