How to calculate probability density for individual and for combined components from me.weighted()

Question

How to calculate probability density for individual and for combined components from me.weighted()

152 Views Asked by JPollock At 17 August 2025 at 19:09

I am using Mclust to estimate probability of component membership, but "density" is not included in the output from me.weighted(). Consequently, I am unable to plot probability density. The following code is lengthy because I want to clearly illustrate my purpose and problem, but I clearly indicate where my problem/question arises. My last chunk of code is my attempt at a solution, but it probably only highlights my ignorance of probability densities.

In this research project, my first objective is to compute an index of age-1 fish abundance for subsequent analysis. For that, I want to estimate the proportion of age-1 fish at specific lengths (i.e. an age-length key). It is reasonable to assume that the smaller mode is mostly age-1 fish and the larger mode is of age-2+ fish. My data are fish body length (fork length, cm) and abundance as a proportion of total (i.e., weighted univariate). Note, some outlying large lengths with small proportions were omitted; thus, sum(dat.df$proportions) < 1.

My specific aim here is to illustrate the probability densities superimposed on fish size composition, which reflects two age groups. Basically, in the last chunk of ggplot code, I want to swap out the estimated probability of membership to each (red) or either (green) component with probability densities becauses it would make a nice, informative figure in my manuscript.

I have read relevant articles (Murphy; Scrucca et al.; Mignan; R-Bloggers, etc), but found no answer.

So, I would greatly appreciate any help on how to compute the probability densities for each component and also the component-combined probability density.

Packages

library(ggplot2)
library(mclust)

Data

dat.df <- data.frame(flcm = 15:33, proportion = c(0.0043, 0.0114, 0.0296, 0.0519, 0.0540, 0.0403, 0.0294, 0.0152, 0.0257, 0.0793, 0.1458, 0.1505, 0.1277, 0.0909, 0.0389, 0.0308, 0.0121, 0.0101, 0.0085), z1 = c(rep(1,9), rep(0,10)), z2 = c(rep(0,9), rep(1,10)))

Plot data

ggplot()+
  geom_bar(aes(x=dat.df$flcm, y=dat.df$proportion),
           fill = "gray", position="dodge", stat="identity")+
  xlab("Fork length (cm)")+
  ylab("Probability density")+
  theme_bw()

WITHOUT WEIGHTS (i.e., ignore dat.df$proportion)

Fit mixture model without weights

mod1 <- densityMclust(dat.df[, "flcm"], modelName = "V")

Plot probability density

plot(mod1, what = "density", data = dat.df$flcm, breaks = 5)

WITH WEIGHTS (i.e., include dat.df$proportion)

Refit model with weights

mod1_w <- me.weighted(modelName = "V", 
            data = dat.df$flcm,
            z = cbind(dat.df$z1, dat.df$z2),
            weights = dat.df$proportion)

Plot data with estimated fractional membership (updated z)

ggplot()+
  geom_bar(aes(x=dat.df$flcm, y=dat.df$proportion),
           fill = "gray", position="dodge", stat="identity")+
  geom_line(aes(x = dat.df$flcm, 
                y = (mod1_w$z[,1] * dat.df$proportion)), 
            color = "red") +
  geom_line(aes(x = dat.df$flcm, 
                y = (mod1_w$z[,2] * dat.df$proportion)), 
            color = "red") +
    geom_line(aes(x = dat.df$flcm, 
                y = (mod1_w$z[,1] * dat.df$proportion) +
                  mod1_w$z[,2] * dat.df$proportion), 
              color = "green") +
  xlab("Fork length (cm)")+
  ylab("Probability density")+
  theme_bw()

Plot probability density - Here's where my problem/question arises

plot(mod1_w, what = "density", data = dat.df$flcm, breaks = 5)`

Here's my attempted solution. Basically, for each component (age1, age2), multiply probabilities and scale to proportional abundance:

#age1 probability density
age1 <- mod1_w$z[,1]* #probability of age1 membership multiplied by
  dnorm(dat.df$flcm, mod1_w$parameters$mean[1], #probability of flcm given age1
        mod1_w$parameters$variance$sigmasq[1])* 
  sum(mod1_w$z[,1]*mod1_w$weights) #and scaled to proportional abundance of age1

#age2 probability density
age2 <- mod1_w$z[,2]* #probability of age2 membership multiplied by
  dnorm(dat.df$flcm, mod1_w$parameters$mean[2],
        mod1_w$parameters$variance$sigmasq[2])* #probability of flcm given age2
  sum(mod1_w$z[,2]*mod1_w$weights) #and scaled to proportional abundance of age2

#combined ages probability density
age_all <- age1 + age2

#looks bad - the probability densities don't correspond well with proportional abundance
ggplot()+
  geom_bar(aes(x=dat.df$flcm, y=dat.df$proportion),
           fill = "gray", position="dodge", stat="identity")+
  geom_line(aes(x = dat.df$flcm, 
                y = age1), 
            color = "red") +
  geom_line(aes(x = dat.df$flcm, 
                y = age2), 
            color = "red") +
    geom_line(aes(x = dat.df$flcm, 
                y = age_all), 
              color = "green") +
  xlab("Fork length (cm)")+
  ylab("Probability density")+
  theme_bw()

Original Q&A

There are 1 best solutions below

**JPollock** · Answer 1

I am posting a solution to my own question; hopefully, this will help others. Basically, I switched to the package mxdist, which gives me the desired output as indicated in the following code.

library(mxdist)

#input data (dat.df is created by code in my original question)
dat.mx <- as.mixdata(dat.df[, 1:2])
#preliminary plot
plot(dat.mx)
#define initial parameters
dat_parms <- data.frame(pi=c(0.3, 0.7), mu=c(18, 26), sigma=c(2, 3))
#fit the model
fit1 <- mix(dat.mx, dat_parms, "gamma", constr=mixconstr(consigma="CCV"))
#plot default
plot(fit1)
#replot using ggplot for greater flexibility over appearance
z <- fitted(fit1)
dat.mx[dat.mx$flcm == "Inf", "flcm"] <- 34
ggplot()+
  geom_bar(aes(x=dat.mx$flcm, y=dat.mx$proportion),
           fill = "gray", position="dodge", stat="identity")+
  geom_line(aes(x = dat.mx$flcm, 
                y = z$joint[,1]), 
            color = "red") +
  geom_line(aes(x = dat.mx$flcm, 
                y = z$joint[,2]), 
            color = "red") +
    geom_line(aes(x = dat.mx$flcm, 
                y = z$mixed), 
              color = "green") +
  xlab("Fork length (cm)")+
  ylab("Probability density")+
  theme_bw()
#conditional probabilities are output
z$conditprob

How to calculate probability density for individual and for combined components from me.weighted()

There are 1 best solutions below

Related Questions in R

Related Questions in PROBABILITY-DENSITY

Related Questions in WEIGHTED

Related Questions in MCLUST

Trending Questions

Popular # Hahtags

Popular Questions