MASS packages' "fitdistr": Error when dealing with manipulated random data

823 Views Asked by At

Background:

Below I have generated some random beta data using R and manipulate the shape of the data a bit to arrive at what I call "Final" in my code. And I histogram "Final" in my code.

Question:

I'm wondering why when trying to fit a "beta" distribution to "Final" data using MASS packages' "fitdistr" function, I get the following error (Any suggestion how to avoid this error)?

Error in stats::optim(x = c(0.461379379270288, 0.0694261016478062, 0.76934266883081, : initial value in 'vmmin' is not finite

Here is my R code:

 require(MASS)

## Generate some data and manipulate it
set.seed(47)

Initial = rbeta(1e5, 2, 3)
d <- density(Initial)

b.5 <- dbeta(seq(0, 1, length.out = length(d$y)), 50, 50)
b.5 <- b.5 / (max(b.5) / max(d$y))    # Scale down to max of original density

 b.6 <- dbeta(seq(0, 1, length.out = length(d$y)), 60, 40)
 b.6 <- b.6 / (max(b.6) / max(d$y))

 # Collect maximum densities at each x to use as sample probability weights
 p <- pmax(d$y, b.5, b.6)


Final <- sample(d$x, 1e4, replace = TRUE, prob = p) ## THIS IS MY FINAL DATA

hist(Final, freq = F, ylim = c(0, 2))               ## HERE IS A HISTOGRAM

 m <- MASS::fitdistr(Final, "beta",          ## RUN THIS TO SEE HOW THE ERROR COMES UP
                start = list(shape1 = 1, shape2 = 1))
1

There are 1 best solutions below

2
On BEST ANSWER

Here is the code.

It is the same with your code, I just removed the negative beta values.

library(MASS)

set.seed(47)

Initial = rbeta(1e5, 2, 3)
d <- density(Initial)

b.5 <- dbeta(seq(0, 1, length.out = length(d$y)), 50, 50)


b.5 <- b.5 / (max(b.5) / max(d$y))    # Scale down to max of original 
density

b.6 <- dbeta(seq(0, 1, length.out = length(d$y)), 60, 40)
b.6 <- b.6 / (max(b.6) / max(d$y))

# Collect maximum densities at each x to use as sample probability weights
p <- pmax(d$y, b.5, b.6)


Final <- sample(d$x, 1e4, replace = TRUE, prob = p) ## THIS IS MY FINAL DATA

hist(Final, freq = F, ylim = c(0, 2))               ## HERE IS A HISTOGRAM

This is the original histogram

# replace negative beta values with smallest value > 0
Final[Final<= 0] <- min(Final[Final>0])

hist(Final, freq = F, ylim = c(0, 2))

The histogram after removing negative values

m <- MASS::fitdistr(x = Final, densfun = "beta",          
                start = list(shape1 = 1, shape2 = 1))

Here are the shape parameters:

> m
     shape1       shape2  
  1.99240852   2.90219720 
 (0.02649853) (0.04010168)

Take note that it gives some warnings.