Difference between stat_ellipse and geom_mark_ellipse in ggplot?

533 Views Asked by At

I am making an NMDS plot based on Bray-Curtis dissimilarity and trying to draw ellipses on the plot based on a categorical variable. My data does NOT assume a normal distribution.

I've figured out how to draw ellipses using 2 different commands that I was able to find: stat_ellipse and geom_mark_ellipse (part of ggforce package). They produce different results on my plot. However, I don't understand what the difference between the 2 are.

From what I find, stat_ellipse is based on a 'multivariate T distribution' (https://r-charts.com/correlation/scatter-plot-ellipses-ggplot2/) and geom_mark_ellipse is based on the 'Khachiyan algorithm' (https://search.r-project.org/CRAN/refmans/ggforce/html/geom_mark_ellipse.html).

Can any explain what this means in very basic terms? I'm still quite new to stats and using R... I am thinking I should use geom_mark_ellipse because my data is not normally distributed, but I really don't understand what the Khachiyan algorithm is...

Thank you so much for your time.

geom_mark_ellipse stat_ellipse

1

There are 1 best solutions below

0
花落思量错 On

You can see geom_mark_ellipse and related explanation from help.

And I am sure it means confidence interval of its region. And you can adjust parameters,just like:

stat_ellipse(
  mapping = NULL,
  data = NULL,
  geom = "path",
  position = "identity",
  ...,
  type = "t",
  level = 0.95,
  segments = 51,
  na.rm = FALSE,
  show.legend = NA,
  inherit.aes = TRUE
)

But concerning ggforce::geom_mark_ellipse(), I didn't find any parameters that can be used to calculate confidence intervel information,either.

And recently I am also going to produce similar figures but with only two samples in each group.

So stat_ellipse is not suitable. It reminds me that :

ggplot(df, aes(PC1, PC2, color = df$Sample)) + 
  geom_point() +
  stat_ellipse(geom = "polygon", 
               type = "euclid",
               level = 0.95,
               #segments = 51,
               aes(fill = after_scale(alpha(colour, 0.3))))
#?stat_ellipse
# Too few points to calculate an ellipse
# Too few points to calculate an ellipse
# Too few points to calculate an ellipse
# Too few points to calculate an ellipse

And only pca points exist but without circles.

So I guess result from ggforce::geom_mark_ellipse() with little information meaning of confidence interval? It just automatically encolse points in a polygeon ?

Besides, I also posted my similar question here