I would like to (i) compute and (ii) plot the central credible interval and the highest posterior density intervals for a distribution in the Distributions.jl library. Ideally, one can write their own function to compute CI and HPD and then use Plots.jl to plot them. However, I'm finding the implementation quite tricky (disclaimer: I'm new to Julia). Any suggestions about libraries/gists/repo to check out that make the computing and plotting them easier?

Context

using Plots, StatsPlots, LaTeXStrings
using Distributions

dist = Beta(10, 10)
plot(dist)  # thanks to StatsPlots it nicely plots the distribution

# missing piece 1: compute CI and HPD
# missing piece 2: plot CI and HPD

Expected end result summarized in the image below or at p. 33 of BDA3.

enter image description here

Resources found so far:

3

There are 3 best solutions below

2
On BEST ANSWER

Thanks for updating the question; it brings a new perspective.

The gist is kind of correct; only it uses an earlier version of Julia. Hence linspace should be replaced by LinRange. Instead of using PyPlot use using Plots. I would change the plotting part to the following:

plot(cred_x, pdf(B, cred_x), fill=(0, 0.9, :orange))
plot!(x,pdf(B,x), title="pdf with 90% region highlighted")

At first glance, the computation of the CI seems correct. (Like the answer from Closed Limelike Curves or the answer from the question [there][1]). For the HDP, I concur with Closed Limelike Curves. Only I would add that you could build your HDP function upon the gist code. I would also have a version for posterior with a known distribution (like in your reference document page 33, figure 2.2) as you don't need to sample. And another with sampling like Closed Limelike Curves indicated.

1
On

You're looking for ArviZ.jl, together with Turing.jl's MCMCChains. MCMCChains will give you very basic plotting capabilities, e.g. a plot of the PDF estimated from each chain. ArviZ.jl (a wrapper around the Python ArviZ package) adds a lot more plots.

0
On

OP edited the question, so I'm giving a new answer.

For central credible intervals, the answer is pretty easy: Take the quantiles at each point:

lowerBound = quantile(Normal(0, 1), .025)
upperBound = quantile(Normal(0, 1), .975)

This will give you an interval where the probability of x lying below the lower bound .025, and similarly for the upper bound, adding up to .05.

HPDs are harder to calculate. In addition, they tend to be less common because they have some weird properties that aren't shared by central credible intervals. The easiest way to do it is probably using a Monte Carlo algorithm. Use randomSample = rand(Normal(0, 1), 2^12) to draw 2^12 samples from the Normal distribution. (Or however many samples you want, more gives more accurate results that are less affected by random chance.) Then, for each random point, evaluate the probability density at that random point using pdf.(randomSample). Then, pick the 95% of points with the highest probability density; include all of these points in the highest density interval, and any points between them (I'm assuming you're dealing with a single-moded distribution like the normal).

There are better ways to do this for the normal distribution, but they're harder to generalize.