I am trying to evaluate the multivariate normal distribution to get the probability that 1 < z1 < 2.178 and 2.178 < z2 < inf. The mean of the distribution is zero, all variances are 1, and the covariance is 1/sqrt(2).
In R, I get the correct results via:
library(mnormt)
sadmvn(lower=c(1, 2.178), upper=c(2.178, Inf), mean=0, varcov=matrix(c(1, 1/sqrt(2), 1/sqrt(2), 1),2, 2))
In SciPy, I am trying:
from scipy import stats
cov = np.array([[1, 1 / np.sqrt(2)], [1 / np.sqrt(2), 1]])
d = stats.multivariate_normal(mean=np.array([0, 0]), cov=cov)
d.cdf([2.178, np.inf]) - d.cdf([1, 2.178])
but this gives something around 0.14 instead of the correct result ~0.0082. How can I do this correctly?
You can use the
lower_limitparameter of themultivariate_normal.cdfmethod to compute the CDF within a hyperrectangle with cornerslower_limitandx(the first positional argument).It looks like
lower_limitandxofmultivariate_normalcorrespond withlowerandupperofsadmvn, respectively.The reason why your original approach did not work is that when
cdfis called withoutlower_limit, the lower limit of integration it is assumed to be[-np.inf, -np.inf]. Therefore, your code was calculating the difference between the CDFs of two hyperectangles, each with lower limit[-np.inf, -np.inf]. This idea works in 1D, but in 2+ dimensions, this is not the same as the CDF within a single hyperrectangle from corner[1, 2.178]to corner[2.178, np.inf].For example, in the diagram below, the area enclosed within the rectangle defined by points A and B is not the same as the difference between the areas of the blue and red rectangles.