In the context of a model for risk management to calculate the VaR, I am implementing a routine to create multivariate distributions with user defined marginals.
I'll provide an example of the code, using a gaussian copula and gamma marginals:
# generate a multivariate normal distribution with a given input correlation:
mv = scipy.stats.multivariate_normal.rvs(mean = np.zeros(10), cov = correlation, size = (50000, 50, 10))
# transform the sample to a uniform with the above correlation:
unif = scipy.stats.norm.cdf(mv)
# apply the marginal:
mv_fin = scipy.stats.gamma.ppf(unif, a=1)
It works well and produce results as expected. The key problem is that it is super slow and these 3 steps take 90% of the overall model runtime. How can I speed it up? Thanks
Regarding
scipy.stats.norm.cdf
, thenumba-stats
package provide a function to compute apparently the same thing. The default parameters needs to be explicitly provided to this functions as opposed to the Scipy alternative. You can call it usingnumba_stats.stats.norm_cdf(mv, 0, 1)
. This is 2.2 times faster on my machine with a i5-9600KF. The speed up mainly comes from the use of multiple threads (compared to the Scipy implementation which is sequential so far).Regarding
scipy.stats.gamma.ppf
, things are more complex since the Numba package does not seems to provide any alternative (ie. thegamma
sub-module does not seem implemented so far). The implementation of Scipy is sequential. The slow part appear to come from the functiongammaincinv
(of the modulescipy.special
) which is called for all value ofunif
(with1
for the first parameter). This function callsigami
which appear to come from the Cephes library. The source can be found here. This function makes use of a lot of expensive mathematical functions likepow
,exp
,log
(as well as basic loops to compute power/asymptomatic series). The library (and the Scipy functions) seems to be optimized to provide a pretty good accuracy at the expense of a relatively poor performance.Interestingly, this function calls is surprisingly 2.6 times faster on Linux than Windows on the same machine with the same exact version of Scipy (1.10.1). I think this is due to a more efficient implementation of the default math library on Linux than Windows. You can try to use an optimized math library like the one of Intel for example so to certainly speed up the computation. Note that rebuilding Scipy (and Cephes) on the target machine might be needed to get a performance boost. A simpler approach is certainly to try the Intel Distribution for Python assuming you use an Intel processor (or possibly an AMD one, since most Intel tools works on them though the performance might not be as good as on Intel ones).
Another possible way to speed the computation up is to use multiple threads when computing the function
scipy.stats.gamma.ppf
. Unfortunately, Scipy does not support this yet. That being said, one can reimplement the Scipy function and call thegammaincinv
function even though this is certainly a significant work. A simpler (ugly) alternative solution is simply to hook the functionscipy.stats.gamma._ppf
so to use multiple threads. Indeed, this is a pure-Python function and such function can be re-assigned. This strategy speed up the computation by x3.75 on my machine with 6 cores. This optimization can be combined with the better math library.Here is the final script:
As you can see, combining multiple threads with a fast math library (or just using Linux for sake of simplicity) results in a significant speed up. The time dropped from 181.78 s on Windows for the initial implementation to 24.66 s on Linux for the optimized implementation.
This is a speed up of 7.4x!