LinAlgError: 6-th leading minor of the array is not positive definite when running VAR model on list of dataframes

Question

LinAlgError: 6-th leading minor of the array is not positive definite when running VAR model on list of dataframes

1k Views Asked by hulio_entredas At 22 September 2022 at 20:36

I have a generated a list of dataframes called new_new_dfs that all have this general format, with some variation in the number of Coupons and the number of rows:

They are columns of differenced Single Month Mortality (SMM) for bond securities (groupings of mortgage loans) of different Coupons (i.e. interest rates) month-to-month. I next have this code:

for df in new_new_dfs:
           
        train = df[df.index <= max(df.index) - relativedelta(months = 3)]
        test = df[df.index > max(df.index) - relativedelta(months = 3)]
        train = train.dropna()
        
        if train.empty is False and len(train) > 10 and len(list(train.columns)) > 1:
                model = VAR(train)
                result = model.fit()
                result.summary()

To try to create a vector autoregression model for each of the dataframes in the list. I also skip empty dataframes and check for # of rows and columns to ensure that each dataframe is suitable for a VAR. However, about 11 dataframes in I get this error traceback:

LinAlgError                               Traceback (most recent call last)
Input In [135], in <cell line: 4>()
     13 i+=1
     14 print(i)
---> 15 result.summary()

File ~\Anaconda3\lib\site-packages\statsmodels\tsa\vector_ar\var_model.py:1835, in VARResults.summary(self)
   1828 def summary(self):
   1829     """Compute console output summary of estimates
   1830 
   1831     Returns
   1832     -------
   1833     summary : VARSummary
   1834     """
-> 1835     return VARSummary(self)

File ~\Anaconda3\lib\site-packages\statsmodels\tsa\vector_ar\output.py:71, in VARSummary.__init__(self, estimator)
     69 def __init__(self, estimator):
     70     self.model = estimator
---> 71     self.summary = self.make()

File ~\Anaconda3\lib\site-packages\statsmodels\tsa\vector_ar\output.py:83, in VARSummary.make(self, endog_names, exog_names)
     80 buf = StringIO()
     82 buf.write(self._header_table() + '\n')
---> 83 buf.write(self._stats_table() + '\n')
     84 buf.write(self._coef_table() + '\n')
     85 buf.write(self._resid_info() + '\n')

File ~\Anaconda3\lib\site-packages\statsmodels\tsa\vector_ar\output.py:130, in VARSummary._stats_table(self)
    122 part2Lstubs = ('No. of Equations:',
    123                'Nobs:',
    124                'Log likelihood:',
    125                'AIC:')
    126 part2Rstubs = ('BIC:',
    127                'HQIC:',
    128                'FPE:',
    129                'Det(Omega_mle):')
--> 130 part2Ldata = [[model.neqs], [model.nobs], [model.llf], [model.aic]]
    131 part2Rdata = [[model.bic], [model.hqic], [model.fpe], [model.detomega]]
    132 part2Lheader = None

File ~\Anaconda3\lib\site-packages\pandas\_libs\properties.pyx:37, in pandas._libs.properties.CachedProperty.__get__()

File ~\Anaconda3\lib\site-packages\statsmodels\tsa\vector_ar\var_model.py:1540, in VARResults.llf(self)
   1537 @cache_readonly
   1538 def llf(self):
   1539     "Compute VAR(p) loglikelihood"
-> 1540     return var_loglike(self.resid, self.sigma_u_mle, self.nobs)

File ~\Anaconda3\lib\site-packages\statsmodels\tsa\vector_ar\var_model.py:334, in var_loglike(resid, omega, nobs)
    306 def var_loglike(resid, omega, nobs):
    307     r"""
    308     Returns the value of the VAR(p) log-likelihood.
    309 
   (...)
    332         \left(\ln\left|\Omega\right|-K\ln\left(2\pi\right)-K\right)
    333     """
--> 334     logdet = logdet_symm(np.asarray(omega))
    335     neqs = len(omega)
    336     part1 = -(nobs * neqs / 2) * np.log(2 * np.pi)

File ~\Anaconda3\lib\site-packages\statsmodels\tools\linalg.py:28, in logdet_symm(m, check_symm)
     26     if not np.all(m == m.T):  # would be nice to short-circuit check
     27         raise ValueError("m is not symmetric.")
---> 28 c, _ = linalg.cho_factor(m, lower=True)
     29 return 2*np.sum(np.log(c.diagonal()))

File ~\Anaconda3\lib\site-packages\scipy\linalg\decomp_cholesky.py:152, in cho_factor(a, lower, overwrite_a, check_finite)
     93 def cho_factor(a, lower=False, overwrite_a=False, check_finite=True):
     94     """
     95     Compute the Cholesky decomposition of a matrix, to use in cho_solve
     96 
   (...)
    150 
    151     """
--> 152     c, lower = _cholesky(a, lower=lower, overwrite_a=overwrite_a, clean=False,
    153                          check_finite=check_finite)
    154     return c, lower

File ~\Anaconda3\lib\site-packages\scipy\linalg\decomp_cholesky.py:37, in _cholesky(a, lower, overwrite_a, clean, check_finite)
     35 c, info = potrf(a1, lower=lower, overwrite_a=overwrite_a, clean=clean)
     36 if info > 0:
---> 37     raise LinAlgError("%d-th leading minor of the array is not positive "
     38                       "definite" % info)
     39 if info < 0:
     40     raise ValueError('LAPACK reported an illegal value in {}-th argument'
     41                      'on entry to "POTRF".'.format(-info))

LinAlgError: 6-th leading minor of the array is not positive definite

And I'm not sure what it's referring to. I have tried to print each train dataframe to inspect the dataframe it doesn't like, but I can't tell what about it is problematic for the VAR model. Let me know if you have any ideas as to what the problem is here. Thank you!

Original Q&A

There are 1 best solutions below

**Ashutosh Mishra** · Answer 1 · 2023-07-14T17:23:47.623000

Using a stationary series solved my problem.

To check if your series is stationary, perform the Augmented Dickey-Fuller Test using the following code.

for name, column in df.iteritems():
    adfuller_test(column, name=column.name)
    print('\n')

If your series is not stationary, use the following code to differentiate it and perform the test again.

df_differenced = df.diff().dropna()

And then perform the test again

# ADF Test on each column of 1st Differences Dataframe
for name, column in df_differenced.iteritems():
    adfuller_test(column, name=column.name)
    print('\n')

Repeat the differencing step until your series becomes stationary, and then you can use VAR on this differenced dataframe.

LinAlgError: 6-th leading minor of the array is not positive definite when running VAR model on list of dataframes

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in TIME-SERIES

Related Questions in VECTOR-AUTO-REGRESSION

Trending Questions

Popular # Hahtags

Popular Questions