I am using Sympy to run through a basic regression example in the hope that it will become an effective tool for decomposing more complex estimators. I cannot, however, figure out how to display dense matrices for each step in the process. In particular, I get hung up trying to represent the least squares estimate:
Here is the setup with just five observations.
from sympy import *
y=MatrixSymbol('y',5,1)
x=MatrixSymbol('x',5,2)
b=MatrixSymbol('b',2,1)
I can represent the basic components:
(x.T*x).as_explicit()
(x.T*y).as_explicit()
And I can even represent the inverse of the first component symbolically.
(x.T*x).I
However, when I attempt to expand the inverse of the first component, I get smacked with an IndexError.
(x.T*x).I.as_explicit()
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-182-93739c34be6e> in <module>()
----> 1 (x.T*x).I.as_explicit()
/home/choct155/analysis/Anaconda/lib/python2.7/site-packages/sympy/matrices/expressions/matexpr.pyc in as_explicit(self)
230 return ImmutableMatrix([[ self[i, j]
231 for j in range(self.cols)]
--> 232 for i in range(self.rows)])
233
234 def as_mutable(self):
/home/choct155/analysis/Anaconda/lib/python2.7/site-packages/sympy/matrices/expressions/matexpr.pyc in __getitem__(self, key)
198 i, j = sympify(i), sympify(j)
199 if self.valid_index(i, j) is not False:
--> 200 return self._entry(i, j)
201 else:
202 raise IndexError("Invalid indices (%s, %s)" % (i, j))
/home/choct155/analysis/Anaconda/lib/python2.7/site-packages/sympy/matrices/expressions/matpow.pyc in _entry(self, i, j)
27 if self.exp.is_Integer:
28 # Make an explicity MatMul out of the MatPow
---> 29 return MatMul(*[self.base for k in range(self.exp)])._entry(i, j)
30
31 from matmul import MatMul
/home/choct155/analysis/Anaconda/lib/python2.7/site-packages/sympy/matrices/expressions/matmul.pyc in _entry(self, i, j, expand)
45 return coeff * matrices[0][i, j]
46
---> 47 head, tail = matrices[0], matrices[1:]
48 assert len(tail) != 0
49
IndexError: list index out of range
The inverse exists, so am I just asking Sympy to do something it cannot? The goal here is to explicitly step through each component of an estimation procedure with both the matrix representation and data machinations within the IPython Notebook. I have been unable to locate a solution to this, so any help on this front would be greatly appreciated.
This is a bug. I opened https://github.com/sympy/sympy/issues/2749 for it. The workaround would be
(x.T*x).as_explicit().I
, but it seems that raises its own exception (https://github.com/sympy/sympy/issues/2750). So until one of those is fixed, I unfortunately don't have a good solution for you, other than to compute the inverse manually using.det()
and the usual formula for 2x2 matrices.