Analytical solution for Linear Regression using Python vs. Julia

Question

Analytical solution for Linear Regression using Python vs. Julia

4k Views Asked by Anarcho-Chossid At 29 July 2025 at 05:40

Using example from Andrew Ng's class (finding parameters for Linear Regression using normal equation):

With Python:

X = np.array([[1, 2104, 5, 1, 45], [1, 1416, 3, 2, 40], [1, 1534, 3, 2, 30], [1, 852, 2, 1, 36]])
y = np.array([[460], [232], [315], [178]])
θ = ((np.linalg.inv(X.T.dot(X))).dot(X.T)).dot(y)
print(θ)

Result:

[[  7.49398438e+02]
 [  1.65405273e-01]
 [ -4.68750000e+00]
 [ -4.79453125e+01]
 [ -5.34570312e+00]]

With Julia:

X = [1 2104 5 1 45; 1 1416 3 2 40; 1 1534 3 2 30; 1 852 2 1 36]
y = [460; 232; 315; 178]

θ = ((X' * X)^-1) * X' * y

Result:

5-element Array{Float64,1}:
 207.867    
   0.0693359
 134.906    
 -77.0156   
  -7.81836

Furthermore, when I multiple X by Julia's — but not Python's — θ, I get numbers close to y.

I can't figure out what I am doing wrong. Thanks!

Original Q&A

There are 3 best solutions below

Andreas Noack On 16 June 2015 at 04:17

Notice that X is a 4x5 matrix or in statistical terms that you have fewer observations than parameters to estimate. Therefore, the least squares problem has infinitely many solutions with the sum of the squared errors exactly equal to zero. In this case, the normal equations don't help you much because the matrix X'X is singular. Instead, you should just find a solution to X*b=y.

Most numerical linear algebra systems are based on the FORTRAN package LAPACK which uses the a pivoted QR factorization for solving the problem X*b=y. Since there are infinitely many solutions, LAPACK's picks the solution with the smallest norm. In Julia, you can get this solution, simply by writing

float(X)\y

(Unfortunately, the float part is necessary right now, but that will change.)

In exact arithmetic, you should get the same solution as the one above with either of your proposed methods, but the floating point representation of you problem introduces small rounding errors and these errors will affect the calculated solution. The effect of the rounding errors on the solution is much larger when using the normal equations compared to using the QR factorization directly on X.

This holds true also in the usual case where X has more rows than columns so often it is recommended that you avoid the normal equations when solving least squares problems. However, when X has many more rows than columns, the matrix X'X is relatively small. In this case, it will be much faster to solve the problem with the normal equations instead of using the QR factorization. In many statistical problems, the extra numerical error is extremely small compared to the statical error so the loss of precision due to the normal equations can simply be ignored.

waTeim On 15 June 2015 at 23:09

Using X^-1 vs the pseudo inverse

pinv(X) which corresponds to the pseudo inverse is more broadly applicable than inv(X), which X^-1 equates to. Neither Julia nor Python do well using inv, but in this case apparently Julia does better.

but if you change the expression to

julia> z=pinv(X'*X)*X'*y
5-element Array{Float64,1}:
 188.4     
   0.386625
 -56.1382  
 -92.9673  
  -3.73782

you can verify that X*z = y

julia> X*z
4-element Array{Float64,1}:
 460.0
 232.0
 315.0
 178.0

**xnx** · Accepted Answer

A more numerically robust approach in Python, without having to do the matrix algebra yourself is to use numpy.linalg.lstsq to do the regression:

In [29]: np.linalg.lstsq(X, y)
Out[29]: 
(array([[ 188.40031942],
        [   0.3866255 ],
        [ -56.13824955],
        [ -92.9672536 ],
        [  -3.73781915]]),
 array([], dtype=float64),
 4,
 array([  3.08487554e+03,   1.88409728e+01,   1.37100414e+00,
          1.97618336e-01]))

(Compare the solution vector with @waTeim's answer in Julia).

You can see the source of the ill-conditioning by printing the matrix inverse you're calculating:

In [30]: np.linalg.inv(X.T.dot(X))
Out[30]: 
array([[ -4.12181049e+13,   1.93633440e+11,  -8.76643127e+13,
         -3.06844458e+13,   2.28487459e+12],
       [  1.93633440e+11,  -9.09646601e+08,   4.11827338e+11,
          1.44148665e+11,  -1.07338299e+10],
       [ -8.76643127e+13,   4.11827338e+11,  -1.86447963e+14,
         -6.52609055e+13,   4.85956259e+12],
       [ -3.06844458e+13,   1.44148665e+11,  -6.52609055e+13,
         -2.28427584e+13,   1.70095424e+12],
       [  2.28487459e+12,  -1.07338299e+10,   4.85956259e+12,
          1.70095424e+12,  -1.26659193e+11]])

Eeep!

Taking the dot product of this with X.T leads to a catastrophic loss of precision.

Analytical solution for Linear Regression using Python vs. Julia

There are 3 best solutions below

Using X^-1 vs the pseudo inverse

Related Questions in PYTHON

Related Questions in MATRIX

Related Questions in JULIA

Trending Questions

Popular # Hahtags

Popular Questions