I have been given a problem in Jupiter notebooks to code using python. This problem is about linear regression. It's as follows:
1: Linear Regression In this notebook we will generate data from a linear function: =+ and then solve for ̂ using OLS (ordinary least squares) and gradient descent.
Question 1.1 : Generate data: =+ Here we assume ≈(,)=+ where is linear in with additive noise Your function should have the following properties:
output y as an np.array with shape (M,1) generate_linear_y should work for any arbitrary x, b, and eps, as long as they are the appropriate dimensions do not use for-loops to calculate each y[i] separately, as this will be very slow for large M and N. Instead, you should leverage numpy linear algebra.
They expect us to write code as follows:
def generate_linear_y(X,b):
""" Write a function that generates m data points from inputs X and b
Parameters
----------
X : numpy.ndarray
x.shape must be (M,N)
Each row of `X` is a single data point of dimension N
Therefore `X` represents M data points
b : numpy.ndarray
b.shape must be (N,1)
Each element of `b` is a value of beta such that b=[[b1][b2]...[bN]]
Returns
-------
y : numpy.ndarray
y.shape = (M,1)
y[i] = X[i]b
"""
Can someone please assist me because I am thoroughly confused! I didn't even realize the things I am doing required array coding in python, which I always struggle with! Please help!
This looks like a direct matrix multiplication to me. In NumPy, this is implemented using the matrix multiplication operator
@
(akanp.matmul
).To generate random noise, you can use the functions from
numpy.random
, most likelyrandom_sample
orstandard_normal
. If you want to do it the most-correct way, you can create a random number generator withdefault_rng
, then use, for instance,rng.standard_normal
.