I am trying to find performance and Mean squared error of linnerud dataset with linear regression technique. I am stuck while passing data and get error "ValueError: Found input variables with inconsistent numbers of samples: [10, 1]". Linnerud dataset has three features and three columns in target where I only want to use one feature which is chinup. Can someone help me in fixing at the point I am stuck?
Following is what I have tried so far, by referring https://scikit-learn.org/stable/auto_examples/linear_model/plot_ols.html
from sklearn import datasets
from sklearn import linear_model
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib.pyplot as plt
import numpy as np
linnerud = datasets.load_linnerud()
print(linnerud)
# Use only one feature
linnerud_X = linnerud.data[:, np.newaxis, 0]
print(linnerud_X)
X = np.array(linnerud_X).reshape((1,-1))
print(X)
# Split the data into training/testing sets
linnerud_X_train = linnerud_X[:-10]
linnerud_X_test = linnerud_X[-10:]
#print(linnerud_X_train)
#print(linnerud_X_test)
Y = np.array(linnerud.target).reshape((1,-1))
# Split the targets into training/testing sets
linnerud_y_train = Y
#linnerud_y_test #= Y[-10:]
print(linnerud_y_train)
#print(linnerud_y_test)
# Create linear regression object
regr = linear_model.LinearRegression()
# Train the model using the training sets
regr.fit(linnerud_X_train, linnerud_y_train)
# Make predictions using the testing set
linnerud_y_pred = regr.predict(linnerud_X_test)
I am expecting similar results what is been achieved in the following example, https://scikit-learn.org/stable/auto_examples/linear_model/plot_ols.html
The number of entries in the dependent and the independent variable is not that same.
Also, the reshapes you did on the target are incorrect(I'm not sure what you were trying to do there).
The features were split into train and test, but the split was not done on the target. This was the reason you got the value error.
But a better way to do it would be: