Can we use 'train_test_split' in one single colab(or jupyter notebook) twice?

61 Views Asked by At

I have to perform classification and regression using decision tree Machine Learning algorithm. Now I have already done the regression part of the code. If I proceed with classification task on this, i should do train_test_split on the preprocessed dataset. In the code i have to define X and y variables, then do X_train, X_test, y_train and y_test part. The same variables are repeating in both regression and classification. By taking the same variables from classification, will it consider the previous values from regression as they are given first or does it take the newly given values ?

I want the answer clearly whether we can use train_test_split function more than once in either a single colab or jupyter notebook.

1

There are 1 best solutions below

0
On

Yes, you can recall the train_test_split function after you have already called it once while working on the regression problem. The new X_train, X_test, y_train, y_test variables will replace the old ones.

Something like this.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
...do your regression model fitting here.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)  # this will replace the original variables because you are reassigning values here.
...do classification here with the new X_train, X_test, y_train, y_test variables.

As long as the train_test_split function calls are successive, it should not be a problem.