How to do training and testing for decision tree classifier

286 Views Asked by At

im try to do training and testing for my decision tree classifier. im still new in decision tree. i have 150 data with two columns in my csv file and im tried to split it into 100 training and 50 for testing. i've tried using scikit but i still don't understand.

from sklearn.tree import DecisionTreeClassifier
classifier = DecisionTreeClassifier(random_state=17)
classifier.fit(train_x, train_Y)
pred_y = classifier.predict(test_x)
print(classification_report(test_Y,pred_y))
accuracy_score(test_Y,pred_y)

can anyone help me how to do it ? i appreciate every help

1

There are 1 best solutions below

3
KarelZe On BEST ANSWER

You need to perform a train-test-split.

As you got 150 samples in total and 50 should be part of your test set, you can set the test size as an integer equal to 50.

You might want to set the random_state for reproducability. Generally, it's also good advice to leave shuffle=True activated. If your data is time-correlated, deactivate it to prevent data leakage. You can find detailled examples in this book.

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
     X, y, test_size=50, random_state=42)