C4.5 Decision Tree Algorithm doesn't improve the accuracy

1.5k Views Asked by At

I ran C4.5 Pruning algorithm in Weka using 10-fold cross validation. I noticed that the unpruned tree had a higher testing accuracy than a pruned tree. I couldn't understand the reason about why pruning the tree didn't improve the testing accuracy?

1

There are 1 best solutions below

3
On

Pruning reduces the size of the decision tree which (in general) reduces training accuracy but improves the accuracy on test (unseen) data. Pruning helps to mitigate overfitting, where you would achieve perfect accuracy on training data, but the model (i.e. the decision tree) fails whenever it sees unseen data.

So, pruning should improve testing accuracy. From your question, its difficult to say why pruning is not improving the testing accuracy.

However, you can check your training accuracy. Just check whether pruning is reducing the training accuracy or not. If not, then the problem is somewhere else. Probably then you need to think about the number of features or the dataset size!