How does pessimistic error pruning in C4.5 algorithm working?

178 Views Asked by hua At 17 August 2025 at 10:02

I'm studing C4.5 algorithm and trying to make it in java by myself, but in the part of pruning, I don't understand what it compute in book of C4.5.

In this book,it said:"When N training cases are covered by a leaf, E of them incorrectly....For a given confidence level CF, ..., this upper limit is here written UCF(E,N)(default confidence level is 25%)....a leaf covering N training cases with a predicted error rate of UCF(E,N) would give rise to predicted N*UCF(E,N) errors."

author show a example, in this subtree:

education cost == n: democrat(6)
education cost == y: democrat(9)
education cost == u: republican(1)

For the first leaf, N=6, E=0, U25%(0,6)=0.206, for the remaining leaves, U25%(0,9)=0.143 and U25%(0,1)=0.75, so the number of predicted errors for this subtree is 60.206+90.143+1*0.75=3.273.

If the subtree were replace by a democrat, the predicted errors is 16U25%(1,16)=160.157=2.512, it is less than 3.273, so it should pruned to a leaf.

But I don't know how it computed U25%(0,6)=0.206, U25%(0,9)=0.143 and so on, and inceased the number of errors observed at each leaf by 0.5, where were put 0.5 into? I have search on internet, there are many different formula so I don't know which one is correct.

Original Q&A

How does pessimistic error pruning in C4.5 algorithm working?

There are 0 best solutions below

Related Questions in JAVA

Related Questions in DECISION-TREE

Related Questions in PRUNING

Related Questions in C4.5

Trending Questions

Popular # Hahtags

Popular Questions