Getting negative information gain with Laplace smoothing

184 Views Asked by At

Is it possible to get negative information gain if Laplace smoothing is used too?

We know:

IG = H(Y) - H(Y|X)

Here, H is the entropy function and IG is the information gain.

Also:

H(Y) = -ΣyP(Y=y).log2(P(Y=y))

H(Y|X) = ΣxP(X=x).H(Y|X=x)

H(Y|X=x) = -ΣyP(Y=y|X=x).log2(P(Y=y|X=x))

For example, suppose P(Y=y|X=x) = ny|x/nx. But it is possible that nx = 0 and ny|x = 0. So, I do laplace smoothing and define P(Y=y|X=x) = (ny|x+1)/(nx+|X|). Here, |X| denote the number of possible values that X can take(number of splits possible if X is chosen as the attribute). Is it possible that due to laplace smoothing, I get negative information gain?

0

There are 0 best solutions below