I am thinking about some trouble that might occur while implementing decision trees. Suppose, I select X3 as my root attribute to start splitting. I have X1, X2 and X3. So, X3 gives me higher information gain. So, I decides to start splitting on that.
Suppose, in X3 I can ask 2 questions to start splitting.
a.) If value > 0.6 , it can be class 1 b.) Suppose, it has some values like 0.4 occurring several times. So, values == 0.4 belongs to class 2.
So I was thinking, there are many possibilities of question to start splitting. Information gain will provide only info about which attribute is better to start splitting, by giving more purity.
So, while implementing Decision Tress, how can one write code to choose all possible questions and select the best one ?