Why does Random Forest using SAP PAL predict the same value for every input?

203 Views Asked by At

I am using SAP Predictive Analytics Library to predict a certain variable. For this, I am using Random Decision Tree( also known as Random Fores) algorithm. I have 24 features and 25k rows. I am using the following parameters to train the model.

INSERT INTO #PAL_PARAMETER_TBL VALUES ('HAS_ID', 1, null, null);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('TREES_NUM', 100, NULL, NULL);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('TRY_NUM', 3, NULL, NULL);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('MAX_DEPTH ', 6, null, NULL);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('SEED', 0, NULL, NULL);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('SPLIT_THRESHOLD', NULL, 1e-5, NULL);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('CALCULATE_OOB', 1, NULL, NULL);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('NODE_SIZE', 500, NULL, NULL);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('THREAD_RATIO', NULL, 1.0, NULL);

The following is the output that I get-

Predicted output

The left column is the predicted output and the right column is the confidence. The actual value is supposed to be as follows-

Actual value

In my training set, I have values (for the Dependent variable) ranging from 1.7 to 4. So, my question is why is the model behaving in this manner ? Also I have noticed that using the same dataset for Decision Tree algorithm I get close enough values to the actual output. Since Random forest is based on Decision tree, it should output more accurate values !

Please help.

REFERENCE - SAP PAL Reference guide

0

There are 0 best solutions below