is their any way to show random forest as nonlinear using suppose 100 attributes

236 Views Asked by At

Is their any way to show random forest as nonlinear using suppose 100 attributes. Actually I compared the accuracy of J48 with Random forest. Random forest works better as given works better for Nonlinear so how to prove or show it.

1

There are 1 best solutions below

0
On

First a simple proof that RF can capture some non-linear signal:

obs=2500
vars = 100
X = data.frame(replicate(vars,runif(obs,-2,2)))
y = apply(X,1,function(x) sum(x^2))
rfo = randomForest(X,y)
print(rfo)

As the randomForest can fit the non-linear signal, explained variance 85% ( out-of-bag cross validated) it is hereby empirically shown.

Second you can inspect any curvature of any random forest model with following example:

I use forestFloor package to visualize the RF model trained on a data set sample from the hidden non-linear function:

y = f(x) ={x_1}^2 + sin(x_2) + x_3 * x_4 where X is sampled from a normal distribution. As the found curvature reproduces the non-linear hidden function

library(forestFloor)
library(randomForest)
#simulate data
obs=2500
vars = 6 
X = data.frame(replicate(vars,rnorm(obs)))
Y = with(X, X1^2 + sin(X2*pi) + 2 * X3 * X4 + 1 * rnorm(obs))

#grow a forest, remeber to include inbag
rfo=randomForest(X,Y,keep.inbag = TRUE,sampsize=1500,ntree=500)

#compute/extract mapping curvature
ff = forestFloor(rfo,X)

#plot partial functions of most important variables first
plot(ff) 

#Non interacting functions are well displayed, whereas X3 and X4 are not
#by applying different colourgradient, interactions reveal themself 
Col = fcol(ff,3,orderByImportance=FALSE)
plot(ff,col=Col,plot_GOF=TRUE) 

#in 3D the interaction between X3 and X reveals itself completely
show3d(ff,3:4,col=Col,plot.rgl=list(size=5),orderByImportance=FALSE) 

#although no interaction, a joined additive effect of X1 and X2
#colour by FC-component FC1 and FC2 summed
Col = fcol(ff,1:2,orderByImportance=FALSE,X.m=FALSE,RGB=TRUE)
plot(ff,col=Col) 
show3d(ff,1:2,col=Col,plot.rgl=list(size=5),orderByImportance=FALSE) 

#...or two-way gradient is formed from FC-component X1 and X2.
Col = fcol(ff,1:2,orderByImportance=FALSE,X.matrix=TRUE,alpha=0.8) 
plot(ff,col=Col) 
show3d(ff,1:2,col=Col,plot.rgl=list(size=5),orderByImportance=FALSE)