House price prediction using Neural Network- network not learning

437 Views Asked by At

I am using pybrain to predict house prices in the house price dataset. I downloaded the dataset from below link: https://www.kaggle.com/apratim87/housingdata/data

I picked 6 columns to predict the price 'bedrooms','bathrooms','sqft_living','sqft_lot','floors','zipcode'

I took a neural network with 6 input units,1 hidden layer with 3 neurons and 1 unit in the output.

I have normalized the data. Code is as below:

 house_df = pd.read_csv("kc_house_data.csv")
 print(house_df.head())
df = house_df.dropna(axis=0)
df = df[(df != 0).all(1)]
df.reset_index(drop=True,inplace=True)
X_org=house_df[['bedrooms','bathrooms','sqft_living','sqft_lot','floors','zipcode']]
y_org=house_df[['price']]


scaler = Normalizer().fit(X_org)
X = scaler.transform(X_org)
target_scaler = preprocessing.MinMaxScaler()
y=target_scaler.fit_transform(y_org)

ds=SupervisedDataSet(X.shape[1],y.shape[1])
for i in range(len(X)):
    ds.addSample(X.iloc[i,:].values,y.iloc[i,:].values)

train,rest=ds.splitWithProportion(0.60)
test,validation=rest.splitWithProportion(0.50)

print('Training Set Size='+str(len(train)))
print('Test Set Size='+str(len(test)))
print('Validation Set Size='+str(len(validation)))

#creating a neural network
def buildNN(invar,hidden,out):
    net=buildNetwork(invar,hidden,out,hiddenclass=SigmoidLayer,outclass=SoftmaxLayer)
    trainer=BackpropTrainer(net,dataset=train,momentum=0.1,verbose=True,weightdecay=0.01)
    trn_err,val_err=trainer.trainUntilConvergence(dataset=train,maxEpochs=50)

    #trainer.trainOnDataset(trndata,500)
    tst,=plt.plot(trn_err,'b',label='Test Error')
    vali,=plt.plot(val_err,'r',label='Validation Error')
    plt.legend(handles=[tst, vali])
    plt.ylabel('Error')
    plt.xlabel('Number of Epochs')
    plt.show()
     #testing it on test data
     out=net.activateOnDataset(test).argmax(axis=1)
    test_error=percentError(out,test['price'])
    #on validation data
    out=net.activateOnDataset(validation).argmax(axis=1)
    vali_error=percentError(out,validation['price'])
    return  test_error,vali_error

print('Neaural network with 6 input, 3 hidden units, 1 output')
nn3_testerr,nn3_valierr=buildNN(6,3,1)

I get constant error and my program is not learning. Can you please suggest what might be the issue?

Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278

2

There are 2 best solutions below

0
On

Ideally, you need to have two dense layers and a higher number of neurons per each layer. Another important point to note is mean normalization of your feature matrix. Instead of Sigmoid, Try Relu or Elu as the activation function.

0
On

there are lots of ways to improve NN performance

1) tweak the geometry (add layers, change layer size)

2) change the activation function

3) change step size/momentum

4) play around with data preprocessing

you should try all of these, and various combinations of all of these. From a quick glance, a single layer, with only three neurons won't be very rugged, so start there.

Can you get this network to converge on a simple example like xor?