Preface: I'm currently learning about ANNs because I have ~18.5k images in ~83 classes. They will be used to train a ANN to recognize approximately equal images in realtime. I followed the image example in the book, but it doesn't work for me. So I'm going back to the beginning as I've likely missed something.
I took the Encog XOR example and extended it to teach it how to add numbers less than 100. So far, the results are mixed, even for exact input after training.
Inputs (normalized from 100): 0+0, 1+2, 3+4, 5+6, 7+8, 1+1, 2+2, 7.5+7.5, 7+7, 50+50, 20+20
.
Outputs are the numbers added, then normalized to 100.
After training 100,000 times, some sample output from input data:
0+0=1E-18 (great!)
1+2=6.95
3+4=7.99 (so close!)
5+6=9.33
7+8=11.03
1+1=6.70
2+2=7.16
7.5+7.5=10.94
7+7=10.48
50+50=99.99 (woo!)
20+20=41.27 (close enough)
From cherry-picked unseen data:
2+4=7.75
6+8=10.65
4+6=9.02
4+8=9.91
25+75=99.99 (!!)
21+21=87.41 (?)
I've messed with layers, neuron numbers, and [Resilient|Back]Propagation, but I'm not entirely sure if it's getting better or worse. With the above data, the layers are 2, 6, 1.
I have no frame of reference for judging this. Is this normal? Do I have not enough input? Is my data not complete or random enough, or too weighted?
You are not the first one to ask this. It seems logical to teach an ANN to add. We teach them to function as logic gates, why not addition/multiplication operators. I can't answer this completely, because I have not researched it myself to see how well an ANN performs in this situation.
If you are just teaching addition or multiplication, you might have best results with a linear output and no hidden layer. For example, to learn to add, the two weights would need to be 1.0 and the bias weight would have to go to zero:
linear( (input1 * w1) + (input2 * w2) + bias) = becomes linear( (input1 * 1.0) + (input2 * 1.0) + (0.0) ) =
Training a sigmoid or tanh might be more problematic. The weights/bias and hidden layer would basically have to undo the sigmoid to truely get back to an addition like above.
I think part of the problem is that the neural network is recognizing patterns, not really learning math.