Teaching a ANN how to add

182 Views Asked by At

Preface: I'm currently learning about ANNs because I have ~18.5k images in ~83 classes. They will be used to train a ANN to recognize approximately equal images in realtime. I followed the image example in the book, but it doesn't work for me. So I'm going back to the beginning as I've likely missed something.

I took the Encog XOR example and extended it to teach it how to add numbers less than 100. So far, the results are mixed, even for exact input after training.

Inputs (normalized from 100): 0+0, 1+2, 3+4, 5+6, 7+8, 1+1, 2+2, 7.5+7.5, 7+7, 50+50, 20+20. Outputs are the numbers added, then normalized to 100.

After training 100,000 times, some sample output from input data:

0+0=1E-18 (great!)
1+2=6.95
3+4=7.99 (so close!)
5+6=9.33
7+8=11.03
1+1=6.70
2+2=7.16
7.5+7.5=10.94
7+7=10.48
50+50=99.99 (woo!)
20+20=41.27 (close enough)

From cherry-picked unseen data:

2+4=7.75
6+8=10.65
4+6=9.02
4+8=9.91
25+75=99.99 (!!)
21+21=87.41 (?)

I've messed with layers, neuron numbers, and [Resilient|Back]Propagation, but I'm not entirely sure if it's getting better or worse. With the above data, the layers are 2, 6, 1.

I have no frame of reference for judging this. Is this normal? Do I have not enough input? Is my data not complete or random enough, or too weighted?

3

There are 3 best solutions below

1
On

You are not the first one to ask this. It seems logical to teach an ANN to add. We teach them to function as logic gates, why not addition/multiplication operators. I can't answer this completely, because I have not researched it myself to see how well an ANN performs in this situation.

If you are just teaching addition or multiplication, you might have best results with a linear output and no hidden layer. For example, to learn to add, the two weights would need to be 1.0 and the bias weight would have to go to zero:

linear( (input1 * w1) + (input2 * w2) + bias) = becomes linear( (input1 * 1.0) + (input2 * 1.0) + (0.0) ) =

Training a sigmoid or tanh might be more problematic. The weights/bias and hidden layer would basically have to undo the sigmoid to truely get back to an addition like above.

I think part of the problem is that the neural network is recognizing patterns, not really learning math.

0
On

ANN can learn arbitrary function, including all arithmetics. For example, it was proved that addition of N numbers can be computed by polynomial-size network of depth 2. One way to teach NN arithmetics is to use binary representation (i.e. not normalized input from 100, but a set of input neurons each representing one binary digit, and same representation for output). This way you will be able to implement addition and other arithmetics. See this paper for further discussion and description of ANN topologies used in learning arithmetics.

PS. If you want to work with image recognition, its not good idea to start practicing with your original dataset. Try some well-studied dataset like MNIST, where it is known what results can be expected from correctly implemented algorithms. After mastering classical examples, you can move to work with your own data.

0
On

I am in the middle of a demo that makes the computer to learn how to multiply and I share my progress on this: as Jeff suggested I used the Linear approach and in particular ADALINE. At this moment my program "knows" how to multiply by 5. This is the output I am getting:


    1 x 5 ~= 5.17716232607829
    2 x 5 ~= 10.147218373698
    3 x 5 ~= 15.1172744213176
    4 x 5 ~= 20.0873304689373
    5 x 5 ~= 25.057386516557
    6 x 5 ~= 30.0274425641767
    7 x 5 ~= 34.9974986117963
    8 x 5 ~= 39.967554659416
    9 x 5 ~= 44.9376107070357
    10 x 5 ~= 49.9076667546553

Let me know if you are interested in this demo. I'd be happy to share.