Simple neural network multiply training on untrained data gives big errors

Question

Simple neural network multiply training on untrained data gives big errors

679 Views Asked by Kadir BASOL At 17 August 2025 at 20:22

i have made small multiplication neural network by using encog library with sigmoid activation function on the basic network. My problem is i got big errors on untrained datas. How can i enhance untrained data more good results.Less error prone.

First i tried: train.getError() > 0.00001 to train.getError() > 0.0000001 decreasing error to less will make more sharp results. But this did not helped.

Increasing hidden layer did not helped also: network.addLayer(new BasicLayer(new ActivationSigmoid(),false,128));

i tried to increase neuron count per layer but also not helped

How can i get more sharp results ?

Whats bias do ? When to use it ?

I've seen : http://www.heatonresearch.com/wiki/Activation_Function But i am only using sigmoid.When to use others or i need to change activation function?

Here is my code:

    package org.encog.examples.neural.xor;

    import org.encog.Encog;
    import org.encog.engine.network.activation.ActivationSigmoid;
    import org.encog.ml.data.MLData;
    import org.encog.ml.data.MLDataPair;
    import org.encog.ml.data.MLDataSet;
    import org.encog.ml.data.basic.BasicMLDataSet;
    import org.encog.neural.networks.BasicNetwork;
    import org.encog.neural.networks.layers.BasicLayer;
    import org.encog.neural.networks.training.propagation.resilient.ResilientPropagation;

    import java.awt.*;
    import java.text.DecimalFormat;
    import java.text.NumberFormat;


    public class MulHelloWorld {

        /**
         * The input necessary for MUL.
         */
        public static double MUL_INPUT[][] = { { 0.0, 0.0 }, { 1.0, 0.0 },
                { 0.2, 0.4 }, { 0.3, 0.2 } , {0.12 , 0.11} , {0.7,0.2} , {0.32,0.42} , {0.9,0.3} , {0.5,0.2} , { 0.4 , 0.6 } , {0.9,0.1} };

        /**
         * The ideal data necessary for MUL.
         */
        public static double MUL_IDEAL[][] = { { 0.0 }, { 0.0 }, { 0.08 }, { 0.06 } , {0.0132} , {0.14} , {0.1344} , {0.27} , {0.1} , {0.24} , {0.09} };


        private static BasicNetwork network;
        private static NumberFormat formatter = new DecimalFormat("###.#####");


        public static final void retrain() {
            network = new BasicNetwork();
            network.addLayer(new BasicLayer(null,true,2));
            network.addLayer(new BasicLayer(new ActivationSigmoid(),false,128));
            network.addLayer(new BasicLayer(new ActivationSigmoid(),false,128));
            network.addLayer(new BasicLayer(new ActivationSigmoid(),false,128));
            network.addLayer(new BasicLayer(new ActivationSigmoid(),false,1));
            network.getStructure().finalizeStructure();
            network.reset();

            // create training data
            MLDataSet trainingSet = new BasicMLDataSet(MUL_INPUT, MUL_IDEAL);

            // train the neural network
            final ResilientPropagation train = new ResilientPropagation(network, trainingSet );

            int epoch = 1;

            do {
                train.iteration();
                System.out.println("Epoch #" + epoch + " Error:" + formatter.format(train.getError()));
                epoch++;
            } while(train.getError() > 0.00001);
            train.finishTraining();

            // test the neural network
            System.out.println("Neural Network Results:");

            for(MLDataPair pair: trainingSet ) {
                final MLData output = network.compute(pair.getInput());
                System.out.println(pair.getInput().getData(0) + "," + pair.getInput().getData(1)
                        + ", actual=" + output.getData(0) + ",ideal=" + pair.getIdeal().getData(0));
            }
        }

        /**
         * The main method.
         * @param args No arguments are used.
         */
        public static void main(final String args[]) {
            // create a neural network, without using a factory

            retrain();

            final double computedValue = compute(network, 0.01, 0.01);
            final double diff = computedValue - 0.0001;
            do {
                if (diff < 0.001 && diff > -0.001) {
                    String f = formatter.format(computedValue);
                    System.out.println("0.0001:"+f);
                    System.out.println("0.0002:"+formatter.format(compute(network, 0.02, 0.01)));//0.0002
                    System.out.println("0.001:"+formatter.format(compute(network, 0.05, 0.02)));//0.001
                    Toolkit.getDefaultToolkit().beep();
                    try { Thread.sleep(7000); } catch (Exception epx) {}
                    retrain();
                } else {
                    String f = formatter.format(computedValue);
                    System.out.println("0.0001:"+f);
                    System.out.println("0.0002:"+formatter.format(compute(network, 0.02, 0.01)));//0.0002
                    System.out.println("0.001:"+formatter.format(compute(network, 0.05, 0.02)));//0.001
                    System.exit(0);
                }
            } while (diff < 0.001 && diff > -0.001);

            Encog.getInstance().shutdown();
        }


        public static final double compute(BasicNetwork network, double x, double y) {
            final double value[] = new double[1];
            network.compute( new double[] { x , y } , value );
            return value[0];
        }
    }

Here is my last try seems a little more effective but not good yet:

    package org.encog.examples.neural.xor;

    import org.encog.Encog;
    import org.encog.engine.network.activation.ActivationSigmoid;
    import org.encog.ml.data.MLData;
    import org.encog.ml.data.MLDataPair;
    import org.encog.ml.data.MLDataSet;
    import org.encog.ml.data.basic.BasicMLDataSet;
    import org.encog.neural.networks.BasicNetwork;
    import org.encog.neural.networks.layers.BasicLayer;
    import org.encog.neural.networks.training.propagation.resilient.ResilientPropagation;

    import java.awt.*;
    import java.text.DecimalFormat;
    import java.text.NumberFormat;
    import java.util.ArrayList;


    public class MulHelloWorld {

        /**
         * The input necessary for MUL.
         */
        public static double MUL_INPUT[][] = {
                { 0.0, 0.0 }, { 1.0, 0.0 }, { 0.2, 0.4 }, { 0.3, 0.2 } ,
                {0.12 , 0.11} , {0.7,0.2} , {0.32,0.42} , {0.9,0.3} ,
                {0.5,0.2} , { 0.4 , 0.6 } , {0.9,0.1} , {0.1,0.1} ,
                {0.34,0.42} , {0.3,0.3}
        };

        /**
         * The ideal data necessary for MUL.
         */
        public static double MUL_IDEAL[][] = {
                { 0.0 }, { 0.0 }, { 0.08 }, { 0.06 } ,
                {0.0132} , {0.14} , {0.1344} , {0.27} ,
                {0.1} , {0.24} , {0.09} , {0.01} ,
                {0.1428} , {0.09} };


        private static BasicNetwork network;
        private static NumberFormat formatter = new DecimalFormat("###.##########");
        private static final double acceptableDiff = 0.01;


        public static final void retrain() {
            network = new BasicNetwork();
            network.addLayer(new BasicLayer(null,true,2));
            network.addLayer(new BasicLayer(new ActivationSigmoid(),true,32));
            network.addLayer(new BasicLayer(new ActivationSigmoid(),true,32));
            network.addLayer(new BasicLayer(new ActivationSigmoid(),true,1));
            network.getStructure().finalizeStructure();
            network.reset();

            ArrayList<Double> inputs = new ArrayList<Double>();
            ArrayList<Double> inputs2 = new ArrayList<Double>();
            ArrayList<Double> outputs = new ArrayList<Double>();
            double j = 0;
            int size = 64;
            for (int i = 0; i < size; i++) {
                final double random1 = Math.random();
                final double random2 = Math.random();
                inputs.add( random1 );
                inputs2.add( random2 );
                outputs.add( random1*random2 );
            }
            final Double x1[] = new Double[size];
            final Double x2[] = new Double[size];
            final Double x3[] = new Double[size];

            final Double[] inputz1 = inputs.toArray(x1);
            final Double[] inputz2 = inputs2.toArray(x2);
            final Double[] outz = outputs.toArray(x3);

            final double inputsAll[][] = new double[inputz1.length][2];
            final double outputsAll[][] = new double[inputz1.length][1];

            final int inputz1Size = inputz1.length;
            for (int x = 0; x < inputz1Size ; x++) {
                inputsAll[x][0] = inputz1[x];
                inputsAll[x][1] = inputz2[x];

                outputsAll[x][0] = outz[x];
            }

            // create training data
            MLDataSet trainingSet = new BasicMLDataSet(inputsAll, outputsAll );

            // train the neural network
            final ResilientPropagation train = new ResilientPropagation(network, trainingSet );

            int epoch = 1;
            do {
                train.iteration();
                System.out.println("Epoch #" + epoch + " Error:" + formatter.format(train.getError()));
                epoch++;
            } while(train.getError() > acceptableDiff);
            train.finishTraining();

            // test the neural network
            System.out.println("Neural Network Results:");

            for(MLDataPair pair: trainingSet ) {
                final MLData output = network.compute(pair.getInput());
                System.out.println(pair.getInput().getData(0) + "," + pair.getInput().getData(1)
                        + ", actual=" + output.getData(0) + ",ideal=" + pair.getIdeal().getData(0));
            }
        }

        /**
         * The main method.
         * @param args No arguments are used.
         */
        public static void main(final String args[]) {
            // create a neural network, without using a factory

            retrain();


            double random3 = Math.random();
            double random4 = Math.random();
            double v2 = random3 * random4;
            double computedValue = compute(network, random3, random4);
            System.out.println(formatter.format(v2) + ":" + formatter.format(computedValue));

            final double diff = computedValue - v2;
            do {
                if (diff <  acceptableDiff || diff > -acceptableDiff ) {
                    String f = formatter.format(computedValue);
                    {
                        double random = Math.random();
                        double random1 = Math.random();
                        double v = random * random1;
                        System.out.println(formatter.format(v) + ":" + formatter.format(compute(network, random, random1)));
                    }

                    {
                        double random = Math.random();
                        double random1 = Math.random();
                        double v = random * random1;
                        System.out.println(formatter.format(v) + ":" + formatter.format(compute(network, random, random1)));
                    }

                    {
                        double random = Math.random();
                        double random1 = Math.random();
                        double v = random * random1;
                        System.out.println(formatter.format(v) + ":" + formatter.format(compute(network, random, random1)));
                    }

                    Toolkit.getDefaultToolkit().beep();
                    try { Thread.sleep(1000); } catch (Exception epx) {}
                    retrain();
                } else {
                    String f = formatter.format(computedValue);
                    System.out.println("0.0001:"+f);
                    System.out.println("0.0002:"+formatter.format(compute(network, 0.02, 0.01)));//0.0002
                    System.out.println("0.001:"+formatter.format(compute(network, 0.05, 0.02)));//0.001
                    System.exit(0);
                }
            } while (diff < acceptableDiff || diff > -acceptableDiff);

            Encog.getInstance().shutdown();
        }


        public static final double compute(BasicNetwork network, double x, double y) {
            final double value[] = new double[1];
            network.compute( new double[] { x , y } , value );
            return value[0];
        }
    }

Original Q&A

There are 1 best solutions below

**Philip Graham** · Answer 1

You might find that you are actually fitting the training set too closely and thus your net doesn't generalise well. A better strategy would be to have a third set, for validation. You could use this data to set your training error for an effective result, and then test the net on your untrained data.

I'm not familiar with this particular package but you might also want to look at other training methods. I've found scaled conjugate gradients to often be a bit better than basic back propagation.

Simple neural network multiply training on untrained data gives big errors

There are 1 best solutions below

Related Questions in JAVA

Related Questions in NEURAL-NETWORK

Related Questions in ENCOG

Related Questions in NEUROSCIENCE

Related Questions in BIAS-NEURON

Trending Questions

Popular # Hahtags

Popular Questions