My LSTM network is very slow. What to optimize?

465 Views Asked by Andrew At 01 October 2020 at 19:50

I have following deeplearning4j network (and other similar)

MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
            .updater(new Adam.Builder().learningRate(2e-2).build())
            .l2(1e-5)
            .weightInit(WeightInit.XAVIER)
            .gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue).gradientNormalizationThreshold(1.0)
            .list()
            .layer(0, new LSTM.Builder().nIn(vectorSize).nOut(256)
                .activation(Activation.TANH).build())
            .layer(1, new RnnOutputLayer.Builder().activation(Activation.SOFTMAX)
                .lossFunction(LossFunctions.LossFunction.MCXENT).nIn(256).nOut(2).build())
            .build();

Unfortunately training is very slow. My vector size is 400. I have huge amount of samples. What would you suggest to optimize for faster training? Should I decrease inner layer size? Thanks

Original Q&A

There are 1 best solutions below

Michal Drozd On 01 October 2020 at 19:53 BEST ANSWER

Well from my own experience I would first definitely try Activation.SOFTSIGN as your activation function. It does not saturate as quickly, improving its robustness to vanishing gradients.

My LSTM network is very slow. What to optimize?

There are 1 best solutions below

Related Questions in LSTM

Related Questions in DEEPLEARNING4J

Trending Questions

Popular # Hahtags

Popular Questions