Use tf.layers.batch_normalization to preprocess inputs for SELU activation function?

Question

Use tf.layers.batch_normalization to preprocess inputs for SELU activation function?

696 Views Asked by Maosi Chen At 10 September 2017 at 00:07

The SELU activation function (https://github.com/bioinf-jku/SNNs/blob/master/selu.py) requires the input to be normalized to have the mean value of 0.0 and the variance of 1.0. Therefore, I tried to apply tf.layers.batch_normalization (axis=-1) on the raw data to meet that requirement. The raw data in each batch have the shape of [batch_size, 15], where 15 refers to the number of features. The graph below shows the variances of 5 of these features returned from tf.layers.batch_normalization (~20 epochs). They are not all close to 1.0 as expected. The mean values are not all close to 0.0 as well (graphs not shown).

How should I get the 15 features all normalized independently (I expect every feature after normalization will have mean = 0 and var = 1.0)?

Original Q&A

There are 1 best solutions below

**Maosi Chen** · Accepted Answer · 2017-09-12T16:38:59.793000

After reading the original papers of batch normalization (https://arxiv.org/abs/1502.03167) and SELU (https://arxiv.org/abs/1706.02515), I have a better understanding of them:

batch normalization is an "isolation" procedure to ensure the input (in any mini-batch) to the next layer has a fixed distribution, therefore the so called "shifting variance" problem is fixed. The affine transform ( γ*x^ + β ) just tunes the standardized x^ to another fixed distribution for better expressiveness. For the simple normalization, we need to turn the center and scale parameters to False when calling tf.layers.batch_normalization.
Make sure the epsilon (still in tf.layers.batch_normalization) is set to at least 2 magnitudes less than the lowest magnitude of the all input data. The default value of epsilon is set to 0.001. For my case, some features have values as low as 1e-6. Therefore, I had to change epsilon to 1e-8.
The inputs to SELU have to be normalized before feeding them into the model. tf.layers.batch_normalization is not designed for that purpose.

Use tf.layers.batch_normalization to preprocess inputs for SELU activation function?

There are 1 best solutions below

Related Questions in TENSORFLOW

Related Questions in BATCH-NORMALIZATION

Trending Questions

Popular # Hahtags

Popular Questions