Does the choice of activation function depend on value input range?

113 Views Asked by At

I am currently working with audio data and an autoencoder.

The input data goes from [-1 to 1], same has to be true for output data [-1 to 1]

So, as to help the network retain values between -1 and 1 throught, I'm using Tanh() activation functions to introduce nonlinearity.(This is to retain the "representation" of the sound throughout the whole network).

I was wondering that, if i biased my data to [0 to 2], and then scaled to [0 to 1], if I could also use ReLu functions? (as they are linear between 0 and 1, thus not creating nonlinearities?)

In general, would there be an improvement/reason to bias+normalize my data? Also, are ReLu 'better' than tanh functions, or are they just faster to calculate?

1

There are 1 best solutions below

0
Orix Au Yeung On

You can use any activation functions you want in the hidden layers. As long as the final output layer uses tanh then the output would be in the range of [-1 , 1]. In this sense then yes, you can use ReLU.

As for the biased input, there would be a learnable weights for biases for each layer if not explicitly specified in most of the popular ML frameworks.

As with many things in this field there is no absolutes. I would say test them all out and use the ones that you find it to work best.