I am wondering how you can make TransformerEncoder predict mask values in the training session. Currenlty what I am doing is that the parts I am "masking" in the input vector is to change their value with 0.0 and then pass the new vector into the model. But I think this is not working properly and that the model doesn't recognize that this 0.0 is in fact a mask value to predict but rather that this is just the input vector value.
I tried just changing the value to 0.0