We use online learning to train our models on infinite stream of data. I.e. we train our model on a training set along with option save_resume, deploy the trained model to production and then we train (fine tune) the stored model on fresh data.
Our ML model is misbehaving so we would like to get better understanding of model's weights and how they evolve in time.
First we used audit option before we realized that there are only weights of signals which were present in the data which we used for generation of audit log. But we wanted to see ALL weights of signals present in the model (we didn't go for invert_hash option due to performance reason). So we used option readable_model which generates log with all signal weights of the model (the only issue is that there are just hashes instead of names of the signals).
To my surprise the signal weights for corresponding hashes in audit log and readable model log are different.
Example:
Line from my audit log:
Namespace^Feature, HashValue, WeightValue
c^cnr, 892361, 0.0112584
and the corresponding line from readable model log:
HashValue: WeightValue SumOfGradients Regularization
892361: 0 -9027.66 6.42899e+07
So in audit log I can see weight 0.0112584 for hash value 892361 whilst in readable model log I can see weight 0 for the very same hash value. Can anybody explain me why can I see different weights for the same hash in audit log and in readable model log?
I use VowpalWabbit version 8.11.0.
The command line options I use for generating the logs is:
--bit_precision 21 --l1 1.0e-12 --l2 1.0e-12 --ftrl --ftrl_alpha 0.01 --ftrl_beta 0.5 -q cf -q cp -q cw -q pf -q pw --loss_function logistic --link logistic --hash all --holdout_off --progress 1000000 --save_resume --final_regressor 1948174707144499995.vw --initial_regressor 8629000310498118597.vw --audit --audit_regressor audit.csv --readable_model 1948174707144499995.vw.txt