I have an unusual regression problem that I'm trying to fit into vowpal wabbit. I'm trying to learn a set of regressors {r_m(x)} that train on the data set {(x_n, h_n[m])} for n=1 to n=N, where m indexes the M dimensions. This basically means there are M separate regression problems.
I am wondering if it is possible to merge all M problems into just 1, by relegating each problem into its own namespace. E.g. inside the .vw training file, I'll have the following:
h_1[m=0] |firstnamespace x_1_features
h_2[m=0] |firstnamespace x_2_features
... |... ...
h_N[m=0] |firstnamespace x_N_features
----------------------------------------------------------------
h_1[m=1] |secondnamespace x_1_features
h_2[m=1] |secondnamespace x_2_features
... |... ...
h_N[m=1] |secondnamespace x_N_features
----------------------------------------------------------------
h_1[m=M] |lastnamespace x_1_features
h_2[m=M] |lastnamespace x_2_features
... |... ...
h_N[m=M] |lastnamespace x_N_features
Then I can just perform
vw -d Train.vw -f Train.model -c --loss_function squared
--invert_hash model_readable.txt
and obtain the regressor weights for each namespace.
I know this strategy is similar to the transformation/reduction of a multi-label classification problem into multiple binary classification problems: this link. I am wondering if the same can be applied to regression problems without any cross-talk between the dimensions, i.e. vowpal wabbit treating each namespace independently.
If it's important to note, I have M = 400, N = 4e6, and the number of data dimensions is equal to the number of unique word tokens in the whole document set...