Multi-dimensional regression with Vowpal Wabbit

620 Views Asked by At

I have an unusual regression problem that I'm trying to fit into vowpal wabbit. I'm trying to learn a set of regressors {r_m(x)} that train on the data set {(x_n, h_n[m])} for n=1 to n=N, where m indexes the M dimensions. This basically means there are M separate regression problems.

I am wondering if it is possible to merge all M problems into just 1, by relegating each problem into its own namespace. E.g. inside the .vw training file, I'll have the following:

h_1[m=0] |firstnamespace x_1_features
h_2[m=0] |firstnamespace x_2_features
...      |...            ...
h_N[m=0] |firstnamespace x_N_features
----------------------------------------------------------------
h_1[m=1] |secondnamespace x_1_features
h_2[m=1] |secondnamespace x_2_features
...      |...             ...
h_N[m=1] |secondnamespace x_N_features
----------------------------------------------------------------
h_1[m=M] |lastnamespace x_1_features
h_2[m=M] |lastnamespace x_2_features
...      |...           ...
h_N[m=M] |lastnamespace x_N_features

Then I can just perform

vw -d Train.vw -f Train.model -c --loss_function squared
    --invert_hash model_readable.txt

and obtain the regressor weights for each namespace.

I know this strategy is similar to the transformation/reduction of a multi-label classification problem into multiple binary classification problems: this link. I am wondering if the same can be applied to regression problems without any cross-talk between the dimensions, i.e. vowpal wabbit treating each namespace independently.

If it's important to note, I have M = 400, N = 4e6, and the number of data dimensions is equal to the number of unique word tokens in the whole document set...

0

There are 0 best solutions below