Patsy formula when variable has a hypthen

1.7k Views Asked by At

I am trying to use the statsmodel linear regression functions with formulas. My sample data is coming from a Pandas data frame. I am having a slight problem with column names within the formula. Due to the downstream processes, I have hyphens within my column names. For example:

+------+-------+-------+
+ VOLT +  B-NN + B-IDW +
+------+-------+-------+

Now, one of the reasons for keeping the hyphen as it allows python to split the string for other analysis, so I have to keep it. As you can see, when I want to regress VOLT with B-NN using VOLT ~ B-NN, I encounter a problem as the patsy formula cannot find B.

Is there a way to tell Patsy that B-NN is a variable name and not B minus NN?

Thanks.

BJR

1

There are 1 best solutions below

1
On BEST ANSWER

patsy uses Q for quoting names, e.g. Q('B-IDW')

http://patsy.readthedocs.io/en/latest/builtins-reference.html#patsy.builtins.Q

my_fit_function("y ~ Q('weight.in.kg')", ...)