Enforce binary coefficients in Logistic Regression

100 Views Asked by At

I am facing a classification problem in which I need to classify samples based on the presence or absence of features.

The data matrix X is a binary scipy.sparse.csr_matrix.

I tried to fit a LogisticRegression with penalty='l1' and C=0.5, which gives very good metrics, and very sparse coefficients, which is desired.

But, an additional requirement of my problem is to have binary coefficients in {0, 1} or coefficients in {0, 1, -1}. Meaning features either contribute positively to the classification of samples for the positive class, or negatively or do not contribute, but two features that contribute positively must contribute equally.

I am guessing a simple way to achieve this would be to add a regularization term:

reg

the original optimisation problem being:

reg

So that the coefficients would indeed be sparse, and nonzero coefficients would have to be 1 or -1. Right now I am using the solver=liblinear which is incredibly efficient, but is written in c++, so not so easy to modify. Since I am using penalty=l1, the only other option is solver='saga' which is unfortunately much slower in my experience. Moreover, the code is in cython which is not much easier to modify

Is there another (simpler?) way to achieve this? maybe someone is aware of a method that is already widely used and that I missed?

0

There are 0 best solutions below