I'm using Liblinear in order to train models for a classification problem. I have noticed that changing the order of samples in the training data can result in different models. To test this i have created two different liblinear problems which include the same data in different order.
Problem 1:
x:
[FeatureNode(idx=1, value=1.0), FeatureNode(idx=2, value=1.0), FeatureNode(idx=5, value=1.0)]
[FeatureNode(idx=1, value=1.0), FeatureNode(idx=2, value=1.0), FeatureNode(idx=3, value=1.0), FeatureNode(idx=5, value=1.0)]
y:
[1.0, 0.0]
Generated model:
[0.0, 0.0, -1.0, 0.0, 0.0]
Problem2:
x:
[FeatureNode(idx=1, value=1.0), FeatureNode(idx=2, value=1.0), FeatureNode(idx=3, value=1.0), FeatureNode(idx=5, value=1.0)]
[FeatureNode(idx=1, value=1.0), FeatureNode(idx=2, value=1.0), FeatureNode(idx=5, value=1.0)]
y:
[0.0, 1.0]
Generated model:
[0.04166666666666674, 0.04166666666666674, 0.875, 0.0, 0.04166666666666674]
What is the reason for this? Can this be avoided?