I am just starting to use the SimpleTagger class in mallet. My impression is that it expects binary features. The model that I want to implement has positive integer-valued features and I wonder how to implement this in mallet. Also, I heard that non-binary features need to be normalized if the model is to make sense. I would appreciate any suggestions on how to do this.
ps. yes, I know that there is a dedicated mallet mail list but I am waiting for nearly a day already to get my subscription approved to be able to post there. I'm simply in a hurry.
Well it's 6 years later now. If you're not in a hurry anymore, you could check out the Java API to create your instances. A minimal example:
Or, if you want to keep using
SimpleTagger
, just define binary features likeHAS_1_LETTER
,HAS_2_LETTER
, etc, though this seems tedious.