Why is Java's Double Metaphone only giving four letter codes?

719 Views Asked by At

I want to use DoubleMetaphone to get a phonetic encoding of a given string. For example:

import org.apache.commons.codec.language.DoubleMetaphone;
String s1 = "computer";
(new DoubleMetaphone()).doubleMetaphone(s1);

Result: Computer -> KMPT

The issue arises when I try to encode longer strings.

import org.apache.commons.codec.language.DoubleMetaphone;
String s1 = "dustinhoffmanisanactor";
(new DoubleMetaphone()).doubleMetaphone(s1);

Result: dustinhoffmanisanactor -> TSTN

Clearly it's taking the first 4 encoded characters and halting. In this case Dustin -> TSTN.

I used the Python implementation of Double Metaphone and it works as expected.

>>>from metaphone import doublemetaphone
>>>doublemetaphone("dustinhoffmanisanactor")[0]
"TSTNFMNSNKTR"
1

There are 1 best solutions below

0
Ian On BEST ANSWER

Seems I needed to set the max code length.

String s1 = "dustinhoffmanisanactor";
DoubleMetaphone dm = new DoubleMetaphone();
dm.setMaxCodeLen(100);
dm.doubleMetaphone(s1);

Which gives the expected TSTNFMNSNKTR.