I want to code the Metaphone 3 algorithm myself. Is there a description? I know the source code is available for sale but that is not what I am looking for.
What is the Metaphone 3 Algorithm?
19.1k Views Asked by necromancer AtThere are 6 best solutions below

Since the author (Lawrence Philips) decided to commercialize the algorithm itself it is more than likely that you will not find description. The good place to ask would be the mailing list: https://lists.sourceforge.net/lists/listinfo/aspell-metaphone
but you can also checkout source code (i.e. the code comments) in order to understand how algorithm works: http://code.google.com/p/google-refine/source/browse/trunk/main/src/com/google/refine/clustering/binning/Metaphone3.java?r=2029

I thought it is wrong to have the general community be denied an algorithm (not code)
I am selling source, so the algorithm is not hidden. I am asking $40.00 for a copy of the source code, and asking other people who are charging for their software or services that use Metaphone 3 to pay me a licensing fee, and also asking that the source code not be distributed by other people (except for an exception I made for Google Refine - i can only request that you do not redistribute the copy of Metaphone 3 found there separately from the Refine package.)

This is not a commercial post and I have no relationship with the owner but it is worth saying that an implementation of Metaphone3 is available as commercial software from its creator amporphics.com. It looks like his personal store. It is a Java app but I bought the Python version and it works fine.
The Why Metaphone3? page says:
One common solution to spelling variation is the database approach. Some very impressive work has been done accumulating personal name variations from all over the world. (Of course, we are always very pleased when the companies that retail these databases advertise that they also use some version of Metaphone to improve their flexibility :-) )
But - there are some problems with this approach:
- They only work well until they encounter a spelling variation or a new word or name that is not already in their database.
Then they don't work at all.
Metaphone 3 is an algorithmic approach that will deliver a phonetic lookup key for anything you enter into it.
- Personal names, that is, first names and family names, are not the same as company names. In fact, the name of a company or agency may contain words of any kind, not just names. Database solutions usually don't cover possible spelling variations, or for that matter misspellings, for regular 'dictionary' words. Or if they do, not very thoroughly.
Metaphone 3 was developed to account for all spelling variations commonly found in English words, first and last names found in the United States and Europe, and non-English words whose native pronunciations are familiar to Americans. It doesnt care what kind of a word you are trying to match.
For what it is worth, we licensed the code since it is affordable and it is easy to use. I can't speak as to performance yet. There are good alternatives on PyPi but I can't find them at the moment.

From Wikipedia, the Metaphone algorithm is
Metaphone is a phonetic algorithm, an algorithm published in 1990 for indexing words by their English pronunciation. It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar [...]
Metaphone 3 specifically
[...] achieves an accuracy of approximately 99% for English words, non-English words familiar to Americans, and first names and family names commonly found in the United States, having been developed according to modern engineering standards against a test harness of prepared correct encodings.
The overview of the algorithm is:
The Metaphone algorithm operates by first removing non-English letters and characters from the word being processed. Next, all vowels are also discarded unless the word begins with an initial vowel in which case all vowels except the initial one are discarded. Finally all consonents and groups of consonents are mapped to their Metaphone code. The rules for grouping consonants and groups thereof then mapping to metaphone codes are fairly complicated; for a full list of these conversions check out the comments in the source code section.
Now, onto your real question:
If you are interested in the specifics of the Metaphone 3 algorithm, I think you are out of luck (short of buying the source code, understanding it and re-creating it on your own): the whole point of not making the algorithm (of which the source you can buy is an instance) public is that you cannot recreate it without paying the author for their development effort (providing the "precise algorithm" you are looking for is equivalent to providing the actual code itself). Consider the above quotes: the development of the algorithm involved a "test harness of [...] encodings". Unless you happen to have such test harness or are able to create one, you will not be able to replicate the algorithm.
On the other hand, implementations of the first two iterations (Metaphone and Double Metaphone) are freely available (the above Wikipedia link contains a score of links to implementations in various languages for both), which means you have a good starting point in understanding what the algorithm is about exactly, then improve on it as you see fit (e.g. by creating and using an appropriate test harness).

Actually Metaphone3 is an algorithm with many very specific rules being a result of some test cases analysis. So it's not only a pure algorithm but it comes with extra domain knowledge. To obtain these knowledge and specific rules the author needed to put in a great effort. That's why this algorithm is not open-source.
There is an alternative anyway which is open-source: Double Metaphone. See here: https://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/language/DoubleMetaphone.html
The link by @Bo now refers to (now defucnt) project entire source code.
Hence here is the new link with direct link to Source code for Metaphone 3 https://searchcode.com/codesearch/view/2366000/
by Lawrence Philips