I am attempting to make the logic for the opponent in a scrabble game.
I have done a lot of thinking and come to the conclusion that I need to use anagrams
and check those anagrams against a list of words in a dictionary file to see if the generated word is in fact a word contained in the dictionary file.
The issue I'm having is optimization
. Because this anagram uses recursion
and runs up to 8 factorial, there tends to be many 'junk' words generated that do not exist in any dictionary, such as repetitions of a single letter.
There has to be some sort of check to see if the permutations are valid and not just a repetition of 1 character. So far, I'm at a loss for how to do this both quickly and accurately.
In English, words seem to be formed by both vowels and consonants. I was thinking of checking if a word contains at least 1 vowel and at least 1 consonant, however there are some exceptions where words can contain only vowels or only consonants. So this method seems to go out the window.
Now I may be missing something crucial, but short of brute-forcing my way through all permutations I have no real idea of how to check in a way that is sufficiently fast enough for gameplay.
My question is:
Can anyone suggest a method that will work 100% of the time to optimize the number of permutations generated?
I don't need useless ones generated and these turn out to be the bulk of what is generated.
I believe this to a good approach, however at the same time I believe that I must be missing something that is much faster and more appropriate for what I want to achieve.
If anyone could suggest a way to check if words are actually viable or not, OR if you could suggest a better way of approaching the situation, it'd be greatly appreciated.
Thanks.
(disclaimer: pseudocode may not be valid java, even though it looks like it)
It sounds like you have a jumbled collection of letters, and want to find all of the English words that can be spelled using them.
Two strings are anagrams of one another if they compare equal when you sort both of them. Permuting the order of letters in your candidate word to see if any of them are legal English words is expensive. Instead, sort the letters and compare it against your word list:
This is more efficient if the number of words in your word list is less than the factorial of the size of your candidate word. For instance, the number of legal words in Words With Friends is about 170,000, so you would prefer the above method for checking words of length 9 or more.
If you plan to check a lot of candidate words, then you can save time by saving the sorted forms of all of your valid words. Create a dictionary where the key is a sorted string, and the value is a list of English words that are an anagram of that string. It should look like this:
You can construct this dictionary once at the beginning of your program, like so:
then finding valid anagrams for a string is just a matter of accessing the dictionary.