For People With A Similar Question (written after finding a solution):
This problem, as you might notice according to the answers below, has a lot of different solutions. I only chose Evan's because it was the easiest one for me implement into my own code. However, from what I tried, every other answer also worked. @SalvadorDali linked this Kaggle page which was definitely interesting and I reccomend reading if you are interested. Prolog was also brought up as a possible solution, I'm unfamiliar with it, but if you already know it -- it's probably worth considering. Also, if you just want to get code to use there are working Javascript and Python examples below. However, each one had a different approach to the solution and I'm not sure which is most effecient (feel free to test it yourself).
For further approaches/reading:
http://en.wikipedia.org/wiki/Breadth-first_search
Prolog and ancestor relationship
https://www.kaggle.com/c/word2vec-nlp-tutorial/details/part-2-word-vectors
Sorry for the confusing title, I can't figure out a way to properly word my question -- any better ideas are welcome.
Because I'm having such a difficult time describing my question, I'll try to explain my goal and code as much as needed:
Note: my code here is Go, but I'd be happy with answers in other languages as well, if you have any questions I'll try to answer as quick as possible
Basically, I have an array of "Word" objects that look like this:
type Word struct{
text string
synonyms []string
}
This is an example of 4 words within the array:
[]Word{
{text: "cat" synonyms: ["feline", "kitten", "mouser"]}
{text: "kitten" synonyms: ["kitty", "kit"]}
{text: "kit" synonyms: ["pack", "bag", "gear"]}
{text: "computer" synonyms: ["electronics", "PC", "abacus"]}
}
My challenge is writing a method to test for a relationship between 2 words. Of course, testing between 2 words like "cat" and "kitten" would be easy with the example above. I could just check "Cat"s list of synonyms and test to see if it contains "kitten." With code like this:
areWordsRelated(word1 Word, word2 Word) bool{
for _, elem := range word1.synonyms{
if elem == word2.text{
return true
}
}
return false
}
However, I can't figure out how to test for a more distant relationship.
For example:
areWordsRelated("cat","pack") //should return true
//because "cat" is related to "kitten" which is related to "pack"
areWordsRelated("cat", "computer") //should return false
I tried to do it recursively, but all my attempts don't seem to work. Any example code (My code is in Go, but Python, Java, or Javascript are also fine), pseudocode or just explanations would be really great.
If you give me some feedback on this I can edit it because it doesn't do exactly what you asked but it is the jist. I'll edit with a technical explanation of what has to be changed to meet your exact example.
EDIT: This doesn't do quite what you asked because it doesn't make the connection between "pack" and "cat" as pack is not represented by an actual word object and I defined the method to receive word2 as an object (just working off your example). I could instead make that a string so it can check for "pack" in the synonyms array of "kit" before returning but the idea is the same none the less... Here's a high level explanation of the algorithm.
Iterate the synonyms, if it isn't a match, find that
Word
object back in the original collection and call myself with it as the first argument. This will recursively exhaust every path until it finds a match, or there are none left in which case you're outside the loop returning false. The code above runs in go playground and correctly returnstrue\nfalse
. Notice that the recursive call is made within an if to protect from returning false prematurely (also a performance enhancement because we return as soon as true is found rather than continue to recurse the paths).https://play.golang.org/p/gCeY0SthU1