How to do phonetic search in DSE Graph qremlin query?

141 Views Asked by At

I've gone through the possibilities of phrase, tokenFuzzy, fuzzy search options presented in DSE Graph Documentation. But these would not be sufficient to search for the case say Anish and Aneesh both spelled differently but pronounced more or less similarly.

Is there a way to use the underlying Solar phonetic search capabilities in gremlin queries?

Similar pairs of words are Muller and Mueller!

1

There are 1 best solutions below

3
On

Because of the integration with DSE Graph and DSE Search, there is no special Gremlin syntax to achieve this. To accomplish this in DSE Graph you would change the SOLR Core for the Vertex Label to add the desired filter to the analyzer chain. Something like the following could be used:

ALTER SEARCH INDEX SCHEMA ON graph_name.vertex_label_p
ADD types.fieldType [ @name='phonetic' , @class='org.apache.solr.schema.TextField' ]
  WITH 
    $${
        "analyzer": [
          {
            "filter": [
              { "class": "solr.LowerCaseFilterFactory" },
              { 
                "class": "solr.BeiderMorseFilterFactory", 
                "nameType": "GENERIC", 
                "ruleType": "APPROX",
                "concat": "true",
                "languageSet": "auto" 
              }
            ],
            "tokenizer": { "class": "solr.StandardTokenizerFactory" }
          }
        ]
      }$$;

...and then to make those changes active...

RELOAD SEARCH INDEX ON graph_name.vertex_label_p;

We have had other users use BMPM with DSE Search but I'm not aware of anyone using it with Graph to date.

With any SOLR extension added to the analyzer, you may see some increase in hardware utilization, CPU in particular.