fuzzy search in django postgresql without using Elasticsearch

376 Views Asked by At

I try to incorporate fuzzy serach function in a django project without using Elasticsearch.

1- I am using postgres, so I first tried levenshtein, but it did not work for my purpose.

class Levenshtein(Func):
    template = "%(function)s(%(expressions)s, '%(search_term)s')"
    function = "levenshtein"

    def __init__(self, expression, search_term, **extras):
        super(Levenshtein, self).__init__(
            expression,
            search_term=search_term,
            **extras
        )
items = Product.objects.annotate(lev_dist=Levenshtein(F('sort_name'), searchterm)).filter(
                            lev_dist__lte=2
                        )

Search "glyoxl" did not pick up "4-Methylphenylglyoxal hydrate", because levenshtein considered "Methylphenylglyoxal" as a word and compared with my searchterm "glyoxl".

2. trigram_similar gave weird results and was slow

items = Product.objects.filter(sort_name__trigram_similar=searchterm)
  • "phnylglyoxal" did not pick up "4-Methylphenylglyoxal hydrate", but picked up some other similar terms: "4-Hydroxyphenylglyoxal hydrate", "2,4,6-Trimethylphenylglyoxal hydrate"

  • "glyoxl" did not pick up any of the above terms

3. python package, fuzzywuzzy seems can solve my problem, but I was not able to incorporate it into query function.

ratio= fuzz.partial_ratio('glyoxl', '4-Methylphenylglyoxal hydrate') 

# ratio = 83

I tried to use fuzz.partial_ratio function in annotate, but it did not work.

items = Product.objects.annotate(ratio=fuzz.partial_ratio(searchterm, 'full_name')).filter(
                            ratio__gte=75
                        )

Here is the error message:

QuerySet.annotate() received non-expression(s): 12.

According to this stackoverflow post (1), annotate does not take regular python functions. The post also mentioned that from Django 2.1, one can subclass Func to generate a custom function. But it seems that Func can only take database functions such as levenshtein.

Any way to solve these problems? thanks!

0

There are 0 best solutions below