How do I implement a custom comparator in the Python Dedupe library?

377 Views Asked by At

I'm using the so-far great Dedupe library to help link records from multiple providers. One of the fields I'm comparing is a phone number field. I'd like to use Google's phone number library to normalize these phone numbers. One other nice functionality is an ability to compare numbers and return a match type from 0 (not at all a match) to 4 (every component matches exactly).

So this seems like a natural fit for Dedupe's custom variable. But I'm a bit confused on what the custom comparator implementation should look like. The example in the docs is just a simple 0 vs 1 for match/non-match.

I basically want to ensure that, behind the scenes, my custom comparator will indicate to Dedupe that a 4 means the phone numbers are very close and a 0 means they're very far apart.

Will that work? Or do I have to return it some other way? E.g. do I have to indicate an exact match with 0?

0

There are 0 best solutions below