Teradata SQL to Extract Records Based on Approximate String Matching

3.5k Views Asked by Samir Parmar At 06 July 2016 at 23:18

We are on version TD 14 and I come from Netezza / Postgre(Redshift) background. I have been asked to extract a login data from audit logs to find out records/transactions where the same ip is submitting similar looking usernames with small changes. e.g Samir --> Samr --> Amir etc To capture phishing activity. In POstgres we have fuzzy string functions like '%' e.g ColA % ColB (where % operator is equivalent to Similar) Soundex, Metaphone, levenshtein etc. In Teradata however I have just encountered or I have been able to find just Soundex. Is there any such in built function/method capability with Teradata version 14 to achieve the above string approximation.

Original Q&A

There are 1 best solutions below

Rob Paller On 07 July 2016 at 01:41

Teradata 14.x supports the Damerau-Levenshtein Distance algorithm via the EDITDISTANCE() function and n-gram pattern matching via the NGRAM() function.

You can find information about the EDITDISTANCE function here and the NGRAM() function here.

Teradata SQL to Extract Records Based on Approximate String Matching

There are 1 best solutions below

Related Questions in TERADATA

Related Questions in LEVENSHTEIN-DISTANCE

Related Questions in FUZZY

Related Questions in METAPHONE

Trending Questions

Popular # Hahtags

Popular Questions