Prevent insert of duplicate records in table that not have foreign key and have arabic values

37 Views Asked by At

I am a beginner in ASP.NET Core.

I have a table that does not have a key column under certain conditions, and three of its columns are first name, last name, and father's name. Is there an algorithm or query that can prevent insert duplicate records in the database? The column values are in Arabic languages.

It is very difficult to recognize that my combination key is equal to the database column, and it is often mistaken by putting a single character.

Do you have a solution to compare Arabic names? So that if their writing type is different, the comparison algorithm will recognize it?

For example الاء is the same الا , but == not recognize this

Thanks for your help

Do you have a solution to compare Arabic names? So that if their writing type is different, the comparison algorithm will recognize it?

1

There are 1 best solutions below

2
Md Farid Uddin Kiron On

I have a table that does not have a key column under certain conditions, and three of its columns are first name, last name, and father's name. Is there an algorithm or query that can prevent insert duplicate records in the database?

Well, there's no ready made solution available. You should split the whole duplication checking steps in smaller chunck in order to get the expected result.

First of all you could use NormalizationForm to cehck unicode normalization form which Indicates that a Unicode string is normalized using full canonical decomposition.

In the next stage, you should use Regex to eliminate the unicode diacritics to compare your value accordingly.

Finally, you should checking the equality of your given name by using StringComparison.OrdinalIgnoreCase.

Let's have a look in practice, how we could implement that:

Static class for Normalize Arabic Name, Removing Diacritics:

public static class ArabicNameComparer
{
    public static bool AreEqual(string name1, string name2)
    {
       
        name1 = NormalizeArabicName(name1);
        name2 = NormalizeArabicName(name2);

        
        return string.Equals(name1, name2, StringComparison.OrdinalIgnoreCase);
    }

    private static string NormalizeArabicName(string name)
    {
        
        name = RemoveDiacritics(name);

        
        name = NormalizeHamza(name);

   
        name = name.Normalize(NormalizationForm.FormC);

        return name;
    }

    private static string RemoveDiacritics(string text)
    {
        
        string normalized = text.Normalize(NormalizationForm.FormD);
        Regex regex = new Regex(@"\p{Mn}", RegexOptions.Compiled);
        return regex.Replace(normalized, string.Empty);
    }

    private static string NormalizeHamza(string text)
    {
        return text.Replace('أ', 'ا').Replace('إ', 'ا').Replace('آ', 'ا').Replace('ى', 'ي');
    }
}

Test case:

In order to check, if the name are duplicate or note, I am considering two arabic charecters these are "محمد" and "مَحْمُد" let's check if these are treated as duplicate or not:

string name1 = "محمد";
string name2 = "مَحْمُد";
bool equal = ArabicNameComparer.AreEqual(name1, name2);

Output:

enter image description here

enter image description here

Note: As you can see that, from two given arabic charecters we can check if those are duplicate. Following same way you should implement as per your requirement. Keep it mind this a just a way how we could do that, not a exact solution of your scenario. You could proceed following this kind of approach. In addition, for any kind of custom implementation, please study this official document.