MySQL collation query results

109 Views Asked by At

does anyone has explaination why :

SELECT * FROM MY_TABLE WHERE 1 = 1 AND libelle COLLATE latin1_general_ci LIKE '%dég%'

returns 1 record (only the record with é) while

SELECT * FROM MY_TABLE WHERE 1 = 1 AND libelle COLLATE latin1_swedish_ci LIKE '%dég%'

returns 4 record (of course including the one above) ?

According to MySQL doc latin1_general_ci is "Multilingual (Western European) case insensitive" so should not it manage accents like latin1_swedish_ci ?

Thanks Nicolas

1

There are 1 best solutions below

0
On

I suspect you're misunderstanding what collation is.

Collation is a set of rules used in natural language (Swedish, English, Russian, Japanese...) to determine the dictionary order of words. In relational databases, this is used to sort data (e.g. ORDER BY clauses) and to compare data (e.g. WHERE clauses or unique indexes). A couple of examples:

  • If you need to order by country name in English you get this:

    1. Canada
    2. China
    3. Colombia


    However, in traditional Spanish ch used to be an independent letter so correct order was this:

    1. Canada
    2. Colombia
    3. China
  • In Swedish, å is a separate letter so you can have a login name like ångström even if you already have angström. In other languages they'd be duplicates and would not be allowed.

Collation is not something you use to display emojis and other Unicode characters. That's just encoding (ISO-8859-1, UTF-8, UTF-16... whatever).