MySQL query to get count of repeating characters from a string

1.1k Views Asked by At

My target data/table:

mysql> select firstname from empl;
+-----------+
| firstname |
+-----------+
| Abhishek  |
| Arnab     |
| Aamaaan   |
| Arbaaz    |
| Mohon     |
| Parikshit |
| Tom       |
| Koustuv   |
| Amit      |
| Bibhishana|
| abCCdEeff |
+-----------+
11 rows in set (0.00 sec)

Desired output:

Return one row with three columns for each case-sensitive letter that repeats itself in a first name: column_one is x—the firstname in which a repeating letter is found; column_two is y—the leftmost-unique-letter that is repeated; column_three is z—the number of times the letter encountered in the word.

---------------+-------+-----+
 firstname,x   | str,y |cnt,z|
---------------+-------+-----+
 Aamaaan       | a     | 4   | 
 Arbaaz        | a     | 2   | 
 Mohon         | o     | 2   | 
 Parikshit     | i     | 2   | 
 Koustuv       | u     | 2   |  
 Bhibhishana   | h     | 3   | 
 Bhibhishana   | i     | 2   | 
 Bhibhishana   | a     | 2   | 
 abcCCdEeff    | C     | 2   |
 abcCCdEeff    | f     | 2   |

My best attempt, to-date:

WITH CTE AS
(
    SELECT firstname, CONVERT(LEFT(firstname,1),CHAR) AS Letter, RIGHT(firstname, LENGTH(firstname)-1) AS Remainder
    FROM empl
    WHERE LENGTH(firstname)>1
    UNION ALL
    SELECT firstname, CONVERT(LEFT(Remainder,1),CHAR) AS Letter,
        RIGHT(Remainder, LENGTH(Remainder)-1) AS Remainder
    FROM CTE
    WHERE LENGTH(Remainder)>0
)
SELECT firstname, Letter, ASCII(Letter) AS CharCode, COUNT(Letter) AS CountOfLetter
FROM CTE
GROUP BY firstname, Letter, ASCII(Letter)
HAVING COUNT(Letter)>2
1

There are 1 best solutions below

0
On

Use recursive CTEs to get all the letters A-Z and a-z and join to the table:

with 
  recursive u_letters as (
    select 'A' letter
    union all
    select char(ascii(letter) + 1) from u_letters
    where letter < 'Z'
  ),
  l_letters as (
    select 'a' letter
    union all
    select char(ascii(letter) + 1) from l_letters
    where letter < 'z'
  ),
  letters as (
    select * from u_letters
    union all
    select * from l_letters
  ),
  results as (
    select e.firstname, l.letter,
      length(e.firstname) - length(replace(e.firstname, l.letter, '')) cnt
    from empl e inner join letters l
    on binary e.firstname like concat('%', l.letter, '%')
  )
select * from results                                   
where cnt > 1 

See the demo.
Results:

| firstname   | letter | cnt |
| ----------- | ------ | --- |
| Abhishek    | h      | 2   |
| Aamaaan     | a      | 4   |
| Arbaaz      | a      | 2   |
| Mohon       | o      | 2   |
| Parikshit   | i      | 2   |
| Koustuv     | u      | 2   |
| Bibhishana  | a      | 2   |
| Bibhishana  | h      | 2   |
| Bibhishana  | i      | 2   |
| abCCdEeff   | C      | 2   |
| abCCdEeff   | f      | 2   |