SELECT only a specific set of characters (PostgreSQL)

753 Views Asked by At

I have a table with a specific set of characters containing the value, it's decimal and hexadecimal equivalents.

/-------------------------\
|value|decimal|hexadecimal|
|-------------------------|
|    a|     97|       0061|
|    b|     98|       0062|
|    c|     99|       0063|
|    d|    100|       0064|
|-------------------------|

How can I select a column ignoring any character that is not in this table?

Edit 1: Let's say this table is called "tb_abcd" and a wanna select the column "user_name" from the table "tb_users".

Table content:

/---------\
|user_name|
|---------|
| Samuel  |
| Paul    |
| Chris   |
|---------|

Expected output (based on the character set from "tb_abcd"):

/---------\
|user_name|
|---------|
|    A    |
|    A    |
|    C    |
|---------|

I know this is a silly example, but the original table has a set of 251 characters.

2

There are 2 best solutions below

1
On BEST ANSWER

So long as none of your 251 characters interfere with regex, this silly solution should work.

with keeps as (
  select string_agg(value, '') as letters
    from tb_abcd 
)
select *, 
       upper(regexp_replace(u.user_name, '[^'||k.letters||']', '', 'gi'))
  from tb_users u
 cross join keeps k;

 user_name | letters | upper 
-----------+---------+-------
 Samuel    | abcd    | A
 Paul      | abcd    | A
 Chris     | abcd    | C
(3 rows)
0
On

If I understand correctly you would use regexp_replace():

select regexp_replace(t.user_name, r.regex, '', 'g')
from t cross join lateral
     (select '[^' || string_agg(value, '') || ']' as regex
      from content
     ) r

Note: You need to take care if any of the characters are valid in a regular expression. That is not the case for your sample data.

Here is a db<>fiddle. This returns different results from what you specify -- I can't figure out where those come from. This does do what you describe.