Snowflake-How to filter out numbers whose digits are all the same?

127 Views Asked by At

I am trying to filter out numbers whose all digits are the same- like '000000000','111111111111',etc.. I am hoping it could be done using REGEXP_LIKE in snowflake. But it is not working.

These are the things I have tried to test:

SELECT
  CASE
    WHEN REGEXP_LIKE(column, '^[0-9]\\1+$') THEN '0'
    ELSE '1'
  END AS D;

tried adding more backslashes for escape sequence problem. Still didn't give me the desired output.

6

There are 6 best solutions below

0
ADITYA PAWAR On

You can go with below solution, which would be more reliable and customizable

with data as (
select '1111111' num
union 
select '0000000' num
union
select '1234' num
)
select *
,case when array_size(ARRAY_DISTINCT(regexp_extract_all(num,'.'))) = 1 THEN '0' ELSE '1' END AS D
from data
 ;
0
Dave Welden On

Alternate solution not using ARRAY. This also only flags entries with repeating digits and ignores repeating non-digit characters. If you know the column will only have digits, the first condition of the WHEN can be omitted (see the commented line below).

with data as (
    select '1111111' num
    union 
    select '0000000' num
    union
    select '1234' num
    union
    select 'aaaa' num
)
select num
,case
    when (
        left(trim(num), 1) regexp '[0-9]'  -- Only needed if non-digits may be present
        and trim(num) = repeat(left(trim(num), 1),len(trim(num)))
    )  then '0'
    else '1'
 end AS d
from data;
NUM D
1111111 0
0000000 0
1234 1
aaaa 1
0
Simeon Pilgrim On

Using a Scalar JavaScript Function, we can access real regular expressions

CREATE OR REPLACE FUNCTION uniform_digit_check(STR VARCHAR)
  RETURNS VARCHAR
  LANGUAGE JAVASCRIPT
  AS $$
  return STR.match(/^(\d)\1+$/)?'0':'1';
  $$
  ;

we can then use like so:

select
    $1 as val
    ,uniform_digit_check(val)
from values
    ('111111'),
    ('111100'),
    ('0000000000'),
    ('abc');

enter image description here

0
DuesserBaest On

Not familiar with SQL and snowflake, but I think simply:

^(1+|2+|3+|4+|5+|6+|7+|8+|9+|0+)$

should do the trick aswell, explicitly stating all 10 options of digits that could be repeated.

See: regex101.com/r/Uh9Cn7/latest


Borrowing OPs SQL code, the solution could look like this:

SELECT
  CASE
    WHEN REGEXP_LIKE(column, '^(1+|2+|3+|4+|5+|6+|7+|8+|9+|0+)$') THEN '0'
    ELSE '1'
  END AS D;
0
Alexander Mashin On

Try this:

SELECT column
FROM some_table
WHERE REPEAT (LEFT (column, 1), LENGTH (column)) = column -- same character repeated.
  AND LENGTH (column) > 1 -- more than one digit.
  AND columnt > 9 -- add this, only if column is int, and there sought to be more than one digits.
  AND REGEXP_LIKE(column, '^[0-9]+$') -- add this, only if column is not int.
0
Kathmandude On

Tweaking @DuesserBaest's answer just a little

select col, regexp_count(col,'^(0+|1+|2+|3+|4+|5+|6+|7+|8+|9+)$') as is_monodigit
from t