How to split pascal case address in SnowFlake?

189 Views Asked by At

In order to create PlaceKey for addresses to link some of my tables, I need to split an address column in SnowFlake.

I am not familiar with JavaScript, but I tried Javascript UDF in SnowFlake. Then I don't know how to deal with the addresses like '123_45ThSt'.

The output of my function is like '123_45 Th St'. I am stuck here.

The expected output is '123 45Th St'. Hope someone could help me out. Much appreciated!

Below is another example and my SnowFlake SQL code:

 Original address column: 12345NE17ThSt

 The expected column:     12345 NE 17Th St

 My function's output:    12345 NE17 ST

My function:

CREATE OR REPLACE FUNCTION Split_On_Upper_Case(s string)
 RETURNS string
 LANGUAGE JAVASCRIPT 
 AS '
 function Split_On_Upper_Case(str){
 str=str.split(/(?=[A-Z])/).join(" ")
 return str
 }

 // Now call the function
 return Split_On_Upper_Case(S);
 '
 ;
1

There are 1 best solutions below

4
On BEST ANSWER

Assuming the format of street address, which includes number + word (ends with lower case or number) + word (start with upper case), I have below solution:

CREATE OR REPLACE FUNCTION Split_On_Upper_Case(s string)
RETURNS string
LANGUAGE JAVASCRIPT 
AS $$
  regexp = /([0-9]+)(NE|SE|NW|SW)?(.*[0-9a-z]{1})([A-Z][a-zA-Z0-9]+)/g;
  splits = regexp.exec(S.replace(/_/g, " "));
  if (splits && splits.length == 5) {
     return 
        splits[1].trim() + " " + 
        (splits[2] ? splits[2].trim() + " ": "" ) + 
        splits[3].trim() + " " + 
        splits[4].trim();
  }
  
  return "not found" // or whatever you want to do
$$;

Then try to run the function:

select Split_On_Upper_Case('12345NE17ThSt');
-- 12345 NE 17Th St

select Split_On_Upper_Case('123_45ThSt');
-- 123 45Th St

select Split_On_Upper_Case('35TestSt');
-- 35 Test St

It returns expected output, but if you have more sample inputs, they can help to validate.