How to parse_url without scheme in sql within snowflake?

2.1k Views Asked by At

I am trying to pull hostname and path from urls without scheme but parse_url doesn't works. What all other options do I have?

2

There are 2 best solutions below

0
On

PARSE_URL does not work without scheme -

SNOWFLAKE1#COMPUTE_WH@TEST_DB.PUBLIC>SELECT PARSE_URL('www.snowflake.com');
100139 (22000): Error parsing URL: scheme not specified
SNOWFLAKE1#COMPUTE_WH@TEST_DB.PUBLIC>SELECT PARSE_URL('http://www.snowflake.com');
+---------------------------------------+
| PARSE_URL('HTTP://WWW.SNOWFLAKE.COM') |
|---------------------------------------|
| {                                     |
|   "fragment": null,                   |
|   "host": "www.snowflake.com",        |
|   "parameters": null,                 |
|   "path": null,                       |
|   "port": null,                       |
|   "query": null,                      |
|   "scheme": "http"                    |
| }                                     |
+---------------------------------------+
1 Row(s) produced. Time Elapsed: 0.229s

Wondering if you are looking for something like below -

Split a column into multiple columns by delimiter and recombine from last column to first using SQL

SNOWFLAKE1#COMPUTE_WH@TEST_DB.PUBLIC>select value from table(split_to_table('com.dqm.abcv.xyz.google', '.'));
+--------+
| VALUE  |
|--------|
| com    |
| dqm    |
| abcv   |
| xyz    |
| google |
+--------+
5 Row(s) produced. Time Elapsed: 0.566s
0
On

The snowflake parse_url function has the option to set the level of "permissive".

By defult, the function will error out when it encounters a url without a scheme. This sounds like the issue you are having.

If you set permissive to 1, then the function will write the error to the column and continue on. I have used this to get around some dirty data in the url field in the past.

example: Failing query:

select
parse_url(evnt_landing_url):parameters:msclkid::string as msclkid,
EVENT_TIME 
FROM table;

result:

SQL Error [100139] [22000]: Error parsing URL: scheme not specified

Passing query:

select
parse_url(evnt_landing_url,1):parameters:msclkid::string as msclkid,
EVENT_TIME 
FROM table;

Further reading: PARSE_URL(, []) From: https://docs.snowflake.com/en/sql-reference/functions/parse_url