BigQuery - Extract substring from column

73 Views Asked by At

I have column let's say called "assets" with results looks like this

//bigquery.googleapis.com/projects/ABC/datasets/123

//bigquery.googleapis.com/projects/BlaBla-something/datasets/12345

//bigquery.googleapis.com/projects/ProjectName/datasets/6789

I want to case to another column only part of text from first column : after "//bigquery.googleapis.com/projects/" and don't want to have this part /datasets/123 so in results I want only text in between and new column with name "asset_code" like this :

**Result **

| assets | asset_code | |//bigquery.is.com/projects/ABC/datasets/123 |ABC | |//bigquery.googleapis.com/projects/BlaBla-something/datasets/12345 |BlaBla-something | |//bigquery.googleapis.com/projects/ProjectName/datasets/6789 |ProjectName |

Would you be able to advice ?

tried substr. REGEXP_EXTRACT but got stucked split

2

There are 2 best solutions below

1
On

If you see the 3 examples you mentioned. The code is always in the 3rd position you can just use split function you dont need to use REGEXP_EXTRACT to get the values. Below is the sql code

select SPLIT(REPLACE('//bigquery.googleapis.com/projects/ProjectName/datasets/6789', '//', ''),'/')[OFFSET(2)] as assest_code

This is sql code which works with the columns

select assets, SPLIT(REPLACE(assets, '//', ''),'/')[OFFSET(2)] as assest_code from table
1
On

Consider also below approach

select assets, 
  regexp_extract(assets, r'//bigquery.googleapis.com/projects/(.*?)/') as asset_code
from your_table    

if applied to sample data in your question - output is

enter image description here