So I am exploring of fully using BQ as my primary storage for Data Lakehouse architecture pattern.
However, one of the main features of the Data Lakehouse is that its** first layer** (raw, bronze - whatever it's called) is schema on-read.
Are there any approaches where I could use BQ for my RAW with a schema on-read approach?
Has anyone seen / done this? Is this completely stupid question? :)
For example, I am loading data from a RDBMS (mssql, oracle) via a BQ connector, and even if a column changes the data type or a new column is added or column is removed - all works and data is ingested just fine in BQ. Meaning at this RAW stage I don't have to worry about managing schema evolution.
Thank you, DV
I am trying to build a data ingestion pattern with schema on read approach, but when sink is BQ. Currently, found no options on paper.
Bigquery allows you to have an external table on files stored in cloud storage. So you can use this option.
However, before you choose bigquery as your primary storage and query engine. please understand this two things.
We learned this the hard way. our data got tripped when we moved to bigquery and so does cost.