Databricks delta live table throws error saying, "can't use generated columns"

104 Views Asked by Dhaval Patodiya At 04 March 2024 at 02:47

I am using delta live tables for learning.

So, I created tables that look like this:

/* Silver Layer */

CREATE LIVE TABLE SSliverLive
(
  RowID         INT             COMMENT 'Primary key',  
  OrderDate     DATE,
  ShipDate      DATE,
  OrderYear     INT             GENERATED ALWAYS AS (YEAR (OrderDate)),
  ShipYear      INT             GENERATED ALWAYS AS (YEAR (ShipDate)),
)
USING DELTA
AS
SELECT *
FROM live.SBronzeLive;

/* Gold Layer */

CREATE LIVE TABLE SGoldLive
AS
SELECT rowid
       , OrderYear
       , COUNT(RowId) AS tt
FROM live.SSliverLive
GROUP BY rowid, orderyear;

But, the DLT pipelines throw the below error:

A generated column cannot use a non-existent column or another generated column.

I followed the SQLBits tutorial online and they can use the generated columns.

Removing the columns works fine but the Databricks website says I can use it.

Original Q&A

There are 1 best solutions below

JayashankarGS On 05 March 2024 at 12:33

The documentation states that generated column values are automatically generated based on a user-specified function over other columns in the Delta table.

In your gold table, you mention only OrderYear, which is evaluated as YEAR(OrderDate) AS OrderYear, but you don't have OrderDate defined in the gold table, so it raises an error.

You can use OrderYear only when you include OrderDate in the SELECT statement, as shown below:

CREATE LIVE TABLE SGoldLive
 AS 
SELECT
OrderYear,
OrderDate
FROM live.SSliverLive

If you include OrderDate in the SELECT statement, you will need to add it to the GROUP BY clause, which may not produce the desired results.

Therefore, you should perform a GROUP BY on OrderYear without selecting it.

Here's an example using MONTH:

Silver table:

CREATE LIVE TABLE SSliverLive
(
  RowID         LONG             COMMENT 'Primary key',  
  OrderDate     DATE,
  orderMonth    INT              GENERATED ALWAYS AS (month(OrderDate))
)
USING DELTA
AS
SELECT
o_orderkey as RowID,
o_orderdate as OrderDate
FROM LIVE.SBronzeLive

Gold table:

CREATE LIVE TABLE SGoldLive
 AS 
SELECT
COUNT(RowId) AS tt
FROM live.SSliverLive
group by orderMonth

This produces output with a single column.

Alternatively, you can use the original column:

CREATE LIVE TABLE SGoldLive
 AS 
SELECT 
month(OrderDate) as ordMonth,
COUNT(RowId) AS tt
FROM live.SSliverLive
group by month(OrderDate)

Outputs:

enter image description here

Databricks delta live table throws error saying, "can't use generated columns"

There are 1 best solutions below

Related Questions in DATABRICKS

Related Questions in AZURE-DATABRICKS

Related Questions in DATABRICKS-SQL

Related Questions in DELTA-LIVE-TABLES

Related Questions in DATABRICKS-UNITY-CATALOG

Trending Questions

Popular # Hahtags

Popular Questions