I have working code using PySpark which uses the STACK function to turn columns into rows, creating a display_name column in doing so.
I am trying to convert this to a Snowflake query, in theory my thought was that a LATERAL JOIN should be able to do the job.
The below is very similar to what I have in PySpark, but for some reason I cant get it to work. I keep getting the below error:
Invalid expression [T.ACTIVE_SHARE_PCT] in VALUES clause
SELECT
t.as_of_date,
t.product_code,
v.*
FROM (
SELECT
to_date(dd.date) AS as_of_date,
pd.product_code AS product_code,
pv.*
FROM
integration.product_valuation_accounting_fact_v01 pv
JOIN integration.date_dim_v01 dd
ON pv.valuation_date_key = dd.date_key
JOIN integration.product_dim_v01 pd
ON pv.product_key = pd.product_key
AND dd.date BETWEEN pd.start_date AND pd.end_date
LEFT JOIN integration.currency_dim_v01 cd
ON pv.base_currency_key = cd.currency_key
WHERE
dd.date = '2023-06-30'
) AS t
CROSS JOIN LATERAL (
VALUES
('Active Share End Weight', t.active_share_pct, null),
('Wtd Avg Market Cap Base', t.weighted_avg_mkt_cap_base, null),
('Price to Book Ratio – Latest Reported', t.price_book_ratio_latest_rptd, null)
) AS v (display_name, portfolio, benchmark)
I can't really use a UNPIVOT clause because I need to include any NULL values plus, i would be unpivoting two columns, the port and benchmark.
So if I makes so fake tables (as CTE's) the have the columns you use in your joins, and run this:
I get:
which is different to your error. Given
tonly has 2 known columns and "all ofPVcolumns" I have to assume this are on PV, so we add those:yippie, it seems we have reproduced your problem...
right, now I read the code more, you are wanting to make 3 rows for each input row, and take one of three different values, but those values are coming from the input. Thus the
LATERAL, but this is not really going to fly, instead, you should cross join to 3 values, and then use a CASE to access the values you wish like:giving: