A date error keeps on occurring when I run my code. I've parsed the date but my suspicion is I need to parse it elsewhere in the code. I've tried doing this but I'm obviously missing something.
When t1 is run separately I don't get any errors so I suspect the issue is in t2 or t3.
The data being queried is the public GA4 dataset so you can try this yourself.
Many thanks in advance.
WITH t1 AS (
SELECT
PARSE_DATE('%Y%m%d', event_date) AS event_date,
user_pseudo_id,
COUNT(DISTINCT user_pseudo_id) AS users,
MAX(CASE WHEN params.key = "ga_session_id" THEN params.value.int_value ELSE 0 END) AS sessionId,
CASE WHEN event_name = "first_visit" THEN 1 ELSE 0 END AS newUsers,
COUNT(ecommerce.transaction_id) AS conversions,
SUM(ecommerce.purchase_revenue) AS totalRevenue
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*` AS ga,
UNNEST (event_params) AS params
WHERE
_table_suffix BETWEEN '20210101' AND '20210131'
GROUP BY event_date, user_pseudo_id, event_name, ecommerce.transaction_id
),
t2 AS (
SELECT
user_pseudo_id,
COUNT(ecommerce.transaction_id) AS conv,
AVG(COUNT(ecommerce.transaction_id))
OVER (
PARTITION BY user_pseudo_id
ORDER BY user_pseudo_id
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS avg_conversion,
CAST(MAX(event_date) AS DATE) AS most_recent_purchase,
CAST(MIN(event_date) AS DATE) AS first_purchase
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*` AS ga,
UNNEST (event_params) AS params
WHERE
_table_suffix BETWEEN '20210101' AND '20210131'
GROUP BY user_pseudo_id, event_date
),
t3 AS (
SELECT
user_pseudo_id,
DATE_DIFF(
t2.most_recent_purchase, t2.first_purchase, DAY
) / 30.0 AS day_in_between_purchases,
t2.avg_conversion AS conversions
FROM t2
GROUP BY user_pseudo_id, most_recent_purchase, first_purchase, avg_conversion
)
SELECT
t1.event_date,
t4.newUsers,
t4.conversions,
t4.totalRevenue
FROM (
SELECT
t1.user_pseudo_id,
t1.event_date,
SUM(newUsers) AS newUsers,
SUM(conversions) AS conversions,
CONCAT('$', IFNULL(SUM(totalRevenue), 0)) AS totalRevenue
FROM
t1
GROUP BY t1.user_pseudo_id, t1.event_date
) AS t4
LEFT JOIN t3 ON t3.user_pseudo_id = t4.user_pseudo_id
LEFT JOIN t2 ON t2.user_pseudo_id = t4.user_pseudo_id
WHERE t2.conv > 0;
As a quick observation … while you parse the
event_dateint1, you don't do that int2.So … if
event_dateain’t no date not, I would not take it granted, your SQL dialect (MS SQL Server?) can justcastit.Besides that … there should be some helpful error messages (sometimes them even come with positional information).