I'm struggling to see how I would represent the following type of postgres SQL query in a cube.js schema:
SELECT
CASE
WHEN COUNT(tpp.net_total_amount) > 0 THEN
SUM(tpp.net_total_amount) / COUNT(tpp.net_total_amount)
ELSE
NULL
END AS average_spend_per_customer
FROM
(
SELECT
SUM(ts.total_amount) AS net_total_amount
FROM
postgres.transactions AS ts
WHERE
ts.transaction_date >= '2020-11-01' AND
ts.transaction_date < '2020-12-01'
GROUP BY
ts.customer_id,
ts.event_id
) AS tpp
;
I had the feeling that pre-aggregations might be what I'm after, but that doesn't seem to be the case after looking into them. I can get a list of total amount spent per customer per event with the following schema:
cube(`TransactionTotalAmountByCustomerAndEvent`, {
sql: `SELECT * FROM postgres.transactions`,
joins: {
},
measures: {
sum: {
sql: `SUM(total_amount)`,
type: `number`
}
},
dimensions: {
eventId: {
sql: `event_id`,
type: `string`
},
customerId: {
sql: `customer_id`,
type: `string`
},
transactionDate: {
sql: `transaction_date`,
type: `time`
}
},
preAggregations: {
customerAndEvent: {
type: `rollup`,
measureReferences: [sum],
dimensionReferences: [customerId, eventId]
}
}
});
But that is really just giving me the output of the inner SELECT statement grouped by customer and event. How do I query the cube to get the average customer spend per event figure I'm after?
You might find it easier to model the dataset as two different cubes,
CustomersandTransactions. You'll then need to set up a join between the cubes and then create a special dimension with thesubQueryproperty set totrue. I've included an example below to help you understand:You can find more information on the Subquery page on the documentation