I'm struggling to see how I would represent the following type of postgres SQL query in a cube.js schema:
SELECT
CASE
WHEN COUNT(tpp.net_total_amount) > 0 THEN
SUM(tpp.net_total_amount) / COUNT(tpp.net_total_amount)
ELSE
NULL
END AS average_spend_per_customer
FROM
(
SELECT
SUM(ts.total_amount) AS net_total_amount
FROM
postgres.transactions AS ts
WHERE
ts.transaction_date >= '2020-11-01' AND
ts.transaction_date < '2020-12-01'
GROUP BY
ts.customer_id,
ts.event_id
) AS tpp
;
I had the feeling that pre-aggregations might be what I'm after, but that doesn't seem to be the case after looking into them. I can get a list of total amount spent per customer per event with the following schema:
cube(`TransactionTotalAmountByCustomerAndEvent`, {
sql: `SELECT * FROM postgres.transactions`,
joins: {
},
measures: {
sum: {
sql: `SUM(total_amount)`,
type: `number`
}
},
dimensions: {
eventId: {
sql: `event_id`,
type: `string`
},
customerId: {
sql: `customer_id`,
type: `string`
},
transactionDate: {
sql: `transaction_date`,
type: `time`
}
},
preAggregations: {
customerAndEvent: {
type: `rollup`,
measureReferences: [sum],
dimensionReferences: [customerId, eventId]
}
}
});
But that is really just giving me the output of the inner SELECT statement grouped by customer and event. How do I query the cube to get the average customer spend per event figure I'm after?
You might find it easier to model the dataset as two different cubes,
Customers
andTransactions
. You'll then need to set up a join between the cubes and then create a special dimension with thesubQuery
property set totrue
. I've included an example below to help you understand:You can find more information on the Subquery page on the documentation