Lets say I have 3 tables: Invoices, Charges, and Payments. Invoices can have multiple charges, and charges can have multiple payments.
Doing a simple join, data would look like this:
invoiceid | chargeid | charge | payment
----------------------------------
1 | 1 | 50 | 50
2 | 2 | 100 | 25
2 | 2 | 100 | 75
2 | 3 | 30 | 10
2 | 3 | 30 | 5
If I do an join with sums,
select invoiceid, sum(charge), sum(payment)
from invoices i
inner join charges c on i.invoiceid = c.invoiceid
inner join payments p on p.chargeid = c.chargeid
group by invoiceid
The sum of payments would be correct but charges would include duplicates:
invoiceid | charges | payments
--------------------------------------
1 | 50 | 50
2 | 260 | 115
I want a query to get a list of invoices with the sum of payments and sum of charges per invoice, like this:
invoiceid | charges | payments
--------------------------------------
1 | 50 | 50
2 | 130 | 115
Is there any way to do this by modifying the query above WITHOUT using subqueries since subqueries can be quite slow when dealing with a large amount of data? I feel like there must be a way to only include unique charges in the sum.
one way is to do the aggregation by the tables before the joins on the grouping value
Another way would be to do it inline per invoice using correlation