I am trying to create a query that will pivot over some rows but will SUM
some columns and then group them together. I've used the PIVOT
function before but I am running into issues when my result set contains similar values.
This is SQL Server 2012.
Sample code:
CREATE TABLE #Foo
(
Store varchar(50),
Employee varchar(50),
Sold money,
Waste money,
Tmsp datetime
)
INSERT INTO #Foo
VALUES
('Harrisburg', 'John', 20.00, 10.00, GETDATE()),
('Harrisburg', 'John', 20.00, 10.00, GETDATE()),
('Harrisburg', 'Jim', 20.00, 10.00, GETDATE()),
('Seattle', 'Jim', 20.00, 10.00, GETDATE()),
('Seattle', 'Alex', 20.00, 10.00, GETDATE())
SELECT
Store,
SUM(Sold) TotalSold,
SUM([John]) WastedByJohn,
SUM([Jim]) WastedByJim,
SUM([Alex]) WastedByAlex
FROM
#Foo
PIVOT
(SUM(Waste)
FOR Employee IN ([John], [Jim], [Alex])
) PVT
GROUP BY
Store
DROP TABLE #Foo
This yields the following results:
Store | TotalSold | WastedByJohn | WastedByJim | WastedByAlex
Harrisburg | 20.00 | 20.00 | 10.00 | NULL
Seattle | 20.00 | NULL | 10.00 | 10.00
Shouldn't the TotalSold for Harrisburg be 60.00 and the TotalSold for Seattle be 40.00 based on the data in the table?
It gets harder for me to understand, because if I change the data so that the values aren't the same, I get the correct results.
INSERT INTO #Foo
VALUES
('Harrisburg', 'John', 25.00, 10.00, GETDATE()),
('Harrisburg', 'John', 30.00, 10.00, GETDATE()),
('Harrisburg', 'Jim', 40.00, 10.00, GETDATE()),
('Seattle', 'Jim', 50.00, 10.00, GETDATE()),
('Seattle', 'Alex', 60.00, 10.00, GETDATE())
This set of data yields the expected result:
Store | TotalSold | WastedByJohn | WastedByJim | WastedByAlex
Harrisburg | 95.00 | 20.00 | 10.00 | NULL
Seattle | 110.00 | NULL | 10.00 | 10.00
I looked around for a bit and couldn't find an answer as to why the PIVOT would be different based on distinct values when it comes to aggregation. I feel like there's something fundamental that I'm missing here, unless I just happened to come across some issue with SQL Server which is unlikely.
Any help would be greatly appreciated.
Thanks!
The following query should give you what you want:
To understand why you get unexpected results, try your query without the
GROUP BY
clause:Output:
Now, try the same again with second version of sample data:
Output:
By comparing the 2 different result sets you can clearly see that
PIVOT
takes place for every combination of columns not participating in it, i.e. for every combination ofStore
,Sold
.In first case there is only
Harrisburg,20,00
andSeattle,20,00
. That's why you only get two rows in this case. In second case you have a total of 3 + 2 = 5 combinations.You can now see why
GROUP BY
works only in the second case.