Reshape long to wide table in SQL

55 Views Asked by At

I have a dataset with variables ID, Vaccine, and Date. Some Vaccines have a single date, and some vaccines have two or three dates for a single ID. When I create a table in SQL, multiple rows are created for the IDs that have two or more dates for a vaccine. See an example below:

ID Vaccine Date
A Flu 1/15/2005
A Flu 3/25/2005
B Flu 1/15/2005
B TB 1/15/2005
B TB 2/15/2005
B TB 3/15/2005
B Flu 4/18/2005
C Flu 1/15/2005
C TB 5/21/2005

How can I create a table as below in SQL?

ID Vaccine Date1 Date2 Date3
A Flu 1/15/2005 3/25/2005
B Flu 1/15/2005 4/18/2005
B TB 1/15/2005 2/15/2005 3/15/2005
C Flu 1/15/2005
C TB 5/21/2005

Thanks

2

There are 2 best solutions below

0
On BEST ANSWER

You can try something like this:

SELECT 
    ID,
    Vaccine,
    MAX(CASE WHEN rn = 1 THEN Date END) AS Date1,
    MAX(CASE WHEN rn = 2 THEN Date END) AS Date2,
    MAX(CASE WHEN rn = 3 THEN Date END) AS Date3
FROM (
    SELECT 
        ID,
        Vaccine,
        Date,
        ROW_NUMBER() OVER (PARTITION BY ID, Vaccine ORDER BY Date) AS rn
    FROM 
        YourTableName
) AS subquery
GROUP BY 
    ID, Vaccine;

In this query:

  1. The inner subquery assigns a row number (rn) to each row within the same ID and Vaccine partition, ordered by the Date.
  2. The outer query uses CASE expressions to pivot the dates based on their row number (rn). The MAX() function is used to aggregate the dates for each ID and Vaccine combination.

Make sure to replace YourTableName with the actual name of your table. This query will transform the data into the format you desire, with Date1, Date2, and Date3 columns indicating the different dates for each ID and Vaccine combination.

0
On

To reshape a long tb to a wide tb you can use a combination of conditional aggregation.

Example

SELECT
    ID,
    Vaccine,
    MAX(CASE WHEN rn = 1 THEN Date END) AS Date1,
    MAX(CASE WHEN rn = 2 THEN Date END) AS Date2,
    MAX(CASE WHEN rn = 3 THEN Date END) AS Date3
FROM (
    SELECT
        ID,
        Vaccine,
        Date,
        ROW_NUMBER() OVER (PARTITION BY ID, Vaccine ORDER BY Date) AS rn
    FROM TBName
) AS Ranked
GROUP BY ID, Vaccine, rn
HAVING MAX(CASE WHEN rn = 1 THEN Date END) IS NOT NULL;