Transpose column to row of column headers in PySpark

138 Views Asked by At

I have the following dataframe in pyspark

+------------+---------+---------+
| id         | day     | quantity|
+------------+---------+---------+
|id001       | Mon     | 9       |
|id001       | Tue     | 8       |
|id001       | Wed     | 7       |
|id002       | Mon     | 10      |
|id002       | Tue     | 10      |
|id002       | Wed     | 11      |
|id003       | Mon     | 1       |
|id003       | Tue     | 2       |
|id003       | Wed     | 3       |
+------------+---------+---------+

I would like to change it such that the id is the column header, like such:

+-------+--------+--------+
| id001 | id002  | i003   |
+-------+--------+--------+
|9      | 10     | 1      |
|8      | 10     | 2      |
|7      | 11     | 3      |   
+-------+--------+--------+
1

There are 1 best solutions below

0
Callum Brown On

I found the answer; super simple:

df = df.groupBy('day').pivot('id').sum('quantity').drop('day)