Transpose column to row of column headers in PySpark

138 Views Asked by Callum Brown At 06 March 2023 at 14:53

I have the following dataframe in pyspark

+------------+---------+---------+
| id         | day     | quantity|
+------------+---------+---------+
|id001       | Mon     | 9       |
|id001       | Tue     | 8       |
|id001       | Wed     | 7       |
|id002       | Mon     | 10      |
|id002       | Tue     | 10      |
|id002       | Wed     | 11      |
|id003       | Mon     | 1       |
|id003       | Tue     | 2       |
|id003       | Wed     | 3       |
+------------+---------+---------+

I would like to change it such that the id is the column header, like such:

+-------+--------+--------+
| id001 | id002  | i003   |
+-------+--------+--------+
|9      | 10     | 1      |
|8      | 10     | 2      |
|7      | 11     | 3      |   
+-------+--------+--------+

Original Q&A

There are 1 best solutions below

Callum Brown On 06 March 2023 at 15:07

I found the answer; super simple:

df = df.groupBy('day').pivot('id').sum('quantity').drop('day)

Transpose column to row of column headers in PySpark

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in PYSPARK

Related Questions in TRANSPOSE

Related Questions in COLUMNHEADER

Trending Questions

Popular # Hahtags

Popular Questions