I have a pySpark dataframe with multiple columns and a list with the items with one of the column items. I want to sort rows by the order of the given list.
| col_A | col_B | col_c |
|---|---|---|
| a1 | b1 | c1 |
| a2 | b2 | c2 |
| a3 | b3 | b3 |
col_A_itm_order = ['a2', 'a3', 'a1']
Expected output
| col_A | col_B | col_c |
|---|---|---|
| a2 | b2 | c2 |
| a3 | b3 | c3 |
| a1 | b1 | b1 |
I found a similar question for Pandas dataframe, but not for PySpark.