I am new to PySpark and have been trying a few stuff.
I have a data frame as follows
+----------+-----------+
| Column1| Column2|
+----------+-----------+
| VALUE1| 30000|
| VALUE2| 25000|
| VALUE3| 20000|
| VALUE4| 19500|
| VALUE5| 18100|
+----------+-----------+
I want to add a new column such that its value is as per the following formula
CurrentRow[Column3] =
IF (CurrentRow[Column2] > PreviousRow[Column3])
THEN PreviousRow[Column3]
ELSE CurrentRow[Column2] * 0.9
Example below
+----------+------------------+------------------+
| Column1| Column2| Column3|
+----------+------------------+------------------+
| VALUE1| 30000| 27000|
| VALUE2| 25000| 22500|
| VALUE3| 20000| 18000|
| VALUE4| 19500| 18000|
| VALUE5| 18100| 18000|
+----------+------------------+------------------+
I tried searching for the lag function on the same column that is being updated (withColumn) but could not succeed