I'm trying to add an "CASE WHEN ... ELSE ..." calculated column to an existing DataFrame, using Scala APIs. Starting dataframe:
color
Red
Green
Blue
Desired dataframe (SQL syntax: CASE WHEN color == Green THEN 1 ELSE 0 END AS bool):
color bool
Red 0
Green 1
Blue 0
How should I implement this logic?
In the upcoming SPARK 1.4.0 release (should be released in the next couple of days). You can use the when/otherwise syntax:
If you are using SPARK 1.3.0 you can chose to use a UDF: