Newbie here on spark... how can I use a column in spark dataset ask key to get some values and add the values as new column to the dataset?
In python, we have something like:
df.loc[:,'values'] = df.loc[:,'key'].apply(lambda x: D.get(x))
where D is a function in python defined earlier.
how can I do this in spark using Java? thank you.
Edit: for example: I have a following dataset df:
A
1
3
6
0
8
I want to create a weekday column based on the following dictionary:
D[1] = "Monday"
D[2] = "Tuesday"
D[3] = "Wednesday"
D[4] = "Thursday"
D[5] = "Friday"
D[6] = "Saturday"
D[7] = "Sunday"
and add the column back to my dataset df:
A days
1 Monday
3 Wednesday
6 Saturday
0 Sunday
8 NULL
This is just an example, column A could be anything other than integers of course.
df.withColumn
to return a new df with the new columnvalues
and the previous values of df.udf
function (user defined functions) to apply the dictionary mapping.here's a reproducible example: