Some how in Spark2.0, I can use Dataframe.map(r => r.getAs[String]("field")) without problems
But DataSet.map(r => r.getAs[String]("field")) gives error that r doesn't have the "getAs" method.
What's the difference between r in DataSet and r in DataFrame and why r.getAs only works with DataFrame?
After doing some research in StackOverflow, I found a helpful answer here
Encoder error while trying to map dataframe row to updated row
Hope it's helpful
Datasethas a type parameter:class Dataset[T].Tis the type of each record in the Dataset. ThatTmight be anything (well, anything for which you can provide an implicitEncoder[T], but that's besides the point).A
mapoperation on aDatasetapplies the provided function to each record, so therin the map operations you showed will have the typeT.Lastly,
DataFrameis actually just an alias forDataset[Row], which means each record has the typeRow. AndRowhas a method namedgetAsthat takes a type parameter and a String argument, hence you can callgetAs[String]("field")on anyRow. For anyTthat doesn't have this method - this will fail to compile.