While mapping a dataset I keep having the problem that columns are being renamed from _1, _2 ect to value, value.
What is it which is causing the rename?
While mapping a dataset I keep having the problem that columns are being renamed from _1, _2 ect to value, value.
What is it which is causing the rename?
Copyright © 2021 Jogjafile Inc.
That's because
map
on Dataset causes that query is serialized and deserialized in Spark.To Serialize it, Spark must now the Encoder. That's ewhy there is an object ExpressionEncoder with method apply. It's JavaDoc says:
Please look at the last point. Your query is just mapped to primitives, so Catalyst uses name "value".
If you add
.select('value.as("MyPropertyName")).as[CaseClass]
, the field names will be correct.Types that will have column name "value":