How to handle VectorAssembler errors in Pyspark?

19 Views Asked by At

Please can anyone help on this?

The same code runs fine in my Google Colab but throws an error in Databricks.

image

It seems like the vector assembler is not accepting the data type 'double' in most columns. However, when I check the datatypes in the dataframe prior to applying the pipeline, it does not have any 'double'. Does that mean that the datatype changed when applying StringIndexer or OHE as these are the only steps in the pipeline prior to the VectorAssembler?

0

There are 0 best solutions below