Databricks pyspark - specify schema nullability of dataframe returned by spark.sql()

56 Views Asked by At

Is there a way to specify the schema of a pyspark DataFrame returned by a query within df = spark.sql(...)? Specifically, I am looking for a way to specify some columns must be nullable = false.

This answer shows you can change the schema by creating a new DataFrame using spark.createDataFrame(df.rdd, df.schema), but as a comment mentions, it is very costly.

1

There are 1 best solutions below

0
On

df.schema["column name"].nullability = True