How to load a pyspark dataframe into cosmos having different datatypes in a a single column

91 Views Asked by goodWill At 16 June 2025 at 20:06

I am trying to load a pyspark dataframe into cosmos container. one of my column (rating) has values both in string and int.

ID	rating
id1	5
id2	bad

I want to load data into cosmos as per their data types. for example, In pyspark I tried casting the datatypes based on the value, similar to this. I have tried different versions of the below, like checking the values with (rlike("^[0-9]+$")) etc.

df = df.withColumn("rating", when(col("rating").cast("int").isNotNull(), col("rating").cast("int")).otherwise(col("rating")))

But once I load the data into cosmos, it all come as string, with "" around the value, for example "5" and "bad". Instead what I want is , 5 and "bad".

I am not sure if related to cosmos config when write the data, so here's my setting

"spark.cosmos.write.strategy": "ItemOverwrite",
"spark.cosmos.serialization.inclusionMode" : "NonNull",
"spark.cosmos.write.bulk.enabled": "true",
"mode" : "Append",
"Upsert" : "true"

Original Q&A

How to load a pyspark dataframe into cosmos having different datatypes in a a single column

There are 0 best solutions below

Related Questions in DATAFRAME

Related Questions in PYSPARK

Related Questions in AZURE-COSMOSDB

Related Questions in AZURE-COSMOSDB-SQLAPI

Trending Questions

Popular # Hahtags

Popular Questions