Apache spark: asc not working as expected

40 Views Asked by Mandroid At 03 July 2025 at 09:09

I have following code:

df.orderBy(expr("COUNTRY_NAME").desc, expr("count").asc).show()

I expect count column to be arranged in ascending order for a given COUNTRY_NAME. But I see something like this:

Last value of 12 is not as per the expectation.

Why is it so?

There are 1 best solutions below

Cstack2 On 25 October 2022 at 05:41 BEST ANSWER

If you output df.printSchema(), you'll see that your "count" column is of the string datatype, resulting in the undesired alphanumeric sort.

In pyspark, you can use the following to accomplish what you are looking for:

df = df.withColumn('count',df['count'].cast('int'))
df.orderBy(['COUNTRY_NAME'],ascending=False).orderBy(['count'],ascending=True).show()

You should create and apply your schema when the data is read in - if possible.