I have following code:
df.orderBy(expr("COUNTRY_NAME").desc, expr("count").asc).show()
I expect count column to be arranged in ascending order for a given COUNTRY_NAME. But I see something like this:
Last value of 12 is not as per the expectation.
Why is it so?
If you output df.printSchema(), you'll see that your "count" column is of the string datatype, resulting in the undesired alphanumeric sort.
In pyspark, you can use the following to accomplish what you are looking for:
You should create and apply your schema when the data is read in - if possible.