Databricks not displaying correct output

51 Views Asked by MUHAMMAD UMER At 14 March 2024 at 06:08

In Azure databricks I am applying a filter to show the data where Region column has value 'weu'.

display(df.where(col("Region") == 'weu'))

But the output dataframe I am getting has Region values as eus & sea. Can anyone help why is this happening?

Original Q&A

There are 1 best solutions below

DileeprajnarayanThumula On 15 March 2024 at 09:50

I have used some sample data like below:

Region  Value
weu      1
eus      2
sea      3

The reason you are seeing region values as eus & sea Because containing values with leading and trailing white spaces.

I have tried the below approach:

Filter the DataFrame based on the Region column, applying the trim() function to remove any leading or trailing white spaces before filtering.

from pyspark.sql.functions import col, trim
filtered_df = dilip_df.where(trim(col("Region")) ==  "weu")
display(filtered_df)

enter image description here

Also you can check:

from pyspark.sql.functions import lower
dilip_df.select(lower(col("Region"))).distinct().show()

From the dilip_df DataFrame, select the Region column. Convert all the values in the column to lowercase using the lower() function, and then return only the distinct values in the column using the distinct() function.

Databricks not displaying correct output

There are 1 best solutions below

Related Questions in FILTER

Related Questions in WHERE-CLAUSE

Related Questions in AZURE-DATABRICKS

Related Questions in DISPLAY

Trending Questions

Popular # Hahtags

Popular Questions