How to use dateutil parser in Pyspark dataframe

31 Views Asked by At

Actually, I need to iterate through each column of Pyspark dataframe and check for each rows. If I have date column with any type of date format, I need to identify it as date and its format. So I'm trying to use UDF with dateutil parser. But, the code is not going into the function.

def parse_date_string(date_str):
    try:
        return parser.parse(date_str).date()
    except ValueError:
        return None 

def functiontofinddatatype():

    for column in df.columns:
        date_parse_udf = udf(parse_date_string, DateType())

        
       df=df.withColumn("parsed_date", date_parse_udf(col(column)))

Please correct if the given code is wrong.

The parse_date_string function is not working. I mean it is not getting called by udf.

0

There are 0 best solutions below