pyspark: change string to timestamp

285 Views Asked by At

I've a column in String format , some rows are also null. I add random timestamp to make it in the following form to convert it into timestamp.

date
null
22-04-2020
date
01-01-1990 23:59:59.000
22-04-2020 23:59:59.000

df = df.withColumn('date', F.concat (df.date, F.lit(" 23:59:59.000")))
df = df.withColumn('date', F.when(F.col('date').isNull(), '01-01-1990 23:59:59.000').otherwise(F.col('date')))

df.withColumn("date",      F.to_timestamp(F.col("date"),"MM-dd-yyyy HH mm ss SSS")).show(2)

but after this the column date becomes null.

can anyone help me solve this. either convert the string to timestamp direct

1

There are 1 best solutions below

0
On

Your timestamp format should start with dd-MM, not MM-dd, and you're also missing some colons and dots in the time part. Try the code below:

df.withColumn("date", F.to_timestamp(F.col("date"),"dd-MM-yyyy HH:mm:ss.SSS")).show()
+-------------------+
|               date|
+-------------------+
|1990-01-01 23:59:59|
|2020-04-22 23:59:59|
+-------------------+