Why difference when importing csv with spark

48 Views Asked by At

I have this csv file, payments.csv, for some particular rows the timestamp is changing by itself. the first 3 lines are the screenshots for easier understanding.

import spark.implicits._
import org.apache.spark.sql.functions.{col,when,to_date,row_number,date_add,expr}
import org.apache.spark.sql.expressions.{Window}
import org.apache.spark.sql.SparkSession

val spark = SparkSession.builder().getOrCreate()
//Importing the csv
val df = spark.read.option("header","true").option("inferSchema","true").csv("payment.csv")


val df2 = df.filter($"payment_id" === 21112).show()
val time_value = df2.collect(){0}{5}
println(time_value)

clueless about it as of now.

Screenshots:

enter image description here

enter image description here

enter image description here

0

There are 0 best solutions below