Spark: How to get value for each day in interval?

82 Views Asked by At

I have a table with values and a date starting from which this value is valid:

param validfrom value
param1 01-01-2022 1
param2 03-01-2022 2
param1 05-01-2022 11
param1 07-01-2022 1

I need to get values of each parameter on each day before the specified date

For example, I have to get values for 06-01-2022:

param validfrom value
param1 01-01-2022 1
param1 02-01-2022 1
param1 03-01-2022 1
param1 04-01-2022 1
param1 05-01-2022 11
param1 06-01-2022 11
param2 03-01-2022 2
param2 04-01-2022 2
param2 05-01-2022 2
param2 06-01-2022 2

In other words, I need to get rows for missing dates and fill values by previous value.

I use window funсtion to get last value like this:

val windowPartitionByCompositeKey = Window.partitionBy(CompositeKey.map(col):_*)

    spark.table("table")
      .where($"$ValidFrom" <= repDt )
      .distinct()
      .withColumn("max_validfrom", max($"$ValidFrom").over(windowPartitionByCompositeKey))
      .where($"$ValidFrom" === $"max_validfrom")
      .drop("max_validfrom")
      .show()

and I get next result:

param validfrom value
param2 03-01-2022 2
param1 05-01-2022 11

But I need to get value of each param on each day before the specified date.

How I can get this?

0

There are 0 best solutions below