How to limit decimal values to 2 digits before applying agg function?

31.2k Views Asked by At

I am following this solution from one of the stack overflow post, my only requirement here is how can I limit the values that I want to sum to 2 digit after the decimal before applying the df.agg(sum()) function?

For examples: I have values like below and the sum function sums it,

2.346
1.549

However I want the values to be rounded to 2 digit after the decimal like

2.35
1.55

before summing it. How can I do it? I was not able to find any sub function like sum().round of function sum.

Note: I am using Spark 1.5.1 version.

2

There are 2 best solutions below

3
Psidom On BEST ANSWER

You can use bround:

val df = Seq(2.346, 1.549).toDF("A")
df.select(bround(df("A"), 2)).show
+------------+
|bround(A, 2)|
+------------+
|        2.35|
|        1.55|
+------------+


df.agg(sum(bround(df("A"), 2)).as("appSum")).show
+------------------+
|            appSum|
+------------------+
|3.9000000000000004|
+------------------+
                                          ^
df.agg(sum(df("A")).as("exactSum")).show
+--------+
|exactSum|
+--------+
|   3.895|
+--------+
0
Explorer On

The above solution does work for spark 2.0 version however for folks like me who are still using 1.5.*+ versions below is something that will work.(I used round function as suggested by @Psidom):

val df = Seq(2.346, 1.549).toDF("A")
df.select(bround(df("A"), 2)).show
+------------+
|bround(A, 2)|
+------------+
|        2.35|
|        1.55|
+------------+

val total=df.agg(sum(round(df.col(colName),2)).cast("double")).first.getDouble(0)
total: Double = 3.90