java.lang.Long cannot be cast to java.lang.Double ERROR when using MAX()

501 Views Asked by At

Since the update of Cloud Dataprep yesterday 19/11/2018, I got an error everytime I'm using the function MAX(), either alone or in pivot.

Some notes :

  • I used the MAX function on another dataset and it was working. ( So max() works )
  • I didn't have this issue before the update of dataprep yesterday, the flow was working.
  • I tried many time to edit the recipe to isolate the issue but it seems to be that MAX() function
  • The column i'm using MAX() on are of type INT. i tried to convert INT-> FLOAT -> INT to make sure it's INT before using MAX() but keep getting the same issue

Here is the log

java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Double
    at com.trifacta.google.dataflow.functions.MaxCombineFn.binaryOperation(MaxCombineFn.java:18)
    at com.trifacta.google.dataflow.functions.BinaryOperationCombineFn.addInput(BinaryOperationCombineFn.java:60)
    at org.apache.beam.sdk.transforms.CombineFns$ComposedCombineFn.addInput(CombineFns.java:295)
    at org.apache.beam.sdk.transforms.CombineFns$ComposedCombineFn.addInput(CombineFns.java:212)
    at org.apache.beam.runners.core.GlobalCombineFnRunners$CombineFnRunner.addInput(GlobalCombineFnRunners.java:109)
    at com.google.cloud.dataflow.worker.PartialGroupByKeyParDoFns$ValueCombiner.add(PartialGroupByKeyParDoFns.java:163)
    at com.google.cloud.dataflow.worker.PartialGroupByKeyParDoFns$ValueCombiner.add(PartialGroupByKeyParDoFns.java:141)
    at com.google.cloud.dataflow.worker.util.common.worker.GroupingTables$CombiningGroupingTable$1.add(GroupingTables.java:385)
    at com.google.cloud.dataflow.worker.util.common.worker.GroupingTables$GroupingTableBase.put(GroupingTables.java:230)
    at com.google.cloud.dataflow.worker.util.common.worker.GroupingTables$GroupingTableBase.put(GroupingTables.java:210)
    at com.google.cloud.dataflow.worker.util.common.worker.SimplePartialGroupByKeyParDoFn.processElement(SimplePartialGroupByKeyParDoFn.java:35)
    at com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
    at com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
    at com.google.cloud.dataflow.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:271)
    at org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:309)
    at org.apache.beam.runners.core.SimpleDoFnRunner.access$700(SimpleDoFnRunner.java:77)
    at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:621)
    at org.apache.beam.sdk.transforms.DoFnOutputReceivers$WindowedContextOutputReceiver.output(DoFnOutputReceivers.java:71)
    at org.apache.beam.sdk.transforms.MapElements$1.processElement(MapElements.java:128)
1

There are 1 best solutions below

0
On

I'm with Google Cloud Platform Support.

This is is an internal issue that happened after the update on the 19th (as you said). We know about this and we are working along the Trifacta team (as this is a third party product developed and managed by them).

There is a Public Issue regarding this, feel free to add info or anything you feel is needed.

EDIT: The issue is fixed now, could you try now and tell me if it worked?