View uses get_json_object fails in Athena, but works in Databricks

25 Views Asked by At

In our org we have the need to expose our data both in Databricks and in Athena. We have some views that use "get_json_object(my_col)['path']" function and runs as expected from Databricks. Once trying to query same view from Athena we get below exception:

VIEW_IS_STALE: line 2:6: View 'my_view' is stale or in invalid state: 
column [my_col] of type json projected from query view 
at position 256 cannot be coerced to column [response_syn_exp_pos_1_feature] 
of type varchar stored in view definition

In Athena the parallel function is "json_extract_scalar(my_col,'path')", but it has different name and signature.

I do wonder whether there is any way around this so we can query the same view from databricks and athena?

1

There are 1 best solutions below

0
Ananth Tirumanur On

In Databricks, you might be using get_json_object(my_col)['path'], which works as expected. However, in Athena, the equivalent function is json_extract_scalar(my_col,'path'). This discrepancy leads to the error when trying to query the same view from Athena, as the extracted JSON data cannot be directly coerced into the VARCHAR type expected by the view definition.

I cant think of a way to use the same view in Athena and databricks. you can redefine the view in Athena with json_extract_scalar instead

CREATE VIEW my_athena_view AS SELECT json_extract_scalar(my_col, '$.path') AS response_syn_exp_pos_1_feature FROM my_table