Snowflake interprets boolean values in parquet as NULL?

360 Views Asked by At

Parquet Entry Example (All entries have is_active_entity as true)

{
 "is_active_entity": true,
  "is_removed": false
}

Query that demonstrates all values are taken as NULL

select  $1:IS_ACTIVE_ENTITY::boolean, count(*) from @practitioner_delta_stage/part-00000-49224c02-150b-493b-8036-54ab30a8ff40-c000.snappy.parquet group by $1:IS_ACTIVE_ENTITY::boolean ;

Output has only one group for NULL

$1:IS_ACTIVE_ENTITY::BOOLEAN    COUNT(*)
    NULL                                            4930277

I don't know where I am going wrong, Spark writes the correct schema in parquet as evident from the example but snowflake takes it as NULL.

How do I fix this?

1

There are 1 best solutions below

0
On BEST ANSWER

The columns in your file are quoted. As a consequence "is_active_entity" is not the same like "IS_ACTIVE_ENTITY"

Please try this query:

select $1:is_active_entity::boolean, count(*) from @practitioner_delta_stage/part-00000-49224c02-150b-493b-8036-54ab30a8ff40-c000.snappy.parquet group by $1:IS_ACTIVE_ENTITY::boolean ;

More infos: https://docs.snowflake.com/en/sql-reference/identifiers-syntax.html#:~:text=The%20identifier%20is%20case%2Dsensitive.