Parquet Entry Example (All entries have is_active_entity
as true)
{
"is_active_entity": true,
"is_removed": false
}
Query that demonstrates all values are taken as NULL
select $1:IS_ACTIVE_ENTITY::boolean, count(*) from @practitioner_delta_stage/part-00000-49224c02-150b-493b-8036-54ab30a8ff40-c000.snappy.parquet group by $1:IS_ACTIVE_ENTITY::boolean ;
Output has only one group for NULL
$1:IS_ACTIVE_ENTITY::BOOLEAN COUNT(*)
NULL 4930277
I don't know where I am going wrong, Spark writes the correct schema in parquet as evident from the example but snowflake takes it as NULL.
How do I fix this?
The columns in your file are quoted. As a consequence
"is_active_entity"
is not the same like"IS_ACTIVE_ENTITY"
Please try this query:
select $1:is_active_entity::boolean, count(*) from @practitioner_delta_stage/part-00000-49224c02-150b-493b-8036-54ab30a8ff40-c000.snappy.parquet group by $1:IS_ACTIVE_ENTITY::boolean ;
More infos: https://docs.snowflake.com/en/sql-reference/identifiers-syntax.html#:~:text=The%20identifier%20is%20case%2Dsensitive.