dbt query to Snowflake resulting in an "invalid identifier" error for a column that exists

7k Views Asked by At

I've been pulling my hair out for several hours trying to understand what's going on, to no avail so far.

I've got this query on dbt:

{{
  config(
    materialized='incremental',
    unique_key='event_ID'
  )
}}

SELECT
    {{ dbt_utils.star(from=ref('staging_pg_ahoy_events'), relation_alias='events', prefix='event_') }},
    {{ dbt_utils.star(from=ref('staging_pg_ahoy_visits'), relation_alias='visits', prefix='visit_') }}
FROM
    {{ ref('staging_pg_ahoy_events') }} AS events
LEFT JOIN {{ ref('staging_pg_ahoy_visits') }} AS visits ON events.visit_id = visits.id

{% if is_incremental() %}
    WHERE "events"."event_ID" >= (SELECT max("events"."event_ID") FROM {{ this }})
{% endif %}

Along with this config:

version: 2

models:
  - name: facts_ahoy_events
    columns:
      - name: event_ID
        quote: true
        tests:
          - unique
          - not_null

dbt run -m facts_ahoy_events --full-refresh runs successfully, however when I try an incremental backup by dropping the --full-refresh flag, the following error ensues:

10:35:51  1 of 1 START incremental model DBT_PCOISNE.facts_ahoy_events.................... [RUN]
10:35:52  1 of 1 ERROR creating incremental model DBT_PCOISNE.facts_ahoy_events........... [ERROR in 0.88s]
10:35:52  
10:35:52  Finished running 1 incremental model in 3.01s.
10:35:52  
10:35:52  Completed with 1 error and 0 warnings:
10:35:52  
10:35:52  Database Error in model facts_ahoy_events (models/marts/facts/facts_ahoy_events.sql)
10:35:52    000904 (42000): SQL compilation error: error line 41 at position 10
10:35:52    invalid identifier '"events"."event_ID"'

I've gotten used to the case-sensitive column names on Snowflake, but I can't for the life of me figure out what's going on, since the following query run directly on Snowflake, completes:

select "event_ID" from DBT_PCOISNE.FACTS_AHOY_EVENTS limit 10;

Whereas this one expectedly fails:

select event_ID from DBT_PCOISNE.FACTS_AHOY_EVENTS limit 10;

I think I've tried every combination of upper, lower, and mixed casing, each with and without quoting, but all my attempts have failed.

Any help or insight would be greatly appreciated! Thank you

2

There are 2 best solutions below

3
Sergiu On

Most probably your column event_ID was created using "" around it which means an identifier was used. Now, using it also requires "" as all column names are capitalized inside Snowflake unless using identifiers. Solution is to either use "" around column name or rename it to lower case using an ALTER.

For DBT you can read more here

0
Vũ Bùi Hồng Đức On

Firstly, you should use Snowflake UI to check whether your column name is upper or lower case, because in the Snowflake database, id and ID are different.

Secondly, you should surround your column in .yml testing like this:

"<column_name>"

For example (.yml file in folder contains models):

version: 2

models:
  - name: CREDITS_DIM
    columns: 
      - name: '"ID"'
        tests:
          - unique
          - not_null

Thanks for guy upper. My comment is for full step-by-step solution.

Solution quoting in dbt_project.yml didn't work to me. I'll explore this and update later.

quoting:
    identifier: true