Is it possible to pass dynamic values into a dbt source freshness test?

2.1k Views Asked by At

I'm trying to dynamically determine warnings and errors on freshness checks, specified in dbt sources.yml, based on the median and std dev of the "synced_at" column of the underlying source.

To accomplish this, I thought I might try to pass a macro in the freshness block of the source.yml file as so:

# sources.yml
...
    tables:
      - name: appointment_type
        freshness:
          error_after:
            count: test_macro()
            period: hour
...

Where:

{%- macro test_macro(this) -%}

{# /*
The idea is {{ this.table }} would parameterize a query, 
going over the same column name for all sources, _fivetran_synced, 
and spit out the calculated values I want. This makes me feel like 
it needs to be a prehook, that somehow stores the value in a var, 
and that is accessed in the source.yml, instead of calling it directly. 

In this case a trivial integer is attempted to be returned, just as an example.
*/ #}
{{ return(24) }}

{%- endmacro -%}

However this results in a type error. Presumably the macro is not called at all. Wrapping it in jinja quotes also returns an error.

I am curious if passing dynamic values to freshness checks can currently be achieved in any way?

1

There are 1 best solutions below

0
On BEST ANSWER

It isn't possible today to call macros from .yml files, for precisely this reason: dbt needs to be able to statically parse those files and validate internal objects (including resource properties like source freshness) before it runs any queries against the database.

I think you could maybe hack this by overriding the collect_freshness macro to return, instead of simply max(synced_at), a timestamp that is Z-score diffed from current_timestamp, normalized based on all Fivetran max(synced_at) timestamps. It feels tricky but possible.

At the same time, I'd gently push back on your larger goal here. We think of source freshness as something that should be prescriptive. You get to tell Fivetran how often you want it to sync data, and add freshness blocks to test those expectations. You can run ad hoc queries like the one you envision above to determine if those expectations are reasonable. Obviously, some tables are updated infrequently or unpredictably, but I find it's more useful to override or remove these tables' freshness expectations than to add significant complexity on their account.