Generic tests that can be applied to models, sources, and snapshots to validate data quality and integrity.
compare_model
(required): The model to compare againstgroup_by_columns
(optional): Columns to group by for granular comparison
# In schema.yml
models:
- name: orders
tests:
- dbt_utils.equal_rowcount:
compare_model: ref('expected_orders')
# With grouping
- dbt_utils.equal_rowcount:
compare_model: ref('orders_backup')
group_by_columns: ['status']
compare_model
(required): The model to compare againstgroup_by_columns
(optional): Columns to group by
models:
- name: failed_orders
tests:
- dbt_utils.fewer_rows_than:
compare_model: ref('all_orders')
compare_model
(required): The model to compare againstcompare_columns
(optional): List of specific columns to compareexclude_columns
(optional): List of columns to exclude from comparisonprecision
(optional): Decimal places for numeric comparison
models:
- name: orders_current
tests:
# Compare entire tables
- dbt_utils.equality:
compare_model: ref('orders_expected')
# Compare specific columns only
- dbt_utils.equality:
compare_model: ref('orders_backup')
compare_columns: ['order_id', 'customer_id', 'total']
# Exclude timestamp columns from comparison
- dbt_utils.equality:
compare_model: ref('orders_snapshot')
exclude_columns: ['created_at', 'updated_at']
precision: 2
expression
(required): SQL expression to evaluatewhere
(optional): Filter condition
models:
- name: financial_data
tests:
# Test that revenue + tax = total
- dbt_utils.expression_is_true:
expression: "revenue + tax = total"
# Test percentage is between 0 and 100
- dbt_utils.expression_is_true:
expression: "discount_pct >= 0 and discount_pct <= 100"
# Test with WHERE condition
- dbt_utils.expression_is_true:
expression: "end_date > start_date"
config:
where: "status = 'active'"
datepart
(required): Date part (day, hour, etc.)field
(required): Timestamp column nameinterval
(required): Number of datepartsgroup_by_columns
(optional): Columns to group by
models:
- name: user_events
tests:
# Test data is less than 1 day old
- dbt_utils.recency:
datepart: day
field: created_at
interval: 1
# Test each customer has recent activity
- dbt_utils.recency:
datepart: hour
field: last_activity
interval: 24
group_by_columns: ['customer_id']
group_by_columns
(optional): Columns to group by
models:
- name: customer_profiles
columns:
- name: email
tests:
- dbt_utils.at_least_one
# Ensure each customer segment has at least one email
- name: email
tests:
- dbt_utils.at_least_one:
group_by_columns: ['customer_segment']
models:
- name: orders
columns:
- name: status
tests:
- dbt_utils.not_constant
# Test by customer that order amounts vary
- name: order_amount
tests:
- dbt_utils.not_constant:
group_by_columns: ['customer_id']
trim_whitespace
(optional, default=true): Whether to trim whitespacefield
(required): Column in the comparison modelto
(required): The comparison modelat_least
(required): Minimum proportionat_most
(optional, default=1.0): Maximum proportiongroup_by_columns
(optional): Columns to group byvalues
(required): List of unacceptable valuesto
(required): Referenced modelfield
(required): Referenced columnfrom_condition
(optional): Filter condition for sourceto_condition
(optional): Filter condition for targetlower_bound_column
(required): Lower bound columnupper_bound_column
(required): Upper bound columnpartition_by
(optional): Partitioning columnsgaps
(optional): Gap allowance ('allowed', 'not_allowed', 'required')zero_length_range_allowed
(optional): Allow zero-length rangesinterval
(optional, default=1): Gap between sequential valuesdatepart
(optional): Date part for datetime sequencesgroup_by_columns
(optional): Columns to group bycombination_of_columns
(required): List of columnsquote_columns
(optional, default=false): Quote column namesmin_value
(optional): Minimum valuemax_value
(optional): Maximum valueinclusive
(optional, default=true): Include boundary valuesThese macros run queries and return results as objects, typically abstracting over dbt statement blocks.
table
(required): Relation to querycolumn
(required): Column namewhere
(optional): Filter conditionorder_by
(optional, default='count(*) desc'): Sort ordermax_records
(optional): Limit resultsdefault
(optional, default=[]): Default if relation doesn't exist
{# Get all unique payment methods #}
{% set payment_methods = dbt_utils.get_column_values(
table=ref('payments'),
column='payment_method'
) %}
{# Use in pivot or conditional logic #}
select
customer_id,
{% for method in payment_methods %}
sum(case when payment_method = '{{ method }}' then amount else 0 end) as {{ method }}_total,
{% endfor %}
sum(amount) as total_amount
from {{ ref('payments') }}
group by customer_id
{# With filtering and ordering #}
{% set recent_customers = dbt_utils.get_column_values(
table=ref('customers'),
column='customer_id',
where="created_at >= '2023-01-01'",
order_by='created_at desc',
max_records=100
) %}
from
(required): Source relationexcept
(optional, default=[]): Columns to exclude
{# Get all columns except system fields #}
{% set columns = dbt_utils.get_filtered_columns_in_relation(
from=ref('raw_data'),
except=['_loaded_at', '_row_hash', 'dbt_updated_at']
) %}
select
{% for column in columns %}
max({{ column }}) as max_{{ column }},
{% endfor %}
count(*) as total_rows
from {{ ref('raw_data') }}
{# Create a clean version without metadata columns #}
select
{% for column in columns %}
{{ column }}{% if not loop.last %},{% endif %}
{% endfor %}
from {{ ref('source_table') }}
schema_pattern
(required): Schema patterntable_pattern
(required): Table patternexclude
(optional): Exclusion patterndatabase
(optional): Target databaseschema
(required): Schema nameprefix
(required): Table prefixexclude
(optional): Exclusion patterndatabase
(optional): Target databasesql_statement
(required): SQL query to executesql_statement
(required): SQL query to executedefault
(optional): Default value if query failsThese macros generate SQL code or complete queries, implementing common patterns that are difficult to write in pure SQL.
datepart
(required): Date interval (day, month, etc.)start_date
(required): Start dateend_date
(required): End date (exclusive)
{# Create daily date spine for 2023 #}
{{ dbt_utils.date_spine(
datepart="day",
start_date="cast('2023-01-01' as date)",
end_date="cast('2024-01-01' as date)"
) }}
{# Monthly spine for reporting #}
{{ dbt_utils.date_spine(
datepart="month",
start_date="cast('2020-01-01' as date)",
end_date="cast('2025-01-01' as date)"
) }}
{# Use in model for gap analysis #}
with date_spine as (
{{ dbt_utils.date_spine(
datepart="day",
start_date="'2023-01-01'",
end_date="current_date"
) }}
),
sales_data as (
select date_trunc('day', order_date) as order_day, sum(amount) as daily_sales
from {{ ref('orders') }}
group by 1
)
select
ds.date_day,
coalesce(sd.daily_sales, 0) as sales
from date_spine ds
left join sales_data sd on ds.date_day = sd.order_day
relation
(required): Source relationpartition_by
(required): Columns to partition byorder_by
(required): Priority order for deduplicationlat1
(required): First latitudelon1
(required): First longitudelat2
(required): Second latitudelon2
(required): Second longitudeunit
(optional, default='mi'): Distance unit ('mi' or 'km')n
(required): Number of fields to group by
select
customer_id,
order_date,
status,
count(*) as order_count,
sum(amount) as total_amount
from {{ ref('orders') }}
{{ dbt_utils.group_by(n=3) }}
{# Dynamic grouping based on columns #}
{% set group_cols = ['region', 'product_category', 'month'] %}
select
{% for col in group_cols %}
{{ col }},
{% endfor %}
sum(revenue) as total_revenue,
count(*) as transaction_count
from {{ ref('sales') }}
{{ dbt_utils.group_by(n=group_cols|length) }}
from
(required): Source relationexcept
(optional, default=[]): Columns to excluderelation_alias
(optional): Table alias for columnsprefix
(optional): Column prefixsuffix
(optional): Column suffixquote_identifiers
(optional, default=true): Quote column namesrelations
(required): Array of relations to unionexclude
(optional): Columns to excludeinclude
(optional): Columns to include (mutually exclusive with exclude)column_override
(optional): Explicit column type overridessource_column_name
(optional, default='_dbt_source_relation'): Source tracking column namewhere
(optional): Filter condition
{# Basic union of similar tables #}
{{ dbt_utils.union_relations(
relations=[
ref('orders_2022'),
ref('orders_2023'),
ref('orders_2024')
]
) }}
{# Union with column exclusions #}
{{ dbt_utils.union_relations(
relations=[
source('raw', 'customers_us'),
source('raw', 'customers_eu'),
source('raw', 'customers_asia')
],
exclude=['_loaded_at', '_metadata', 'region_specific_field']
) }}
{# Advanced: Type overrides and filtering #}
{{ dbt_utils.union_relations(
relations=[
ref('legacy_sales'),
ref('modern_sales')
],
column_override={
'amount': 'decimal(10,2)',
'customer_id': 'varchar(50)'
},
source_column_name='data_source',
where="status = 'active'"
) }}
{# Dynamic union with pattern matching #}
{% set event_relations = dbt_utils.get_relations_by_pattern('events%', 'page_views_%') %}
{{ dbt_utils.union_relations(
relations=event_relations,
include=['user_id', 'page_url', 'timestamp', 'session_id']
) }}
upper_bound
(required): Maximum number in seriesfield_list
(required): List of fields to hash
{# Simple surrogate key from multiple fields #}
select
{{ dbt_utils.generate_surrogate_key(['customer_id', 'order_date']) }} as order_key,
customer_id,
order_date,
amount
from {{ ref('raw_orders') }}
{# Complex example with many fields #}
select
{{ dbt_utils.generate_surrogate_key([
'customer_id',
'product_id',
'cast(order_date as string)',
'channel'
]) }} as composite_key,
*
from {{ ref('order_details') }}
{# Use as primary key in model #}
{{ config(materialized='table') }}
select
{{ dbt_utils.generate_surrogate_key(['id', 'version']) }} as pk,
id,
version,
data
from {{ source('raw', 'versioned_records') }}
field_list
(required): List of fields to sumnumerator
(required): Numerator expressiondenominator
(required): Denominator expressionfield_list
(required): List of fields to subtractcolumn
(required): Column to pivotvalues
(required): List of values to turn into columnsalias
(optional, default=true): Create column aliasesagg
(optional, default='sum'): Aggregation functioncmp
(optional, default='='): Comparison operatorprefix
(optional): Column prefixsuffix
(optional): Column suffixthen_value
(optional, default=1): Value when condition is trueelse_value
(optional, default=0): Value when condition is falsequote_identifiers
(optional, default=true): Quote column aliasesdistinct
(optional, default=false): Use DISTINCT in aggregation
{# Basic pivot - count by status #}
select
customer_id,
{{ dbt_utils.pivot(
'order_status',
dbt_utils.get_column_values(ref('orders'), 'order_status')
) }}
from {{ ref('orders') }}
group by customer_id
{# Revenue pivot with custom aggregation #}
select
product_category,
{{ dbt_utils.pivot(
'payment_method',
['credit_card', 'paypal', 'cash'],
agg='sum',
then_value='amount',
else_value=0,
prefix='revenue_',
suffix='_total'
) }}
from {{ ref('sales') }}
group by product_category
{# Advanced: Distinct count with custom comparison #}
select
date_trunc('month', order_date) as month,
{{ dbt_utils.pivot(
'customer_tier',
['bronze', 'silver', 'gold', 'platinum'],
agg='count',
then_value='distinct customer_id',
distinct=true
) }}
from {{ ref('customer_orders') }}
group by 1
relation
(required): Source relationcast_to
(optional, default='varchar'): Data type for valuesexclude
(optional): Columns to exclude from unpivot but keepremove
(optional): Columns to remove completelyfield_name
(optional): Column name for field namesvalue_name
(optional): Column name for valuesquote_identifiers
(optional, default=false): Quote identifiersexpr
(required): Expression to create histogram formin_value
(required): Lower boundmax_value
(required): Upper boundnum_buckets
(required): Number of bucketsMacros for extracting and manipulating URL components.
field
(required): Column containing URLurl_parameter
(required): Parameter name to extract
select
page_url,
{{ dbt_utils.get_url_parameter(field='page_url', url_parameter='utm_source') }} as utm_source,
{{ dbt_utils.get_url_parameter(field='page_url', url_parameter='utm_medium') }} as utm_medium,
{{ dbt_utils.get_url_parameter(field='page_url', url_parameter='utm_campaign') }} as utm_campaign
from {{ ref('web_analytics') }}
{# Extract multiple parameters for marketing analysis #}
select
session_id,
{{ dbt_utils.get_url_parameter(field='referrer_url', url_parameter='q') }} as search_query,
{{ dbt_utils.get_url_parameter(field='current_url', url_parameter='product_id') }} as viewed_product,
timestamp
from {{ ref('page_views') }}
where referrer_url is not null
field
(required): Column containing URL
select
referrer_url,
{{ dbt_utils.get_url_host(field='referrer_url') }} as referrer_domain,
count(*) as visits
from {{ ref('web_sessions') }}
where referrer_url is not null
group by 1, 2
order by visits desc
{# Analyze traffic sources by domain #}
select
{{ dbt_utils.get_url_host(field='page_url') }} as site_domain,
{{ dbt_utils.get_url_host(field='referrer_url') }} as referrer_domain,
count(*) as sessions,
sum(revenue) as total_revenue
from {{ ref('session_data') }}
group by 1, 2
field
(required): Column containing URL
select
page_url,
{{ dbt_utils.get_url_path(field='page_url') }} as page_path,
count(*) as page_views,
count(distinct session_id) as unique_sessions
from {{ ref('page_analytics') }}
group by 1, 2
order by page_views desc
{# Content performance analysis #}
select
case
when {{ dbt_utils.get_url_path(field='page_url') }} like '/product/%' then 'Product'
when {{ dbt_utils.get_url_path(field='page_url') }} like '/blog/%' then 'Blog'
when {{ dbt_utils.get_url_path(field='page_url') }} = '/' then 'Homepage'
else 'Other'
end as page_category,
avg(time_on_page) as avg_time_on_page,
count(*) as total_views
from {{ ref('user_behavior') }}
group by 1
Utility macros for Jinja templating and logging.
format
(optional): Date format string
{# Basic timestamp in logs #}
{{ log(dbt_utils.pretty_time() ~ " - Starting data processing", info=true) }}
{# Custom format for model documentation #}
/*
Model refreshed: {{ dbt_utils.pretty_time(format='%Y-%m-%d %H:%M:%S') }}
Generated by: {{ this.name }}
*/
select * from {{ ref('source_data') }}
{# In macro for conditional processing #}
{% if target.name == 'prod' %}
{{ log("Production run started at " ~ dbt_utils.pretty_time(), info=true) }}
{% endif %}
message
(required): Message to formatmessage
(required): Message to logstring
(required): String to slugify
{# Dynamic column creation from data values #}
{% set payment_methods = dbt_utils.get_column_values(ref('payments'), 'payment_method') %}
select
customer_id,
{% for method in payment_methods %}
sum(case when payment_method = '{{ method }}' then amount end) as {{ dbt_utils.slugify(method) }}_total,
{% endfor %}
sum(amount) as total
from {{ ref('payments') }}
group by customer_id
{# Clean column names for complex strings #}
{% set categories = ['High-Value Customer', 'New User!', 'VIP (Premium)', '30-Day Trial'] %}
select
{% for cat in categories %}
sum(case when category = '{{ cat }}' then 1 else 0 end) as {{ dbt_utils.slugify(cat) }}_count,
{% endfor %}
count(*) as total_users
from {{ ref('customer_segments') }}
Category | Count | Primary Use Cases | Key Examples |
---|---|---|---|
Generic Tests | 15 | Data quality validation, integrity checks, relationship testing | equality, recency, mutually_exclusive_ranges |
Introspective | 6 | Database introspection, metadata extraction, dynamic queries | get_column_values, get_relations_by_pattern |
SQL Generators | 15 | Complex SQL patterns, data manipulation, cross-database compatibility | pivot, union_relations, generate_surrogate_key |
Web | 3 | URL parsing and manipulation | get_url_parameter, get_url_host |
Jinja Helpers | 4 | Template utilities, logging, string manipulation | slugify, pretty_time, log_info |
dbt-labs/dbt_utils
to your packages.yml
and run dbt deps
.dbt_utils.
prefix, e.g., {{ dbt_utils.generate_surrogate_key(['col1', 'col2']) }}
For comprehensive information about dbt macros and utilities, refer to the official documentation:
These resources provide the authoritative source for dbt macro capabilities and are regularly updated.