dbt_utils Package - Comprehensive Macro Summary

About dbt_utils: This package contains reusable macros that can be used across dbt projects. It provides a comprehensive set of utilities for testing, SQL generation, introspection, and data manipulation.

Table of Contents

TESTSGeneric Tests

Generic tests that can be applied to models, sources, and snapshots to validate data quality and integrity.

equal_rowcount

Asserts that two relations have the same number of rows.
Parameters:
  • compare_model (required): The model to compare against
  • group_by_columns (optional): Columns to group by for granular comparison
# In schema.yml
models:
  - name: orders
    tests:
      - dbt_utils.equal_rowcount:
          compare_model: ref('expected_orders')

# With grouping
      - dbt_utils.equal_rowcount:
          compare_model: ref('orders_backup')
          group_by_columns: ['status']

Expected Results:

✅ Pass: Both tables have exactly 1,250 rows
❌ Fail: "got 1,250 rows, expected 1,180 rows" (shows difference)

fewer_rows_than

Asserts that the model has fewer rows than the comparison model.
Parameters:
  • compare_model (required): The model to compare against
  • group_by_columns (optional): Columns to group by
models:
  - name: failed_orders
    tests:
      - dbt_utils.fewer_rows_than:
          compare_model: ref('all_orders')

Expected Results:

✅ Pass: failed_orders has 45 rows, all_orders has 1,250 rows
❌ Fail: "failed_orders has more rows (1,300) than all_orders (1,250)"

equality

Asserts the equality of two relations, with options to compare specific columns or exclude columns, and set precision for numeric comparisons.
Parameters:
  • compare_model (required): The model to compare against
  • compare_columns (optional): List of specific columns to compare
  • exclude_columns (optional): List of columns to exclude from comparison
  • precision (optional): Decimal places for numeric comparison
models:
  - name: orders_current
    tests:
      # Compare entire tables
      - dbt_utils.equality:
          compare_model: ref('orders_expected')

      # Compare specific columns only
      - dbt_utils.equality:
          compare_model: ref('orders_backup')
          compare_columns: ['order_id', 'customer_id', 'total']

      # Exclude timestamp columns from comparison
      - dbt_utils.equality:
          compare_model: ref('orders_snapshot')
          exclude_columns: ['created_at', 'updated_at']
          precision: 2

Expected Results:

✅ Pass: All rows match between tables
❌ Fail: Returns differing rows with 'which_diff' column showing 'a_minus_b' or 'b_minus_a'

expression_is_true

Asserts that a valid SQL expression is true for all records. Useful for cross-column integrity checks.
Parameters:
  • expression (required): SQL expression to evaluate
  • where (optional): Filter condition
models:
  - name: financial_data
    tests:
      # Test that revenue + tax = total
      - dbt_utils.expression_is_true:
          expression: "revenue + tax = total"

      # Test percentage is between 0 and 100
      - dbt_utils.expression_is_true:
          expression: "discount_pct >= 0 and discount_pct <= 100"

      # Test with WHERE condition
      - dbt_utils.expression_is_true:
          expression: "end_date > start_date"
          config:
            where: "status = 'active'"

Expected Results:

✅ Pass: All rows satisfy the expression
❌ Fail: Returns rows where expression is false, e.g.:
  | revenue | tax | total | expression_result |
  | 100.00 | 8.5 | 110.0 | false |

recency

Asserts that a timestamp column contains data at least as recent as the defined interval.
Parameters:
  • datepart (required): Date part (day, hour, etc.)
  • field (required): Timestamp column name
  • interval (required): Number of dateparts
  • group_by_columns (optional): Columns to group by
models:
  - name: user_events
    tests:
      # Test data is less than 1 day old
      - dbt_utils.recency:
          datepart: day
          field: created_at
          interval: 1

      # Test each customer has recent activity
      - dbt_utils.recency:
          datepart: hour
          field: last_activity
          interval: 24
          group_by_columns: ['customer_id']

Expected Results:

✅ Pass: Most recent record is within specified timeframe
❌ Fail: "Most recent record is 2 days old, expected within 1 day"
With grouping: Each group has recent data or shows which groups are stale

at_least_one

Asserts that a column has at least one non-null value.
Parameters:
  • group_by_columns (optional): Columns to group by
models:
  - name: customer_profiles
    columns:
      - name: email
        tests:
          - dbt_utils.at_least_one

      # Ensure each customer segment has at least one email
      - name: email
        tests:
          - dbt_utils.at_least_one:
              group_by_columns: ['customer_segment']

Expected Results:

✅ Pass: At least one non-null value found
❌ Fail: "All values in column 'email' are null"
With grouping: Shows which groups have all null values

not_constant

Asserts that a column does not have the same value in all rows.
models:
  - name: orders
    columns:
      - name: status
        tests:
          - dbt_utils.not_constant

      # Test by customer that order amounts vary
      - name: order_amount
        tests:
          - dbt_utils.not_constant:
              group_by_columns: ['customer_id']

Expected Results:

✅ Pass: Column has varying values (e.g., 'pending', 'completed', 'cancelled')
❌ Fail: "Column 'status' has the same value 'pending' in all 1,250 rows"
With grouping: Identifies groups where all values are identical

not_empty_string

Asserts that a column does not have any empty string values.
Parameters:
  • trim_whitespace (optional, default=true): Whether to trim whitespace

cardinality_equality

Asserts that values in a column have exactly the same cardinality as values from a different column in a different model.
Parameters:
  • field (required): Column in the comparison model
  • to (required): The comparison model

not_null_proportion

Asserts that the proportion of non-null values in a column is within a specified range.
Parameters:
  • at_least (required): Minimum proportion
  • at_most (optional, default=1.0): Maximum proportion
  • group_by_columns (optional): Columns to group by

not_accepted_values

Asserts that there are no rows matching the specified values.
Parameters:
  • values (required): List of unacceptable values

relationships_where

Asserts referential integrity between two relations with additional filtering conditions.
Parameters:
  • to (required): Referenced model
  • field (required): Referenced column
  • from_condition (optional): Filter condition for source
  • to_condition (optional): Filter condition for target

mutually_exclusive_ranges

Asserts that ranges between lower and upper bound columns do not overlap.
Parameters:
  • lower_bound_column (required): Lower bound column
  • upper_bound_column (required): Upper bound column
  • partition_by (optional): Partitioning columns
  • gaps (optional): Gap allowance ('allowed', 'not_allowed', 'required')
  • zero_length_range_allowed (optional): Allow zero-length ranges

sequential_values

Confirms that a column contains sequential values (numeric or datetime).
Parameters:
  • interval (optional, default=1): Gap between sequential values
  • datepart (optional): Date part for datetime sequences
  • group_by_columns (optional): Columns to group by

unique_combination_of_columns

Asserts that the combination of specified columns is unique.
Parameters:
  • combination_of_columns (required): List of columns
  • quote_columns (optional, default=false): Quote column names

accepted_range

Asserts that a column's values fall within an expected range.
Parameters:
  • min_value (optional): Minimum value
  • max_value (optional): Maximum value
  • inclusive (optional, default=true): Include boundary values

INTROSPECTIVEIntrospective Macros

These macros run queries and return results as objects, typically abstracting over dbt statement blocks.

get_column_values

Returns unique values for a column in a relation as an array.
Parameters:
  • table (required): Relation to query
  • column (required): Column name
  • where (optional): Filter condition
  • order_by (optional, default='count(*) desc'): Sort order
  • max_records (optional): Limit results
  • default (optional, default=[]): Default if relation doesn't exist
{# Get all unique payment methods #}
{% set payment_methods = dbt_utils.get_column_values(
    table=ref('payments'),
    column='payment_method'
) %}

{# Use in pivot or conditional logic #}
select
  customer_id,
  {% for method in payment_methods %}
  sum(case when payment_method = '{{ method }}' then amount else 0 end) as {{ method }}_total,
  {% endfor %}
  sum(amount) as total_amount
from {{ ref('payments') }}
group by customer_id

{# With filtering and ordering #}
{% set recent_customers = dbt_utils.get_column_values(
    table=ref('customers'),
    column='customer_id',
    where="created_at >= '2023-01-01'",
    order_by='created_at desc',
    max_records=100
) %}

Expected Results:

Returns: ['credit_card', 'paypal', 'bank_transfer', 'cash']

Generated SQL creates columns:
| customer_id | credit_card_total | paypal_total | bank_transfer_total | cash_total | total_amount |
| 1001 | 150.00 | 75.00 | 0.00 | 25.00 | 250.00 |

get_filtered_columns_in_relation

Returns an iterable list of columns for a relation, with optional exclusions.
Parameters:
  • from (required): Source relation
  • except (optional, default=[]): Columns to exclude
{# Get all columns except system fields #}
{% set columns = dbt_utils.get_filtered_columns_in_relation(
    from=ref('raw_data'),
    except=['_loaded_at', '_row_hash', 'dbt_updated_at']
) %}

select
  {% for column in columns %}
  max({{ column }}) as max_{{ column }},
  {% endfor %}
  count(*) as total_rows
from {{ ref('raw_data') }}

{# Create a clean version without metadata columns #}
select
  {% for column in columns %}
  {{ column }}{% if not loop.last %},{% endif %}
  {% endfor %}
from {{ ref('source_table') }}

Expected Results:

From table with columns: [id, name, email, _loaded_at, amount, _row_hash]

Returns: ['id', 'name', 'email', 'amount']

Generated SQL includes only business columns, excludes system metadata

get_relations_by_pattern

Returns a list of relations that match schema or table name patterns.
Parameters:
  • schema_pattern (required): Schema pattern
  • table_pattern (required): Table pattern
  • exclude (optional): Exclusion pattern
  • database (optional): Target database

get_relations_by_prefix

Returns a list of relations matching a prefix (soon to be deprecated in favor of get_relations_by_pattern).
Parameters:
  • schema (required): Schema name
  • prefix (required): Table prefix
  • exclude (optional): Exclusion pattern
  • database (optional): Target database

get_query_results_as_dict

Returns a dictionary from a SQL query for easier manipulation without Agate library.
Parameters:
  • sql_statement (required): SQL query to execute

get_single_value

Returns a single value from a SQL query.
Parameters:
  • sql_statement (required): SQL query to execute
  • default (optional): Default value if query fails

SQLSQL Generators

These macros generate SQL code or complete queries, implementing common patterns that are difficult to write in pure SQL.

date_spine

Generates SQL to build a date spine between start and end dates.
Parameters:
  • datepart (required): Date interval (day, month, etc.)
  • start_date (required): Start date
  • end_date (required): End date (exclusive)
{# Create daily date spine for 2023 #}
{{ dbt_utils.date_spine(
    datepart="day",
    start_date="cast('2023-01-01' as date)",
    end_date="cast('2024-01-01' as date)"
) }}

{# Monthly spine for reporting #}
{{ dbt_utils.date_spine(
    datepart="month",
    start_date="cast('2020-01-01' as date)",
    end_date="cast('2025-01-01' as date)"
) }}

{# Use in model for gap analysis #}
with date_spine as (
  {{ dbt_utils.date_spine(
      datepart="day",
      start_date="'2023-01-01'",
      end_date="current_date"
  ) }}
),
sales_data as (
  select date_trunc('day', order_date) as order_day, sum(amount) as daily_sales
  from {{ ref('orders') }}
  group by 1
)
select
  ds.date_day,
  coalesce(sd.daily_sales, 0) as sales
from date_spine ds
left join sales_data sd on ds.date_day = sd.order_day

Expected Results:

Generates complete date series:
| date_day |
| 2023-01-01 |
| 2023-01-02 |
| 2023-01-03 |
| ... |
| 2023-12-31 |

With sales join shows zero sales for missing days, enables gap analysis

deduplicate

Generates SQL to remove duplicate rows from a model, source, or CTE.
Parameters:
  • relation (required): Source relation
  • partition_by (required): Columns to partition by
  • order_by (required): Priority order for deduplication

haversine_distance

Calculates the haversine distance between two sets of coordinates.
Parameters:
  • lat1 (required): First latitude
  • lon1 (required): First longitude
  • lat2 (required): Second latitude
  • lon2 (required): Second longitude
  • unit (optional, default='mi'): Distance unit ('mi' or 'km')

group_by

Builds a GROUP BY statement for fields 1...N.
Parameters:
  • n (required): Number of fields to group by
select
  customer_id,
  order_date,
  status,
  count(*) as order_count,
  sum(amount) as total_amount
from {{ ref('orders') }}
{{ dbt_utils.group_by(n=3) }}

{# Dynamic grouping based on columns #}
{% set group_cols = ['region', 'product_category', 'month'] %}
select
  {% for col in group_cols %}
  {{ col }},
  {% endfor %}
  sum(revenue) as total_revenue,
  count(*) as transaction_count
from {{ ref('sales') }}
{{ dbt_utils.group_by(n=group_cols|length) }}

Expected Results:

Generates: "GROUP BY 1,2,3" (references select columns by position)

Equivalent to writing:
GROUP BY customer_id, order_date, status

Useful for maintaining consistency and reducing repetition in complex models

star

Generates a comma-separated list of all fields from a relation, with optional exclusions and transformations.
Parameters:
  • from (required): Source relation
  • except (optional, default=[]): Columns to exclude
  • relation_alias (optional): Table alias for columns
  • prefix (optional): Column prefix
  • suffix (optional): Column suffix
  • quote_identifiers (optional, default=true): Quote column names

union_relations

Combines multiple relations via UNION ALL, handling column differences and adding source tracking.
Parameters:
  • relations (required): Array of relations to union
  • exclude (optional): Columns to exclude
  • include (optional): Columns to include (mutually exclusive with exclude)
  • column_override (optional): Explicit column type overrides
  • source_column_name (optional, default='_dbt_source_relation'): Source tracking column name
  • where (optional): Filter condition
{# Basic union of similar tables #}
{{ dbt_utils.union_relations(
    relations=[
      ref('orders_2022'),
      ref('orders_2023'),
      ref('orders_2024')
    ]
) }}

{# Union with column exclusions #}
{{ dbt_utils.union_relations(
    relations=[
      source('raw', 'customers_us'),
      source('raw', 'customers_eu'),
      source('raw', 'customers_asia')
    ],
    exclude=['_loaded_at', '_metadata', 'region_specific_field']
) }}

{# Advanced: Type overrides and filtering #}
{{ dbt_utils.union_relations(
    relations=[
      ref('legacy_sales'),
      ref('modern_sales')
    ],
    column_override={
      'amount': 'decimal(10,2)',
      'customer_id': 'varchar(50)'
    },
    source_column_name='data_source',
    where="status = 'active'"
) }}

{# Dynamic union with pattern matching #}
{% set event_relations = dbt_utils.get_relations_by_pattern('events%', 'page_views_%') %}
{{ dbt_utils.union_relations(
    relations=event_relations,
    include=['user_id', 'page_url', 'timestamp', 'session_id']
) }}

Expected Results:

Tables with different schemas:
orders_2022: [id, customer_id, amount, status]
orders_2023: [id, customer_id, amount, status, discount]
orders_2024: [id, customer_id, amount, status, discount, channel]

Union result:
| id | customer_id | amount | status | discount | channel | _dbt_source_relation | | 1 | 101 | 100.0 | paid | null | null | orders_2022 | | 2 | 102 | 150.0 | paid | 10.0 | null | orders_2023 | | 3 | 103 | 200.0 | paid | 15.0 | web | orders_2024 |
Automatically handles missing columns with NULL values and tracks source

generate_series

Generates a 1-indexed series of numbers up to the specified bound.
Parameters:
  • upper_bound (required): Maximum number in series

generate_surrogate_key

Generates a hashed surrogate key using specified fields (cross-database compatible).
Parameters:
  • field_list (required): List of fields to hash
{# Simple surrogate key from multiple fields #}
select
  {{ dbt_utils.generate_surrogate_key(['customer_id', 'order_date']) }} as order_key,
  customer_id,
  order_date,
  amount
from {{ ref('raw_orders') }}

{# Complex example with many fields #}
select
  {{ dbt_utils.generate_surrogate_key([
    'customer_id',
    'product_id',
    'cast(order_date as string)',
    'channel'
  ]) }} as composite_key,
  *
from {{ ref('order_details') }}

{# Use as primary key in model #}
{{ config(materialized='table') }}
select
  {{ dbt_utils.generate_surrogate_key(['id', 'version']) }} as pk,
  id,
  version,
  data
from {{ source('raw', 'versioned_records') }}

Expected Results:

Generates: md5(concat(coalesce(cast(customer_id as string), '_dbt_utils_surrogate_key_null_'), '-', coalesce(cast(order_date as string), '_dbt_utils_surrogate_key_null_')))

Example output:
| order_key | customer_id | order_date | | a1b2c3d4e5f6789012345678901234ab | 1001 | 2023-01-15 | | f9e8d7c6b5a4321098765432109876fe | 1002 | 2023-01-15 |
Creates consistent, deterministic hash keys for composite primary keys

safe_add

Implements safe addition of nullable fields across databases.
Parameters:
  • field_list (required): List of fields to sum

safe_divide

Performs division returning null if denominator is 0.
Parameters:
  • numerator (required): Numerator expression
  • denominator (required): Denominator expression

safe_subtract

Implements safe subtraction of nullable fields across databases.
Parameters:
  • field_list (required): List of fields to subtract

pivot

Pivots values from rows to columns with extensive customization options.
Parameters:
  • column (required): Column to pivot
  • values (required): List of values to turn into columns
  • alias (optional, default=true): Create column aliases
  • agg (optional, default='sum'): Aggregation function
  • cmp (optional, default='='): Comparison operator
  • prefix (optional): Column prefix
  • suffix (optional): Column suffix
  • then_value (optional, default=1): Value when condition is true
  • else_value (optional, default=0): Value when condition is false
  • quote_identifiers (optional, default=true): Quote column aliases
  • distinct (optional, default=false): Use DISTINCT in aggregation
{# Basic pivot - count by status #}
select
  customer_id,
  {{ dbt_utils.pivot(
    'order_status',
    dbt_utils.get_column_values(ref('orders'), 'order_status')
  ) }}
from {{ ref('orders') }}
group by customer_id

{# Revenue pivot with custom aggregation #}
select
  product_category,
  {{ dbt_utils.pivot(
    'payment_method',
    ['credit_card', 'paypal', 'cash'],
    agg='sum',
    then_value='amount',
    else_value=0,
    prefix='revenue_',
    suffix='_total'
  ) }}
from {{ ref('sales') }}
group by product_category

{# Advanced: Distinct count with custom comparison #}
select
  date_trunc('month', order_date) as month,
  {{ dbt_utils.pivot(
    'customer_tier',
    ['bronze', 'silver', 'gold', 'platinum'],
    agg='count',
    then_value='distinct customer_id',
    distinct=true
  ) }}
from {{ ref('customer_orders') }}
group by 1

Expected Results:

Input data (orders):
| customer_id | order_status | | 101 | completed | | 101 | pending | | 101 | completed | | 102 | cancelled |
Pivot output:
| customer_id | completed | pending | cancelled | | 101 | 2 | 1 | 0 | | 102 | 0 | 0 | 1 |
Revenue pivot generates: revenue_credit_card_total, revenue_paypal_total, etc.

unpivot

Un-pivots a table from wide to long format (similar to pandas melt).
Parameters:
  • relation (required): Source relation
  • cast_to (optional, default='varchar'): Data type for values
  • exclude (optional): Columns to exclude from unpivot but keep
  • remove (optional): Columns to remove completely
  • field_name (optional): Column name for field names
  • value_name (optional): Column name for values
  • quote_identifiers (optional, default=false): Quote identifiers

width_bucket

Creates equi-width histograms (modeled after Snowflake's width_bucket function).
Parameters:
  • expr (required): Expression to create histogram for
  • min_value (required): Lower bound
  • max_value (required): Upper bound
  • num_buckets (required): Number of buckets

WEBWeb Macros

Macros for extracting and manipulating URL components.

get_url_parameter

Extracts a URL parameter from a column containing a URL.
Parameters:
  • field (required): Column containing URL
  • url_parameter (required): Parameter name to extract
select
  page_url,
  {{ dbt_utils.get_url_parameter(field='page_url', url_parameter='utm_source') }} as utm_source,
  {{ dbt_utils.get_url_parameter(field='page_url', url_parameter='utm_medium') }} as utm_medium,
  {{ dbt_utils.get_url_parameter(field='page_url', url_parameter='utm_campaign') }} as utm_campaign
from {{ ref('web_analytics') }}

{# Extract multiple parameters for marketing analysis #}
select
  session_id,
  {{ dbt_utils.get_url_parameter(field='referrer_url', url_parameter='q') }} as search_query,
  {{ dbt_utils.get_url_parameter(field='current_url', url_parameter='product_id') }} as viewed_product,
  timestamp
from {{ ref('page_views') }}
where referrer_url is not null

Expected Results:

From URL: "https://example.com/page?utm_source=google&utm_medium=cpc&utm_campaign=spring_sale"

| page_url | utm_source | utm_medium | utm_campaign | | https://example.com/page?utm_source=g...| google | cpc | spring_sale | | https://example.com/other | null | null | null |
Extracts URL parameters for marketing attribution and campaign tracking

get_url_host

Extracts the hostname from a URL column.
Parameters:
  • field (required): Column containing URL
select
  referrer_url,
  {{ dbt_utils.get_url_host(field='referrer_url') }} as referrer_domain,
  count(*) as visits
from {{ ref('web_sessions') }}
where referrer_url is not null
group by 1, 2
order by visits desc

{# Analyze traffic sources by domain #}
select
  {{ dbt_utils.get_url_host(field='page_url') }} as site_domain,
  {{ dbt_utils.get_url_host(field='referrer_url') }} as referrer_domain,
  count(*) as sessions,
  sum(revenue) as total_revenue
from {{ ref('session_data') }}
group by 1, 2

Expected Results:

| referrer_url | referrer_domain | visits | | https://google.com/search?q=... | google.com | 1,250 | | https://facebook.com/post/123 | facebook.com | 850 | | https://twitter.com/user/tweet | twitter.com | 420 |
Clean domain extraction for referrer analysis and traffic source reporting

get_url_path

Extracts the path from a URL column.
Parameters:
  • field (required): Column containing URL
select
  page_url,
  {{ dbt_utils.get_url_path(field='page_url') }} as page_path,
  count(*) as page_views,
  count(distinct session_id) as unique_sessions
from {{ ref('page_analytics') }}
group by 1, 2
order by page_views desc

{# Content performance analysis #}
select
  case
    when {{ dbt_utils.get_url_path(field='page_url') }} like '/product/%' then 'Product'
    when {{ dbt_utils.get_url_path(field='page_url') }} like '/blog/%' then 'Blog'
    when {{ dbt_utils.get_url_path(field='page_url') }} = '/' then 'Homepage'
    else 'Other'
  end as page_category,
  avg(time_on_page) as avg_time_on_page,
  count(*) as total_views
from {{ ref('user_behavior') }}
group by 1

Expected Results:

| page_url | page_path | page_views | | https://shop.com/products/shoes | /products/shoes | 5,420 | | https://shop.com/blog/style-tips | /blog/style-tips | 2,180 | | https://shop.com/ | / | 8,950 |
Enables path-based analysis for content performance and user journey mapping

JINJAJinja Helpers

Utility macros for Jinja templating and logging.

pretty_time

Returns a formatted string of the current timestamp.
Parameters:
  • format (optional): Date format string
{# Basic timestamp in logs #}
{{ log(dbt_utils.pretty_time() ~ " - Starting data processing", info=true) }}

{# Custom format for model documentation #}
/*
Model refreshed: {{ dbt_utils.pretty_time(format='%Y-%m-%d %H:%M:%S') }}
Generated by: {{ this.name }}
*/
select * from {{ ref('source_data') }}

{# In macro for conditional processing #}
{% if target.name == 'prod' %}
  {{ log("Production run started at " ~ dbt_utils.pretty_time(), info=true) }}
{% endif %}

Expected Results:

Default format: "14:50:34"
Custom format: "2024-01-15 14:50:34"

In logs appears as:
14:50:34 - Starting data processing
15:23:18 | Production run started at 15:23:18

pretty_log_format

Formats input for pretty command line logging.
Parameters:
  • message (required): Message to format

log_info

Logs a formatted message with timestamp to the command line.
Parameters:
  • message (required): Message to log

slugify

Transforms Jinja strings into database-friendly "slugs" by removing special characters and spaces.
Parameters:
  • string (required): String to slugify
{# Dynamic column creation from data values #}
{% set payment_methods = dbt_utils.get_column_values(ref('payments'), 'payment_method') %}

select
  customer_id,
  {% for method in payment_methods %}
  sum(case when payment_method = '{{ method }}' then amount end) as {{ dbt_utils.slugify(method) }}_total,
  {% endfor %}
  sum(amount) as total
from {{ ref('payments') }}
group by customer_id

{# Clean column names for complex strings #}
{% set categories = ['High-Value Customer', 'New User!', 'VIP (Premium)', '30-Day Trial'] %}

select
  {% for cat in categories %}
  sum(case when category = '{{ cat }}' then 1 else 0 end) as {{ dbt_utils.slugify(cat) }}_count,
  {% endfor %}
  count(*) as total_users
from {{ ref('customer_segments') }}

Expected Results:

Input → Output transformations:
'ca$h money' → 'cah_money'
'Venmo App' → 'venmo_app'
'High-Value Customer' → 'high_value_customer'
'VIP (Premium)' → 'vip_premium'
'30-Day Trial' → '_30_day_trial'

Generated columns are database-safe identifiers for dynamic SQL generation

Cross-database Macros (Deprecated)

Note: Cross-database macros have been removed from dbt_utils version 1.0 as they are now implemented in dbt Core. See dbt Core documentation for current implementations.

Summary Table

Category Count Primary Use Cases Key Examples
Generic Tests 15 Data quality validation, integrity checks, relationship testing equality, recency, mutually_exclusive_ranges
Introspective 6 Database introspection, metadata extraction, dynamic queries get_column_values, get_relations_by_pattern
SQL Generators 15 Complex SQL patterns, data manipulation, cross-database compatibility pivot, union_relations, generate_surrogate_key
Web 3 URL parsing and manipulation get_url_parameter, get_url_host
Jinja Helpers 4 Template utilities, logging, string manipulation slugify, pretty_time, log_info

Key Features of dbt_utils:

Installation: Add dbt-labs/dbt_utils to your packages.yml and run dbt deps.
Usage: Reference macros with dbt_utils. prefix, e.g., {{ dbt_utils.generate_surrogate_key(['col1', 'col2']) }}
Documentation: Each macro supports adapter dispatch for database-specific implementations.

📚 Official dbt Documentation

For comprehensive information about dbt macros and utilities, refer to the official documentation:

These resources provide the authoritative source for dbt macro capabilities and are regularly updated.