dbt Macros Comprehensive Guide
What are dbt Macros?
Macros in dbt are reusable pieces of code that function like "functions" in other programming languages. They use Jinja templating to generate dynamic SQL and help you avoid code repetition across your dbt project.
Key Characteristics:
- Written in SQL files using Jinja syntax
- Stored in the
macros/
directory
- Can accept parameters and return SQL code
- Available globally across your dbt project
- Can be shared across projects via packages
Think of macros as SQL templates that can be customized with parameters and reused wherever needed in your models, tests, and other macros.
When to Use Macros
Common Use Cases:
Use Case |
Description |
Example |
Code Reuse |
Eliminate repetitive SQL patterns |
Date formatting, string cleaning |
Dynamic SQL |
Generate SQL based on conditions |
Conditional column selection |
Cross-database Logic |
Abstract database-specific syntax |
Date functions across warehouses |
Complex Transformations |
Encapsulate business logic |
Currency conversion, tax calculations |
Testing Utilities |
Create reusable test logic |
Custom data quality checks |
💡 Tip: Before creating a macro, check if a similar function exists in dbt-utils or other community packages to avoid reinventing the wheel.
Creating Your First Macro
Basic Macro Structure:
{% macro macro_name(parameter1, parameter2=default_value) %}
{{ parameter1 }}
{% endmacro %}
Simple Example:
{% macro cents_to_dollars(column_name, scale=2) %}
({{ column_name }} / 100)::numeric(16, {{ scale }})
{% endmacro %}
Using the Macro in a Model:
select
order_id,
{{ cents_to_dollars('price_in_cents') }} as price_in_dollars,
{{ cents_to_dollars('tax_in_cents', 4) }} as tax_in_dollars
from {{ source('raw', 'orders') }}
Macro Syntax and Parameters
Parameter Types:
Type |
Syntax |
Description |
Required |
parameter_name |
Must be provided when calling the macro |
Optional |
parameter_name=default |
Has a default value if not provided |
Keyword Args |
**kwargs |
Accepts additional named parameters |
Advanced Parameter Example:
{% macro generate_alias_name(custom_alias_name=none, node=none) %}
{%- if custom_alias_name is none -%}
{{ node.name }}
{%- else -%}
{{ custom_alias_name | trim }}
{%- endif -%}
{% endmacro %}
⚠️ Important: When passing string literals to macros, always use quotes: {{ my_macro('string_value') }}
Types of Macros
1. Utility Macros
Simple transformations and calculations:
{% macro safe_divide(numerator, denominator) %}
case
when {{ denominator }} = 0 then null
else {{ numerator }} / {{ denominator }}
end
{% endmacro %}
2. SQL Generation Macros
Generate complex SQL structures:
{% macro pivot(column, values, agg='sum', quote_identifiers=true) %}
{%- for value in values %}
{{ agg }}(case when {{ column }} = '{{ value }}' then 1 else 0 end)
as {{ value | replace(' ', '_') }}
{%- if not loop.last %},{% endif -%}
{%- endfor %}
{% endmacro %}
3. Test Macros
Create custom data quality tests:
{% test not_null_proportion(model, column_name, at_least=0.95) %}
select *
from (
select
avg(case when {{ column_name }} is not null then 1.0 else 0.0 end) as not_null_proportion
from {{ model }}
) validation
where not_null_proportion < {{ at_least }}
{% endtest %}
Jinja Functions in Macros
Essential dbt Jinja Functions:
Function |
Purpose |
Example |
ref() |
Reference other models |
{{ ref('my_model') }} |
source() |
Reference source tables |
{{ source('raw', 'users') }} |
var() |
Access variables |
{{ var('start_date') }} |
env_var() |
Access environment variables |
{{ env_var('DBT_ENV') }} |
run_query() |
Execute queries at compile time |
{{ run_query('select 1') }} |
Macro with Context Variables:
{% macro audit_columns() %}
'{{ target.name }}' as target_name,
'{{ model.name }}' as model_name,
current_timestamp() as dbt_updated_at,
'{{ invocation_id }}' as dbt_invocation_id
{% endmacro %}
Best Practices
🎯 Design Principles:
- Favor readability over extreme abstraction - Don't make macros too complex
- Single responsibility - Each macro should do one thing well
- Consistent naming - Use clear, descriptive names
- Document your macros - Include descriptions and examples
📝 Code Organization:
- Store macros in logical subdirectories:
macros/utils/
, macros/tests/
- Use meaningful file names that describe the macro's purpose
- Group related macros in the same file
- Keep macro files focused and modular
Macro Documentation Example:
version: 2
macros:
- name: cents_to_dollars
description: "Converts a column from cents to dollars with specified precision"
arguments:
- name: column_name
type: column
description: "The column containing cent values to convert"
- name: scale
type: int
description: "Number of decimal places (default: 2)"
⚠️ Performance Considerations:
- Avoid using
run_query()
in macros that are called frequently
- Be careful with loops that generate large amounts of SQL
- Test compiled SQL output to ensure efficiency
Advanced Features
Macro Overriding
Override built-in or package macros by creating a macro with the same name:
{% macro generate_schema_name(custom_schema_name, node) %}
{%- set default_schema = target.schema -%}
{%- if custom_schema_name is none -%}
{{ default_schema }}
{%- else -%}
{{ default_schema }}_{{ custom_schema_name | trim }}
{%- endif -%}
{% endmacro %}
Conditional Logic in Macros
{% macro date_trunc_by_period(date_column, period) %}
{%- if period == 'day' -%}
date_trunc('day', {{ date_column }})
{%- elif period == 'week' -%}
date_trunc('week', {{ date_column }})
{%- elif period == 'month' -%}
date_trunc('month', {{ date_column }})
{%- else -%}
{{ 'Unsupported period: ' ~ period }}
{%- endif -%}
{% endmacro %}
Cross-Database Compatibility
{% macro current_timestamp_tz() %}
{%- if target.type == 'snowflake' -%}
current_timestamp()
{%- elif target.type == 'bigquery' -%}
current_timestamp()
{%- elif target.type == 'postgres' -%}
now()
{%- else -%}
current_timestamp()
{%- endif -%}
{% endmacro %}
Common Macro Patterns
1. Column List Generation
{% macro get_column_values(table, column) %}
{%- call statement('get_values', fetch_result=true) -%}
select distinct {{ column }}
from {{ table }}
where {{ column }} is not null
{%- endcall -%}
{%- set value_list = load_result('get_values') -%}
{%- if execute -%}
{%- set values = value_list['data'] | map(attribute=0) | list -%}
{{ return(values) }}
{%- endif -%}
{% endmacro %}
2. Surrogate Key Generation
{% macro surrogate_key(field_list) %}
md5(concat(
{%- for field in field_list %}
coalesce(cast({{ field }} as string), '')
{%- if not loop.last %}, '|', {% endif -%}
{%- endfor %}
))
{% endmacro %}
3. Dynamic WHERE Clauses
{% macro date_filter(date_column, start_date=none, end_date=none) %}
{%- if start_date and end_date -%}
{{ date_column }} between '{{ start_date }}' and '{{ end_date }}'
{%- elif start_date -%}
{{ date_column }} >= '{{ start_date }}'
{%- elif end_date -%}
{{ date_column }} <= '{{ end_date }}'
{%- else -%}
1=1
{%- endif -%}
{% endmacro %}
Debugging Macros
Debugging Techniques:
1. Check Compiled SQL
Always review the compiled SQL in the target/compiled/
directory to ensure your macro generates the expected output.
2. Use the log() Function
{% macro debug_macro(value) %}
{{ log('Debug value: ' ~ value, info=true) }}
{{ return(value) }}
{% endmacro %}
3. Test with Simple Inputs
Start with simple, known inputs to verify macro behavior before using complex parameters.
💡 Pro Tip: Use dbt compile
to generate SQL without running it, perfect for debugging macro output.
Practical Examples
💾 Code Examples: All example macros are available in the
code examples directory.
Each file contains ready-to-use macro code with usage instructions.
Available Code Examples:
Example 1: Revenue Recognition Macro
{% macro calculate_monthly_revenue(amount_column, start_date_column, end_date_column) %}
case
when date_trunc('month', {{ start_date_column }}) =
date_trunc('month', {{ end_date_column }})
then {{ amount_column }}
else {{ amount_column }} *
extract(day from last_day({{ start_date_column }})) /
(extract(day from {{ end_date_column }}) -
extract(day from {{ start_date_column }}) + 1)
end
{% endmacro %}
Example 2: Data Quality Validation
{% macro validate_email(email_column) %}
case
when {{ email_column }} is null then 'missing'
when length({{ email_column }}) = 0 then 'empty'
when {{ email_column }} not like '%@%' then 'no_at_symbol'
when {{ email_column }} not like '%.%' then 'no_domain'
else 'valid'
end
{% endmacro %}
Example 3: Dynamic Table Creation
{% macro create_staging_table(source_table, columns_to_exclude=[]) %}
{%- set source_columns = adapter.get_columns_in_relation(source_table) -%}
select
{%- for column in source_columns %}
{%- if column.name not in columns_to_exclude %}
trim({{ column.name }}) as {{ column.name }}
{%- if not loop.last %},{% endif -%}
{%- endif %}
{%- endfor %}
from {{ source_table }}
{% endmacro %}