dbt Macros Comprehensive Guide

What are dbt Macros?

Macros in dbt are reusable pieces of code that function like "functions" in other programming languages. They use Jinja templating to generate dynamic SQL and help you avoid code repetition across your dbt project.

Key Characteristics:

Written in SQL files using Jinja syntax
Stored in the macros/ directory
Can accept parameters and return SQL code
Available globally across your dbt project
Can be shared across projects via packages

Think of macros as SQL templates that can be customized with parameters and reused wherever needed in your models, tests, and other macros.

When to Use Macros

Common Use Cases:

Use Case	Description	Example
Code Reuse	Eliminate repetitive SQL patterns	Date formatting, string cleaning
Dynamic SQL	Generate SQL based on conditions	Conditional column selection
Cross-database Logic	Abstract database-specific syntax	Date functions across warehouses
Complex Transformations	Encapsulate business logic	Currency conversion, tax calculations
Testing Utilities	Create reusable test logic	Custom data quality checks

💡 Tip: Before creating a macro, check if a similar function exists in dbt-utils or other community packages to avoid reinventing the wheel.

Creating Your First Macro

Basic Macro Structure:

{% macro macro_name(parameter1, parameter2=default_value) %}
    -- SQL code goes here
    {{ parameter1 }}
{% endmacro %}
        

Simple Example:

-- macros/cents_to_dollars.sql
{% macro cents_to_dollars(column_name, scale=2) %}
    ({{ column_name }} / 100)::numeric(16, {{ scale }})
{% endmacro %}
        

Using the Macro in a Model:

-- models/orders.sql
select
    order_id,
    {{ cents_to_dollars('price_in_cents') }} as price_in_dollars,
    {{ cents_to_dollars('tax_in_cents', 4) }} as tax_in_dollars
from {{ source('raw', 'orders') }}
        

Macro Syntax and Parameters

Parameter Types:

Type	Syntax	Description
Required	`parameter_name`	Must be provided when calling the macro
Optional	`parameter_name=default`	Has a default value if not provided
Keyword Args	`**kwargs`	Accepts additional named parameters

Advanced Parameter Example:

{% macro generate_alias_name(custom_alias_name=none, node=none) %}
    {%- if custom_alias_name is none -%}
        {{ node.name }}
    {%- else -%}
        {{ custom_alias_name | trim }}
    {%- endif -%}
{% endmacro %}
        

⚠️ Important: When passing string literals to macros, always use quotes: {{ my_macro('string_value') }}

Types of Macros

1. Utility Macros

Simple transformations and calculations:

{% macro safe_divide(numerator, denominator) %}
    case
        when {{ denominator }} = 0 then null
        else {{ numerator }} / {{ denominator }}
    end
{% endmacro %}
        

2. SQL Generation Macros

Generate complex SQL structures:

{% macro pivot(column, values, agg='sum', quote_identifiers=true) %}
    {%- for value in values %}
        {{ agg }}(case when {{ column }} = '{{ value }}' then 1 else 0 end) 
        as {{ value | replace(' ', '_') }}
        {%- if not loop.last %},{% endif -%}
    {%- endfor %}
{% endmacro %}
        

3. Test Macros

Create custom data quality tests:

{% test not_null_proportion(model, column_name, at_least=0.95) %}
    select *
    from (
        select
            avg(case when {{ column_name }} is not null then 1.0 else 0.0 end) as not_null_proportion
        from {{ model }}
    ) validation
    where not_null_proportion < {{ at_least }}
{% endtest %}
        

Jinja Functions in Macros

Essential dbt Jinja Functions:

Function	Purpose	Example
`ref()`	Reference other models	`{{ ref('my_model') }}`
`source()`	Reference source tables	`{{ source('raw', 'users') }}`
`var()`	Access variables	`{{ var('start_date') }}`
`env_var()`	Access environment variables	`{{ env_var('DBT_ENV') }}`
`run_query()`	Execute queries at compile time	`{{ run_query('select 1') }}`

Macro with Context Variables:

{% macro audit_columns() %}
    '{{ target.name }}' as target_name,
    '{{ model.name }}' as model_name,
    current_timestamp() as dbt_updated_at,
    '{{ invocation_id }}' as dbt_invocation_id
{% endmacro %}
        

Best Practices

🎯 Design Principles:

Favor readability over extreme abstraction - Don't make macros too complex
Single responsibility - Each macro should do one thing well
Consistent naming - Use clear, descriptive names
Document your macros - Include descriptions and examples

📝 Code Organization:

Store macros in logical subdirectories: macros/utils/, macros/tests/
Use meaningful file names that describe the macro's purpose
Group related macros in the same file
Keep macro files focused and modular

Macro Documentation Example:

-- macros/utils/schema.yml
version: 2
macros:
  - name: cents_to_dollars
    description: "Converts a column from cents to dollars with specified precision"
    arguments:
      - name: column_name
        type: column
        description: "The column containing cent values to convert"
      - name: scale
        type: int
        description: "Number of decimal places (default: 2)"
        

⚠️ Performance Considerations:

Avoid using run_query() in macros that are called frequently
Be careful with loops that generate large amounts of SQL
Test compiled SQL output to ensure efficiency

Advanced Features

Macro Overriding

Override built-in or package macros by creating a macro with the same name:

-- Override the generate_schema_name macro
{% macro generate_schema_name(custom_schema_name, node) %}
    {%- set default_schema = target.schema -%}
    {%- if custom_schema_name is none -%}
        {{ default_schema }}
    {%- else -%}
        {{ default_schema }}_{{ custom_schema_name | trim }}
    {%- endif -%}
{% endmacro %}
        

Conditional Logic in Macros

{% macro date_trunc_by_period(date_column, period) %}
    {%- if period == 'day' -%}
        date_trunc('day', {{ date_column }})
    {%- elif period == 'week' -%}
        date_trunc('week', {{ date_column }})
    {%- elif period == 'month' -%}
        date_trunc('month', {{ date_column }})
    {%- else -%}
        {{ 'Unsupported period: ' ~ period }}
    {%- endif -%}
{% endmacro %}
        

Cross-Database Compatibility

{% macro current_timestamp_tz() %}
    {%- if target.type == 'snowflake' -%}
        current_timestamp()
    {%- elif target.type == 'bigquery' -%}
        current_timestamp()
    {%- elif target.type == 'postgres' -%}
        now()
    {%- else -%}
        current_timestamp()
    {%- endif -%}
{% endmacro %}
        

Common Macro Patterns

1. Column List Generation

{% macro get_column_values(table, column) %}
    {%- call statement('get_values', fetch_result=true) -%}
        select distinct {{ column }}
        from {{ table }}
        where {{ column }} is not null
    {%- endcall -%}
    
    {%- set value_list = load_result('get_values') -%}
    {%- if execute -%}
        {%- set values = value_list['data'] | map(attribute=0) | list -%}
        {{ return(values) }}
    {%- endif -%}
{% endmacro %}
        

2. Surrogate Key Generation

{% macro surrogate_key(field_list) %}
    md5(concat(
        {%- for field in field_list %}
            coalesce(cast({{ field }} as string), '')
            {%- if not loop.last %}, '|', {% endif -%}
        {%- endfor %}
    ))
{% endmacro %}
        

3. Dynamic WHERE Clauses

{% macro date_filter(date_column, start_date=none, end_date=none) %}
    {%- if start_date and end_date -%}
        {{ date_column }} between '{{ start_date }}' and '{{ end_date }}'
    {%- elif start_date -%}
        {{ date_column }} >= '{{ start_date }}'
    {%- elif end_date -%}
        {{ date_column }} <= '{{ end_date }}'
    {%- else -%}
        1=1
    {%- endif -%}
{% endmacro %}
        

Debugging Macros

Debugging Techniques:

1. Check Compiled SQL

Always review the compiled SQL in the target/compiled/ directory to ensure your macro generates the expected output.

2. Use the log() Function

{% macro debug_macro(value) %}
    {{ log('Debug value: ' ~ value, info=true) }}
    {{ return(value) }}
{% endmacro %}
            

3. Test with Simple Inputs

Start with simple, known inputs to verify macro behavior before using complex parameters.

💡 Pro Tip: Use dbt compile to generate SQL without running it, perfect for debugging macro output.

Practical Examples

💾 Code Examples: All example macros are available in the code examples directory. Each file contains ready-to-use macro code with usage instructions.

Available Code Examples:

Example	File	Description
Cents to Dollars	01_cents_to_dollars.sql	Basic utility macro for currency conversion
Safe Division	02_safe_divide.sql	Division macro that handles zero denominators
Audit Columns	03_audit_columns.sql	Metadata tracking for models
Pivot Macro	04_pivot_macro.sql	Dynamic pivot table generation
Surrogate Key	05_surrogate_key.sql	Generate hash-based surrogate keys
Cross-Database Timestamp	06_cross_database_timestamp.sql	Database-agnostic timestamp functions
Custom Test	07_custom_test.sql	Data quality test macro
Date Filter	08_date_filter.sql	Dynamic date filtering logic
Email Validation	09_email_validation.sql	Email format validation macro
Dynamic Schema	10_dynamic_schema_name.sql	Custom schema naming logic

Example 1: Revenue Recognition Macro

{% macro calculate_monthly_revenue(amount_column, start_date_column, end_date_column) %}
    case
        when date_trunc('month', {{ start_date_column }}) = 
             date_trunc('month', {{ end_date_column }})
        then {{ amount_column }}
        else {{ amount_column }} * 
             extract(day from last_day({{ start_date_column }})) /
             (extract(day from {{ end_date_column }}) - 
              extract(day from {{ start_date_column }}) + 1)
    end
{% endmacro %}
        

Example 2: Data Quality Validation

{% macro validate_email(email_column) %}
    case
        when {{ email_column }} is null then 'missing'
        when length({{ email_column }}) = 0 then 'empty'
        when {{ email_column }} not like '%@%' then 'no_at_symbol'
        when {{ email_column }} not like '%.%' then 'no_domain'
        else 'valid'
    end
{% endmacro %}
        

Example 3: Dynamic Table Creation

{% macro create_staging_table(source_table, columns_to_exclude=[]) %}
    {%- set source_columns = adapter.get_columns_in_relation(source_table) -%}
    
    select
    {%- for column in source_columns %}
        {%- if column.name not in columns_to_exclude %}
        trim({{ column.name }}) as {{ column.name }}
        {%- if not loop.last %},{% endif -%}
        {%- endif %}
    {%- endfor %}
    from {{ source_table }}
{% endmacro %}
        

dbt Macros Comprehensive Guide

Table of Contents

What are dbt Macros?

Key Characteristics:

When to Use Macros

Common Use Cases:

Creating Your First Macro

Basic Macro Structure:

Simple Example:

Using the Macro in a Model:

Macro Syntax and Parameters

Parameter Types:

Advanced Parameter Example:

Types of Macros

1. Utility Macros

2. SQL Generation Macros

3. Test Macros

Jinja Functions in Macros

Essential dbt Jinja Functions:

Macro with Context Variables:

Best Practices

🎯 Design Principles:

📝 Code Organization:

Macro Documentation Example:

Advanced Features

Macro Overriding

Conditional Logic in Macros

Cross-Database Compatibility

Common Macro Patterns

1. Column List Generation

2. Surrogate Key Generation

3. Dynamic WHERE Clauses

Debugging Macros

Debugging Techniques:

1. Check Compiled SQL

2. Use the log() Function

3. Test with Simple Inputs

Practical Examples

Available Code Examples:

Example 1: Revenue Recognition Macro

Example 2: Data Quality Validation

Example 3: Dynamic Table Creation

Official Documentation Links