dbt Macros Comprehensive Guide

Table of Contents

What are dbt Macros?

Macros in dbt are reusable pieces of code that function like "functions" in other programming languages. They use Jinja templating to generate dynamic SQL and help you avoid code repetition across your dbt project.

Key Characteristics:

Think of macros as SQL templates that can be customized with parameters and reused wherever needed in your models, tests, and other macros.

When to Use Macros

Common Use Cases:

Use Case Description Example
Code Reuse Eliminate repetitive SQL patterns Date formatting, string cleaning
Dynamic SQL Generate SQL based on conditions Conditional column selection
Cross-database Logic Abstract database-specific syntax Date functions across warehouses
Complex Transformations Encapsulate business logic Currency conversion, tax calculations
Testing Utilities Create reusable test logic Custom data quality checks
💡 Tip: Before creating a macro, check if a similar function exists in dbt-utils or other community packages to avoid reinventing the wheel.

Creating Your First Macro

Basic Macro Structure:

{% macro macro_name(parameter1, parameter2=default_value) %} -- SQL code goes here {{ parameter1 }} {% endmacro %}

Simple Example:

-- macros/cents_to_dollars.sql {% macro cents_to_dollars(column_name, scale=2) %} ({{ column_name }} / 100)::numeric(16, {{ scale }}) {% endmacro %}

Using the Macro in a Model:

-- models/orders.sql select order_id, {{ cents_to_dollars('price_in_cents') }} as price_in_dollars, {{ cents_to_dollars('tax_in_cents', 4) }} as tax_in_dollars from {{ source('raw', 'orders') }}

Macro Syntax and Parameters

Parameter Types:

Type Syntax Description
Required parameter_name Must be provided when calling the macro
Optional parameter_name=default Has a default value if not provided
Keyword Args **kwargs Accepts additional named parameters

Advanced Parameter Example:

{% macro generate_alias_name(custom_alias_name=none, node=none) %} {%- if custom_alias_name is none -%} {{ node.name }} {%- else -%} {{ custom_alias_name | trim }} {%- endif -%} {% endmacro %}
⚠️ Important: When passing string literals to macros, always use quotes: {{ my_macro('string_value') }}

Types of Macros

1. Utility Macros

Simple transformations and calculations:

{% macro safe_divide(numerator, denominator) %} case when {{ denominator }} = 0 then null else {{ numerator }} / {{ denominator }} end {% endmacro %}

2. SQL Generation Macros

Generate complex SQL structures:

{% macro pivot(column, values, agg='sum', quote_identifiers=true) %} {%- for value in values %} {{ agg }}(case when {{ column }} = '{{ value }}' then 1 else 0 end) as {{ value | replace(' ', '_') }} {%- if not loop.last %},{% endif -%} {%- endfor %} {% endmacro %}

3. Test Macros

Create custom data quality tests:

{% test not_null_proportion(model, column_name, at_least=0.95) %} select * from ( select avg(case when {{ column_name }} is not null then 1.0 else 0.0 end) as not_null_proportion from {{ model }} ) validation where not_null_proportion < {{ at_least }} {% endtest %}

Jinja Functions in Macros

Essential dbt Jinja Functions:

Function Purpose Example
ref() Reference other models {{ ref('my_model') }}
source() Reference source tables {{ source('raw', 'users') }}
var() Access variables {{ var('start_date') }}
env_var() Access environment variables {{ env_var('DBT_ENV') }}
run_query() Execute queries at compile time {{ run_query('select 1') }}

Macro with Context Variables:

{% macro audit_columns() %} '{{ target.name }}' as target_name, '{{ model.name }}' as model_name, current_timestamp() as dbt_updated_at, '{{ invocation_id }}' as dbt_invocation_id {% endmacro %}

Best Practices

🎯 Design Principles:

📝 Code Organization:

Macro Documentation Example:

-- macros/utils/schema.yml version: 2 macros: - name: cents_to_dollars description: "Converts a column from cents to dollars with specified precision" arguments: - name: column_name type: column description: "The column containing cent values to convert" - name: scale type: int description: "Number of decimal places (default: 2)"
⚠️ Performance Considerations:

Advanced Features

Macro Overriding

Override built-in or package macros by creating a macro with the same name:

-- Override the generate_schema_name macro {% macro generate_schema_name(custom_schema_name, node) %} {%- set default_schema = target.schema -%} {%- if custom_schema_name is none -%} {{ default_schema }} {%- else -%} {{ default_schema }}_{{ custom_schema_name | trim }} {%- endif -%} {% endmacro %}

Conditional Logic in Macros

{% macro date_trunc_by_period(date_column, period) %} {%- if period == 'day' -%} date_trunc('day', {{ date_column }}) {%- elif period == 'week' -%} date_trunc('week', {{ date_column }}) {%- elif period == 'month' -%} date_trunc('month', {{ date_column }}) {%- else -%} {{ 'Unsupported period: ' ~ period }} {%- endif -%} {% endmacro %}

Cross-Database Compatibility

{% macro current_timestamp_tz() %} {%- if target.type == 'snowflake' -%} current_timestamp() {%- elif target.type == 'bigquery' -%} current_timestamp() {%- elif target.type == 'postgres' -%} now() {%- else -%} current_timestamp() {%- endif -%} {% endmacro %}

Common Macro Patterns

1. Column List Generation

{% macro get_column_values(table, column) %} {%- call statement('get_values', fetch_result=true) -%} select distinct {{ column }} from {{ table }} where {{ column }} is not null {%- endcall -%} {%- set value_list = load_result('get_values') -%} {%- if execute -%} {%- set values = value_list['data'] | map(attribute=0) | list -%} {{ return(values) }} {%- endif -%} {% endmacro %}

2. Surrogate Key Generation

{% macro surrogate_key(field_list) %} md5(concat( {%- for field in field_list %} coalesce(cast({{ field }} as string), '') {%- if not loop.last %}, '|', {% endif -%} {%- endfor %} )) {% endmacro %}

3. Dynamic WHERE Clauses

{% macro date_filter(date_column, start_date=none, end_date=none) %} {%- if start_date and end_date -%} {{ date_column }} between '{{ start_date }}' and '{{ end_date }}' {%- elif start_date -%} {{ date_column }} >= '{{ start_date }}' {%- elif end_date -%} {{ date_column }} <= '{{ end_date }}' {%- else -%} 1=1 {%- endif -%} {% endmacro %}

Debugging Macros

Debugging Techniques:

1. Check Compiled SQL

Always review the compiled SQL in the target/compiled/ directory to ensure your macro generates the expected output.

2. Use the log() Function

{% macro debug_macro(value) %} {{ log('Debug value: ' ~ value, info=true) }} {{ return(value) }} {% endmacro %}

3. Test with Simple Inputs

Start with simple, known inputs to verify macro behavior before using complex parameters.

💡 Pro Tip: Use dbt compile to generate SQL without running it, perfect for debugging macro output.

Practical Examples

💾 Code Examples: All example macros are available in the code examples directory. Each file contains ready-to-use macro code with usage instructions.

Available Code Examples:

Example File Description
Cents to Dollars 01_cents_to_dollars.sql Basic utility macro for currency conversion
Safe Division 02_safe_divide.sql Division macro that handles zero denominators
Audit Columns 03_audit_columns.sql Metadata tracking for models
Pivot Macro 04_pivot_macro.sql Dynamic pivot table generation
Surrogate Key 05_surrogate_key.sql Generate hash-based surrogate keys
Cross-Database Timestamp 06_cross_database_timestamp.sql Database-agnostic timestamp functions
Custom Test 07_custom_test.sql Data quality test macro
Date Filter 08_date_filter.sql Dynamic date filtering logic
Email Validation 09_email_validation.sql Email format validation macro
Dynamic Schema 10_dynamic_schema_name.sql Custom schema naming logic

Example 1: Revenue Recognition Macro

{% macro calculate_monthly_revenue(amount_column, start_date_column, end_date_column) %} case when date_trunc('month', {{ start_date_column }}) = date_trunc('month', {{ end_date_column }}) then {{ amount_column }} else {{ amount_column }} * extract(day from last_day({{ start_date_column }})) / (extract(day from {{ end_date_column }}) - extract(day from {{ start_date_column }}) + 1) end {% endmacro %}

Example 2: Data Quality Validation

{% macro validate_email(email_column) %} case when {{ email_column }} is null then 'missing' when length({{ email_column }}) = 0 then 'empty' when {{ email_column }} not like '%@%' then 'no_at_symbol' when {{ email_column }} not like '%.%' then 'no_domain' else 'valid' end {% endmacro %}

Example 3: Dynamic Table Creation

{% macro create_staging_table(source_table, columns_to_exclude=[]) %} {%- set source_columns = adapter.get_columns_in_relation(source_table) -%} select {%- for column in source_columns %} {%- if column.name not in columns_to_exclude %} trim({{ column.name }}) as {{ column.name }} {%- if not loop.last %},{% endif -%} {%- endif %} {%- endfor %} from {{ source_table }} {% endmacro %}