dbt-expectations Package - Comprehensive Test Summary

            About dbt-expectations: This package provides Great Expectations-style data quality tests for dbt projects. 
            It includes over 50 sophisticated tests covering data validation, integrity checks, statistical analysis, and more.
            Originally inspired by the Python Great Expectations library, it brings powerful data testing capabilities directly to your dbt transformations.
        

📦 Installation & Setup


# packages.yml

packages:

  - package: metaplane/dbt_expectations

    version: 0.10.9


# dbt_project.yml

vars:

  'dbt_date:time_zone': 'America/Los_Angeles'


# Install

dbt deps

Supported Adapters: Postgres, Snowflake, BigQuery, DuckDB, Spark (experimental), Trino

Table Shape Tests (14 tests)
Column Values & Basic Tests (11 tests)
Aggregate Functions (16 tests)
String Matching (10 tests)
Multi-column Tests (6 tests)
Distributional Tests (3 tests)
Summary Table & Best Practices

TABLETable Shape Tests

Tests that validate the structure, schema, and overall shape of tables including row counts, column existence, and metadata validation.

expect_column_to_exist

Verifies that a specified column exists in the table, with optional position validation.

Parameters:

column_name (required): Name of the column to check
column_index (optional): Expected 1-based position of the column
transform (optional, default='upper'): Case transformation for comparison


# Basic column existence check

models:

  - name: customers

    tests:

      - dbt_expectations.expect_column_to_exist:

          column_name: customer_id


# Validate column exists at specific position

      - dbt_expectations.expect_column_to_exist:

          column_name: email

          column_index: 3

Expected Results:

✅ Pass: Column 'customer_id' exists in table
❌ Fail: "Column 'phone_number' does not exist"
❌ Fail: "Column 'email' exists but is at position 5, expected position 3"

expect_table_row_count_to_be_between

Validates that the total number of rows in a table falls within specified bounds.

Parameters:

min_value (optional): Minimum expected row count
max_value (optional): Maximum expected row count
strictly (optional, default=false): Use strict inequalities


models:

  - name: daily_sales

    tests:

      # Expect at least 100 orders per day

      - dbt_expectations.expect_table_row_count_to_be_between:

          min_value: 100

          max_value: 10000


      # Strict bounds (exclusive)

      - dbt_expectations.expect_table_row_count_to_be_between:

          min_value: 0

          max_value: 1000000

          strictly: true

Expected Results:

✅ Pass: Table has 5,420 rows (within 100-10,000 range)
❌ Fail: "Table has 45 rows, expected between 100 and 10,000"
❌ Fail: "Table has exactly 1,000,000 rows, expected strictly less than 1,000,000"

expect_table_columns_to_match_set

Validates that table columns exactly match a specified set of column names.

Parameters:

column_list (required): List of expected column names
transform (optional, default='upper'): Case transformation


models:

  - name: user_profile

    tests:

      - dbt_expectations.expect_table_columns_to_match_set:

          column_list: ['user_id', 'email', 'first_name', 'last_name', 'created_at']


      # Case-sensitive matching

      - dbt_expectations.expect_table_columns_to_match_set:

          column_list: ['UserId', 'Email', 'CreatedAt']

          transform: 'none'

expect_table_row_count_to_equal_other_table

Compares row counts between two tables to ensure they match.

Parameters:

compare_model (required): Reference to comparison table
factor (optional): Multiplication factor for comparison

expect_row_values_to_have_recent_data

Ensures that data contains recent records within a specified time window.

Parameters:

column_name (required): Timestamp column to check
datepart (required): Time unit (day, hour, etc.)
interval (required): Number of dateparts for recency check


models:

  - name: user_events

    tests:

      # Check for data within last 24 hours

      - dbt_expectations.expect_row_values_to_have_recent_data:

          column_name: event_timestamp

          datepart: hour

          interval: 24


      # Daily freshness check

      - dbt_expectations.expect_row_values_to_have_recent_data:

          column_name: order_date

          datepart: day

          interval: 1

COLUMNColumn Values & Basic Tests

Fundamental tests for column data validation including null checks, uniqueness, data types, and basic value constraints.

expect_column_values_to_be_between

Validates that all values in a numeric column fall within specified minimum and maximum bounds.

Parameters:

min_value (optional): Minimum allowed value
max_value (optional): Maximum allowed value
strictly (optional, default=false): Use strict inequalities
row_condition (optional): Filter condition


models:

  - name: products

    columns:

      - name: price

        tests:

          # Price must be positive

          - dbt_expectations.expect_column_values_to_be_between:

              min_value: 0

              strictly: true


      - name: discount_percent

        tests:

          # Discount between 0-100%

          - dbt_expectations.expect_column_values_to_be_between:

              min_value: 0

              max_value: 100


      - name: rating

        tests:

          # Only check active products

          - dbt_expectations.expect_column_values_to_be_between:

              min_value: 1.0

              max_value: 5.0

              row_condition: "status = 'active'"

Expected Results:

✅ Pass: All prices are positive (0.01 to 999.99)
❌ Fail: Returns rows with invalid values:
  | product_id | price | discount_percent |
  | 12345 | -10.0 | 150 |
  | 67890 | 25.0 | -5 |

expect_column_values_to_be_in_set

Validates that all column values belong to a predefined set of acceptable values.

Parameters:

value_set (required): List of acceptable values
quote_values (optional, default=true): Quote string values
row_condition (optional): Filter condition


models:

  - name: orders

    columns:

      - name: status

        tests:

          - dbt_expectations.expect_column_values_to_be_in_set:

              value_set: ['pending', 'processing', 'shipped', 'delivered', 'cancelled']


      - name: priority

        tests:

          - dbt_expectations.expect_column_values_to_be_in_set:

              value_set: [1, 2, 3]

              quote_values: false


      - name: payment_method

        tests:

          # Only check completed orders

          - dbt_expectations.expect_column_values_to_be_in_set:

              value_set: ['credit_card', 'paypal', 'bank_transfer']

              row_condition: "status = 'completed'"

Expected Results:

expect_column_values_to_be_unique

Ensures all values in a column are unique (no duplicates).

Parameters:

row_condition (optional): Filter condition for subset testing

expect_column_values_to_not_be_null

Validates that a column contains no null values.

Parameters:

row_condition (optional): Filter condition

expect_column_values_to_be_of_type

Validates that all values in a column conform to a specific data type.

Parameters:

type_ (required): Expected data type
row_condition (optional): Filter condition

expect_column_values_to_be_increasing

Validates that column values are in strictly increasing order.

Parameters:

sort_column (optional): Column to sort by before checking order
strictly (optional, default=true): Enforce strict increasing

expect_column_values_to_have_consistent_casing

Validates that string values in a column follow consistent casing patterns.

Parameters:

display_inconsistent_columns (optional, default=false): Show problematic values

AGGREGATEAggregate Functions

Statistical and aggregate function tests for analyzing column distributions, counts, sums, and other mathematical properties.

expect_column_mean_to_be_between

Validates that the mean (average) of a numeric column falls within specified bounds.

Parameters:

min_value (optional): Minimum expected mean
max_value (optional): Maximum expected mean
group_by (optional): Group by columns for per-group validation
row_condition (optional): Filter condition
strictly (optional, default=false): Use strict inequalities


models:

  - name: product_ratings

    tests:

      # Overall average rating should be reasonable

      - dbt_expectations.expect_column_mean_to_be_between:

          column_name: rating

          min_value: 2.0

          max_value: 4.5


      # Per-category average ratings

      - dbt_expectations.expect_column_mean_to_be_between:

          column_name: rating

          min_value: 3.0

          max_value: 5.0

          group_by: [category]

          row_condition: "status = 'published'"


  - name: order_amounts

    tests:

      # Average order value by customer tier

      - dbt_expectations.expect_column_mean_to_be_between:

          column_name: order_total

          min_value: 50.0

          group_by: [customer_tier]

Expected Results:

expect_column_distinct_count_to_equal

Validates that the number of distinct values in a column equals an expected count.

Parameters:

value (required): Expected distinct count
group_by (optional): Group by columns
row_condition (optional): Filter condition


models:

  - name: employee_data

    tests:

      # Expect exactly 5 departments

      - dbt_expectations.expect_column_distinct_count_to_equal:

          column_name: department

          value: 5


      # Each department should have exactly 3 job levels

      - dbt_expectations.expect_column_distinct_count_to_equal:

          column_name: job_level

          value: 3

          group_by: [department]


  - name: product_catalog

    tests:

      # Each category should have 10-20 distinct brands

      - dbt_expectations.expect_column_distinct_count_to_be_between:

          column_name: brand

          min_value: 10

          max_value: 20

          group_by: [category]

expect_column_sum_to_be_between

Validates that the sum of all values in a numeric column falls within specified bounds.

Parameters:

min_value (optional): Minimum expected sum
max_value (optional): Maximum expected sum
group_by (optional): Group by columns
row_condition (optional): Filter condition

expect_column_max_to_be_between

Validates that the maximum value in a column falls within specified bounds.

expect_column_min_to_be_between

Validates that the minimum value in a column falls within specified bounds.

expect_column_median_to_be_between

Validates that the median value of a numeric column falls within specified bounds.

expect_column_stdev_to_be_between

Validates that the standard deviation of a numeric column falls within specified bounds.

expect_column_quantile_values_to_be_between

Validates that specific quantile values fall within expected ranges.

Parameters:

quantile (required): Quantile to check (0.0 to 1.0)
min_value (optional): Minimum expected quantile value
max_value (optional): Maximum expected quantile value

expect_column_most_common_value_to_be_in_set

Validates that the most frequently occurring value in a column belongs to an expected set.

Parameters:

value_set (required): Set of acceptable most common values
top_n (optional, default=1): Number of top values to check

STRINGString Matching

Pattern matching and string validation tests using regular expressions, SQL LIKE patterns, and length constraints.

expect_column_values_to_match_regex

Validates that all values in a column match a specified regular expression pattern.

Parameters:

regex (required): Regular expression pattern
is_raw (optional, default=false): Treat regex as raw string
row_condition (optional): Filter condition


models:

  - name: customers

    columns:

      - name: email

        tests:

          # Email format validation

          - dbt_expectations.expect_column_values_to_match_regex:

              regex: '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'


      - name: phone_number

        tests:

          # US phone number format: (XXX) XXX-XXXX

          - dbt_expectations.expect_column_values_to_match_regex:

              regex: '^\([0-9]{3}\) [0-9]{3}-[0-9]{4}$'


      - name: postal_code

        tests:

          # US ZIP code (5 digits or ZIP+4)

          - dbt_expectations.expect_column_values_to_match_regex:

              regex: '^[0-9]{5}(-[0-9]{4})?$'

              row_condition: "country = 'US'"


      - name: product_sku

        tests:

          # SKU format: 3 letters + 4 digits + 2 letters

          - dbt_expectations.expect_column_values_to_match_regex:

              regex: '^[A-Z]{3}[0-9]{4}[A-Z]{2}$'

Expected Results:

✅ Pass: All emails match valid format
❌ Fail: Returns rows with invalid patterns:
  | customer_id | email | phone_number | postal_code |
  | 1001 | invalid-email | 555-123-4567 | 12345 |
  | 1002 | user@domain.com | (555) 123-4567 | ABC123 |
  | 1003 | test@site.co | 5551234567 | 12345-6789 |

expect_column_values_to_match_like_pattern

Validates that column values match a SQL LIKE pattern.

Parameters:

like_pattern (required): SQL LIKE pattern with % and _ wildcards
row_condition (optional): Filter condition


models:

  - name: products

    columns:

      - name: product_code

        tests:

          # Product codes start with 'PROD-'

          - dbt_expectations.expect_column_values_to_match_like_pattern:

              like_pattern: 'PROD-%'


      - name: batch_number

        tests:

          # Batch format: B + 4 digits + L + 2 digits

          - dbt_expectations.expect_column_values_to_match_like_pattern:

              like_pattern: 'B____L__'


      - name: serial_number

        tests:

          # Check only active products

          - dbt_expectations.expect_column_values_to_match_like_pattern:

              like_pattern: 'SN-%-%'

              row_condition: "status = 'active'"

Expected Results:

Pattern examples:
• 'PROD-%' matches: 'PROD-123', 'PROD-ABC456'
• 'B____L__' matches: 'B1234L01', 'B9876L99'
• 'SN-%-%' matches: 'SN-123-ABC', 'SN-XYZ-999'

❌ Fail: 'ITEM-123' does not match 'PROD-%' pattern

expect_column_value_lengths_to_be_between

Validates that the length of string values falls within specified bounds.

Parameters:

min_value (optional): Minimum string length
max_value (optional): Maximum string length
strictly (optional, default=false): Use strict inequalities


models:

  - name: user_profiles

    columns:

      - name: username

        tests:

          # Username length 3-20 characters

          - dbt_expectations.expect_column_value_lengths_to_be_between:

              min_value: 3

              max_value: 20


      - name: password_hash

        tests:

          # Password hash must be exactly 64 characters

          - dbt_expectations.expect_column_value_lengths_to_equal:

              value: 64


      - name: description

        tests:

          # Description at least 10 chars, max 500

          - dbt_expectations.expect_column_value_lengths_to_be_between:

              min_value: 10

              max_value: 500

expect_column_values_to_match_regex_list

Validates that column values match at least one pattern from a list of regular expressions.

Parameters:

regex_list (required): List of regex patterns
match_on (optional, default='any'): 'any' or 'all'

expect_column_values_to_not_match_regex

Validates that no values in a column match a specified regular expression pattern.

MULTIMulti-column Tests

Tests that validate relationships and constraints across multiple columns within the same row or across different rows.

expect_column_pair_values_A_to_be_greater_than_B

Validates that values in column A are greater than corresponding values in column B.

Parameters:

column_A (required): First column for comparison
column_B (required): Second column for comparison
or_equal (optional, default=false): Allow equal values
row_condition (optional): Filter condition


models:

  - name: events

    tests:

      # End time must be after start time

      - dbt_expectations.expect_column_pair_values_A_to_be_greater_than_B:

          column_A: end_timestamp

          column_B: start_timestamp


      # Sale price >= cost (allow equal for break-even)

      - dbt_expectations.expect_column_pair_values_A_to_be_greater_than_B:

          column_A: sale_price

          column_B: cost_price

          or_equal: true


  - name: financial_data

    tests:

      # Max credit limit > current balance for active accounts

      - dbt_expectations.expect_column_pair_values_A_to_be_greater_than_B:

          column_A: credit_limit

          column_B: current_balance

          row_condition: "account_status = 'active'"

Expected Results:

✅ Pass: All end times are after start times
❌ Fail: Returns rows where A ≤ B:
  | event_id | start_timestamp | end_timestamp |
  | 1001 | 2024-01-15 10:00:00 | 2024-01-15 09:30:00 |
  | 1002 | 2024-01-15 14:00:00 | 2024-01-15 14:00:00 |

expect_column_pair_values_to_be_equal

Validates that values in two columns are equal for each row.

Parameters:

column_A (required): First column
column_B (required): Second column
ignore_row_if (optional): Condition to ignore rows


models:

  - name: order_calculations

    tests:

      # Calculated total should equal manual total

      - dbt_expectations.expect_column_pair_values_to_be_equal:

          column_A: calculated_total

          column_B: manual_total


      # Backup email should match primary for verified users

      - dbt_expectations.expect_column_pair_values_to_be_equal:

          column_A: primary_email

          column_B: backup_email

          ignore_row_if: "email_verified = false"

expect_compound_columns_to_be_unique

Validates that the combination of multiple columns creates unique compound keys.

Parameters:

column_list (required): List of columns to check for uniqueness
ignore_row_if (optional): Condition to ignore rows


models:

  - name: user_permissions

    tests:

      # Each user-role combination should be unique

      - dbt_expectations.expect_compound_columns_to_be_unique:

          column_list: ['user_id', 'role_id']


  - name: product_pricing

    tests:

      # Product-region-date combination should be unique

      - dbt_expectations.expect_compound_columns_to_be_unique:

          column_list: ['product_id', 'region', 'effective_date']


  - name: event_tracking

    tests:

      # User-session-event_type should be unique for completed events

      - dbt_expectations.expect_compound_columns_to_be_unique:

          column_list: ['user_id', 'session_id', 'event_type']

          ignore_row_if: "event_status != 'completed'"

Expected Results:

✅ Pass: All compound keys are unique
❌ Fail: Returns duplicate compound key combinations:
  | user_id | role_id | duplicate_count |
  | 1001 | 5 | 2 |
  | 1002 | 3 | 3 |

expect_multicolumn_sum_to_equal

Validates that the sum of values across multiple columns equals an expected total for each row.

Parameters:

column_list (required): List of columns to sum
sum_total (required): Expected sum value
ignore_row_if (optional): Condition to ignore rows


models:

  - name: budget_allocation

    tests:

      # Budget percentages should sum to 100%

      - dbt_expectations.expect_multicolumn_sum_to_equal:

          column_list: ['marketing_pct', 'sales_pct', 'operations_pct', 'other_pct']

          sum_total: 100


  - name: financial_breakdown

    tests:

      # Revenue streams should equal total revenue

      - dbt_expectations.expect_multicolumn_sum_to_equal:

          column_list: ['product_revenue', 'service_revenue', 'other_revenue']

          sum_total: total_revenue


  - name: survey_responses

    tests:

      # Rating scores should sum to 20 for complete responses

      - dbt_expectations.expect_multicolumn_sum_to_equal:

          column_list: ['q1_score', 'q2_score', 'q3_score', 'q4_score']

          sum_total: 20

          ignore_row_if: "response_status != 'complete'"

expect_select_column_values_to_be_unique_within_record

Validates that specified columns have unique values within each row (no duplicate values across columns in the same row).

Parameters:

column_list (required): List of columns to check for within-row uniqueness
ignore_row_if (optional): Condition to ignore rows

DISTRIBUTIONALDistributional Tests

Advanced statistical tests for detecting outliers, validating data distributions, and ensuring data completeness over time.

expect_column_values_to_be_within_n_stdevs

Validates that column values fall within N standard deviations of the mean, useful for outlier detection.

Parameters:

sigma_threshold (required): Number of standard deviations
take_diff (optional, default=false): Calculate differences from previous values
group_by (optional): Group by columns for per-group analysis


models:

  - name: daily_sales

    tests:

      # Detect sales outliers (beyond 3 standard deviations)

      - dbt_expectations.expect_column_values_to_be_within_n_stdevs:

          column_name: daily_revenue

          sigma_threshold: 3


      # Check day-over-day changes for anomalies

      - dbt_expectations.expect_column_values_to_be_within_n_stdevs:

          column_name: daily_revenue

          sigma_threshold: 2

          take_diff: true


      # Per-store outlier detection

      - dbt_expectations.expect_column_values_to_be_within_n_stdevs:

          column_name: daily_revenue

          sigma_threshold: 2.5

          group_by: ['store_id']


  - name: user_behavior

    tests:

      # Detect unusual session durations

      - dbt_expectations.expect_column_values_to_be_within_n_stdevs:

          column_name: session_duration_minutes

          sigma_threshold: 3

Expected Results:

Statistical Analysis:
• Mean daily revenue: $15,000
• Standard deviation: $3,000
• 3σ bounds: $6,000 - $24,000

✅ Pass: All values within normal range
❌ Fail: Outliers detected:
  | date | daily_revenue | z_score | within_bounds |
  | 2024-01-15 | $35,000 | 6.67 | false |
  | 2024-01-20 | $2,000 | -4.33 | false |

expect_column_values_to_be_within_n_moving_stdevs

Validates that column values fall within N standard deviations using a moving window calculation, useful for time series anomaly detection.

Parameters:

sigma_threshold (required): Number of standard deviations
take_diff (optional, default=false): Calculate differences
lookback_periods (required): Number of periods for moving window
trend_periods (optional): Periods for trend calculation


models:

  - name: website_traffic

    tests:

      # Detect traffic anomalies using 30-day moving window

      - dbt_expectations.expect_column_values_to_be_within_n_moving_stdevs:

          column_name: daily_visitors

          sigma_threshold: 2

          lookback_periods: 30


      # Detect unusual day-over-day changes

      - dbt_expectations.expect_column_values_to_be_within_n_moving_stdevs:

          column_name: daily_visitors

          sigma_threshold: 3

          take_diff: true

          lookback_periods: 14


  - name: stock_prices

    tests:

      # Price anomaly detection with trend adjustment

      - dbt_expectations.expect_column_values_to_be_within_n_moving_stdevs:

          column_name: closing_price

          sigma_threshold: 2.5

          lookback_periods: 60

          trend_periods: 20

Expected Results:

expect_row_values_to_have_data_for_every_n_datepart

Validates that data exists for every N date periods (e.g., every day, every hour) within a specified range, useful for detecting data gaps.

Parameters:

column_name (required): Date/timestamp column
datepart (required): Date part (day, hour, week, etc.)
interval (optional, default=1): Interval between expected records
group_by (optional): Group by columns


models:

  - name: daily_metrics

    tests:

      # Ensure data exists for every day

      - dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart:

          column_name: metric_date

          datepart: day


      # Check for data every week by region

      - dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart:

          column_name: metric_date

          datepart: week

          group_by: ['region']


  - name: hourly_sensors

    tests:

      # Sensor data should be recorded every hour

      - dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart:

          column_name: reading_timestamp

          datepart: hour

          group_by: ['sensor_id']


      # Check for data every 4 hours

      - dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart:

          column_name: reading_timestamp

          datepart: hour

          interval: 4

Expected Results:

Summary Table & Best Practices

Category	Test Count	Primary Use Cases	Key Examples
Table Shape	14	Schema validation, row count checks, column existence, data freshness	expect_column_to_exist, expect_table_row_count_to_be_between
Column Values	11	Basic data validation, null checks, uniqueness, data types, ranges	expect_column_values_to_be_between, expect_column_values_to_be_in_set
Aggregate	16	Statistical analysis, distribution validation, summary statistics	expect_column_mean_to_be_between, expect_column_distinct_count_to_equal
String Matching	10	Pattern validation, format checking, length constraints	expect_column_values_to_match_regex, expect_column_value_lengths_to_be_between
Multi-column	6	Cross-column relationships, compound keys, data integrity	expect_column_pair_values_A_to_be_greater_than_B, expect_compound_columns_to_be_unique
Distributional	3	Outlier detection, time series analysis, data completeness	expect_column_values_to_be_within_n_stdevs, expect_row_values_to_have_data_for_every_n_datepart

🏆 Best Practices for dbt-expectations:

Start Simple: Begin with basic tests (not_null, unique, in_set) before moving to advanced statistical tests
Use Row Conditions: Apply row_condition parameter to test subsets of data for more targeted validation
Group-by Testing: Leverage group_by parameters for per-category or per-segment validation
Statistical Monitoring: Use distributional tests for monitoring data quality over time and detecting anomalies
Combine with dbt Core: Use dbt-expectations alongside standard dbt tests for comprehensive coverage
Test Incrementally: Apply tests to incremental models for ongoing data quality assurance
Document Expectations: Add descriptions to tests explaining business rules and expected behaviors
Performance Considerations: Be mindful of test performance on large datasets, especially statistical tests

🔧 Advanced Configuration Examples:


# Complex test with multiple parameters

- dbt_expectations.expect_column_values_to_be_between:

    name: "validate_revenue_by_region"

    column_name: monthly_revenue

    min_value: 10000

    max_value: 1000000

    row_condition: "region in ('US', 'EU') and status = 'active'"

    config:

      severity: warn

      error_if: ">5"

      warn_if: ">0"


# Statistical test with grouping

- dbt_expectations.expect_column_mean_to_be_between:

    name: "monitor_average_session_duration"

    column_name: session_duration_minutes

    min_value: 2.0

    max_value: 30.0

    group_by: ['device_type', 'user_segment']

    config:

      store_failures: true

                Key Advantages:

                • Great Expectations Compatibility: Familiar syntax for teams using Python GE

                • Comprehensive Coverage: 60+ sophisticated tests covering all aspects of data quality

                • Statistical Power: Advanced distributional tests for anomaly detection and monitoring

                • Cross-database Support: Works consistently across different SQL databases

                • Performance Optimized: Efficient SQL generation for large-scale data validation

                • Flexible Configuration: Extensive parameterization for customized testing scenarios

📚 Official dbt Documentation

For comprehensive information about dbt testing and data quality, refer to the official documentation:

Data tests - Official guide to dbt testing framework and best practices
dbt-expectations package - Official repository for the dbt-expectations package

These resources provide the authoritative source for dbt testing capabilities and are regularly updated.

dbt-expectations Package - Comprehensive Test Summary

📦 Installation & Setup

Table of Contents

TABLETable Shape Tests

expect_column_to_exist

Expected Results:

expect_table_row_count_to_be_between

Expected Results:

expect_table_columns_to_match_set

expect_table_row_count_to_equal_other_table

expect_row_values_to_have_recent_data

COLUMNColumn Values & Basic Tests

expect_column_values_to_be_between

Expected Results:

expect_column_values_to_be_in_set

Expected Results:

expect_column_values_to_be_unique

expect_column_values_to_not_be_null

expect_column_values_to_be_of_type

expect_column_values_to_be_increasing

expect_column_values_to_have_consistent_casing

AGGREGATEAggregate Functions

expect_column_mean_to_be_between

Expected Results:

expect_column_distinct_count_to_equal

expect_column_sum_to_be_between

expect_column_max_to_be_between

expect_column_min_to_be_between

expect_column_median_to_be_between

expect_column_stdev_to_be_between

expect_column_quantile_values_to_be_between

expect_column_most_common_value_to_be_in_set

STRINGString Matching

expect_column_values_to_match_regex

Expected Results:

expect_column_values_to_match_like_pattern

Expected Results:

expect_column_value_lengths_to_be_between

expect_column_values_to_match_regex_list

expect_column_values_to_not_match_regex

MULTIMulti-column Tests

expect_column_pair_values_A_to_be_greater_than_B

Expected Results:

expect_column_pair_values_to_be_equal

expect_compound_columns_to_be_unique

Expected Results:

expect_multicolumn_sum_to_equal

expect_select_column_values_to_be_unique_within_record

DISTRIBUTIONALDistributional Tests

expect_column_values_to_be_within_n_stdevs

Expected Results:

expect_column_values_to_be_within_n_moving_stdevs

Expected Results:

expect_row_values_to_have_data_for_every_n_datepart

Expected Results:

Summary Table & Best Practices

🏆 Best Practices for dbt-expectations:

🔧 Advanced Configuration Examples:

📚 Official dbt Documentation