Skip to main content

Measures

Measures are the aggregations that can result in either your final metric of desire, and/or serve as the foundational building blocks for a metric.

Measures have a few inputs in the data source configuration: name, description, aggregation, expr, and create_metric.

Name (Required)

The name of the measure can be chosen by the user or optionally reference a database column directly. If the name of the measure is different from the column in the table, you must add an expr that will denote the column name. The name of the measure will be referenced when creating a metric. Note that measure names must be unique across all data sources in a project.

Description (Required)

The description section describes the measure that is being calculated. It is strongly recommended to create verbose and human-readable descriptions in this field.

Aggregation (Required)

The aggregation determines how the field will be aggregated. For example, a sum aggregation type over a granularity of day would sum the values across a given day.

Aggregations supported by MetricFlow are as such:

Aggregation TypesDescription
sumSum across the values
minMinimum across values
maxMaximum across values
averageAverage across values
sum_booleanA sum for a boolean type
count_distinctDistinct count of values
tip

All the metrics above, except for count_distinct, are additive measures. MetricFlow can aggregate these agg types sequentially to reuse datasets. As an example, transactions by day could be aggregated to transactions by week. Non-additive measures like count_distinct don't have this property and must be calculated from the source every time. This can be more costly and slower, but the resulting metrics are worth the wait in many cases.

Expr (Optional)

Use the expr parameter if you have specified a name parameter that does not match the name of a column in your database table (in this case, name would effectively serve as an alias to refer to your measure). You may also use the expr parameter with any valid SQL to manipulate an underlying column name into a specific output.

Note: Always refer to your database-specific SQL when utilizing SQL functions to be used in the expr parameter as outputs may differ depending on your specific database provider.

caution

If you are using Snowflake and use any week-level function inside the expr parameter, the function will now return the ISO-standard default of MONDAY.

Examples:

  1. You are running Snowflake warehouse instances with account or session level overrides for the WEEK_START parameter that fix it to a value other than 0 or 1: You will now see MONDAY as the week start in all cases.

  2. You are using the DAYOFWEEK function inside the expr parameter with the legacy Snowflake default of WEEK_START = 0: DAYOFWEEK will now return the ISO-standard values of 1 (Monday) through 7 (Sunday) instead of Snowflake’s legacy default values of 0 (Monday) through 6 (Sunday).

Create Metric (optional)

You can optionally choose to create a metric directly from a measure with create_metric: True as a shortcut for creating a metric.

  • Note that the description and owners of the metric created from a measure with create_metric: True will thus default to the description and owner of the data source.
  • You may further specify the display name of the metric via create_metric_display_name: "Your Metric Display Name"

Note: If you prefer to have a single file location or a single mechanism for storing all of your metric definitions, we do not recommend you use the create_metric: True shortcut as this allows metric definitions to exist in multiple places across your Transform model.

Example file using various aggregation types

data_source:
name: transactions
description: A record for every transaction that takes place. Carts are considered multiple transactions for each SKU.
owners: support@transform.co
sql_table: schema.transactions

# --- IDENTIFIERS ---
identifiers:
- name: transaction_id
type: primary
- name: customer_id
type: foreign
- name: store_id
type: foreign
- name: product_id
type: foreign

# --- MEASURES ---
measures:
- name: transaction_amount_usd
description: Total USD value of transactions
agg: sum
- name: transaction_amount_usd_avg
description: Average USD value of transactions
expr: transaction_amount_usd
agg: average
- name: transaction_amount_usd_max
description: Maximum USD value of transactions
expr: transaction_amount_usd
agg: max
- name: transaction_amount_usd_min
description: Minimum USD value of transactions
expr: transaction_amount_usd
agg: min
- name: quick_buy_transactions
description: The total transactions bought as quick buy
expr: quick_buy_flag
agg: sum_boolean
- name: distinct_transactions_count
description: Distinct count of transactions
expr: transaction_id
agg: count_distinct
- name: transactions
description: The average value of transactions
expr: transaction_amount_usd
agg: average
create_metric: True
create_metric_display_name: Avg Value of Transactions (Daily)
- name: transactions_amount_usd_valid #Notice here how we use expr to compute the aggregation based on a condition
description: The total USD value of valid transactions only
expr: CASE WHEN is_valid = True then 1 else 0 end
agg: sum
- name: transactions
description: The average value of transactions.
expr: transaction_amount_usd
agg: average
create_metric: True
create_metric_display_name: Avg Value of Transactions (Daily)

# --- DIMENSIONS ---
dimensions:
- name: ds
type: time
expr: date_trunc('day', ts) #expr refers to underlying column ts
type_params:
is_primary: true
time_granularity: day
- name: is_bulk_transaction
type: categorical
expr: case when quantity > 10 then true else false end