MetricFlow Overview
MetricFlow is an Open Source metric layer. At its core, it is a Python library that includes a metric spec, model parsing from YAML files, and APIs to create and execute metric logic. It is designed to help organize and standardize metric definitions so that analysts don't have to rewrite the same SQL queries every time they need to pull a metric.
Key Components
MetricFlow encourages DRY (“Do Not Repeat Yourself”) expression of metrics logic in YAML and SQL: analysts will define data sources, and metrics on top of those. By abstracting these key objects, MetricFlow can generate SQL to build metrics, without the need to repeatedly express the same joins, aggregations, filters and expressions from your data warehouse in order to construct datasets for consumption.
In essence, MetricFlow acts as a proxy, translating metric requests in the form of “metrics by dimensions” into SQL queries that traverse the data warehouse and the underlying semantic structure to resolve every possible combination of metric and dimension. In the case where you have specific combinations of metrics and dimensions ahead of time, materializations can be used to set the specific denormalized datasets that should be resolved against the SQL engine.
CLI and Python APIs are exposed in order to support these local developer workflows, and enable users to build and execute metric querying logic. In many cases, a developer might choose to schedule these workflows or serve metrics through interfaces of their own design.
Learn more about installation and key MetricFlow features at the MetricFlow GitHub OS Project (https://github.com/transform-data/metricflow).
Who Should Use MetricFlow
MetricFlow is suited for technical individuals looking to streamline and organize their process for defining and serving metrics to end users—with familiarity and insight into the data pipeline and data structures upon which metrics are calculated. This usually falls within the responsibility of data analyst, data engineer, or data scientist roles, depending on the structure of the organization.
This individual is also a self-starter and something of a trailblazer. Follow the instructions on the ReadMe to pip install MetricFlow
and get started configuring metrics in YAML files and querying these metrics locally on your machine.