Transform is a metrics store that allows organizations to execute on their vision for data democratization in real, practical terms. Organizations can define a source of truth for key KPIs, which can be leveraged in a number of different ways across the data stack and in downstream applications.
This source of truth is defined by leveraging MetricFlow, Transform's Open Source underlying metric layer, which allows you to define metrics in code as YAML files. Analysts will define data sources, and metrics on top of those, allowing Transform to create the denormalized datasets against the organization’s SQL engine. Metrics are exposed in the Transform UI for the organization to view and gain insights into metric behavior, and they are exposed via APIs to downstream tools for further consumption.
Benefits & Value proposition
By centralizing knowledge around metrics, organizations can streamline their decision-making.
Without Transform, analysts will spend hours writing repetitive SQL statements to calculate the same KPIs, rolled up to different aggregations or sliced and diced by different dimensions. If the organization uses multiple BI tools, analysts are probably duplicating these efforts in addition to being responsible for the calculations of the same metrics, over and over.
With Transform, metrics are computed on the fly at the appropriate level of aggregation, determined by how the end-user desires to filter the query. Regardless of the BI tool downstream, these end users can leverage the same code-based definition, thus deduplicating efforts and saving time on the organizations’ behalf.
Without Transform, organizations might struggle to build consensus around a metric’s definition, especially if this metric is defined in numerous different places, across technical and non-technical teams.
With Transform, an organization can build a knowledge base around metrics and their definitions, across technical and non-technical audiences. Data engineers and data analysts will likely be responsible for the definition of data sources and metrics, and business users can consume the “human-readable” version of these definitions in the Transform UI. These various groups can interact and ask questions of each other, add annotations, and provide context as to why a metric would behave a certain way.
Transform brings everyone to the same page (literally) around how metrics are defined.
Where does Transform sit in the data stack?
Companies design their data infrastructure based upon how they expect to drive decision-making through analytical systems.
Typically, these companies manage vast amounts of data, flowing from various different applications or third-party source systems. These are captured and stored in a data warehouse or lake, depending on the type of analytical or operational workflows those data are intended for. Some transformation may occur before the data flows to the warehouse (think ETL), but more often than not companies are opting for an ELT approach, where datasets are loaded into the warehouse, then prepared for analytical application consumption downstream. Even more common is the approach whereby denormalization of such datasets are performed at the application level itself, thus sequestering the metric logic within those bespoke tools (see diagram below).
Transform sits conveniently in between an organization’s data warehouse, and other downstream tools. Essentially, it acts as a proxy to the warehouse, translating metric requests into SQL queries to the warehouse.
In this way, transformation occurs at the 'metric level' — consistently defined in code, accessible to all downstream tools, and centrally governed to maximize insights.
Who should use Transform?
From an organizational standpoint, it can very difficult to standardize around definition and use of metrics. Organizations that are looking to create a central repository for key business KPIs and metrics, to avoid duplicative time spent writing and rewriting SQL queries, and to ensure consistency of use across different teams, would benefit from Transform.
From a more tactical standpoint, a few different personas or types of individuals would interact with the tool:
- A data engineer, data analyst or data scientist would be well-suited to set up MetricFlow which will abstract the metric definitions for end users. They are typically familiar with the underlying warehouse and data structures, write SQL, and generally charged with transforming raw data assets into usable, meaningful data insights for the end user. They want to reduce the time spent writing duplicative SQL and want to leverage a semantic model like MetricFlow to organize key metrics for their stakeholders.
- A business analyst or any sort of end user charged with interpreting what the data means, in the form of reporting or dashboards, would also interact with Transform. These folks would search through the catalog to understand what metrics had been described by MetricFlow, as well as contribute to the contextual knowledge base around metrics in the form of questions and annotations.
Ideally, these two distinct groups of users will collaborate together within Transform to ensure the most up-to-date, accurate and useful metrics.