2023-05-22 data

The Presentation Layer: Extracting insights from the data (Part 10)

(This post is part of a series on working with data from start to finish.)

After we have stored, modeled and governed our data, we must finally make sense of it. This occurs in what is traditionally called the “data presentation layer”. Here, we present data to our users, and they in turn modify its representation in order to understand it.

The presentation layer serves three essential functions, each of which is enabled by the business intelligence infrastructure:

  1. Reporting: We provide a bird’s-eye view of our operations by charting key performance metrics
  2. Analysis: We decompose those metrics across various dimensions to identify areas of over- and underperformance
  3. Inspection: We inspect individual data points to understand what general characteristics drive the behavior we observe in aggregate

Baseline reporting comes in the form of simple, zero- or one-dimensional metrics, such as “total year-to-date sales” or “total year-to-date sales by territory”. It is not designed to provide deep insight into the drivers of particular metrics, but rather a general intuition around levels (“are total sales in the millions or billions?”), ranges (“most sales are in the tens of thousands range”), and relative shares (“Northwest sales are only about 10% of our total sales”).

// zero-dimensional
total_sales

// one-dimensional
month (PK) | total_sales
2023-01-01 | 1,000
2023-02-01 | 2,000
2023-03-01 | 3,000

Analysis, on the other hand, pivots operational metrics along two or more dimensions, such as “regional sales per month” or “regional sales per month per salesperson”. If, for example, sales grew substantially in recent months, we can evaluate whether this growth came predominantly from a single region or whether it was distributed evenly across multiple regions. Further, when adding a third dimension, we can determine who the top-selling salespeople were within the top-selling regions.

// two-dimensional
month, region | total_sales
2023-01-01, EAST | 1,000
2023-02-01, WEST | 2,000
2023-03-01, NORTHWEST | 3,000

// three-dimensional
month, region, salesperson | total_sales
2023-01-01, EAST, A | 1,000
2023-02-01, WEST, B | 2,000
2023-03-01, NORTHWEST, C | 3,000

Unlike the reporting of key metrics across a single dimension, the analysis of key metrics across two or more dimensions can rapidly complicate one’s ability to extract insights from the data.

First, data insights by their very nature are high-level, universal and parsimonious. Multi-dimensional analysis, on the other hand, is verbose and nuanced. It is one thing to say “sales have grown in recent months.” It is quite another to say “sales have grown most in recent months from clients A and C in our Northeast region, and from clients B and F in our Midwest region.” In the former, there is a clear pattern; in the latter, there is not.

Second, there are typically many ways of pivoting a metric, and no single view will accommodate all possible ways. For example, consider a metric of total_sales which can be analyzed across the six dimensions of region, client, salesperson, date_year, date_quarter, date_month. One-dimensional pivots of the data - “sales by region”, “sales by client”, “sales by salesperson” and so on - yield us six views.

Now consider the business user wants to perform a more granular, two-dimensional analysis. “Total sales by client per year” is one way of viewing the data, but so is “total sales by client per salesperson”, “total sales by salesperson by client”, “total sales by client per quarter”, and so on. In total, there are 30 ways of permutating six choices along two dimensions of analysis. Not all of these can be shown at once.

For these reasons, analytical dashboards are typically separated from reporting dashboards. They contain the same data, but their understanding is more nuanced, their operation more hands-on, and their audience more research-oriented. 

Finally, we have inspection. After users drill-down into the drivers of particular trends - “most sales this quarter came from our Northeast region” - they will want to see individual data points. They do this to (a) validate results and (b) build intuition around the data.

It is not at all uncommon for metrics reported in aggregate to be based on incorrect calculations[0]. Calculations can use the wrong data definitions, wrong data sources or wrong data transformations. Data correctness is hard to come by. 

Business users, who observe day-to-day operations and develop an intuition for how the data ought to look, are often the first to say “that number seems wrong.” They will want to inspect the actual data points composing any particular calculation.

Upon seeing individual data points, these users will routinely say “these data points should not be in our data set” or “these data points are misclassified” or “we are missing some data here.” In other words, they will validate high-level reporting metrics using the low-level data points.

Sometimes, they will say: “That’s interesting, I didn’t expect to see that.” Or: “Most of these data points came from our one big client. I didn’t expect them to have such a large effect.” Here, business users will use high-level trends to shape their low-level intuition about the data.

Most teams want observability before they want analysis or inspection. While data analysts are often eager to shoehorn as many metrics and dimensions onto a dashboard as will fit, this is not in fact what business users want. They want simple, high-level takeaways about their operations from the data. Only after grasping this macroscopic view will they delve into the details via analysis and inspection.

The presentation layer is conventionally understood to be the end state of a data infrastructure because it is what is tangibly delivered from a data team to clients. In reality however, clients do not ultimately want dashboards. They want outcomes. How data analysis can be used to discover, deliver and measure such outcomes will be the subject of the next chapter.

For this chapter however, we’ll continue to focus on the infrastructural aspect of the presentation layer. In particular, how to choose a business intelligence tool and how to manage it to facilitate reporting, analysis and inspection.

Choosing a business intelligence vendor #

There is a vast array of business intelligence providers - Tableau, PowerBI, Qlik, Looker, Sisense, Metabase, Preset (among others) - and all of them are sufficient to perform basic data analysis and visualization. As in the data warehouse and data transit layers, there exists feature parity in the business intelligence space: when one vendor adds a feature, the others follow.

The choice of business intelligence vendor largely depends on who is making the decision, what tools they have historical experience with, what the organization is willing to pay, and finally what features they need.

For example, if an organization needs several users to work on the same dashboard at the same time, they may opt not to use Tableau. If they need users to analyze CSV files from their local desktops, they may rule out Looker. And if the CTO has considerable experience with PowerBI, then the organization will likely use PowerBI. As all providers offer similar basic capabilities and user interfaces, the essential workflows can be quickly learned in each.

Nevertheless, organizations sometimes have more bespoke requirements, and business intelligence vendors vary in their provision or usability of corresponding features. It is therefore useful to compare vendors along a checklist of potential requirements so that an organization does not find itself one day needing a feature that the vendor cannot provide. Such features include:

Development

Design

Performance

Automation

Governance

Structuring a business intelligence deployment #

After deciding on your business intelligence vendor, you will build reporting, analytical and inspection dashboards.

Reporting dashboards present a suite of metrics to business stakeholders under a particular theme, such as sales performance or marketing performance. They are passively consumed, informational, and understood in isolation. For example, a marketing team will use a reporting dashboard to display open rates, click rates and conversion rates of their various marketing campaigns over time.

To the extent that reporting dashboards reveal unexpected increases or decreases in the data (i.e. variability), users will typically demand analysis. They will want metrics analyzed along various dimensions to see what factors in particular drove high-level trends. This process occurs within analytical dashboards.

Finally, users will want to see individual data points. These inspection views should exist separate to reporting and analytical views. Underlying data is displayed in tabular format with as many columns as possible to add context, as well as filters to isolate relevant sets of data.

Reporting dashboards #

Reporting dashboards succeed in their objective if they quickly and concisely answer the questions of their users. If a marketer wants to know whether subscriptions are up year-over-year, then they only want a single number: “Subscriptions are up 42%”.

If they further inquire into whether this was a transient or persistent trend, we would provide data from multiple periods: “Subscriptions are up 42% this year, compared to only 18% last year and 16% the year prior.” These statistics could be displayed using either a table or a visualization.

The reporting dashboard is designed around the goals of a business team, such as increasing sales or growing marketing engagement. These goals are, for the most part, well-defined and static. The attendant questions, such as “how are sales recently” or “how is marketing engagement recently”, are too. Reporting dashboards should provide these answers front and center.

Anything which obscures these answers - such as missing titles, confusing data labeling, haphazard visual organization or slow dashboard performance - rapidly diminishes the utility of these dashboards. 

Good reporting dashboards make takeaways from the data obvious, with little to no work required by the user to understand what they need. If you cannot copy-and-paste a chart from a reporting dashboard into a PowerPoint deck, then it does not belong in the dashboard.

General principles for building reporting dashboards relate to[1]:

Finally, after the above requirements are satisfied, other features can be added to reporting dashboards, such as print-to-PDF, daily scheduled emails, and threshold-based alerting.

Analytical dashboards #

Analytical dashboards succeed in their objective if they produce novel insights from the data. Unlike reporting dashboards, there is no view of the data users routinely want to see, as such views would necessarily not be novel. Instead, business questions are not well-defined and investigations are more exploratory in nature. Analytical dashboards are therefore designed for exploratory data analysis (EDA).

Insights are novel to the extent they surface unexpected variability within the data. For example, email conversion rates may be higher among drip campaign emails compared to promotional emails, and it may be further the case that conversion rates are highest among drip campaigns placing the call-to-action at the top of the email rather than the bottom.

If email conversion rates were flat between drip campaigns and promotional emails, and as well as between emails with calls-to-action at the top versus at the bottom, then there would be no data insights of interest. As though placing weights on a balance, variability tells us which side of the data we should focus our efforts on.

It is not obvious in advance where to necessarily look for patterns. If you pivot the data by has_image or has_emoji instead of call_to_action_location, you might find that conversion rates are flat across those dimensions. It is only with respect to call_to_action_location that conversion rates differ.

As a result, analytical dashboards should make it easy to pivot by various dimensions. This can be achieved in the Editor view of the business intelligence tool, or by exposing parameters which allow the user to choose what dimensions to pivot the measure by. Additionally, analytical dashboards should generously include filters (often as many as the number of dimensions available) and enable data inspections of individual data points.

You’ll notice in the examples above that dimensions are always discrete: has_image, has_emoji, call_to_action_location. In fact, in data modeling more broadly, dimensions are discrete and facts (conversion_rate) are continuous. Traditionally, we perform analysis by pivoting a continuous measure, such as the conversion rate, by discrete dimensions.

If we have two continuous variables, such as number_of_images versus conversion_rate, then we typically discretize one of them by way of bins. The continuous variable number_of_images becomes the discrete variable number_of_images_bin with levels of [0, 1), [1, 3) and [3, 100). Then, as usual, we pivot the continuous conversion_rate by these discrete categories to determine whether it materially differs between them.

In actuality, conversion_rate is also discretized: after all, we must apply some aggregate function to collapse the individual data points, such as AVG(), to arrive at an average conversion rate in the number_of_images_bin = [1, 3). This is indeed the conventional analytical methodology in dashboard-driven exploratory data analysis: group-by a dimension, aggregate a measure.

While useful, this is only one particular style of analysis - one that I like to call “dispersionless”. When we partition the number_of_images variable into bins, we lose information on how many emails have number_of_images = 1 versus how many have number_of_images = 2. Conversion rates may very well differ between these two values, but we would never be able to see this difference using the bin of number_of_images_bin = [1, 3).

Additionally, the AVG(conversion_rate) also collapses the underlying distribution of conversion rates. If we sent just two email campaigns in the number_of_images_bin = [3, 100), one with a conversion rate of 90% and the other with a conversion rate of 10%, then our average conversion rate is 50%. However, we do not actually expect an average conversion rate of 50%, but rather a rate of “too little data to know”. Dispersion therefore qualifies our point estimates with corresponding uncertainty.

Charting continuous variables against continuous variables (or alternatively, facts against facts) typically sits within the realm of scatter plots and regression models. While business intelligence tools can produce these, they typically fall short of the outputs generated by full-fledged programming languages such as Python and R. Business intelligence tools are not statistical tools, and as such they do not produce statistical summaries or statistical graphs (e.g. CDFs and residual plots), nor do they work well with large volumes of unaggregated data.

Analytical dashboards are therefore useful for the conventional, “guess-and-check” analysis of pivoting measures by various dimensions in the hope that some of these dimensions exhibit unexpected variability. Oftentimes however, they can leave the analyst overwhelmed by the number of dimensions to choose from and underwhelmed by the amount of variability observed. Data-driven patterns can be hard to come by.

There is indeed a fast-track to identifying patterns which exhibit the most variability - that is, the most promising data insights - though this cannot be achieved using business intelligence tools. For that, we will need to jump into our Jupyter notebooks and perform some last-mile analytics. That is the subject of our next chapter.

What about metric stores? #

Business intelligence tools today couple two separate but related analytical functions: the definition of metrics, as well as the visualization of those metrics. You might, for example, create an average revenue per active user metric in Tableau by filtering for all active users, then taking the average across users. You could further group by year if you wanted to compare how this metric changed over time, and visualize these changes as a bar chart.

However, once defined within Tableau, this average revenue per active user metric is not easily replicated to other applications such as Jupyter notebooks, Excel workbooks, webapps, other business intelligence tools, or enterprise data systems like Hubspot and Salesforce. 

If a Jupyter user wanted to use the average revenue per active user metric, they would have to recreate the underlying SQL, then reconcile its results to those in Tableau. They could not simply reference the figures produced by Tableau directly. Metrics, when created within the presentation layer, essentially become “locked in” to the tool.

Most commonly, when replicating a metric to another system, users imitate the SQL but do not perform the reconciliation. Without the reconciliation however, inconsistencies invariably arise and consume hours of analyst time to resolve. The desire to sidestep this issue entirely - to reference metrics instead of duplicating them - led to the emergence of “metric stores”.

Metric stores decouple the definition of metrics from their visualization. Metrics are defined exclusively in the metric store and referenced by the business intelligence tool. Under this model, business intelligence becomes just another consumer of metrics, like Jupyter or Excel or an internal webapp.

Metric stores are sometimes called “headless BI” because they encode the logic of metrics without their attendant, end-user presentation (“the head”). They provide CLIs and APIs, but not GUIs. Because metric stores extricate the code logic from the visual display, metrics can be more easily stored in a version control system, documented, code reviewed, tested and deployed. 

The semantics of modern-day metric stores originated in 2012 with Looker’s LookML, which pioneered the definition of metrics in simple, human-readable configuration files:

view: encounter {
  sql_table_name: lookerdata.healthcare_demo_live.encounter ;;
  dimension: status {
    label: Status"
    type: string
    sql: ${TABLE}.status ;;
  }

  dimension: code_name {
    type: string
    sql: case when ${code} = 'IMP' then 'Inpatient'
    when ${code} = 'AMB' then 'Ambulatory'
    when ${code} = 'EMER' then 'Emergency Department' end;;
  }

  dimension_group: period__end {
    label: "Discharge"
    type: time
    timeframes: [
      date, week, month, year, day_of_week, time, time_of_day, hour_of_day, raw
    ]
    sql:  ${period}.end ;;
  }

  measure: count_patients {
    label: "Number of Patients"
    type: count_distinct
    sql: ${patient_id} ;;
    drill_fields: [patient_id, patient.gender, patient.age, patient.los]
  }

  measure: med_los {
    group_label: "LOS Statistics"
    label: "Median Length of Stay"
    type: median
    sql: ${length_of_stay} ;;
    value_format_name: decimal_2
  }

  measure: repeat_patients {
    label: "Percent of Repeat Patients"
    type: number
    value_format_name: percent_2
    sql: 1.0*(${count}-${count_patients})/nullif(${count},0) ;;
  }

Looker’s insight was to equip otherwise nondescript SQL tables (or dbt data models) with the analytics-friendly handlebars of dimensions and measures. These handlebars could be variously combined to form “metrics” - a particular, summary view of the data - such as::

  - title: 'Error Type 2: Data Entry Error'
    name: 'Error Type 2: Data Entry Error'
    model: healthcare
    explore: observation_vitals
    type: looker_bar
    fields: [observation_vitals.type, observation_vitals.count_anomalies]
    pivots: [observation_vitals.type]
    filters:
      observation_vitals.issued_hour: 48 hours
      observation_vitals.absolute_standard_deviation: "<15"
    limit: 500

Above, Error Type 2: Data Entry Error represents a final, polished metric ready to be consumed by business stakeholders. In English, it translates to: “the number of Type 2 data anomalies per observational vital type in the past 48 hours, excluding outliers”. If another application wanted to reference this metric, it could simply query this Look (a reference to a Looker visualization) via API.

Although Looker invented the decoupled business intelligence layer, it had two drawbacks that prevented mass adoption. The first was the price: a starter package will run in the several tens of thousands of dollars, putting it out of reach for most individuals and early-stage startups. The second was that LookML was not in fact a universal protocol: it only worked with Looker. It was not designed to work with Jupyter, internal webapps, or other BI clients like Tableau (despite what 2-year old press releases may claim).

Over the past few years, open-source alternatives have emerged to make metric stores more broadly accessible. The most prominent of these are Cube.js, Transform’s MetricFlow (recently acquired by dbt Labs) and Google’s Malloy (still experimental). Each prescribes similar semantics to Looker, such as Malloy below:

source: airports is table('malloy-data.faa.airports') {
  measure: airport_count is count()
  measure: avg_elevation is elevation.avg()
  query: top_5_states is {
    group_by: state
    aggregate: airport_count
    limit: 5
  }

  query: by_facility_type is {
    group_by: fac_type
    aggregate: airport_count
  }
}

To request the number of airports by facility type, a client would run query: airports -> by_facility_type

Is a metric store necessary? #

One might wonder why dbt models, which also structure data using dimensions and measures, do not suffice for defining metrics. If metrics should not be defined within the presentation layer, perhaps they can live within the warehouse layer instead?

This is in fact how many data teams operate today. Whether it is to create reusable data marts or to craft bespoke team metrics, as much business logic as possible is encoded within dbt. The goal is to make the warehouse layer as intelligent as possible, and the presentation layer as naive as possible.

The disadvantage of this approach is that metrics are situated alongside data marts, living within the same codebase and the same database. Metrics, however, are not data marts. 

Data marts are modular, reusable, materialized data sets (or “OLAP cubes”) which can be used to craft many metrics. Metrics, on the other hand, are particular slices and aggregations of this underlying OLAP cube. A single dashboard can produce various metrics all sourcing from the same OLAP cube.

If we define metrics using dbt, we would need a great many dbt models: one for num_anomalies_by_observational_vital_type_past_48h_excl_outliers, as well as one for num_anomalies_by_practictioner_name_past_48h_excl_outliers, and so on. Clearly, this approach is untenable at scale, as it would crowd out the reusable, modular OLAP cubes with narrowly defined, bespoke metrics.

Metric stores are downstream of OLAP cubes but upstream of BI clients. They allow business logic to be pulled out of the presentation layer, yet not be haphazardly stuffed into the warehouse layer. They facilitate the broad reusability of metrics while at the same time preserving their separation from dbt’s reusable, modular data sets.


[0] This is known as Twyman’s law.

[1] Metabase also provides useful documentation on how to design business intelligence dashboards.

(previous)(next)