Executives receive diverging answers to simple questions ("What was revenue last month?") because Finance and Data Science define "revenue" differently in their respective tools. Tableau dashboards, Power BI reports, dbt models, and ad-hoc SQL views each compute the same metric with subtly different logic. Every BI tool proliferates its own calculated fields. When definitions change, they must be updated in every tool individually. NL-to-SQL systems fail because there is no single source of truth for metric semantics.
A semantic layer sits between the physical data store and all consuming tools (BI dashboards, experimentation platforms, ML feature pipelines, NL-to-SQL interfaces). Metric definitions — including grain, filters, time logic, and business rules — are declared once in the semantic layer and served to all downstream consumers via a unified API (SQL dialect, GraphQL, or REST). dbt Semantic Layer, Cube, Looker/LookML, and AtScale all implement variants of this pattern. Consumers query "Revenue" rather than writing SQL logic; the semantic layer compiles the query to the physical warehouse. Caching and push-down optimization prevent per-query recomputation.
Semantic layer engine (dbt Semantic Layer / Cube / Looker LookML / AtScale / Metriql) + BI connector (Tableau / Power BI / Metabase / Superset) + metric glossary management + caching layer + API gateway for downstream consumers.
Modular, version-controlled SQL transformations executed inside the warehouse, bringing software engineering practices to analytics code.
The semantic layer sits on top of the transformation layer and references its models.
Unified data lake + warehouse architecture on open-format object storage, eliminating copy pipelines and providing ACID semantics at petabyte scale.
Physical data must be organized in a queryable foundation before semantic models can reference it.
Conversational analytics letting users ask data questions in natural language and receive governed answers, proactive insights, and charts.
An AI system converting business questions in natural language into executable SQL, enabling non-technical users to query data warehouses directly.
Controlled-experiment infrastructure with statistical rigor enabling continuous testing and replacement of opinion-driven decisions with evidence.