A Note on This Article

This article was originally written in 2006–2007, at a time when the Data Warehouse landscape was dominated by CIF, BUS Architecture, and centralized EDW approaches. The challenges we identified then — spiraling TCO, fragmented datamarts, slow time-to-insight, and the impossibility of funding a full EDW upfront — were real and largely unsolved.

The ITA&S Hybrid DW Architecture was our answer at the time. What we did not know yet was that these same frustrations, and the architectural principles we developed to address them, would eventually lead us to design and implement what is today called a Lakehouse — before the concept had a name. The real-time Lakehouse we now offer through our GUM-RTDP platform is the direct evolution of the thinking that started here.

We are keeping this article as a testament to that journey — and as proof that our current platforms are not built on trend-following, but on decades of hands-on experience solving real problems.


Organizations continue to spend millions annually on Business Intelligence — and still lack an integrated Enterprise Data Warehouse (EDW) or even conformed dimensions across their datamarts. The result is predictable:

  • No reliable way to compare metrics and reports across business units and functions
  • Endless debates about conflicting numbers
  • Excessive time spent manipulating data
  • Decisions made on imprecise and outdated information

Building a BI Enabling Architecture (BIEA) requires people with the right architectural experience — it has nothing to do with tools. A suite of tools is necessary, but without the proper architecture underneath, those tools deliver little value.

The Problem with Existing Approaches

A BIEA contains multiple components and performs multiple functions — but central to it is the Data Warehouse or Enterprise Data Warehouse. The most popular approach at the time, the Corporate Information Factory (CIF, Inmon), was also the most expensive. Multiple data stores, combined with the continuous movement of data between them through ETL (Extract, Transform and Load) processes, drove costs up relentlessly.

This is not a criticism of CIF alone. The BUS Architecture (Kimball) and the centralized DW approach each had their place — but also their limitations. For anyone who has implemented all of these, those limitations are well known.

The CIF in particular was frustrating for several reasons:

  • Finding a sponsor was difficult — allocated budgets typically covered only a fraction of the full EDW
  • The time required to put the EDW in place with proper data quality controls was significant
  • Users quickly understood that an EDW without datamarts was not usable — they had to wait for the EDW to be completed before datamarts could be built, adding further cost and delay
  • Any change required a cascade of updates — EDW, then datamart, then MOLAP cube, then reports

The easy way out was to build datamarts directly. But that path led inevitably to data silos. Building a constellation of star and snowflake schemas with conformed dimensions was possible — but introduced its own complexity, particularly around maintaining multi-dimensional models and preserving history. Most Data and BI Architects believed they were fully covered with Slowly Changing Dimension Type 2 — they were not.

The centralized DW approach was tempting but equally limited. In practice, BI resources ended up adding aggregates, views, and MOLAP cubes to make a normalized EDW usable — adding cost and fragility. Proprietary hardware and databases were expensive, and MOLAP cube refreshes were slow and unreliable, making near real-time implementations impractical.

The ITA&S Hybrid DW Architecture

After experiencing the limitations of all these approaches firsthand, we developed the ITA&S Hybrid DW Architecture — a constellation of snowflakes with a series of deliberate enhancements:

  • Conformed Dimensions with lowest level principle
  • Anchor Points in a Dimension
  • Automated Dimensional Recasting
  • Dimension-related EDW Extensions
  • Denormalized Snowflakes
  • Additional EDW Extensions
  • Fact Delta Records (for retroactive processing)

This approach reduces the number of distinct data stores and — critically — the number of ETL processes required, which is the single most effective way to reduce BI TCO.

The most recent deployment of this architecture, in production since 2006, validated the approach definitively. Delivered at approximately one third of the typical cost for a project of this scale, it featured:

  • Two main fact tables with billions of rows
  • Only one additional dimension required for the second fact table — a direct result of conformed dimensions
  • Rich dimensions with multiple hierarchies
  • 34 dimension roles
  • A rating engine providing anticipated margins on each transaction
  • A leading-edge BI environment operating in near real-time

Additional Strategies for Reducing BI TCO

The ITA&S-H architecture is the primary lever for reducing BI TCO — but not the only one. The following strategies can further reduce costs and complexity.

Leverage Your Integration Layer to Simplify ETL

Traditional ETL requires developers to understand the internal database schemas of every source system — tightly coupling the data warehouse to structures you do not control. A far better approach is to consume business events produced by your applications rather than reading their underlying tables directly.

We first understood the power of this principle in a Telecommunications project, where a rating engine was producing files of mediated and rated voice calls every 15 to 20 minutes. Rather than navigating hundreds of billing system tables, we asked the Billing team to add a few columns to the file the rating engine was already generating. They did it in days. The ETL team never had to touch the billing system internals. We even added the file as a dimension — with a time hierarchy — enabling us to simulate the state of the billing system at any point in time, and to use it as part of a control and audit process.

That experience was a turning point. Consuming a business event — call.rated — rather than performing ETL against a source system was simply more intelligent. It was the seed of what eventually became our Event-Driven Architecture platform, where business events flow through Kafka and are consumed by the Lakehouse in real time, eliminating entire categories of ETL complexity. What started as a pragmatic workaround in 2006 is now a core architectural principle.

If your organization has invested in SOA, ESB, or modern event streaming, reusing that integration layer to feed your data platform is one of the highest-leverage TCO reductions available to you.

Understand Vendor Pricing Models and Negotiate Strategically

BI vendor negotiations reward preparation. Enterprise edition licenses often include additional tools at a marginal cost increase — tools that would otherwise require separate procurement. If you are in a best-of-breed model, factor total cost carefully before dismissing the enterprise edition.

Pay close attention to annual maintenance costs — attractive initial pricing sometimes masks significant increases after year two or three. And look beyond your own project: if another initiative in your organization is negotiating with the same vendor for a different product, there may be leverage to reduce maintenance costs across both deals simultaneously. We have built sophisticated what-if analysis models on vendor proposals to expose the true total cost — a level of rigor that often surprises both procurement teams and vendors alike.

Consider Open Source BI Where Appropriate

Open source BI has matured significantly. Multiple organizations have replaced proprietary tools with open source alternatives without sacrificing capability. The right approach is not to abandon existing investments that deliver value — but to include open source options in any replacement evaluation for tools that are underperforming.

A hybrid model combining proprietary and open source tools is entirely viable, provided you establish clear BI governance defining which tools are appropriate for which tasks. Modern open source options worth evaluating include Apache Superset, dbt, and others that have emerged well beyond the Pentaho and Jaspersoft era.

Exploit Application-Generated Transaction Files

Where SOA or event streaming is not yet available, application-generated files remain a practical and underutilized ETL simplification strategy. Many systems produce structured output files as a natural part of their processing. Working with application teams to add a small number of required fields to existing files — rather than reverse-engineering source database schemas — can dramatically reduce ETL complexity, coupling, and maintenance burden.


The Bridge to Today

The challenges we identified and solved here — spiraling TCO, fragmented datamarts, slow time-to-insight, and the impossibility of funding a full EDW upfront — did not disappear. They evolved. And so did our thinking. The architectural principles developed through this work eventually led us to design and implement what is today called a Lakehouse, before the concept had a name. The real-time Lakehouse at the heart of our GUM-RTDP platform is the direct evolution of the journey that started here.

For a deeper dive into the BIEA and DW Hybrid Modeling concepts, see our course: DW Hybrid: Key to Significantly Lowering Your Enterprise DW TCO. To see how these architectural principles come to life in our current platforms, visit the EDA Platform and the Lakehouse Platform.