DW Hybrid: Key to lower your BI TCO (Total Cost of Ownership)!

Last Updated on Sunday, 06 February 2011 23:11

We all heard of companies spending millions on BI each year and they still do not have an integrated Enterprise Data Warehouse (EDW) and most do not even have conformed dimensions for their datamarts. This means that many companies today:

Cannot really compare metrics/reports across their business units and business functions
Spend too much time discussing about the differences in their numbers
Spend a lot of time manipulating data
Take decisions with imprecise and older data/information

Putting in place a BIEA (BI Enabling Architecture) requires people with the right architecture experience; it has nothing to do with tools. Yes, for sure we need a suite of tools to support a BIEA but without the proper architecture these tools are useless!

A BIEA contains multiple components and perform multiple functions, the BIEA topic will be covered in a future article, but central to it is the DW or EDW. The most popular approach, the CIF (Corporate Information Factory, Inmon) is also the most expensive! Why, because there are multiple data stores and we keep moving data between these continuously via a process called ETL (Extract, Transform and Load). Now, I am not against CIF and while I am here, I am not against the BUS Architecture (Kimball) or the centralized DW only (supported by Teradata). But when you have implemented all of these you know their limitations ⁽¹⁾.

We created our ITAAS Hybrid DW Architecture in 2000 after having experienced the limitations of these various architectures; the most frustrating one was the CIF for these reasons:

Who will pay for it ? (finding a sponsor is possible, but very often allocated budget was for a small portion of the EDW)
Time it takes to put in place the EDW with proper data quality control
The understanding from the users that an EDW without Datamarts was not usable, they had to wait for the EDW to be put in place and pay more for the datamarts
Time to implement a change (in EDW, then in the Datamart, then in MOLAP, then in the reports)

The easy way out was to build datamarts but then, eventually, the customers will have end-up with a series of silos. Yes, it is possible to build a constellation of star/snowflakes sharing conformed dimensions but there are multiple issues here involving keeping the model multi-dimensional and not losing history (most people including Data/BI Architects think that they are fully covered with Slowly Changing Dimension Type 2!)

Others alternatives where tempting but they also had limitations, in the case of the Centralized DW approach, very often, BI resources were adding aggregates, views and MOLAP cubes to make the Centralized DW (ER normalized) practical. Back then the proprietary hardware and DB were expensive and MOLAP cubes refreshments were slow (and sometimes failed) making them not very suitable for near-real time implementations.

After having considered all of these approaches, we have opted for a DW Hybrid made of a constellation of snowflakes with a series of enhancements such as:

Conformed Dimension with lowest level principle
Anchor Points in a Dimension
Automated Dimensional Recasting
Dimensions related EDW Extensions
Denormalized Snowflakes
Other EDW extensions
Fact Delta Records (for retro-active process)

This approach reduces the number of different storage and automatically reduces the number of ETL processes, which is the best way to reduce the BI TCO. The last deployment of this approach, which has been in production since 2006, truly proved that this is the best architecture for a BIEA. It was deployed in a near real-time DW Environment and at a fraction of the cost (1/3) for a project of this size, in addition the project had the following characteristics:

2 main fact tables with Billions rows
Only 1 dimension had to be added for the second fact table (conformed dimensions)
Very rich dimensions (with multiple hierarchies)
Many dimensions roles (34)
A rating engine providing anticipated margins on each call
Leading edge BI Environment

Following this successful project, we decided to offer a course on the BIEA & the DW Hybrid Modeling Concepts, click the following link for more detail: DW Hybrid: Key to significantly lower your Enterprise DW TCO.

⁽¹⁾ The comparative analysis of the different BIEA architectures is covered in the DW Hybrid modeling course.