Battle of The Giants - Comparing Kimball and Inmon
Battle of The Giants - Comparing Kimball and Inmon
Data Warehousing
Battle of the Giants:
Comparing the Basics of the
Kimball and Inmon Models
Mary Breslin
Many organizations today need to create data warehouses
massive data stores of time-series data used for decision support.
These organizations face a range of choices, both in terms software
tools and development approaches. Making good choices requires
an understanding of the two main data warehousing models
Inmons and Kimballs.
Bill Inmon advocates a top-down development approach that adapts
DW MODELS
DW MODELS
Operational
Departmental
Individual
DW MODELS
(data item set) for each department. Again, the sum of the
departmental DISs comprise the corporate DIS. The midlevel data model includes four constructs:
Type of data
ERD(1)
ERD(2)
DIS(2)
corporate ERD
ERD(3)
DIS(3)
ERD(n)
user view (n)
DIS(n)
corporate DIS
DW MODELS
DSS5
DSS7
DSS8
DSS9
DSS10
subject
area
source
system
analysis
specs
programming
population
DSS2
DSS6
breadbox
analysis
DSS3
technical
assessment
data warehouse
database design
DSS4
technical
environment
preparation
10
DW MODELS
11
DW MODELS
Kimballs Model
Kimballs model differs in several important respects from
a traditional relational database approach. One significant
difference is that data warehouses built with the Kimball
model use a data modeling method unique to the data
warehouse. This is discussed in the next section:
Dimensional Data Modeling.
Another significant difference is that the overall architecture features multiple databases that are expected to be
highly interoperable. The data bus is the main design
feature that makes this possible (further discussion of the
data bus is included in the section The Data Bus and
Conformed Dimensions).
12
DW MODELS
13
DW MODELS
Date Dimension
Date Key (PK)
Date Attributes TBD
Store Dimension
Store Key (PK)
Store Attributes TBD
Product Dimension
Product Key (PK)
Product Attributes TBD
Promotion Dimension
Promotion Key (PK)
Promotion Attributes TBD
14
DW MODELS
Protect information
Date Dimension
Date Key (PK)
Date
Day of Week
Day Number in Epoch
Month Number in Epoch
Day Number in Calendar Month
Day Number in Calendar Year
Day Number in Fiscal Month
Day Number in Fiscal Year
Last Day in Week Indicator
(and many more)
15
DW MODELS
the value of a data item, such as when two different operational systems measure a product or process differently
and therefore assign a different value to the same data
item. In general, the purpose of the transform processes is
to ensure data integrity within the data warehouse. There
are several methods used to transform data, including
field mapping and algorithmic comparisons.
Load: The final step in ETL is loading the data into
either the atomic data warehouse (in Inmons model) or
into data marts (in Kimballs model). The load process in
either case involves placing the data physically. The main
concern in this process is appending the newly extracted
and transformed data onto the data already in the data
warehouse. Various ETL routines run at this point help
ensure data integrity and guard against data redundancy.
ETL is essential to the viability of the data warehouse in
that it attempts to ensure data integrity within the data
warehouse. Obviously, if two user queries that are essentially the same return two different results, the credibility
of the data warehouse is damaged in the eyes of the users.
Because operational systems are seldom (if ever) designed
to produce results compatible with one another, making
the output of these systems consistent is generally a
Inmon
Methodology and architecture
Top-down
Overall approach
Enterprisewide (atomic) data warehouse
Architectural structure
"feeds" departmental databases
Quite complex
Complexity of the method
Derived from the spiral methodology
Comparison with established
development methodologies
Discussion of physical design Fairly thorough
Data modeling
Subject- or data-driven
Data orientation
Traditional (ERDs, DISs)
Tools
Low
End-user accessibility
Philosophy
IT professionals
Primary audience
Integral part of the Corporate Information
Place in the organization
Factory (CIF)
Deliver a sound technical solution based on
Objective
proven database methods and technologies
Table 1. Comparison of Essential Features of Inmons and Kimballs Models
16
Kimball
Bottom-up
Data marts model a single business process; enterprise consistency achieved through data bus and conformed dimensions
Fairly simple
Four-step process; a departure from RDBMS methods
Fairly light
Process oriented
Dimensional modeling; a departure from relational modeling
High
End users
Transformer and retainer of operational data
Deliver a solution that makes it easy for end users to directly
query the data and still get reasonable response times
DW MODELS
Herculean effort. Not surprisingly, ETL is frequently considered the most labor-intensive data warehouse activity,
surpassing even decision support analysis activities!
Differences
The differences between Inmons and Kimballs approaches
are many and deep. It is interesting to note that the two features that create similarities between the two modelstimestamped data and ETLare required to make decision
support systems viable. In other words, the two models are
similar only in the areas in which, arguably, they have to be
similar. In all other areas, their differences are profound.
Differences in Development
Methodologies and Architectures
In order to have an atomic data warehouse, as in Inmons
model, some degree of top-down development must be
present. The atomic data warehouse must serve the entire
enterprise, and all departmental databases obtain their
data through the atomic data warehouse. Top-down development efforts have a certain unavoidable degree of complexity, and Inmons methodology is no exception,
although his clear presentation helps it seem less complex.
Overall, Inmons methodology and architectural orientation
is a technical one. His primary interest is ensuring that the
technical solution works. Oversimplified, the objective of
this technical solution is to optimize I/Os. Inmons audience
is clearly comprised of IT professionals. Few business readers
have the background to understand Inmons development
approach because of its emphasis on technical aspects and a
lack of understanding of the spiral development approach
on which it is based. His emphasis on the technical aspects
of the development implies that the IT department
members of the data warehousing team will feel the greatest
degree of ownership of the data warehouse as they, not the
end users, will understand the development methodology.
Philosophical Differences
By now it is clear that Inmon views IT as the primary
developer and provider of the data warehouse. Inmon
believes that the performance of the completed data ware-
17
DW MODELS
Characteristic
Favors Inmon
Favors Kimball
Strategic
Enterprisewide integration
Structure of data
Scalability
Persistency of data
Time to delivery
Cost to deploy
18
To at least partially summarize the data in Table 1, an organization is more likely to succeed using Inmons approach if it
has a large team of data warehouse specialists, plans a large
project with enterprisewide access needs, stores data that is
not primarily business metrics, and can wait to see results
over a longer timeframefrom four to nine months (Inmon,
2000). These characteristics and data requirements fit well
with Inmons recommendation to first build a considerable
infrastructure on a solid enterprisewide data model.
On the other hand, an organization with different characteristics may be better off with a Kimball-based approach.
According to one expert, A typical requirement is to
develop an operational data mart for a specific business
area in 90 days, and develop subsequent data marts in 60
to 90 days each (Mimno, 2002). Kimballs approach is
generally recognized as faster than Inmons, at least for the
delivery of the first data mart (versus the first departmental
database using Inmons approach). Kimballs approach is
also indicated if the organization is better able to field
smaller teams of generalists for data warehouse project
development, and expects to store mostly business metrics.
An organization with these characteristics and requirements is more likely to succeed with a data mart architecture developed using the dimensional modeling approach.
It is important to realize that choosing an approach to data
DW MODELS
warehousing is not as simple as the two preceding paragraphs imply. However, as long as the reader understands
that these guidelines represent a gross oversimplification of
the process, they may be useful as a starting point for discussing the data warehousing needs and characteristics
unique to a given organization.
Finally, research shows that having the right set of soft
skills is just as important, if not more important, than
technical skills and knowledge.
Interestingly, the keys to success are not technical in nature.
Projects dont succeed because they use an innovative design
or radical new technology. They succeed because of the
soft stuffleadership, communication, planning, and
interpersonal relationships (Eckerson, 2003).
When building a data warehouse, whether using Inmons or
Kimballs approach, it is critical that the data warehouse
team employ soft skills liberally and effectively. This involves
ensuring that the organization has a well-articulated vision
of the data warehouses role and usage, and allocates sufficient resources to create and maintain the data warehouse
(Eckerson, 2003). These are not typically responsibilities
that an IT project development team must shoulder, yet
they are critical to the success of a data warehouse project.
Summary
Data warehouses require storage and access of massive
amounts of time-stamped data for decision support. Since
the building of data warehouses was first attempted in the
early 1990s, two models have emerged as dominant:
Inmons and Kimballs.
Inmons approach stresses top-down development using
proven database development methodologies and tools,
such as ERDs, DISs, and a modification of the spiral
development approach. Inmons tools and methods are
adaptations of traditional tools and methods for operational database development. Inmon sees the data warehouse as a part of a much larger information environment,
which he calls the Corporate Information Factory (CIF).
To ensure that the data warehouse fits well in this larger
environment, he advocates the construction of both an
19
DW MODELS
REFERENCES
Albert, G. The Importance of Data Warehousing (May
03, 2000), BusinessLine, Internet edition, division of The
Hindu Business Line. Retrieved August 12, 2003, from
http://www.blonnet.com/businessline/2000/05/03/stories/
150339m6.htm.
Eckerson, W. Smart Companies in the 21st Century:
The Secrets of Creating Business Intelligence Solutions,
(April 2003), TDWI Web site. Retrieved August 11,
2003, from http://www.dw-institute.com/research/.
Inmon, W.H. Building the Data Warehouse (Third
Edition), New York: John Wiley & Sons, (2002).
Inmon, W.H. and C. Imhoff. Corporate Information
Factory Components, (2002), Inmon Associates Inc. Web
site. Retrieved September 9, 2003 from http://www.billinmon.com/library/articles/artcifco.asp.
Inmon, W.H. Accelerating the Development of the
Enterprise Data Warehouse, (2000), Inmon Associates
Inc. Web site. Retrieved September 10, 2003 from
http://www.billinmon.com/library/presents/present.asp.
20