Warehousing Des-WPS Office
Warehousing Des-WPS Office
4. 1975 – Sperry Univac introduces 14. 2012 – Bill Inmon develops and
MAPPER (maintain, Prepare, and makes public technology known as
Produce Executive Reports), a "textual disambiguation". Textual
database management and reporting disambiguation applies context to
system that includes the world's first raw text and reformats the raw text
4GL. It is the first platform designed and context into a standard data base
for building Information Centers (a format.
What Is a Data Warehouse? enhances decision-making efficiency.
• A data warehouse is a database designed • A data warehouse usually stores many
to enable business intelligence activities: months or years of data to support historical
analysis. The data in a data warehouse is
• it exists to help users understand and typically loaded through an extraction,
enhance their organization's performance. transformation, and loading (ETL) process
from multiple data sources.
•It is designed for query and analysis rather
than for transaction processing, and usually • Modern data warehouses are moving
contains historical data derived from toward an extract, load, transformation (ELT)
transaction data, but can include data from architecture in which all or most data
other sources. transformation is performed on the
database that hosts the data warehouse. It
• Data warehouses separate analysis is important to note that defining the ETL
workload from transaction workload and process is a very large part of the design
enable an organization to consolidate data effort of a data warehouse. Similarly, the
from several sources. speed and reliability of ETL operations are
the foundation of the data warehouse once it
This helps in: is up and running.
1. Maintaining historical records. • Users of the data warehouse perform data
analyses that are often time-related.
2. Analyzing the data to gain a better Examples include consolidation of last year's
understanding of the business and to sales figures, inventory analysis, and profit
improve the business. by product and by customer. But
• In addition to a relational database, a data timefocused or not, users want to "slice and
warehouse environment can include an dice" their data however they see fit and a
extraction, transportation, transformation, well-designed data warehouse will be
and loading (ETL) solution, statistical flexible enough to meet those demands.
analysis, reporting, data mining capabilities, Users will sometimes need highly
client analysis tools, and other applications aggregated data, and other times they will
that manage the process of gathering data, need to drill down to details. More
transforming it into useful, actionable sophisticated analyses include trend
information, and delivering it to business analyses and data mining, which use
users. existing data to forecast trends or predict
futures. The data warehouse acts as the
• To achieve the goal of enhanced business underlying engine used by middleware
intelligence, the data warehouse works with business intelligence environments that
data collected from multiple sources. The serve reports, dashboards and other
source data may come from internally interfaces to end users.
developed systems, purchased applications, • Although the discussion above has
third-party data indicators and other sources. focused on the term "data warehouse", there
It may involve transactions, production, are two other important terms that need to
marketing, human resources and more. In be mentioned. These are the data mart and
today's world of big data, the data may be the operation data store (ODS).
many billions of individual clicks on web
sites or the massive data streams from • A data mart serves the same role as a data
sensors built into complex machinery. warehouse, but it is intentionally limited in
• Data warehouses are distinct from online scope. It may serve one particular
transaction processing (OLTP) systems. department or line of business. The
With a data warehouse you separate advantage of a data mart versus a data
analysis workload from transaction warehouse is that it can be created much
workload. Thus data warehouses are very faster due to its limited coverage. However,
much read-oriented systems. They have a data marts also create problems with
far higher amount of data reading versus inconsistency. It takes tight discipline to
writing and updating. This enables far better keep data and calculation definitions
analytical performance and avoids impacting consistent across data marts. This problem
your transaction systems. A data warehouse has been widely recognized, so data marts
system can be optimized to consolidate data exist in two styles. Independent data marts
from many sources to achieve a key goal: it are those which are fed directly from source
becomes your organization's "single source data. They can turn into islands of
of truth". There is great value in having a inconsistent information. Dependent data
consistent source of data that all users can marts are fed from an existing data
look to; it prevents many disputes and warehouse. Dependent data marts can avoid
the problems of inconsistency, but they 4.Non-volatile
require that an enterprise-level data
warehouse already exist. Once data is in the data warehouse, it will
not change. So, historical data in a data
• Operational data stores exist to support warehouse should never be altered.
daily operations. The ODS data is cleaned
and validated, but it is not historically deep: it
may be just the data for the current day.
Rather than support the historically rich Data Warehouse Functions:
queries that a data warehouse can handle,
the ODS gives data warehouses a place to 1. Staging- Staging is used to store raw data
get access to the most current data, which for use by developers.
has not yet been loaded into the data
warehouse. 2. Integration- The integration layer is used
to integrate data and to have a level of
• The ODS may also be used as a source to abstraction from users.
load the data warehouse. As data
warehousing loading techniques have 3. Access- The access layer is for getting
become more advanced, data warehouses data out for users.
may have less need for ODS as a source for With data marts it stores subsets of data
loading data. Instead, constant trickle-feed from a warehouse, which focuses on a
systems can load the data warehouse in specific aspect of a company like sales or a
near real time. marketing process. This definition of the
A common way of introducing data data warehouse focuses on data storage.
warehousing is to refer to the The main source of the data is cleaned,
characteristics of a data warehouse as set transformed, catalogue and made available
forth by William Inmon: for use by managers and other business
professionals for data mining, online
1. Subject Oriented analytical processing, market research and
decision support.
2. Integrated
3. Nonvolatile
TOPIC 2: Business Process Modeling
4. Time Variant
Content Discussion:
• Business process modeling (BPM)- In
business process management and systems
engineering is the activity of representing
processes of an enterprise, so that the
current process may be analyzed, improved,
Characteristics of a data warehouse: and automated.
1. Subject-Oriented • BPM is typically performed by business
analysts, who provide expertise in the
A data warehouse can be used to analyze a modeling discipline; by subject matter
particular subject area. For example, “sales” experts, who have specialized knowledge of
can be a particularbsubject. the processes being modeled; or more
commonly by a team comprising both.
2. Integrated Alternatively, the process model can be
derived directly from events' logs using
A data warehouse integrates data from process mining tools.
multiple data sources. For example, source
A and source B may have different ways of
identifying a product, but in a data
warehouse, there will be only a single way of
identifying a product.
3. Time-Variant
Historical data is kept in a data warehouse.
For example, one can retrieve data from 3
months, 6 months, 12 months, or even older
data from a data warehouse. This contrasts
with a transactions system, where often only
the most recent data is kept. • Business process - A business process is a
collection of related, structured activities or
tasks that produce a specific service or
product (serve a particular goal) for a
particular customer or customers.
There are three main types of business
processes:
1. Management processes, that govern
the operation of a system. Typical 2.UML Diagrams
management processes include
corporate governance and strategic • Is a modeling language mainly used for
management. specification, visualization, development and
documenting of software systems. But
2. Operational processes, that business professionals have adapted it as a
constitute the core business and powerful business process modeling
create the primary value technique.
stream.Typical operational processes
are purchasing, manufacturing, 3. Flowchart Technique
marketing, and sales.
• Are probably the most popular diagram
3. Supporting processes, that support type in the world. Because it has few
the core processes. Examples include standard symbols it can be easily
accounting, recruitment, and technical understood by many. Also, most drawing
support. software support creation of flowcharts it is
used by a much wider audience as well.
Some business process modeling
techniques are:
1.Business process modeling notation
(BPMN)
2.UML diagrams
3.Flowchart technique
4.Data flow diagrams
5.Role activity diagrams
4. Data Flow Diagrams – Yourdon’s
6.Role interaction diagrams Technique
• Data flow diagrams (DFD) show the flow of
data or information from one place to
another. DFDs describe the processes
showing how these processes link together
through data stores and how the processes
relate to the users and the outside world.
They are used to record the processes
analyzed as a part of the design
documentation.
• A DFD can be seen as a method of
organizing data from its raw state. DFDs are
7.Gantt charts the backbone of structured analysis that was
developed in the early sixties by Yourdon.