0% found this document useful (0 votes)
6 views7 pages

6 1 DWM 2019 S

The document discusses various aspects of data warehousing, including the differences between fact and dimension tables, OLTP vs OLAP, and the design of data marts using star schema. It also covers the advantages and disadvantages of top-down and bottom-up approaches to data mart creation, as well as the concept of hypercubes for representing multidimensional data. Additionally, it outlines OLAP operations such as roll-up, drill-down, slice, dice, and pivot in the context of a data warehouse with dimensions and measures.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views7 pages

6 1 DWM 2019 S

The document discusses various aspects of data warehousing, including the differences between fact and dimension tables, OLTP vs OLAP, and the design of data marts using star schema. It also covers the advantages and disadvantages of top-down and bottom-up approaches to data mart creation, as well as the concept of hypercubes for representing multidimensional data. Additionally, it outlines OLAP operations such as roll-up, drill-down, slice, dice, and pivot in the context of a data warehouse with dimensions and measures.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

(Attempt any five of the following)

a)​ ​Analyze how fact tables are deep and dimension tables are wider.
Ø Typically a fact table contains fewer attributes than a dimension table. Usually,
there are about 10 attributes or less. For example of 3 products, 5 customers,
30 days, and 10 sales representatives represented as rows in the dimension
tables. Even in this example, the number of fact table rows will be 4500, very
large in comparison with the dimension table rows. fact table is narrow, with a
small number of columns, but very deep, with a large number of rows.
Ø Dimension table has many columns or attributes. It is not uncommon for some
dimension tables to have more than 50 attributes. Therefore, we say that the
dimension table is wide. If you lay it out as a table with columnsand rows, the
table is spread out horizontally.
b)​ ​Differentiate between OLTP and OLAP

c) ​ uild an information package for Railway Reservation System. Assume your own
B
functionality.
d) ​ hen is a full data refresh preferable to an incremental load? Justify with a proper
W
diagram

Ø The cost of refresh remains constant irrespective of the number of changes in the
source systems. If the number of changes increases, the time and effort for doing a
full refresh remain the same.
Ø On the other hand, the cost of update varies with the number of records to be
updated.
Ø If the number of records to be updated falls between 15% and 25% of the total
number of records, the cost of loading per record tends to be the same whether
you opt for a full refresh of the entire data warehouse or to do the updates.
Ø This range is just a general guide. If more than 25% of the source records change
daily, then seriously consider full refreshes.
e)​ ​List any five major transformation types with an example each.
1. Format Revisions
2. Decoding of Fields
3. Calculated and Derived Values.
4. Splitting of Single Fields.
5. Merging of Information.
6. Character Set Conversion.
7. Conversion of Units of Measurements
8. Date/Time Conversion.
9. Summarization.
10.​ Key Restructuring.
11.​ Deduplication.

f)​ ​Explain degenerate dimensions with an example​.


Ø When you pick up attributes for the dimension tables and the fact tables from operational
systems, you will be left with some data elements in the operational systems that are neither
facts nor strictly dimension attributes.
Ø Examples of such attributes are reference numbers like order numbers, invoice numbers,
order line numbers, and so on.
Ø These attributes are useful in some types of analyses. For example, you may be looking for
average number of products per order.
Ø Then you will have to relate the products to the order number to calculate the average.
Attributes such as order_number and order_line in the example are called degenerate
dimensions and these are kept as attributes of the fact table

Q2 a) Design the data mart using star schema for wholesale furniture company. The data
mart has to allow analyzing the company’s situation at least with respect to the Furniture,
Customer and Time. More ever, the company needs to analyse : The furniture with respect
to its type, category and material. The customers with respect to their spatial location by
considering at least cities, regions and states. The company is interested in learning the
quantity, income and discount of its sales
Q2 b) Explain the merits and demerits of data mart using the top down and bottom up
approach
Here are the two different basic approaches: (1) overall data warehouse feeding dependent data
marts, and (2) several departmental or local data marts combining into a data warehouse.
Top-Down Approach:
This is the big-picture approach in which you build the overall, big, enterprise-wide data
warehouse. The data warehouse is large and integrated. would take longer to build and has a high
risk of failure. If you do not have experienced professionals on your team, this approach could be
hazardous. Also, it will be difficult to sell this approach to senior management and sponsors.
They are not likely to see results soon enough
The advantages of this approach are:
· A truly corporate effort, an enterprise view of data
· Inherently architected, not a union of disparate data marts
· Single, central storage of data about the content
· Centralized rules and control
· May see quick results if implemented with iterations
The disadvantages are:
· Takes longer to build even with an iterative method
· High exposure to risk of failure
· Needs high level of cross-functional skills
· High outlay without proof of concept.
Bottom-Up Approach:
The key consideration is the conforming of the dimensions among the separate data marts. In this
approach data marts are created first to provide analytical and reporting capabilities for specific
business subjects based on the dimensional data model. Data marts contain data at the lowest
level of granularity and also as summaries depending on the needs for analysis. you build your
departmental data marts one by one. You would set a priority scheme to determine which data
marts you must build first. The most severe drawback of this approach is data fragmentation.
The advantages of this approach are:
· † Faster and easier implementation of manageable pieces
· † Favorable return on investment and proof of concept
· † Less risk of failure
· † Inherently incremental; can schedule important data marts first
· † Allows project team to learn and grow
The disadvantages are:
· † Each data mart has its own narrow view of data
· † Permeates redundant data in every data mart
· † Perpetuates inconsistent and irreconcilable data
· † Proliferates unmanageable interfaces

Q3 a) Explain hypercubes. Give an example to demonstrate that how they are used to
display six dimensional data on the screen.
•the metaphor of a physical cube to represent data breaks down when you try to represent four
dimensions
•A hypercube is a general metaphor for representing multidimensional data
six-dimensional data
Notice how product and metrics are combined and represented as columns, store and time are
combined as rows, and demographics and promotion as pages.
Q3 b) ​Suppose that a data warehouse consists of the three dimensions time, doctor and
patient and the two measures : count and charge, where charge is the fee that a doctor
charges a patient for a visit.
Perform the following OLAP Operations for the above given problem statement. I) Roll up
2) Drill down 3 ) Slice 4) Dice 5) Pivot

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy