DWH Assignment
DWH Assignment
Assignment
Department: BCS-VII
Q: What is OLAP? On-Line Analytical Processing (OLAP) is a category of software technology that enables analysts, managers and executives to gain insight into data through fast, consistent, interactive access to a wide variety of possible views of information that has been transformed from raw data to reflect the real dimensionality of the enterprise as understood by the user. OLAP functionality is characterized by dynamic, multi-dimensional analysis of consolidated enterprise data supporting end user analytical and navigational activities including:
calculations and modeling applied across dimensions, through hierarchies and/or across members trend analysis over sequential time periods slicing subsets for on-screen viewing drill-down to deeper levels of consolidation reach-through to underlying detail data rotation to new dimensional comparisons in the viewing area
OLAP is implemented in a multi-user client/server mode and offers consistently rapid response to queries, regardless of database size and complexity. It helps the user synthesize enterprise information through comparative, personalized viewing, as well as through analysis of historical and projected data in various "what-if" data model scenarios. It is a unique technology which can provide users with a major competitive advantage by giving them the ability to view and analyze data not only in any way, but more importantly, in real-time. Types of OLAP : There are three main types of OLAP, namely MOLAP, ROLAP and HOLAP.
MOLAP - Multidimensional OLAP stores the OLAP summary information in efficiently structured data files. MOLAP cubes normally provide the best level of performance. Whether the calculated items are stored within these structures, will depend on the OLAP product being used.
ROLAP - Relational OLAP stores all the information within a relational database. This will normally include the calculations as well.
HOLAP - Hybrid OLAP stores the summary information in the MOLAP cube format with the more detailed transactional information being available from the relational database.
The quickest response rates are normally achieved using MOLAP. However, some solution functionality may be better accommodated using a combination of approaches. OLAP Applications The market has confirmed the unmatched strengths of OLAP as the foundation for financial decision support applications. Many of the largest and most demanding corporations in the world have eliminated the delays and tedium associated with these critical enterprise financial applications. It empowers managers to make more informed, faster decisions. They can examine and discover critical business issues while they still have time to act upon them and discover relationships that were previously hidden. OLAP's flexibility caters for most modeling applications, including:
Reporting, including multiple reporting formats, income statements, balance sheets, management accounts, head office accounts, cash flow statements, statutory and regulatory reporting
Financial Consolidations Foreign Currency Translations, including European Monetary Units Budgeting Interactively Forecasting, Re-forecasting and Automatic Forecasting Techniques Fixed Assets Register and Depreciation Calculations Cost-Volume-Profit Analysis Allocations Tax Planning Accounts Receivable and Accounts Payable Age Analysis Expense Tracking and Reallocation Market Share Analysis Trend Analysis over time periods
Q: What are Advantages and disadvantages of MOLAP and ROLAP? Advantages of MOLAP
Fast query performance due to optimized storage, multidimensional indexing and caching. Smaller on-disk size of data compared to data stored in relational database due to compression techniques. Automated computation of higher level aggregates of the data. It is very compact for low dimension data sets. Array models provide natural indexing. Effective data extraction achieved through the pre-structuring of aggregated data.
Disadvantages of MOLAP
Within some MOLAP Solutions the processing step (data load) can be quite lengthy, especially on large data volumes. This is usually remedied by doing only incremental processing, i.e., processing only the data which have changed (usually new data) instead of reprocessing the entire data set.
MOLAP tools traditionally have difficulty querying models with dimensions with very high cardinality (i.e., millions of members). Some MOLAP products have difficulty updating and querying models with more than ten dimensions. This limit differs depending on the complexity and cardinality of the dimensions in question. It also depends on the number of facts or measures stored. Other MOLAP products can handle hundreds of dimensions.
Advantages of ROLAP
ROLAP is considered to be more scalable in handling large data volumes, especially models with dimensions with very high cardinality (i.e., millions of members).
With a variety of data loading tools available, and the ability to fine tune the ETL code to the particular data model, load times are generally much shorter than with the automated MOLAP loads.
The data are stored in a standard relational database and can be accessed by any SQL reporting tool (the tool does not have to be an OLAP tool).
ROLAP tools are better at handling non-aggregatable facts (e.g., textual descriptions). MOLAP tools tend to suffer from slow performance when querying these elements.
By decoupling the data storage from the multi-dimensional model, it is possible to successfully model data that would not otherwise fit into a strict dimensional model.
The ROLAP approach can leverage database authorization controls such as row-level security, whereby the query results are filtered depending on preset criteria applied, for example, to a given user or group of users (SQL WHERE clause).
Disadvantages of ROLAP
There is a consensus in the industry that ROLAP tools have slower performance than MOLAP tools. However, see the discussion below about ROLAP performance.
The loading of aggregate tables must be managed by custom ETL code. The ROLAP tools do not help with this task. This means additional development time and more code to support.
When the step of creating aggregate tables is skipped, the query performance then suffers because the larger detailed tables must be queried. This can be partially remedied by adding additional aggregate tables, however it is still not practical to create aggregate tables for all combinations of dimensions/attributes.
ROLAP relies on the general purpose database for querying and caching, and therefore several special techniques employed by MOLAP tools are not available (such as special hierarchical indexing). However, modern ROLAP tools take advantage of latest
improvements in SQL language such as CUBE and ROLLUP operators, DB2 Cube Views, as well as other SQL OLAP extensions. These SQL improvements can mitigate the benefits of the MOLAP tools.
Since ROLAP tools rely on SQL for all of the computations, they are not suitable when the model is heavy on calculations which don't translate well into SQL. Examples of such models include budgeting, allocations, financial reporting and other scenarios.