0% found this document useful (0 votes)
13 views15 pages

Week 02 Part 01

The document discusses data management, focusing on the benefits of databases, including data integrity and ease of access. It outlines the structure and functions of databases, online transaction processing (OLTP), and the concept of data warehouses and data marts. Additionally, it covers database normalization and the various languages used in database management systems (DBMS).

Uploaded by

Riya singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views15 pages

Week 02 Part 01

The document discusses data management, focusing on the benefits of databases, including data integrity and ease of access. It outlines the structure and functions of databases, online transaction processing (OLTP), and the concept of data warehouses and data marts. Additionally, it covers database normalization and the various languages used in database management systems (DBMS).

Uploaded by

Riya singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Business Intelligence & Analytics

Data Management
Saji K Mathew, PhD
Professor, Department of Management Studies
INDIAN INSTITUTE OF TECHNOLOGY MADRAS
Data management
Two approaches
1. File system
2. Database system
Relational
Non-relational
◻ Object-oriented/object-relational
◻ XML
◻ Spatial
◻ Multimedia
Benefits of databases
Ensures data integrity
Entity integrity
Referential integrity
Resolves redundancy, inconsistency
Multiple file formats, duplication of information in different
files (data integrity problem)
Govt.’s Adhar (UID) project
Ease of access
Data is independent of the programs that use the data
One enterprise, one language
Common database integrates
Databases
Logical level: Database management system (DBMS)
DBMS is a set of software tools that lets users create, view, and
work with the data in a database.
Data modeling (ER diagrams)
Creation and manipulation (SQL)
Maintenance tools
Physical level (Storage)
Storage Area Network (SAN)
Network Attached Storage (NAS)
Content Addressable Storage (CAS)
Online Transaction Processing (OLTP)
ACID property : Atomicity, Consistency, Isolation, Durability
Atomicity
Manages failures that may leave database in an inconsistent state with
partial updates carried out—all or none
E.g. transfer of funds from one account to another should either
complete or not happen at all
Consistency
Ensures enforcement of rules
Isolation
Ensures concurrent usage, uncontrolled concurrent accesses can lead to
inconsistencies
E.g. two people reading a balance and updating it at the same time
Durability
Preserves committed transactions against failures
Data warehouse
For organizational learning to take place, data from
many sources must be gathered together and
organized in a consistent and useful way – hence,
Data Warehousing (DW)
DW allows an organization (enterprise) to remember
what it has noticed about its data
Data Mining techniques make use of the data in a
Data Warehouse
Definitions of a data warehouse
“A subject-oriented, integrated, time-variant and non-volatile
collection of data in support of management's decision making
process”

An enterprise has one data warehouse, and data marts source


their information from the data warehouse.

- W.H. Inmon
“A copy of transaction data, specifically structured for
query and analysis”
Data warehouse is the conglomerate of all data marts
within the enterprise.

- Ralph Kimball
Data Warehouse—Subject-Oriented

Organized around major subjects, such as customer, product,


sales.
Focusing on the modeling and analysis of data for decision
makers, not on daily operations or transaction processing.
Provide a simple and concise view around particular subject
issues by excluding data that are not useful in the decision
support process.

9
Data Warehouse—Integrated
Constructed by integrating multiple, heterogeneous data
sources
relational databases, flat files, on-line transaction records
Data cleaning and data integration techniques are applied.
Ensure consistency in naming conventions, encoding
structures, attribute measures, etc. among different data
sources
E.g., Hotel price: currency, tax, breakfast covered, etc.
When data is moved to the warehouse, it is converted.

10
Data Warehouse—Time Variant
The time horizon for the data warehouse is significantly longer
than that of operational systems.
Operational database: current value data.
Data warehouse data: provide information from a historical
perspective (e.g., past 5-10 years)
Every key structure in the data warehouse
Contains an element of time, explicitly or implicitly
But the key of operational data may or may not contain “time
element”.

11
Data Warehouse—Non-Volatile

A physically separate store of data transformed from the


operational environment.
Operational update of data does not occur in the data
warehouse environment.
Does not require transaction processing, recovery, and
concurrency control mechanisms
Requires only two operations in data accessing:
initial loading of data and access of data.

12
Data mart
A Data Mart is a smaller, more focused Data Warehouse –
a mini-warehouse.

A Data Mart typically reflects the business rules of a


specific business unit within an enterprise.
Normalization
Database normalization is the process of decomposing
relations with anomalies to produce smaller, well
structured relations
1st Normal form: Multivalued attributes (repeating
groups)removed/re-organized
2nd normal form: Partial dependencies addressed
3rd normal form: Transitive dependencies addressed
Boyce/Codd NF, 4th NF and higher normal forms do exist
Trade off: Efficient storage space vs efficient data
processing
De-normalization
DBMS languages
Data Definition Language (DDL)
Builds the data dictionary
Creates the database
Describes logical views for each user
Specifies record or field security constraints
Data Manipulation Language (DML)
Changes the content in the database
Creates, updates, insertions, and deletions
Data Query Language (DQL)
Enables users to retrieve, sort, and display specific data from
the database

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy