3) Wase 2021 Dds Ho Modified
3) Wase 2021 Dds Ho Modified
COURSE HANDOUT
(Flipped with 16 Sessions)
Course Description
This course will deal with the fundamental issues in large, distributed database systems which are
motivated by the computer networking and distribution of processors, and control. The theory,
design, specification, implementation, and performance large systems will be discussed.
Concurrency, Consistency, Integrity, Reliability, Privacy, and Security in distributed database systems
will be included
Course Objectives
No Objective
This field covers all aspects of data computing and information access across multiple processing
CO1
elements connected by any form of communication network, either local area, or wide area.
There has been a steady growth in the development of contemporary applications that
CO2 demonstrate their efficacy by connecting millions of users/applications/machines across the
globe without relying on a traditional client-server approach.
The general computing trend is to leverage shared resources and massive amounts of data over
CO3
the Internet.
This course aims to provide an understanding of theory and systems aspects of distributed data
CO4
across web.
This course deals with the latest trends and tools being used for the understanding of huge
CO5
volumes of data.
Text Book(s)
No Author(s), Title, Edition, Publishing House
T1 M. Tamer Özsu Patrick Valduriez Principles of Distributed Database Systems Third Edition
T2 Big Data Fundamentals Concepts, Drivers & Techniques, Thomas Erl, Wajid Khattak, Paul Buhler
Storage Networks Explained by Ulf Troppens, Wolfgang Muller-Freidt, Rainer Wolafka, IBM
R1
Storage Software Development, Germany. Publishers: Wiley
Content Structure
Learning Outcomes:
No Learning Outcomes
LO2 Understanding of Distributed Storage systems and the technologies used to implement
JBOD
Storage virtualisation using RAID
RL 1.3
RAID 0: block-by-block striping
RAID 1: block-by-block mirroring
RAID 01: striping and mirroring combined
RL 1.4
RAID 10: striping and mirroring combined
Pre - CS
RAID 4 and RAID 5 R1 - Ch.1
RAID 6: double parity R1 - Ch.2
RL 1.5 RAID 2 & RAID 3
Comparison of the RAID levels
Basic forms of storage
RL 1.6 Parity Check using XOR Logic
Discuss all the above RL topics in brief and solve problems on
During - CS CS - 2
RAID 4
R1: Page
Post - CS HW Understanding RAID Levels 4 & 5
535 & 536
Contact Session - 3
M2 : Distributed DBMS Architecture
Distributed Database System
RL 2.1 Distributed DBMS
Pre - CS
ANSI/SPARC Architecture
T1 - Ch 1
RL 2.2 Architectural Models for DDBS
Complications Introduced by Distribution
During - CS CS - 3
Design Issues
Post - CS HW Applications and real time examples on various architectures Online
Contact Session - 4
M2 : Distributed DBMS Architecture
RL 2.3 Client/Server Systems
Pre - CS RL 2.4 Peer-to-Peer Systems
RL 2.5 Multidatabase System Architecture T1 - Ch 1
Discussion on few Architecture Examples
During - CS CS - 4
Detailed discussion on multi database architecture
Post - CS HW Applications and real time examples on various architectures Online
Contact Session - 5
M3 : Distributed Database Design & Integration
RL 3.1 Framework of Distribution
RL 3.2 Top-Down Design Process
Pre - CS RL 3.3 Distribution Design Issues
RL 3.4 Horizontal Fragmentation T1 - Ch 3
RL 3.5 Vertical and Hybrid Fragmentation
During - CS CS - 5 Discuss all the above RL topics in brief
Post - CS HW Problem solving on different fragmentation models
Contact Session - 6
M3 : Distributed Database Design & Integration
Pre - CS RL 3.6 Bottom-Up Design Methodology
Discuss all the above RL topics in brief
During - CS CS - 6 Allocation T1 - Ch 4
Data Directory
Problem solving on GCS generation
Post - CS HW
Example 4.1 & 4.2 Page 136
Contact Session - 7
M4 : Data and Access Control
RL 4.1 D
RL 4.2 D
Pre - CS
RL 4.3 M
RL 4.4 D
T1 - Ch 5
Discuss all the above RL topics in brief
During - CS CS - 7 View Management
Exercises on View creation and querying
Post - CS HW Materilaised view creation
Contact Session - 8
Mid Term Review - Review of Modules 1-4 and doubts clarification session
Contact Session - 9
M5 : Data Replication
Consistency of Replicated Databases
Pre - CS RL 5.1 Mutual Consistency
Mutual Consistency versus Transaction Consistency T1 - Ch 13
During - CS CS - 9 Discuss all the above RL topics in brief
Post - CS HW -
Contact Session - 10
M5 : Data Replication
RL 5.2 Update Management Strategies - Eager Update Propagation
RL 5.3 Update Management Strategies - Lazy Update Propagation
Centralized Techniques
RL 5.4
Distributed Techniques
Pre - CS Replication Protocols
T1 - Ch 13
RL 5.5 Eager Centralized Protocols
Eager Distributed Protocols
Lazy Centralized Protocols
RL 5.6
Lazy Distributed Protocols
During - CS CS - 10 Discuss all the above RL topics in brief
Post - CS HW -
Contact Session - 11
M6 : Parallel Database Systems
Parallel Database Systems
RL 6.1 Objectives
Functional Architecture
Parallel DBMS Architectures
RL 6.2
Pre - CS Shared-Memory
RL 6.3 Shared-Disk T1 - Ch 14
RL 6.4 Shared-Nothing
RL 6.5 Hybrid Architectures
RL 6.6 Parallel Data Placement
During - CS CS - 11 Discuss all the above RL topics in brief
Post - CS HW -
Contact Session - 12
M7 : Web Data Management
RL 7.1 Web Graph Management
Compressing Web Graphs
RL 7.2
Storing Web Graphs as S-Nodes
Pre - CS Supernode graph
Intranode graph T1 - Ch 17
RL 7.3
Positive superedge graph
Negative superedge graph
During - CS CS - 12 Discuss all the above RL topics in brief
Post - CS HW Problems on S Nodes
Contact Session - 13
M7 : Web Data Management
RL 7.4 Web Search
RL 7.5 Web Crawling
Indexing
RL 7.6 Structure Index
Text Index T1 - Ch 17
Ranking and Link Analysis
RL 7.7
Evaluation of Keyword Search
During - CS CS - 13 Discuss all the above RL topics in brief
Post - CS HW -
Contact Session - 14
M8 : Hadoop & Big Data
Hadoop Architecture
Hadoop Distributed File System
RL 8.1
How Does Hadoop Work
Advantages of Hadoop
Pre - CS T2- Ch 5 & 6
HDFS
Features of HDFS
RL 8.2
HDFS Architecture
Goals of HDFS
RL 8.3 HDFS Operations
During - CS CS - 14 Discuss all the above RL topics in brief
T2 Page 20
Post - CS HW Case Study
&117
Contact Session - 15
M8 : Hadoop & Big Data
Big Data
Pre - CS RL 8.4
Benefits of Big Data
T2- Ch 5 & 6
Big Data Technologies
During - CS CS - 15 Discuss all the above RL topics in brief
Post - CS HW Case Study T2 Page 143
Contact Session - 16
Comprehensive Examination Review - Review of Modules 5-8 and doubts clarification session
Evaluation Scheme: