Chapeter1 2 3 Summary
Chapeter1 2 3 Summary
Why Databases
Data is unescapable, prevalent, and persistent nature and it exists from birth to death.
Individuals continuously generate and consume a lot of data throughout their lives.
It starts with birth certificates and extends to death certificates, highlighting the lifelong
data generation process.
3. Importance of Databases
Databases are the optimal solution for storing and managing data effectively.
Databases make data persistent, shareable, and secure, addressing the challenges posed
by the sheer volume of generated data.
Businesses are not able to store and retrieve huge collections of data
Databases are the solution to efficiently process, store, and retrieve vast amounts of
data for timely decision-making.
Metadata- the data characteristics and the set of relationship that links the data found
within the database.
Advantages of DBMS
Types of databases
Workgroup database- when multiuser database supports relatively small number of users
Centralized database – Database that supports data located on the single site
Distributed database – database that supports data distributed across several different sites
Cloud database- Database that is created and maintained using cloud data services
General purpose database- contains a wide variety of data used in different disciplines
Analytical database- stores historical data and business metrics used exclusively for tactical or
strategic decision making.
Online analytical processing: is set of tools that work together to provide an advanced data
analysis environment for retrieving, processing and modelling data from the data warehouse
Structured data: formatted raw data to facilitate storage, use and generation of information.
Database design- activities that focus on the design on the design of the database structure that
will be used to store and manage end-user data.
Structural independence- exist when you can change the file structure without affecting the
applications’ ability to access data
Physical data format- How computers must work with the data
Data redundancy: occurs when the same data is stored unnecessarily at different places
Effects of data Redundancy
Data anomalies- develops when not all of the required changes in the redundant data are made
successfully
Update anomalies
Insertion anomalies
Deletion anomalies
Database system environment- is an organization of components that define and regulate the
collection, storage, management and use of data within database environment.
Functions of DBMS
o Increased cost
o Management complexities
o Maintaining currency
o Vendor dependence
o Needs frequent upgrades
Chapter 2
Data model: is the simple representation, usually graphical of more complex real world data
structure
Importance of Data Models
It facilitates interaction among the designer, the application programmer and end user
Entity- a person, place, thing or event about which data will be collected and stored
Relationships – describes association among entities. Designers usually use shorthand notations
to represent one-to-many, many-to-many and one-to-one [1: M or M:N or *..* and 1:1 or 1..1
respectively]
Constraints- itis the restriction placed on data. They ensure data integrity and are expressed
inform of rules.
Business rules: is a brief, precise and unambiguous description of policy, procedure, or principle
within specific organization
The main source of business rules are company managers, policy makers, department managers
and written documentation such as company’s procedures, manuals and standards.
The quest for better a better data management model led to the development of several models
Hierarchical model – developed in 1960 to manage large amount of complex data
Network model: represents complex data relationship more effectively than hierarchical model
to improve data performance and impose a database standard
Schema- is the conceptual organization of the entire database as viewed by the database
administrator.
Subschema – it defines the portion of the database “seen” seen by the application programs
that actually produce the desired information from the data within the
database.
Schema data definition language- enables the database administrator to define the schema
components.
A data manipulation language- defines environment in which data can be managed and it is
used to work with the data within the database.
Object oriented Model (components) – also called sematic data model because it indicates
meaning
Inheritance- is the ability of the object within the class hierarchy to inherit the attributes and
methods of the classes above it.
UML class diagram- are used to represent data and its relationship within the upper UML object
oriented systems modeling language
Big Data: is the movement to find new and better ways to manage large amount of web-and
sensor generated data and drive business insight from it while simultaneously providing high
performance and scalability at a reasonable cost.
Velocity- it is the speed with which data grows and the need to process this data quickly in order
to generate information and insight.
Variety- refers to the data being collected comes in multiple different data formats.
Big data technologies –Hadoop- is java based, open source, high speed, fault tolerant
distributed storage and computational framework
Chapter 3
Tables and their characteristics
A table- is a two dimensional structure composed of rows and columns
- Also called relation because the relational model creator E.F Codd used the two terms as
synonyms.
Primary key (PK) - is an attribute or combination of attributes that uniquely identifies any given row.
Dependencies:
Determination is the in which knowing the value of the attribute makes it possible to determine the
value of another.
Full functional dependence –is a functional dependencies in which the entire collection of attributes in
the determinant is necessary for the relationship.
Types of Keys
Super key – a key that can uniquely identify any row in a table
Primary key- a candidate key selected to identify all other attribute value in a a given row, cannot
contain null entry
Foreign key- an attribute or combination of attributes in one table whose value must either much the
primary key in other table
Entity integrity- is the condition in which each row in a table has its own unique identity
Relational algebra- defines the theoretical way of manipulating table content using relational operators.