0% found this document useful (0 votes)
5 views48 pages

Relational Model

The relational model represents data and relationships through tables, where each table consists of unique columns and atomic values. It emphasizes the importance of database schema, normalization, and keys (superkeys, candidate keys, and primary keys) to maintain data integrity. Query languages, particularly relational algebra, provide a formal foundation for database operations, allowing users to retrieve and manipulate data efficiently.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views48 pages

Relational Model

The relational model represents data and relationships through tables, where each table consists of unique columns and atomic values. It emphasizes the importance of database schema, normalization, and keys (superkeys, candidate keys, and primary keys) to maintain data integrity. Query languages, particularly relational algebra, provide a formal foundation for database operations, allowing users to retrieve and manipulate data efficiently.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Relational Model

 The relational model uses a collection of tables to represent both data &
the relationships among those data.

 Each table has multiple columns, and column has a unique name. Each
attribute of a relation has a name.

 The set of allowed values for each attribute is called the domain of the
attribute.

 Attribute values are required to be atomic; that is, indivisible


- E.g. the value of an attribute can be an account number, but cannot be a
set of account numbers

 Domain is said to be atomic if all its members are atomic

 The special value null is a member of every domain. The null value causes
complications in the definition of many operations
Relational Model…contd.
Database schema: The database schema is a logical design of database

 A database consists of multiple relations


 Information about an enterprise is broken up into parts, with each relation
storing one part of the information
E.g. account : information about accounts
depositor : which customer owns which account
customer : information about customers
The customer Relation Depositor Relation
Relational Model…contd.
Because tables are relations, we use mathematical terms relation and tuple in
place of table and row

Comparison of database terms with programming language terms


 The concept of relation corresponds to the programming language notion of the
variable.

 The concept of relation schema corresponds to the programming language notion


of the type definition.

 The concept of relation instance corresponds to the programming language notion


of a value of the variable.
Relational Model…contd.
 Relation Schema
• Formally, given domains D1, D2, …. Dn a relation r is a subset of D1 x D2 x … x Dn
Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai  Di
• A relation is a subset of cartesian product of a list of domains
• Schema of a relation consists of
 attribute definitions
- name
- type/domain
 integrity constraints
 Relation Instance
The current values (relation instance) of a relation are specified by a table
An element t of r is a tuple, represented by a row in a table
attributes
(or columns)
customer_name customer_street customer_city

Jones Main Harrison


Smith North Rye tuples
Curry North Rye (or rows)
Lindsay Park Pittsfield

customer
Relational Model…contd.
Attribute Types
 Each attribute of a relation has a name

 The set of allowed values for each attribute is called the domain of the
attribute

 Attribute values are normally required to be atomic; that is, indivisible


- E.g. the value of an attribute can be an account number, but cannot be a
set of account numbers

 Domain is said to be atomic if all its members are atomic

 The special value null is a member of every domain

 The null value causes complications in the definition of many operations


Relational Model…contd.

Why Split Information Across Relations

Storing all information as a single relation such as


bank(account_number, balance, customer_name, ..) results in

- repetition of information

e.g., if two customers own an account (What gets repeated?)

- the need for null values

e.g., to represent a customer without an account

Normalization theory deals with how to design relational schemas


Relational Model…contd.
Keys
 Let K  R K is a superkey of R if values for K are sufficient to identify a
unique tuple of each possible relation r(R) by “possible r ” we mean a relation r
that could exist in the enterprise we are modeling.

• Example:{customer_name,customer_street} and
{customer_name}

are both superkeys of Customer, if no two customers can possibly have the
same name

• In real life, an attribute such as customer_id would be used instead of


customer_name to uniquely identify customers, but we omit it to keep our
examples small, and instead assume customer names are unique.

• A superkey is a set of one or more attributes that, taken collectively identify


uniquely a tuple in the relation.
Relational Model…contd.

Banking Enterprise
Relational Model…contd.
 K is a candidate key if K is minimal
Example: {customer_name} is a candidate key for Customer, since it is a superkey and no
subset of it is a superkey.

 Primary key: a candidate key chosen as the principal means of identifying tuples within
a relation
• Should choose an attribute whose value never, or very rarely, changes.
• E.g. email address is unique, but may change

 Foreign Key: A relation schema may have an attribute that corresponds to the
primary key of another relation. The attribute is called a foreign key.

E.g. customer_name and account_number attributes of depositor are foreign keys to


customer and account respectively.

- Only values occurring in the primary key attribute of the referenced relation may
occur in the foreign key attribute of the referencing relation.
Relational Model…contd.

 Query Languages
 A query language is a language in which a user requests information
from the database.

 In procedural language the user interacts the system to perform a


sequence of operations on the database to perform result.

 The relational algebra is a pure procedural language.

 Relational algebra gives formal foundation for relational model


operations

 It is used as basis for implementing and optimizing queries in the query


processing & optimization model
Relational Model…contd.

 Some of the relational algebra concepts are incorporated into the


SQL standard query language for RDBMS.

 The basic set of operations for relational model is the relational


algebra.

 A sequence of relation algebra operations forms a relational


algebra expression

 The result of a retrieval is a new relation which may have been


formed from one or more relations

 The fundamental operations in the relational algebra are Select,


Project, Union, Setdifference, Cartesian product and Rename.
Relational Model…contd.
Procedural language

Six basic operators


select: 
project: 
union: 
set difference: –
Cartesian product: x
rename: 

The operators take one or two relations as inputs and produce a new relation as a
result.
Select Operation

A B C D
 Relation r
  1 7
  5 7
  12 3
  23 10

 A=B ^ D > 5 (r)


A B C D

  1 7
  23 10
Project Operation
A B C
 Relation r:
 10 1
 20 1
 30 1
 40 2

 A,C (r) A C A C

 1  1
 1 =  1
 1  2
 2
Union Operation

 Relations r, s: A B A B

 1  2
 2  3
 1 s
r

A B

 r  s:  1
 2
 1
 3
Set Difference Operation

 Relations r, s: A B A B

 1  2
 2  3
 1 s
r

 r – s: A B

 1
 1
Cartesian-Product Operation

 Relations r, s: A B C D E
 1  10 a
 10 a
 2
 20 b
r  10 b
s
 r x s: A B C D E
 1  10 a
 1  10 a
 1  20 b
 1  10 b
 2  10 a
 2  10 a
 2  20 b
 2  10 b
Rename Operation

• Allows us to name, and therefore to refer to, the results of


relational-algebra expressions.
• Allows us to refer to a relation by more than one name.
• Example:
 x (E)

returns the expression E under the name X


• If a relational-algebra expression E has arity n, then

 x ( A ,A ,...,A ) (E )
1 2 n

returns the result of expression E under the name X, and with the
attributes renamed to A1 , A2 , …., An .
Relational Algebra Queries:

 Find all loans of over Rs 1200

 Find the loan number for each loan of an amount greater than Rs1200

 Find the names of all customers who have a loan, an account, or both, from the bank
Relational Algebra Queries:

 Find all loans of over Rs1200


amount > 1200 (loan)

 Find the loan number for each loan of an amount greater than Rs1200
loan_number (amount > 1200 (loan))

 Find the names of all customers who have a loan, an account, or both, from the bank
customer_name (borrower)  customer_name (depositor)
Project Operation

 Find the names of all customers who have a loan at the Perryridge branch.

 Find the names of all customers who have a loan at the Perryridge branch but do not
have an account at any branch of the bank.
Project Operation
 Find the names of all customers who have a loan at the Perryridge branch.
customer_name (branch_name=“Perryridge” (borrower.loan_number =
loan.loan_number(borrower x loan)))

 Find the names of all customers who have a loan at the Perryridge branch but do not
have an account at any branch of the bank.
customer_name (branch_name = “Perryridge” (borrower.loan_number = loan.loan_number(borrower x
loan))) – customer_name(depositor)
Project Operation
 Find the names of all customers who have a loan at the Perryridge branch
Project Operation
 Find the names of all customers who have a loan at the Perryridge branch

 customer_name (branch_name = “Perryridge” (borrower.loan_number =


loan.loan_number (borrower x loan)))

 customer_name(loan.loan_number = borrower.loan_number ((branch_name =


“Perryridge” (loan)) x borrower))
Set-Intersection Operation

 Additional Operations

– Set intersection ()


– Natural join ( )
– Aggregation
– Outer Join
– Division
Set-Intersection Operation

A B A B
 Relation r, s:  1  2
 2  3
 1

r s

A B

 2
• rs
Natural Join Operation

• Relations r, s:
A B C D B D E

 1  a 1 a 
 2  a 3 a 
 4  b 1 a 
 1  a 2 b 
 2  b 3 b 
r s

 r s
A B C D E
 1  a 
 1  a 
 1  a 
 1  a 
 2  b 
Natural-Join Operation
 Notation: r s
 Let r and s be relations on schemas R and S respectively.
Then, r s is a relation on schema R  S obtained as follows:
– Consider each pair of tuples tr from r and ts from s.
– If tr and ts have the same value on each of the attributes in R  S, add a
tuple t to the result, where
• t has the same value as tr on r
• t has the same value as ts on s

 Example:
R = (A, B, C, D)
S = (E, B, D)
• Result schema = (A, B, C, D, E)
• r s is defined as:
r.A, r.B, r.C, r.D, s.E (r.B = s.B  r.D = s.D (r x s))
Natural Join

 Find the name of all customers who have a loan at the bank and the
loan amount
Natural Join
• Find the name of all customers who have a loan at the bank and the loan
amount

customer_name, loan_number, amount (borrower loan)


Project Operation

• Find all customers who have an account from at least the “Downtown” and the
Uptown” branches.
Project Operation

• Find all customers who have an account from at least the “Downtown” and the
Uptown” branches.
customer_name (branch_name = “Downtown” (depositor account )) 
customer_name (branch_name = “Uptown” (depositor account))
Project Operation
• Find the largest account balance.
Project Operation
• Find the largest account balance
– Strategy:
• Find those balances that are not the largest
– Rename account relation as d so that we can compare each account
balance with all others
• Use set difference to find those account balances that were not found in
the earlier step.
balance(account) - account.balance (account.balance < d.balance (account x d (account)))
Aggregate Functions and Operations
 Aggregate functions that summarize data from tables

 Aggregation function takes a collection of values and returns a single value as


a result.
 avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
• Aggregate operation in relational algebra
G1,G2 ,,Gn
F ( A ),F ( A ,,F ( A ) (E )
1 1 2 2 n n

E is any relational-algebra expression


– G1, G2 …, Gn is a list of attributes on which to group
– Each Fi is an aggregate function
– Each Ai is an attribute name
Aggregate Operation

• Relation r:
A B C

  7
  7
  3
  10

 g sum(c) (r) sum(c )

27
Aggregate Operation
• Relation account grouped by branch-name:

branch_name account_number balance


Perryridge A-102 400
Perryridge A-201 900
Brighton A-217 750
Brighton A-215 750
Redwood A-222 700

branch_name g sum(balance) (account)


branch_name sum(balance)
Perryridge 1300
Brighton 1500
Redwood 700
Outer Join
• An extension of the join operation that avoids loss of information.

• Computes the join and then adds tuples form one relation that does not match tuples in
the other relation to the result of the join.
Loan Borrower
Example: loan_number branch_name amount customer_name loan_number

L-170 Downtown 3000 Jones L-170


L-230 Redwood 4000 Smith L-230
L-260 Perryridge 1700 Hayes L-155

 Join loan borrower loan_number branch_name amount customer_name

L-170 Downtown 3000 Jones


L-230 Redwood 4000 Smith
 Left Outer Join loan borrower

loan_number branch_name amount customer_name


L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-260 Perryridge 1700 null
Outer Join
 Right Outer Join
loan borrower

loan_number branch_name amount customer_name


L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-155 null null Hayes

 Full Outer Join


loan borrower

loan_number branch_name amount customer_name


L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-260 Perryridge 1700 null
L-155 null null Hayes
Null Values
 It is possible for tuples to have a null value, denoted by null, for some of their attributes
null signifies an unknown value or that a value does not exist.

 The result of any arithmetic expression involving null is null.. Aggregate functions
simply ignore null values (as in SQL)

 For duplicate elimination and grouping, null is treated like any other value, and two
nulls are assumed to be the same (as in SQL) Comparisons with null values return the
special truth value: unknown

 If false was used instead of unknown, then not (A < 5)


would not be equivalent to A >= 5

 Three-valued logic using the truth value unknown:


 OR: (unknown or true) = true,
(unknown or false) = unknown
(unknown or unknown) = unknown
 AND: (true and unknown) = unknown,
(false and unknown) = false,
(unknown and unknown) = unknown
 NOT: (not unknown) = unknown
Division Operation

 Notation: r  s
 Suited to queries that include the phrase “for all”.
 Let r and s be relations on schemas R and S respectively where
– R = (A1, …, Am , B1, …, Bn )
– S = (B1, …, Bn)
The result of r  s is a relation on schema
R – S = (A1, …, Am)
r  s = { t | t   R-S (r)   u  s ( tu  r ) }
Where tu means the concatenation of tuples t and u to produce a single
tuple
Division Operation

 Relations r, s:
A B B
 1 1
 2
 3 2
 1 s
 1
 1
 3
 4
 6
 1
 2
 r  s: A r



Division Operation

 Relations r, s:
A B C D E D E

 a  a 1 a 1
 a  a 1 b 1
 a  b 1 s
 a  a 1
 a  b 3
 a  a 1
 a  b 1
 a  b 1
r
 r  s:
A B C

 a 
 a 
Division Operation
• Property
– Let q = r  s
– Then q is the largest relation satisfying q x s  r
• Definition in terms of the basic algebra operation
Let r(R) and s(S) be relations, and let S  R

r  s = R-S (r ) – R-S ( ( R-S (r ) x s ) – R-S,S(r ))

To see why
– R-S,S (r) simply reorders attributes of r

– R-S (R-S (r ) x s ) – R-S,S(r) ) gives those tuples t in

R-S (r ) such that for some tuple u  s, tu  r.


• Find all customers who have an account at all branches located in

Brooklyn city .
Natural Join and Division

• Find all customers who have an account at all branches located in Brooklyn
city.

customer_name, branch_name (depositor account)  branch_name (branch_city = “Brooklyn”


(branch))
Natural Join
• Find all customers who have an account from at least the “Downtown” and the
Uptown” branches.
 customer_name (branch_name = “Downtown” (depositor account )) 
customer_name (branch_name = “Uptown” (depositor account))

customer_name, branch_name (depositor account)


 temp(branch_name) ({(“Downtown” ), (“Uptown” )})
• Find all customers who have an account at all branches located in
Brooklyn city

customer_name, branch_name (depositor account)


 branch_name (branch_city = “Brooklyn” (branch))

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy