W1-3 RelationalModel
W1-3 RelationalModel
1
Relational Model Concepts
●
The relational data model was first introduced by Ted Codd
of IBM Research in 1970
●
The model uses the concept of a mathematical relation as its
basic building block.
●
The relational model represents the database as a collection
of relations.
●
The relation resembles a table of values or, to some extent, a
flat file of records.
●
There are important differences between relations and files
●
A relation is thought of as a table of values
●
In the relational model terminology, a row is called a tuple, a
column header is called an attribute, and the table is called a
relation. The data type describing the types of values that can
appear in each column is represented by a domain of possible
values 2
The UNIVERSITY database
3
4
Notation and terminology
●
Domain D
– A domain D is a set of atomic values.
– By atomic we mean that each value in the domain is indivisible
●
It is also useful to specify a name for the domain, to help in
interpreting its values
●
Examples of logical definitions of domains:
– Social_security_numbers. The set of valid nine-digit Social Security
numbers.
– Names : The set of character strings that represent names of persons.
– Academic_department_names . The set of academic department names
in a university, such as Computer Science, Economics, and Physics.
– ….
●
A data type or format is also specified for each domain.
– EX: the data type for Employee_ages is an integer number between 15
and 80.
●
A domain is thus given a name, data type, and format. 5
Characteristics of Relations
●
A relation schema R,
– Denoted by R(A1, A2, ...,An) Example STUDENT(Name, Ssn, ….)
– Made up of a relation name R and a list of attributes, A1, A2, ..., An
– Each attribute Ai
●
is the name of a role played by some domain D in the relation schema R.
●
D is called the domain of Ai and is denoted by dom(Ai ).
– R is called the name of this relation.
●
The degree (or arity) of a relation is the number of attributes n of its relation schema.
– A relation of degree seven, which stores information about university students, would contain
seven attributes describing each student. as follows: STUDENT(Name, Ssn, Home_phone,
Address, Office_phone, Age, Gpa)
● A relation (or relation state) r of the relation schema R(A1, A2, ...,An), also denoted by
r(R),
– Set of n-tuples (“rows”) r = {t1, t2, ..., tm}
● Each n-tuple t is an ordered list of n values t =<v1, v2, ..., vn >, where each value vi, 1 ≤ i ≤ n,, is an element of
dom(Ai) or is a special NULL value.
●
The i th value in tuple t, which corresponds to the attribute Ai , is referred to as t[Ai ] or t.Ai (or t[i]).
●
NULL values represent attributes whose values are unknown or do not exist for some individual STUDENT
tuple.
●
Ordering of tuples in a relation
– Relation defined as a set of tuples
– Elements have no order among them 6
7
Same
concept
Same
domain,
different role
9
●
Values and NULLs in tuples
– Each value in a tuple is atomic: it is not divisible into components
– Flat relational model
●
Composite and multivalued attributes not allowed → First normal form
assumption → composite and multivalued attributes are not allowed
– Multivalued attributes
●
Must be represented by separate relations
– Composite attributes
●
Represented only by simple component attributes in basic relational
model
●
NULL values
– Represent the values of attributes that may be unknown or may
not apply to a tuple
– Meanings for NULL values
●
Value unknown. Ex: For example, in Figure 3.1, some STUDENT tuples have
NULL for their office phones because they do not have an office
●
Value exists but is not available
●
Attribute does not apply to this tuple (also known as value undefined) 10
●
Interpretation (meaning) of a relation
– Assertion
●
Each tuple in the relation is a fact or a particular instance of the assertion
– Predicate
●
Values in each tuple interpreted as values that satisfy predicate
●
For example, the schema of the STUDENT relation of Figure
3.1 asserts that
– a student entity has a Name , Ssn , Home_phone , Address ,
Office_phone , Age, and Gpa . (Gpa: grade point average)
●
For example, the first tuple in Figure 3.1 asserts the fact that
– there is a STUDENT whose Name is Benjamin Bayer, Ssn is 305-61-
2435, Age is 19, and so on.
●
Notice that some relations may represent facts about entities,
whereas other relations may represent facts about relationships.
– For example, a relation schema MAJORS (Student_ssn ,
Department_code ) asserts that students major in academic
disciplines.
11
Relational Model Notation
●
Relation schema R of degree n
– Denoted by R(A1, A2, ..., An)
●
Uppercase letters Q, R, S
– Denote relation names
●
Lowercase letters q, r, s
– Denote relation states
●
Letters t, u, v
– Denote tuples
●
Name of a relation schema: STUDENT
– Indicates the current set of tuples in that relation
●
Notation: STUDENT(Name, Ssn, ...)
– Refers only to relation schema
●
Attribute A can be qualified with the relation name R to which it belongs
– Using the dot notation R.A
12
●
n-tuple t in a relation r(R)
– Denoted by t = <v1, v2, ..., vn> ( Notation)
– vi is the value corresponding to attribute Ai
●
Component values of tuples:
– t[Ai] and t.Ai refer to the value vi in t for attribute Ai
– t[Au, Aw, ..., Az] and t.(Au, Aw, ..., Az) refer to the subtuple
of values <vu, vw, ..., vz> from t corresponding to the
attributes specified in the list
●
consider the tuple t = <‘Barbara Benson’, ‘533-69-
1238’, ‘(817)839-8461’, ‘7384 Fontana Lane’, NULL ,
19, 3.25> from the STUDENT relation in Figure 3.1
– t[Name] = <‘Barbara Benson’>
– t[Ssn , Gpa , Age] = <‘533-69-1238’,3.25, 19>. 13
Relational Model Constraints and Relational
Database Schemas
●
The state of the whole database will correspond
to the states of all its relations at a particular
point in time. There are generally many
restrictions or constraints on the actual values in
a database state.
●
Constraints
– Restrictions on the actual values in a database state
– Derived from the rules in the miniworld that the
database represents
●
Inherent model-based constraints or implicit
constraints
– Inherent in the data model 14
●
Schema-based constraints or explicit
constraints
– Can be directly expressed in schemas of the data
model
– The schema-based constraints include
●
domain constraints
●
key constraints, constraints on NULLs
●
entity integrity constraints
●
referential integrity constraints.
●
Application-based or semantic constraints or
business rules
– Cannot be directly expressed in schemas
– Expressed and enforced by application programs
15
1 Domain Constraints
●
Domain constraints specify that within each
tuple, the value of each attribute A must be an
atomic value from the domain dom(A).
●
Typically include:
– Numeric data types for integers and real numbers
– Characters
– Booleans
– Fixed-length strings
– Variable-length strings
– Date, time, timestamp
– Money
– Other special data types 16
Key Constraints and Constraints on NULL
2 Values
●
a relation is defined as a set of tuples. By definition, all elements of a set are
distinct; hence, all tuples in a relation must also be distinct
●
No two tuples can have the same combination of values for all their attributes.
●
Superkey SK is a subset of attributes
– No two distinct tuples in any state r of R can have the same value for SK
●
Key satisfies two properties:
– Two distinct tuples in any state of relation cannot have identical values for (all)
attributes in key
– Minimal superkey
●
Cannot remove any attributes and still have uniqueness constraint in above condition hold
●
a key is also a superkey but not vice versa.
Consider the STUDENT relation of Figure 3.1.
The attribute set { Ssn } is a key of STUDENT because no two student tuples
can have the same value for Ssn .
Any set of attributes that includes Ssn —for example, { Ssn , Name , Age }—is a
superkey. However, the superkey { Ssn , Name , Age } is not a key of STUDENT
because removing Name or Age or both from the set still leaves us with a
superkey. 17
●
Candidate key
– Relation schema may have more than one key
– it is usually better to choose a primary key with a single attribute or a
small number of attributes.
●
Primary key of the relation
– Designated among candidate keys
– Underline attribute
●
Another constraint on attributes specifies whether NULL values
are or are not permitted. For example, if every STUDENT tuple
must have a valid, non- NULL value for the Name attribute, then
Name of STUDENT is constrained to be NOT NULL .
18
DBMS Keys
●
Key: attribute or set of attributes that help to uniquely identify a row in
a table. There are different types of keys depending on their
functionality.
●
Candidate key: the minimal set of attributes which can uniquely
identify a tuple.
– The value of the Candidate Key is unique and non-null for every tuple
– The Primary key should be selected from the candidate keys. Every table must
have at least a single candidate key
– It may have multiple attributes
– There can be more than one Candidate Key in a table
– Each Candidate Key can work as Primary Key
●
Super key: group (set) of single or multiple keys which uniquely
identifies rows in a table. A super key contains a set of attributes,
including the primary key, which can uniquely identify any data row in
the table.
– Adding zero or more attributes to candidate key generates a super key
– A candidate key is a super key but vice versa is not true
– It includes only those fields that have unique values. 19
DBMS Keys
●
Primary Key: Primary key is a candidate key that is most
appropriate to become the main key for any table.
– It is a key that can uniquely identify each record in a table.
– It will not accept duplicate or null values. Primary key contains unique
values.
– Only one Candidate Key can be Primary Key.
– Unique Key:
●
Unique Key: The unique key is a set of one or more columns or
fields of a table that can uniquely identify a record in the table.
– Other than primary key there can also be other unique fields in a table
– The unique key cannot have duplicate values and can accept only one
null value.
●
Alternate Key or Secondary Key: The candidate keys which are
not selected as primary key are known as secondary keys or
alternative keys.
20
DBMS Keys
●
Compound Key: Compound key has many fields which allow you to uniquely
recognize a specific record. The compound key can be a combination of primary and
candidate keys.
– It is possible that each column may be not unique by itself within the database. However,
when combined with the other column or columns the combination of composite keys become
unique.
●
Composite Key: A key which has multiple attributes to uniquely identify rows in a
table is called a composite key.
– The difference between compound and the com posite key is that any part of the compound
key can be a foreign key, but the composite key may or may not be part of the foreign key. A
compound key is a composite key for which each attribute that makes up the key is a foreign
key in its own right.
●
Foreign Key: Foreign Key is a field in a database table that is Primary key in another
table. It can accept multiple null, duplicate values.
– The relation which is being referenced is called referenced relation and the corresponding
attribute is called referenced attribute and the relation which refers to the referenced relation
is called referencing relation and the corresponding attribute is called referencing attribute.
– The referenced attribute of the referenced relation should be the primary key for it.
– the foreign key is useful in linking together two tables
– Foreign keys help us to maintain data integrity. Every relationship in the model needs to be
supported by a foreign key
– It may be worth noting that unlike the Primary Key of any given relation, the Foreign Key can
be NULL as well as may contain duplicate tuples i.e. it need not follow uniqueness constraint. 21
DBMS Keys
●
Surrogate Key: An artificial key which aims to uniquely identify each
record is called a surrogate key. These kind of key are unique
because they are created when you don't have any natural primary
key.
– They do not lend any meaning to the data in the table. Surrogate key is
usually an integer.
– Surrogate keys are allowed when:
●
No property has the parameter of the primary key.
●
In the table when the primary key is too big or complicated.
●
Non-key Attributes: Non-key attributes are the attributes or fields
of a table, other than candidate key attributes/fields in a table.
●
Non-Prime Attributes: Non-prime Attributes are attributes other
than Primary Key attribute(s).
References
– https://www.geeksforgeeks.org/types-of-keys-in-relational-model-candidate-super-primary-alter
nate-and-foreign/
– https://www.guru99.com/dbms-keys.html
– https://www.csestack.org/different-types-database-keys-example/ 22
– https://www.studytonight.com/dbms/database-key.php
primary key super key DBMS Keys Alternate
(secondary) key
Unique key candidate key
STUDENT
student_id name phone age gender email
2345A Juli 0000099 22 F juli@mail
Surrogate
STUDENT_COURSE key
student_id course_id course_name marks N
2345A DSA Data 4.75 1
Structures
2345A DBS Databases 9.8 2
27
Other Types of Constraints
●
State constraints
– Define the constraints that a valid state of the database must
satisfy
●
Transition constraints
– Define to deal with state changes in the database
●
Semantic integrity constraints
– May have to be specified and enforced on a relational database
– Use triggers and assertions
– More common to check for these types of constraints within the
application programs
●
Functional dependency constraint
– Establishes a functional relationship among two sets of
attributes X and Y
– Value of X determines a unique value of Y 28
Update Operations, Transactions, and Dealing
with Constraint Violations
●
Operations of the relational model can be
categorized into retrievals and updates
●
Basic operations that change the states of
relations in the database:
– INSERT. Insert is used to insert one or more new tuples
in a relation
– DELETE. Delete is used to delete tuples
– UPDATE (or Modify). Update (or Modify) is used to
change the values of some attributes in existing tuples.
●
Whenever these operations are applied, the
integrity constraints specified on the relational
database schema should not be violated.
29
The INSERT Operation
●
Provides a list of attribute values for a new tuple t that is
to be inserted into a relation R
●
Can violate any of the four types of constraints
1 – Domain constraints can be violated if an attribute value is given
that does not appear in the corresponding domain or is not of
the appropriate data type.
2 – Key constraints can be violated if a key value in the new tuple t
already exists in another tuple in the relation r(R).
3 – Entity integrity can be violated if any part of the primary key of
the new tuple t is NULL .
4
– Referential integrity can be violated if the value of any foreign
key in t refers to a tuple that does not exist in the referenced
relation.
●
If an insertion violates one or more constraints
– Default option is to reject the insertion 30
examples
●
Operation:
– Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, NULL , ‘1960-04-05’, ‘6357 Windy Lane, Katy, TX’, F,
28000, NULL , 4> into EMPLOYEE .
– Result: This insertion violates the entity integrity constraint ( NULL for the primary
key Ssn ), so it is rejected.
●
Operation:
– Insert <‘Alicia’, ‘J’, ‘Zelaya’, ‘999887777’, ‘1960-04-05’, ‘6357 Windy Lane, Katy, TX’,
F, 28000, ‘987654321’, 4> into EMPLOYEE .
– Result: This insertion violates the key constraint because another tuple with the
same Ssn value already exists in the EMPLOYEE relation, and so it is rejected.
●
Operation:
– Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, ‘677678989’, ‘1960-04-05’, ‘6357 Windswept, Katy,
TX’, F, 28000, ‘987654321’, 7> into EMPLOYEE .
– Result: This insertion violates the referential integrity constraint specified on Dno
in EMPLOYEE because no corresponding referenced tuple exists in DEPARTMENT
with Dnumber = 7.
●
Operation:
– Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, ‘677678989’, ‘1960-04-05’, ‘6357 Windy Lane, Katy,
TX’, F, 28000, NULL , 4> into EMPLOYEE .
– Result: This insertion satisfies all constraints, so it is acceptable. 31
The DELETE Operation
●
Can violate only referential integrity 4
– If tuple being deleted is referenced by foreign keys
from other tuples
– Restrict
●
Reject the deletion
– Cascade
●
Propagate the deletion by deleting tuples that reference
the tuple that is being deleted
– Set null or set default
●
Modify the referencing attribute values that cause the
violation
32
examples
●
Operation:
– Delete the WORKS_ON tuple with Essn = ‘999887777’ and Pno
= 10.
– Result: This deletion is acceptable and deletes exactly one
tuple.
●
Operation:
– Delete the EMPLOYEE tuple with Ssn = ‘999887777’.
– Result: This deletion is not acceptable, because there are tuples
in WORKS_ON that refer to this tuple. Hence, if the tuple in
EMPLOYEE is deleted, referential integrity violations will result.
●
Operation:
– Delete the EMPLOYEE tuple with Ssn = ‘333445555’.
– Result: This deletion will result in even worse referential
integrity violations, because the tuple involved is referenced by
tuples from the EMPLOYEE , DEPARTMENT , WORKS_ON , and
DEPENDENT relations. 33
The UPDATE Operation
●
Necessary to specify a condition on attributes of
relation
– Select the tuple (or tuples) to be modified
●
If attribute not part of a primary key nor of a
foreign key
– Usually causes no problems
●
Updating a primary/foreign key
– Similar issues as with INSERT/DELETE
34
examples
●
Operation:
– Update the salary of the EMPLOYEE tuple with Ssn = ‘999887777’ to
28000.
– Result: Acceptable.
●
Operation:
– Update the Dno of the EMPLOYEE tuple with Ssn = ‘999887777’ to 1.
– Result: Acceptable.
●
Operation:
– Update the Dno of the EMPLOYEE tuple with Ssn = ‘999887777’ to 7.
– Result: Unacceptable, because it violates referential integrity.
●
Operation:
– Update the Ssn of the EMPLOYEE tuple with Ssn = ‘999887777’ to
‘987654321’.
– Result: Unacceptable, because it violates primary key constraint by
repeating a value that already exists as a primary key in another tuple; it
violates referential integrity constraints because there are other
relations that refer to the existing value of Ssn . 35