Oo
Oo
Ioan Despi 1
1. Advanced Database Applications
2
Disadvantages of Relational DBMS:
3
2. Object- Oriented Concepts
Loosely speaking, an object correspond to an entity in the ER model.
The object -oriented paradigm is based on encapsulating data and
code related to an object into a single unit.
Abstraction: the process of identifying the essential aspects of an
entity and ignoring the unimportant properties.
1. Encapsulation: an object contains both the data structure and
the set of operations that can be used to
manipulate it.
2. Information hiding: we separate the external aspects of an
object from its internal details, which are
hidden from the outside world.
The internal details of an object can be changed without affecting the application that use it.
4
The current state of an object is described by one or more attributes,
or instance variable. The value of each variable is itself an object.
1. A simple attribute: can be a primitive type (integer, string, real,…
2. A complex-attribute: can contain collections and/or references
3. A reference attribute: represents a relationship between objects
contains a value or collection of values, which are themselves
objects (like a Foreign Key or a pointer)
Complex object: an object that contains one or more complex
attributes
Notation: “dot” notation: branch.street, branch.manager, branch.city
Object= a uniquely identifiable entity that contains both the
attributes that describe the state of the object and the actions that are
associated with it, that is its behaviour.(Simula) 5
The behaviour of an object is given by:
a set of messages to which the object responds
each message may have 0, 1 or more parameters
a set of methods, each of which is a body of code to
implement a message
a method returns a value as the response to the message
Method 4 Method 1
Attributes
Method 3 Method 2
7
Methods are programs written in a general-purpose language
respecting thee following restrictions:
1. Only variables in the object itself may be referenced directly
2. Data in other objects are referenced only by sending messages
They can be used to change the object’s state by modifying its
attribute values, or to query the values of selected attributes.
A method consists of a name and a body that performs the behavior
associated with the method name:
staff_object.update_salary(1000)
update_salary(staff_object, 1000)
9
Object classes
Examples:
Manager is AKO Staff.
Susan Deer IS-A Manager.
Inheritance:
1. Single inheritance: the subclass inherits from no more than one
superclass
2. Multiple inheritance: the subclass inherits from more than one
superclass ===> conflicts!
15
Person Single
inheritance
Staff
Manager Sales_Staff
Multiple
Manager Sales_Staff
inheritance
Sales_Manager 16
3. Repeated inheritance: a special case of multiple inheritance
superclasses inherit from a common superclass
The inheritance mechanism must ensure that the subclass does
not inherits properties twice.
Staff
Manager Sales_Staff
Sales_Manager
Other concepts:
overriding (+ overloading)
polymorphism & dynamic binding
complex objects
persistence 18
3. OODBMS
Hierarchical Data Model IMS
Chen, 1976
ER Data Model
Third generation DBMS
Special requirements
Semantic data models •versionong
•generalization •schema evolution
•aggregation
21
Strategies for Developing an OODBMS:
22
OODBMS Perspectives:
23
7. Integrity: the assurance that the data conforms to specified
correctness and consistency rules
8. Distribution: the ability to physically distribute a logically
interrelated collection of shared data over a network
24
Issues:
25
B. Serialization: copy the closure of a data structure to disk.
A write operation on a data value involves the traversal of the
graph of objects reachable from the value and, then, the writing
of a flattened version of the structure to disk.
Reading back this flattened structure: serialization, pickling,
marshaling.
• Does not preserve object identity: if two data structures that
share a common sunstructure are separately serialized, then on
retrieval the substructure will no longer be shared in the new
copies.
• It is not incremental, and so saving small changes to a large
data structure is not efficient.
26
C. Explicit paging: paging objects between the application heap
and the persistent store.
Requires the conversion of object pointers from a disk-based
scheme to a memory-based scheme.
There are two common methods for creating/updating
persistent objects:
a. Reachability-based: an object will persist if it is reachable
from a persistent root object
at any time after creation, an object can become
persistent by adding it to the reachability tree.
Garbage collection: deletes objects when they are no
longer accessible from any other object
Smalltalk, Java
27
b. Allocation-based: an object is explicitely declared as being
persistent within the application program
i) By class: a class is statically declared to be
persistent --> all instances of the class are
made persistent when they are created
a clas may be a subclass of a system-
supplied persistent class
Ontos, Objectivity/DB
ii) By explicit call: an object may be specified as
persistent when it is created or, in soome
cases, dynamically at runtime (added to a
persistent collection)
ObjectStore
28
Alternatively, to provide persistence in a programming language:
orthogonal persistence, based on the following principles:
1. Persistence independence: the persistence of a data object is
independent of how the program manipulates the data object
and conversely, a fragment of the program is expressed
independently of the persistence of data it manipulates.
2. Data type orthogonality: all data objects should be allowed the
full range of persistence irrespective of their type: Ps-algol,
Napier88, Galileo, GemStone
Persistence is only a quality attributable to a subset of the
language data types: Pascal/R, Amber, E, Avalon/C++
3. Transitive persistence: the choice of how to identify and provide
persistent objects at the language level is independent of the
choice of data types in the language. Most used technique:
reachability-based. 29
Orthogonal persistence:
Advantages:
1. There is no need to define long-term data in a separate schema language
2. No special application code is required to access or update persistent data
3. There is no limit to the complexity of the data structures that can be made persistent
4. Improved programmer productivity from simpler semantics
5. Improved maintenance
6. Consistent protection mechanisms over the whole environment
7. Support for incremental evolution
8. Automatic referential integrity
30
Issues:
31
A. No swizzling: the OID is used every time the object is accessed
the system maintains a lookup table, so that the object’s
virtual memory pointer can be located and then used to access
the object.
Could be inefficient if the same objects are accessed repeatedly
Could be acceptable if applications access an object once
32
Virtual memory is considered to be a directed graph, with objects as
nodes and references as directed edges:
1. Edge marking marks every object pointer with a tag bit.
If the bit is set, then the reference is to a virtual
memory pointer
Otherwise, it is still pointing to an OID and needs to
be swizzled when the object it referes to is faulted
into the application’s memory space.
2. Node marking requires that all object references are
immediately converted to virtual pointers when the
object is faulted into memory.
1 is a software-based technique;
2 can be implemented using software or hardware-based
techniques. 33
C. Hardware-based schemes: use virtual memory access protection
violations to detect accesses of non-resident objects(Lamb91)
Use the standard virtual memory hardware to trigger the
transfer of persistent data from disk to main memory.
Once a page has been faulted in, objects are accessed on that
page via normal virtual memory pointers.
The hardware approach avoids the overhead of residency
checks incurred by software approaches but
limits the amount of data that can be accessed during a
transaction to the size of virtual memory and complicates
other issues, like recovery, fine-grained locking, aso.
ObjectStore, Texas
34
Issues:
3. Transactions
in classical DBMSs: short duration transactions
in CAD, CASE,…: long duration transactions (hours, days)
a need for new protocols:
nested transactions, sagas, multi-level transactions.
4. Versions: Ontos, Versant, ObjectStore, Objectivity/DB, Itasca
object version = an identifiable state of an object
version history = the evolution of an object
version management = object references always point to
the correct version of an object
35
Types of versions:
1. Transient version: unstable, can be updated and deleted
it can be created from new by checking out a released version
from a public database or
by deriving it from a working or transient version in a private
database, when the base transient version is promoted to a
working version. Always sored in the creator’s private
workspace.
2. Working version: stable and cannot be updated but it can be
deleted by its creator. It is stored in the creator’s private
workspace.
3. Released version: stable, cannot be updated or deleted.
it is stored in a public database by checking in a working
version from a private database 36
Issues:
40
Disadvantages of OODBMSs:
lack of universal data model
lack of experience
lack of standards
query optimization compromises encapsulation
locking at object level may impact performance
complexity
lack of support for views
lack of support for security
41
Object Database Standard ODMG 2.0 1997
Transaction
storage management queries versioning security
Object
OMA services 43
1. The Object Model-- OM
is a design-portable abstract model for communicating with
OMG-compliant object-oriented systems
a requester sends a request for object services to the ORB
which keeps track of all the objects in the system and
the types of services they can provide
the ORB then forwards the message to a provider
who acts on the message and passes a response back
to the requester via the ORB
44
2. The Object Request Broker -- ORB
handles distribution of messages between application objects
is a distributed ‘software bus’ that enables objects (requesters)
to make and receive requests and responses from a provider
on receipt of a response from the provider, the ORB translates
the response into a form the original requester can understand
--> provides a mechanism by which objects make and receive
requests and responses transparently
--> interoperability between applications in a heterogeneous
distributed environment
45
3. The Object Services --OS
provide the main functions for realizing basic object functionality
collection: a uniform way to create and manipulate most
common collections generically:
sets, queues, stacks, lists, binary trees
concurrency control: a lock manager that enables multiple
clients to coordinate their access to shared
rresources
event management: allows components to dynamically
register or unregister their interest in specific
events
exeternalization: provides protocols and conventions for
externalizing and internalizing objects.
46
externalization: records the state of an object as a stream of
data (in memory, on disk, across network)
internalization: creates a new object from it in a different process
licensing: operations for metering the use of components to
ensure fair compensation for their use, and protect
intellectual property
lifecycle: operations for creating, copying, moving, and
deleting groups of related objects
naming: facilities to bind a name to an object relative to a
naming context
persistence: interfaces to mechanisms for storing and
managing objects persistently
property: operations to associate named values (properties)
with any (external) component 47
query: declarative query statements with predicates, the
ability to invoke operations and other object services
relationship: a way to create dynamic associations between
components that know nothing of each other
security: services such as identification and authentification,
authorization and access control, auditing, security of
communication, non-repudiation, administration
time: maintains a single notion of time across different
machines
trader: a matchmaking service for objects. It allows objects
to dynamicaly advertise their services, and other
objects to register for a service.
transactions: a two-phase commit coordination among
recoverable components using flat or nested
transactions 48
4. The Common Facilities --CF
comprise a set of tasks that many applications must perform
but are traditionally duplicated within each one.
they are made available through OMA-compliant class
interfaces
in the latest version: CF are split in
horizontal common facilities (printing, electronic
mail, aso) and
vertical domain facilities (finance, helthcare,
manufacturing, e-commerce, transportation,
telecommunications)
49
The Common Request Broker Architecture -- CORBA
From the IDL definitions, CORBA objects can be mapped into particular
programming languages, as C, C++, Smalltalk and Java. This produces interface
stubs within the application programming language (client) that are used to invoke
the requests. The same stubs are used on the object implementation side (server) to
51
create skeletons, which are completed to provide the requested behavior.
The ODMG Object Model
52
The major components of the ODMG for an OODBMS are:
1. Object model--OM
2. Object definition language --ODL
3. Object query language -- OQL
4. C++ language bindings
5. Smalltalk language bindings
6. Java language bindings
53
1. The Object Model --OM
ODMG object model is a superset of th OMG object model
enables both designs and implementations to be ported
between complian systems
Basic modeling primitives: the object and the literal.
Objects and literals can be categorized in types: all objects of a given
type exihibit common behavior and state. A type is an object.
Behavior is defined by a set of operations that can be performed on
or by object.
State is defined by the values an object carries for a set of properties
A property may be either an attribute or a relationship between the
object and one or more other objects.
54
Literal_type
56
A database stores objects, enabling them to be shared by multiple
users and applications.
A database is based on a schema that is defined in ODL. The
database contains instances of the types defined by its schema.
Objects types are: atomic, collections or structured types.
Types shown in italics are abstract types. Types shown in normal
are directly instantiable. They are the only base types.
Types with < > indicate type generators.
Objects are created using the new() method of the corresponding
factory interface provided by the language binding interface.
All objects have an ODL interface which is implicitly inherited by
the definition of all user-defined objects:
57
Interface Object {
enum Lock_Type {read, write, upgrade};
exception LockNot Granted {};
void lock(in Lock_Type mode) raises (LockNotGranted);
boolean try_lock(in Lock_Type mode);
boolean same_as(in Object anObject);
Object copy();
void delete(); }
Each object has an unique identity, OID, which does not change and
is not reused when the object is deleted.
In addition, each object has one or more meaningful user names
Objects can be transient or persistent.
58
Literals : atomic, collections, structured, null
The values of a literal’s properties may not change.
Literals do not have their own OID and cannot stand
alone as objects: they are embedded in objects
Structured literals contain a fixed number of named heterogenous
elements of the form: < name , value >, where value may be any
literal type.
Struct Address {
string street;
string area;
string city;
string post_code; };
attribute Address branch_address;
59
Collections: contain an arbitrary number of unnammed
homogeneous elements, each of which can be an
instance of an atomic type, a collection or literal type
There are ordered and unordered collections. Ordered collections
must be traversed first to last or vice versa; unordered collections
have no fixed order of iteration.
Set: unordered collections that do not allow duplicates
Bag: unordered collections that do allow duplicates
List: ordered collections that allow duplicates
Array:one-dimensional array of dynamically varying length
Dictionary: unordered sequence of key-value pairs with no
duplicate ekeys
Each subtype has operations to create an instance of the type and insrt an element
into the collection. Sets and Bags have usual set operations: , ,
60
Interface Collection: Object {
exception InvalidCollection{};
exception ElementNotFound{any element};
unsigned long cardinality();
boolean is_empty();
boolean is_ordered();
boolean allows_duplicates();
boolean contains_element(in any_element);
void insert _element(in any_element);
void remove _element(in any_element);
raises (ElementNotFound);
Iterator create_iterator(in boolean stable);
` Bidirectionaliterator create_bidirectional_iterator(in boolean stable);
Raises(InvalidCollectionType); };
62
Properties: in ODMG object model: attributes and relationships
attribute BranchWorksAt;
void form_WorksAt(in Branch aBranch);
void drop_WorkAt(in Branch aBranch);
64
2. The Object Definition Language --ODL
is a specification language for defining the specifications of
object types for OMG-complian systems.
facilitates portability of schemes between compliant systems
defines the attributes and relationships of types
specifies (but not addresses the implementation of) the
signature of the operations
the syntax of ODL extends the IDL (Interface Definition
Language) of the CORBA
will be the basis for integrating schemas from multiple
sources and applications
65
3. The Object Query Language --OQL
provides declarative access to the object database using an
SQL-like syntax.
does not provide explicit update operators, but leaves this to
the operations defined on object types.
can be used as a standalone or as an embedded language in
another language (now: C++, Smalltalk, Java).
can invoke operations programmed in these languages
An OQL query is a function that delivers an object
whose type may be infered from
the operator contributing to the
query expression.
66
Query definition expression:
DEFINE Q AS e /* defines a query with name Q given a query
/* expression e
1. Elementary expressions:
• an atomic literal: 10, 17.5, ‘c’, “qwerty”, false, nill
• a named object:
• an iterator variable from the FROM clause of the
SELECT-FROM-WHERE:
e as x or e x or x in e
where e is of type collection(T), then x is of type T
• a query definition expression (Q above)
67
2. Construction expression:
•If T is a type name with properties p1, p2, …,pn and e1, e2, …, en are
expressions then T(p1 : e1, p2 : e2, …,pn : en) is an expression of type T.
Example: Branch(bno : ”B22”, manager : ”Susan Brand”)
68
3. Atomic Type Expressions
•Expressions can be formed using the standard unary and binary
operations on expressions.
•If S is a string, expressions can be formed using:
the string concatenation operation ( || or + )
a string offset Si , meaning the i + lth character of the string
S[low : up], meaning the substring of S from low + lth to
up+lth character
c in S (where c is a char) returning a bolean expression
S like pattern . Pattern contains the characters ? or _ ,
meaning any char, or the wildcard characters * or %,
mening any substring. Returns a boolean expression
69
4. Object Expressions
70
5. Collections expressions
8. Structure Expression
• If e is a expression and p is a property name, then e.p and
e-->p are expressions, which extract the property p of an
object e.
73
9. Conversion Expressions
If e is an expression, then element(e) is an expression that
checks e is a singleton, raising an exception if it is not.
If e is a list expression, then listtoset(e) is an expression that
converts the list into a set.
If e is a collection-valued expression, then flatten(e) is an
expression that converts a collection of collections into a
collection, that is, it flattens the structure.
If e is an expression and c is a type name, then c(e) is an
expression that asserts e is an object of type c, raising an
exception if it is not.
10. Object Expressions
If e is an expression and f is an operation, then e.f and e-->f
are expressions that apply an operation to an object. The
operation can optionally take a number of expressions as
parameters. 74
A query consists of a (possibly empty) set of query definition
expressions followed by an expression.
The result of a query is an object with or without identity.
Examples:
75
C. get the set of all staf who live in London (without identity):
define Londoners as
select x
from x in staff
where x.address.city = “London”
select x.name.lname from x in Londoners
returns a literal of type set<string>
78