0% found this document useful (0 votes)
34 views156 pages

MCA-05-Unlocked

The document is an introduction to Database Management Systems (DBMS), detailing its concepts, architecture, and advantages over traditional file systems. It covers various database models, including relational and hierarchical models, and emphasizes the importance of DBMS in career development for students in computer science. The text is structured into multiple blocks and units, providing examples, practical questions, and summaries to aid learning for BCA, MCA, and PGDCA students.

Uploaded by

S.S. Ammar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
34 views156 pages

MCA-05-Unlocked

The document is an introduction to Database Management Systems (DBMS), detailing its concepts, architecture, and advantages over traditional file systems. It covers various database models, including relational and hierarchical models, and emphasizes the importance of DBMS in career development for students in computer science. The text is structured into multiple blocks and units, providing examples, practical questions, and summaries to aid learning for BCA, MCA, and PGDCA students.

Uploaded by

S.S. Ammar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 156
MCA-05 ; Introduction to Database Management Systems Page No: Block | DBMS Concepts 4 Unit 4 Fundamentals of DBMS 5 Unit 2 Database Models 18 Block 2__ | File Organization 44 Unit 3 File Management 45 Block3 | RDBMS and DDBMS 4 Relational model and Normalization Unit 4 75 = = aa | Unit 5 SQL 102 | Block4 | Trends in DBMS 429 Unit 6 Application of DBMS 130 = ' —— o = Unit 7 Client/Sever Database ) 149 Course Design and Preparation Committee i Chairperson + Prof. M.S. Palanichamy Vice-Chancellor Tamil Nadu Open University Course Design : Dr. P. Thiyagarajan Reader & Head School of Continuing Education Tamil Nadu Open University Er. N. Sivashanmugam Lecturer School of Computer Sciences ‘Tamil Nadu Open University Course Writers : Mr. H. Khanna Nehemiah Lecturer Dept. of Computer Sciences and Engineering Anna University, Chennai-25 Co-ordination & Composed by: Er. N. Sivashanmugam Lecturer School of Computer Sciences kal Tamil Nadu Open University December 2007 (First Edition) Tamil Nadu Open University, 2007 All rights reserved. No part of this work may be reproduced in any form, by mimeograph or any other means, without permission in writing from the Tamil Nadu Open University. Further information on the Tamil Nadu Open University Programmes may be obtained from the University office at Directorate of Technical Education Campus, Guindy, Chennai - 600025. i Printed at: vst Associated Graphics Pvt. Ld,, Chennala4 Course Introduction This book provides a complete guide for the implementation of Introduction to Database Management Systems concepts. This book is very much helpful for you career development in the statistics field, so Database Management Systems is very much essential in this competitive world. There are large numbers of examples, practical questions, objectives, summaries of important topics, referral books and learning activities are available in this book, which are very much useful for universities and job oriented examinations of various reputed companies. It is very useful for BCA, MCA and PGDCA student of Universities, computer institutions and so on. This book covers four blocks and we discussed many topics with detailed explanation for each black; Block one deal with RDBMS Terminology, block two deal with concepts relational Models (Normalization), block three deals with practical on RDBMS and finally block four deals database Manipulation. Most of the concepts in the text are illustrated by several examples are important topics in their own right and may be treated as such. We fee! that, at the stage of a student's development for which the test is designed, it is more important to cover several examples in great detail than to cover a broad range of topics cursorily. All the programs and aigorithms in this text have been tested and debugged. Of course, any errors that remain are the sole responsibility of the authors. We have tried the best to avoid the mistakes and errors, however their presence cannot be ruled out. Your valuable suggestions and corrections are welcomed to improve our quality. This book is dedicated to all of our students and colleagues. Block 1: DBMS Concepts In this block, we “will learn about the concept of Database Management System and database models. With this you will get clear idea about architecture, merits and demerits. This block is divided into two unit are as follows Unit 1: It deals with fundamentals of Database Management ‘Systems and their related concepts. Unit 2: It deals with E-R model, Relational Model, Network model and Hierarchical Model Unit-1 ee Fundamentals of DBMS Structure Overview Learning Objectives 1.0 Introduction 1.1 Basics of database 1.2 File System Vs Database Systems 1.3 Advantages of Database Approach 1.4 Three level architecture of DBMS. 1.5 Categories of Data Models Let us sum up Answer to learning Activities References Overview This unit gives you the database Architecture, Elements of DBMS and Merits and Demerits. Learning Objectives At the end of this unit you will be able to > Understand the architecture > Define the elements > Analyze the merits and demerits 1.0 Introduction Definitions: Data: Known facts that can be recorded and that have implicit meaning Database: Model of the real world in a computer system Database: Collection of related data Ex: fhe names, telephone numbers and addresses of all the people you know Basics of database + Itis a persistent (stored) collection of related data. * The data is input (stored) only once, + The data is organized (in some fashion). + The data is accessible and can be queried (effectively and efficiently). A database (DB) is a shared coliection of logically related persistent data as part ot the information system of an organisation. A database contains: ¢ Persistent data * Logically related data * Shared A Database Management System (DBMS) is a software Thal provides a set of primitives for defining, accessing and manipulating data. A dalabase system refers to the database + the operations defined on it. Database Management System: A computerized record- keeping system Mini-world: Some part of the real world about which data is stored in a database. For example, student grades and transcripts at a university. Database Management System (DBMS): A software package/ system to facilitate the creation and maintenance of a computerized database. Database System: The DBMS software together with the data itself. Sometimes, the applications are also included. Typical DBMS Functionality: Define a database: in terms of data types, structures and constraints Construct or Load the Database on a secondary storage medium Manipulating the database: querying, generating reports, insertions, deletions and modifications to its content Concurrent Processing and Sharing by a set of users and programs — yet, keeping al! data valid and consistent Other Features: * Protection or Security measures to prevent unauthorized access © “Active” processing to take internal actions on data * Presentation and Visualization of data DBMS are expensive to create in terms of software, hardware, and time invested. So why use them? Why couldn't we just keep all our data in files, and use word-processors to edit the files appropriately to insert, delete, or update data? So what is bad about flat files? Uncontrolled redundancy © Inconsistent data © Inflexibility © Limited data sharing * Poor enforcement of standards + Low programmer productivity Excessive program maintenance Excessive data maintenance Drawbacks of File Systems: Data redundancy and Inconsistency: Due to availability of multiple file formats, storage in files may cause duplication of information in different files. Difficulty in Accessing Data : in order to retrieve access and use stored data, we need to write a new program to carry out each new task Data Isolation: To isolate data we need fo store them in multiple files and different Formats. Integrity Problems; Integrity constraints became part of program code which has to be written every time, It is hard to add new constraints or to change existing ones. Atomicity of Updates: Failures of files may leave database in an inconsistent state with partial updates carried out, For example transfer of funds from one account to another should either complete or not happen at all. Concurrent access by Multiple Users: Concurrent access of files is needed for better performance and it also true that uncontrolled concurrent accesses of files can lead to inconsistencies. Several security related problems may be caused in file systems, Main Characteristics of the Database Approach: Self-describing nature of a database system: A DBMS catalog stores the description of the database. The description is called mefa-data). This allows the DBMS software to work with different databases. Insulation between programs and data: Called program-data independence. Allows changing data storage structures and operations without having to change the DBMS access programs. Data Abstraction: A data model is used to hide storage details and present the users with a conceptual view of the database, Support of multiple views of the data: Each user may see a different view of the database, which describes only the data of interest to that user. Sharing of data and multi-user transaction processing: Allowing a set of concurrent users to retrieve and to update the database. Concurrency contro! within the DBMS guarantees that each transaction is correctly executed or completely aborted. OLTP (Oniine Transaction Processing) is a major part of database applications. Database Users: Database administrators: responsible for authorizing access to the database, for coordinating and monitoring its use, acquiring software, and hardware resources, controlling its use and monitoring efficiency of operations. Database Designers: res ponsible to define the content, the structure, the constraints, and functions or transactions against the database. They must communicate with the end-users and understand their needs. End-users: they use the data for queries, reports and some of them actually update the database content. 1.3 Advantages of Using the Database Approach «Providing backup and recovery services. * Providing multiple interfaces ta different classes of users. «Representing complex relationships among data. «Enforcing integrity constraints on the database. © Drawing Inferences and Actions using rules. ‘1.4 Three level architecture of DBMS Three Schema Architecture: Defines DBMS schemas at three levels: Internal schema at the internal level to describe physical storage structures and access paths. Typically uses a physical data model. Conceptual schema at the conceptual level to describe the ‘sttucture and constraints for the whole database for a community of users. Uses a conceptual or an implementation data model. External schemas at the external level to describe the various user views. Usually uses the same data model as the ENDUSEAS f conceptual level. LEVEL blialial VIEW exlemaliconceptual “ veri SescassaennnatinnaLaaan aaITSnaaaTE oan [ones orem i cconceptualnternal mapping are. we —< | te Mappings among schema levels are needed to transform requests and data. Programs refer to an external schema, and are mapped by the DBMS to the internal schema for execution. Data Independence: Logical Data Independence: The capacity to change the conceptual schema without having to change the external schemas and their application programs. Physical Data Independence: The capacity to change the internal schema without having to change the conceptual schema. When a schema at a lower level is changed, only the mappings between this schema and higher-level schemas need to be changed in a DBMS that fully supports data independence. The higher-level schemas themselves are unchanged. Hence, the application programs need not be changed since they refer to the external schemas. Overall System Structure: The DBMS architecture constitutes the Disk storage, Storage manager, Query processor and Database users, connected through various tools and applications like query and application tools and application programs and application interfaces. The architecture of a DBMS is illustrated as follows: ul aiveoes seni sone ate | son Soe; oto, RM See, | = | [ms | DML | compiler | inepreter | slobase. ingen sronage Database Users: Users are differentiated by the way they expect to interact with the system, © Application programmers — interact with system through DML calls 2 * Sophisticated users — form requests in a database query language + Specialized users - write specialized database applications that do not fit into the traditional data processing framework * Naive users — invoke one of the application programs: that have been written previously. For example people accessing database over the web, bank tellers, cierical staff etc. Database Administrator: Coordinates all the activities of the database system; the database administrator has a good understanding of the enterprise's information resources and needs. Responsibilities of the Database administrator include: «Schema definition + Storage structure and access method definition + Schema and physical organization modification * Granting user authority to access the database + Specifying integrity constraints, * Acting as liaison with users ‘* Monitoring performance and responding to changes in requirements Transaction Management within Storage Manager: * A transaction is a collection of operations that performs a single logical function in a database application * Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (@.g., power failures and operating system crashes) and transaction failures. * Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database. This process is performed by a system inside Storage manager. Storage Manager Storage manager is a program module that provides the interface between the low-level data, stored in the database and the application programs and queries submitted to the system It executes based on data obtained from query execution engine. It consists of Buffer manager, File manager, Authentication and integrity manager and Transaction manager. The storage manager is responsible to the following tasks: + Interaction within various managers. ™ Efficient storing, retrieving and updating of data Disk Storage Disk storage consists of Data in the form logical tables, indices, data dictionary and statistical data, Data Dictionary stores the data about data i., its structure etc. Indices are used for easy searching in a data base. Statistical data is the log storage details about the various transactions which occur on the database. Query processor The users submits query which passes to optimizer where the query is optimized, the physical execution plan goes to execution engine. Query Execution engine passes the request to index/ file J record manager. That in turn passes the request to buffer manager requesting it to allocate memary to store 4 pages. Buffer manager in turn sends the pages to storage manager which takes care of physical storage. + The resulting data out of physical storage comes in reverse order. The catalog is the Data Dictionary which contains statistics and schema. Every query execution which takes place in execution engine is logged and recovered when required. When not to use a DBMS? ‘As a DBMS includes the following costs like high initial cost , (possibly) cost of extra hardware , cost of entering data, cost of training people to use DBMS ,cost of maintaining DBMS, it becomes tough to use DBMS. A DBMS becomes unnecessary If access to data by multiple users is not required (and data set is small) « If database and application are simple, well- defined, and not expected to change Data Model: A set of concepts to describe the structure of a database, and certain constraints that the database should obey. Data Model Operations. Operations for specifying database retrievals and updates by referring to the concepts of the data model Operations on the data model may include basic operations and user-defined operations. _ ries of data models Categories of Data Models ft | Conceptual | Physical Representational | Conceptual (high-level, semantic) data models: Provide conedbts that are close to the way many users perceive data (Also called entity-based or object-based data models.) Physical (jow-levei, internal) data models: Provide concepts that describe details of how data is stored in the computer. Implementation (representational) data models: Provide concepts that fall between the above two, balancing user views with some computer storage details. This unit is very much useful to know the concept of Data Base Management Systems(DBMS). Leamingyectivities are also included. If you want more details about this unit means you can refer the prescribed book given below. earning Aeti a) Fill in the blanks: 1. Data Dictionary siores the __about data. b) State whether true or false: 1. A transaction is a collection of operations that performs a single logical function in a database application Answertolearning Activities a) Fill in the blanks? 4. Data b) State whether true or faise: 4. True References 1. Database System Concepts by Silberschatz, Korth and Sudarshan, McGraw Hill 2. An Introduction to Database systems by Bibin C. Desai, Galgotia Publications. 3. Modern Database Management , Jeffrey A.Hoffer , Mary B.Prescott , Fred RMofadden , Sixth edition . Pearson Education Asia , First Indian Reprint 2002 4, Fundamentals of database systems , Ramez Elmasri , Shamkant B.Navathe , Third Edition , Pearson education asia , Fourth indian Reprint 2001. Unit -2 Database Models Structure . Overview Learning Objectives 2.0 Hierarchical Model 2.1 Network model 2.2 Relational Model 2.3 E-R Model Let us sum up Answer to learning Activities References Overview This unit gives you the complete details of database models ‘Learning Objectives At the end of this unit you will be able to v Understand the concepts of E-R Models Define the hierarchical Model v v Get the Knowledge of Network Model and Relational Model 2.0 Hierarchical Data Model Organizes the database as a tree structure. + Organization of the records is as a collection of trees, rather than arbitrary graphs. + Schema represented by a Hierarchical Diagram. "One record type, called Root, does not participate as a child record type. : “s Every record type except the root participates as a child record type in exactly one type. = Leafis a record that does not participate in any record types. + Arecord can act as a Parent for any number of records. a 2.1 Network Data Model Organizes the database as a graph. YO + Data are represented by collections of records. + Relationships among data are represented by links. 19 + Organization is that of an arbitrary graph and represented by Network diagram. Constraints in the Network Mod + Insertion Constraints: Specifies what should happen when a record is inserted © Retention Constraints: Specifies whether a record must exist on its own or atways be related to an owner as @ member of some set instance. * Set Ordering Constraints: Specifies how a record is ordered inside the database. * Set Selection Constraints: Specifies how a record can be selected from the database. 2.2 Relational Data Model Organizes the database as a set of relations (tables) and relationship amang the relations (tables). COLUMN _4t Row —4 VALUE Data Dictionary: The data dictionary stores information about the data in the database that is essentiai to its management as a business resource. A data dictionary, or data catalog is a database (in its own right) that provides a list of the definitions of all objects in the main database. For instance, it should include information on all entities in the database, along with their attributes and indexes. This "data about data’ is sometimes referred to as TABLE metadata. The data dictionary should be accessible to the user of the database, so that she can obtain this metadata. Some examples of the contents of the data dictionary are: © What data is available? ‘Where the data is stored? '* Who owns the data? * How the data is used? ‘* Who can access the data? + Where relationships exist between data items? Database Design Phases: DATA ANALYSIS Entities - Attributes - Relationships - Integrity Rules LOGICAL DESIGN Tables - Columns - Primary Keys - Foreign Keys PHYSICAL DESIGN DDL for Tablespaces, Tables, Indexes © Conceptual Design: (Entity Relationship Model is used at this stage.) 21 «Schema Refinement: (Normalization) + Physical Database Design and Tuning 2.3 Entity Relationship Model An Entity is an object in the world that can be distinguished from other objects. An Entity has well defined properties (attributes). What Should an Entity be and Should not be? Should Be: * An object that will have many instances in the database + An object that will be composed of multiple attributes * An object that we are trying to model Should Not Be * Auser of the database system * An output of the database system (e.g. a report) Attribute: An Attribute is property or characteristic of an entity type. Classification of attributes: ‘* Simple versus Composite Attribute ‘* Single-Valued versus Multivalued Attribute * Stored versus Derived Attributes * Identifier Attributes Simple versus Composite Attribute: An attribute that cannot be further subdivided is termed as a simple attribute or an atomic attribute, 22 For example consider tne following OMe { Yoars_tmpioyed }———-}_ EMPLOYEE + —Ca) Date_Employed The attribute Employee_ID in EMPLOYEE entity, cannot be further sub divided An attribute that can be further sub divided is an example for composite attribute. As an example consider the attribute address: AR YO ee Single-Valued versus Multivalued Attribute: If the attribute of an entity has only one value associated with it, itis termed as a single valued attribute. For example the attribute Emptoyee_ID in Employee entity is Single valued 23 If there are more than one values associated with an entity it is a multivalued attribute. For example the attribute Skill in Employee entity is multivalued. Stored versus Derived Attributes: If the value of an attribute is stored (for example Date_Employed) in Employee entity) itis a stored attribute, If the value of an attribute is derived from the value of another it is a derived attribute. For example the attribute Years_Employed is a derived attribute, since the value of Years_Employed is derived from the value of Date_Employed Identifier Attributes: \dentifier (Key) - An attribute (or combination of attributes) that uniquely identifies individual instances of an entity type. The attribute Employee_ID in Employee entity uniquely identifies an Employee entity. E-R Model Constructs: Entity instance: person, place, object, event, concept (often corresponds to a row in a table), Entity Type: Collection of entities (often corresponds to a table) Attribute - property or characteristic of an entity type (often corresponds to a field in a table). Relationship instance: Link between entities (corresponds to primary key-foreign key equivalencies in related tables) Relationship Type: Category of relationship...link between entity types A Sample Entity Relationship Diagram: Business Logic: Supplier supplies Items. Supplier sends Shipment, Shipment includes Items. Items are used in Products. Customers submit Order. Order requests Product. 24 ‘ules 1 _ r > co SHPNENT Ques pe uso a a <> ee oe OF open ose OK oor oy Basic E-R Notation: Basic symbols ‘Strong entity Associative entity <> Relationship utielued abu <> Ieniyngelatonsnip ) Derived atbute An Associative Entity is a special entity that is also a relationship. Relationship Types vs. Relationship Instances: The relationship type is modeled as the diamond and it is connected by lines between two entity types. Relationships can have attributes. These describe features pertaining to the association between the entities in the relationship. Two entities can have more than one type of relationship between them (multiple relationships). An ‘Associative Entity = combination of relationship and entity Degree of Relationships: Degree of a Relationship is the number of entity types that participate in it. A relationship can be: ‘+ Unary Relationship * Binary Relationship + Temary Relationship ata ce Urany One entity Entities of two Entities of three related to different types different types another ofthe etated to each related to each same entity other other type ‘A Unary Relationship associates the same entity. A binary relationship associates two entities. A ternary relationship associates three entities, 26 Cardinality of Relationships: Cardinality ratio specifies the number of instances to an “ entity instance can participate with. Cardinality ratio can be: * One-to-One * One -to-Many © Many-to-Many One - to - One: Each entity in the relationship wili have exactly one related entity One ~ to - Many: An entity on one side of the relationship can have many related entities, but an entity on the other side will have a maximum of one related entity Many - to - Many: Entities on both sides of the relationship can have many related entities on the other side. The following example illustrates unary relationships with cardinality ratio One — to - One and One - to - Many. _ co PERSON EMPLOYEE |s maried_to i One-to-one One-to-many 27 ‘Wanages The following example illustrates binary relationships with cardinality ratio One - to ~ One, One - to ~ Many and Many to Many. EMPLOYEE meme Is_assigned ————~| PARKING PLACE One-to-one PRODUCT LINE < PRODUCT One-to-many —— STUDENT <| COURSE sereesseenee ——1 Many-to-many The following example illustrates ternary relationship: PART =| WAREHOUSE | Shipping_mode C=) | * Note: A relationship can have attributes of its own VENDOR 28 Strong vs. Weak Entities, and Identifying Relationships: Strong Entities: * Exist independently of other types of entities © Has its own unique identifier Weak Entity: * Dependent on a strong entity. It cannot exist on its own * Does not have a unique identifier + Has a partial key Identifying Relationship: * Links strong entities to weak entities Consider the following example: (m=) Employee_1D ) Dependent_Name EMPLOYEE + <> DEPENDENT Strong Entity Identifying WeakEntity Relationship a Date_of_Birth RELATIONAL DATA MODEL: The relational data model organizes the database as a set of relations and relationship among the relations, A relation is nothing but a table of values with rows and columns. In 29 relational database terminology a table is termed as a relation. Arow is termed as a tuple. A column is termed as an attribute. Consider the following Scenario: Narmatha Private Limited is organized into departments. Each department has employees working in it. A department controls aa number of projects. An employee can work on any number of projects on a day. However he / she is not permitted to work more than once on a project he / she worked on the same day. The following relational database schema is used in the company: EMPLOYEE : - Pawo | Er] NAME | Dos | SEX | Dod | DESIGNATION | BASIC. | DNO | DEPARTMENT DNO | DNAME PROJECT | PcopE | PNAME | DNO WORKS_FOR ay ENO | PCODE | DATE WORKED | INTIME | OUTTIME Candidate Key / Primary Key: Any relation must have an atomic attribute or combination of two or more attributes that will uniquely identify a tuple in the relation. This is termed as a candidate key. Properties of Candidate Key's: 30 A candidate key must be «Unique «Not Null « Minimal (Removal of some attribute from the candidate key means the uniqueness property will no more hold) A relation can have any number of candidate keys but only one primary key. If there is only one candidate key by default it becomes the primary key. If there are two or more candidate keys one of the candidate key is assigned as the primary key. + Inthe EMPLOYEE relation there are two candidate keys, PANNO and ENO. Among these ENO has been assigned as Primary Key. + Inthe DEPARTMENT relation there is only one candidate key DNO, 80 by default DNO is the primary key. + In the PROJECT relation there is only one candidate key PCODE, so by default PCODE is the primary key. + Inthe WORKS_FOR relation there is only one candidate key (ENO, PCODE, DATE_WORKED). Note that the candidate key is composite i.e. the candidate key is formed by three attributes. Since there is only one candidate key by default (ENO, — PCODE, DATE_WORKED) is the primary key. Foreign Key: ‘An attribute B in Relation R is said to be a foreign key if it references attribute A in Relation R or another Relation S where A is a Primary Key. This constraint is termed as Referential Integrity Constraint + In the EMPLOYEE relation DNO is a Foreign Key referencing DNO of DEPARTMENT relation. 31 « In the PROJECT relation DNO is a Foreign Key referencing ONO of DEPARTMENT relation. «In the WORKS_FOR relation ENO is a Foreign Key referencing ENO of EMPLOYEE relation, + In the WORKS_FOR relation PCODE is a Foreign Key referencing PCODE of PROJECT relation Note: The value of the Foreign Key must be drawn from the value of the Primary Key: Can a Foreign Key be assigned a NULL value? A Foreign Key can be assigned a NULL value if it is not part of a Primary Key. ‘Super Key: A super key of an entity set is a set of one or more attributes whose values uniquely determine each entity However a Super Key Contains extraneous attributes i. there exists some attribute belonging to the super key which when removed the uniqueness property still holds. For Example: In the EMPLOYEE relation (ENO, NAME) will uniquely identify a tuple. But the attribute NAME is extraneous meaning that even after removal of NAME from the EMPLOYEE relation ENO will uniquely identify a tuple (Even after removal of NAME the uniqueness property still holds). A Candidate Key is a minimal Super Key. Data Integrity: « Entity Integrity « Referential Integrity * Domain Integrity 32 Entity Integrity: The entity integrity rule is designed to assure that every relation has a primary key, and that the data values for that primary key are all valid. Entity integrity guarantees that every primary key attribute is not-null. No attribute participating in the primary key of a relation is allowed to contain nulls. Primary key performs the unique identification function in a relational model. Referential Integrity: An attribute B in Relation R is said to be a foreign key if it references attribute A in Relation R or another Relation S where A is a Primary Key. This constraint is termed as Referential Integrity Constraint. Dom: Integrity: All the vaiues that appear in a column of a relation (table) must be taken from the same domain. As we have seen before, a domain is a set of values that may be assigned to the attribute. A domain definition usually consists of the following components: + Domain Name Meaning * Data Type * Size or Length * Allowable Values or Aifowable Range(if applicable) Transforming ER Model to Relational Data Model: Relations (tables) correspond with entity types and with many-to-many relationship types. Rows correspond with entity instances and with many-to-many relationship instances, Columns correspond with attributes, NOTE: The term relation in relational database is not the same as the term relationship in the ER model. 33 Mapping Regular Entities to Relations: 1. Simple Attributes: E-R attributes map directly onto the relation 2. Composite Attributes: Use only their simple, component attributes 3. Multi-valued Attribule : Becomes a separate relation with a foreign key taken from the superior entity CUSTOMER Entity Type with Simple Attributes: =~ (Cusama.sone ) | “ customer | (Cusomer Adress CUSTOMER Relation: CUSTOMER ee Customer ID | Customer Name | Customer, Address Suet a toe w c Sy) Caste cuSTONER CUSTOMER Relation: CUSTOMER _ _ | Customer_10. ‘Customer_Name ] ‘Street city Stale Mapping a Multivalued Attribute: | EMPLOYEE | Multivalued Attribute becomes a Separate Relation with Foreign Key: EMPLOYEE Employer 10 Employes Namo Employee Addrass EMPLOYEE SKILL Empioyee.I0 | Skit Transformation Rule 4: For each entity in the ER model create a separate relation, Include all simple attributes. If composite represent them as simple atomic attributes. Transformation Rule 2: For each multivalued attribute associated with an entity create a separate relation. Include as attributes: + i. As Foreign Key Primary Key of the entity. ii, The Multivalued attribute. 35 The primary key of the relation will be combination of the foreign key and the multivalued attribute Mapping Weak Entities: Transformation Rule 3: Weak Entities becomes a separate relation with a foreign key taken from the superior entity. The Primary Key is composed of: © Partial identifier of Weak Entity * Primary Key of identifying relation (strong entity) Weak entity DEPENDENT: First Name un Last Nana mam) Dependent Neme Dale, a, ‘Birth L Employee_ID PA, EMPLOYEE | DEPENDENT <> Relations Resulting from Weak Entity: EMPLOYEE Employee ID | Employee Nae DEPENDENT Fits Middle_Inital | LastName | Employee ID | Date_of Binh | cae | Note: The Primary Key of DEPENDENT relation is Composite (Combination of the Primary Key of the Strong Entity and the Partial Key of the Weak Entity). Mapping Binary Relationships: One-to-Many Relationship: Transformation Rule 4: Primary key on the one side becomes a foreign key on the many side. Many-to-Many Relationship: Transformation Rule 5: Create a new relation preferably with the name of the relationship. Include as foreign keys the primary keys of the two entities. Also include the attributes modeled on the relationship The primary key will generally be combination of the foreign keys. However depending upon the business rules, attributes modeled on the relationship may form part of the primary key One-to-One Relationship: Transformation Rule 6: Primary key on the mandatory side becomes a foreign key on the optional side. Example of Mapping a 1: N Relationship: e Mapping the Relationship: CUSTOMER CustomerID ) Customer_Name Customer_Address ORDER Order_ID Order_Date | Customer_|D In the above example Submits is a one — to many relationship that associates the CUSTOMER entity and the ORDER entity. So, we have included on the ORDER relation (Many Side) as Foreign Key, Primary Key of the CUSTOMER relation (One Side) Example of Mapping an M: N Relationship: C “Uni a) Vendor» \ Measure _// ee Z Die oa) Co © Unit Price yy ) Ce) ic { Material ID} | VENDOR RAW | MATERIALS, 38 Three Resulting Relations: RAW MATERIALS Material 1D | Standard. Cost | Unof_ Measure | quote Wario 1 | Yee Venetor.10 | Vencior, Name ; Vendor Adires In the above example Supplies is a Many to Many Relationship associating the entities RAWMATERIALS and VENDOR. So, we have created a new relation QUOTE with attributes MATERIAL_ID which is a Foreign Key referencing the Primary Key of RAWMATERIALS relation, VENDOR_ID which is a Foreign Key referencing the Primary Key of VENDOR relation and UNIT_PRICE (the attribute modeled on the relationship). The Primary Key of the relation QUOTE is a combination of MATERIAL_ID and VENDOR ID. Example of Mapping a Binary 1:1 Relationship: NURSE (date. ~Assivne S) Location: Resulting Relations: NURSE Nurse_10 | Name Nurse_in_Charge | Date_Assigned In the above example it can be noted that the Primary Key on the mandatory side becomes a foreign key on the optional side. Mapping Unary Relationships: One-to-Many ~ Relationship: Transformation Rule 7: Recursive foreign key in the same relation Many-to-Many ~ Relationship: Transformation Rute 8: Two relations. One for the entity type and one for an associative relation in which the primary key has two attributes, both taken from the primary key of the entity. Example of Mapping a Unary 1: N Relationship: nployee.ID Carne EMPLOYEE 40 EMPLOYEE Relation with Recursive Foreign Key: EMPLOYEE Employee_ID Name | Birthdate | Manager ID Example of Mapping a Unary Mf; N Relationship: Bill-of-materials Relationships: a [w= — Ceuantity Contains ITEM and COMPONENT Relations: ITEM Item_No | Name | Unit_Cost COMPONENT ltem_No | Gomponent_No | Quantity 41 Mapping Ternary (and n-ary) Relationships: Transformation Rule 9: One relation for each entity and one for the associative entity. Associative entity has foreign keys to each entity in the relationship. Example of Ternary relationship with Associative Entit hn > Cryscian_ same) > Cenysican 10 > \ Parent | ~~. - ( } *| ‘ if ( Deserioton ») ("ode ‘TREATMENT C pencnipton Mapping the Ternary Relationship: PATIENT favert 19 | Pesiem_Namre i PHYSICIAN ID Physician_Name PATIENT TREATMENT T Prysiian ID | Tea Dae | Time | Resists TREATMENT Description 42 Letussumup This unit is very much useful to know the concept of E-R Models, Relational Models, Network Models and Hierarcfical Model.. Learning activities are also included. If you want more details about this unit means you can refer the prescribed book given below, Learning | fe ¢) Fill in the blanks: 1. An is an object in the world that can be distinguished from other objects. 4) State whether true or false: 4, The primary key will generally be combination of the foreign keys. Answer to learning Activities c) Fill in the blanks: 1. Entity d) State whether true or false: 4. True References | 1) Database System Concepts by Silberschatz, Korth and Sudarshan, McGraw Hil 2) An Introduction to Database systems by Bibin C. Desai, Galgotia Publications. 3) Modern Database Management , Jeffrey AHoffer , Mary B.Prescott , Fred R.Mcfadden , Sixth edition . Pearson Education Asia , First Indian Reprint 2002. 4) Fundamentals of database systems , Ramez Elmasri , Shamkant 8.Navathe , Third Edition , Pearson education asia , Fourth indian Reprint 2001 Block 2: File Organization _ In this block, we will lear about the concept of File Organization. With this you will get the clear idea about methods and management of File Organization. This block is divided into one unit as follows. Unit 3: It deals with file Organization and their related concepts. 44 File Management Structure Overview Learning Objectives 3.0 Introduction 3.4 Methods of File Organization 3.1.1 Sequential File Organization 3.1.2 Direct File Organization 3.1.3 Index Sequential File Organization 3.1.4 Multi Key File Organization 3.2 Management Considerations Let us sum up Answer to learning Activities References Overview At the end of this unit you will be able to > Understand the methods of file organization > Define the Management Considerations 3.0 Introduction ~ File: A file is a collection of related records. Each record in a file is included because it pertains to the same entity. Types of Files: Thiye are various types of files in which the records are collected and maintained. They are categorized as follows: © Master file Transaction file 45 * Table file © Report file + Back-up file * Archival file © Dump file © Library file Master File: Master files are the most important type of file. Most file design activities concentrate here. In a business application, these are considered to be very significant because they contain the essential records for maintenance of the organization's business, A master file can be further categorized. It may be called as a reference master file, in which the records are static (unlikely to change frequently). For example, a product file containing descriptions and codes; customer file contaminating name, address and account number are example of referenda files, Alternatively it may be described as a dynamic master file. In this file, we keep records, which are frequently changed (updated) as a result of transactions or other events. These two types of master fle may be kept as separate files or may be combined, for exampie, a sales ledger file containing referenda data, such as name, address, account number, together with current transaction and balance outstanding for each customer. Transaction File: A transaction is a temporary fie used for two purposes. First of all, it is used to accumulate data about events as they occur. Secondly, it helps in updating master files to reflect the result of current transactions. The term transaction refers to any business event that affects the organization and about which data’s captured. Examples of common transactions in the organization are making purchases, hiring of workers and recording of sales. Table File: A special type of master file is included in many systems to meet specific requirements where data must be referenced repeatedly. Table files are permanent files containing reference data used in processing transactions, updating master file or producing output. As the name implies, these files store reference data in tabular form. Table files conserve memory space and make the program maintenance easier by storing data in a file, that otherwise would be included in programs or master file records. Report Fil Report files are collected contents of individual output reports or documents produced by the system. They are created by the system where many reports are produced by the system and printer may not be available for all the reports. This situation frequently arises when the computer carries out three functions- input, processing and output simultaneously rather than executing each function in sequence. In this case, the computer Writes the report contents to a file on a magnetic tape or disk, where it remains until it can be printer. That file is called the report file, which contains the unprinted output data. The process of creating it is known as spooling which means that output that cannot be printed when it is produced is spooked into airport file; then depending on the availability of printer, the system will be instructed to read the report file and print the outout on the printer. Backup File: It is a copy of master, transaction or table file that is made o ensure a copy is available if anything happens to the original. Archival File: These files are copies made for long-term storage of data that may be required at a much later date. Usually archival files 47 are stored far away from the computer center so that they cannot be easily retrieved for use. Dump File: This is a copy of computer held data at a particular point of time. This may be a copy of master file to be retained to help recovery in the event of a possible corruption of the caster file or it may be part of a program in which error is being traced. Library File: Library file generally contains application programs, utility programs and system software packages. ‘3.4 Methods of file organization _ Structures: File Structures is the Organization of Data in Secondary Storage Device in such a way that minimizes the access time and the storage space. A File Structure is a combination of representations for data in files and of operations for accessing the data. A File Structure allows applications to read, write and modify data. It might also support finding the data that matches some search criteria or reading through the data in some particular order. File organization may be sequential, index sequential, inverted list or random. Each method has its own uses and abuses. A file is organized to ensure that records are available for processing. Before a file is created, the application to which the file wil be use must be carefully examined. Clearly, a fundamental consideration in this examination will concern the data to be recorded on the file, But an equally important and less obvious consideration concerns how the data are to be placed on the file. 3.1.1 Sequential file Organization Itis the simplest method to store and retrieve data from a file, Sequential organization simply means storing and sorting in 48 physical on tape or disk In a sequential organization records can be added only at the end of the file. That is in a sequential file, records are stored one after the other without concern for the actual value of the data in the records. It is not possible to insert a record in the middle of the file without re-writing the file. Records from both files are matched, one record at a time, resulting in an updated master file. It is a characteristic of sequential files that all records are stored by position; the first one is at the first position, the second one occupies the second position and so on. There are no addresses or location assignments in sequential files. To read a sequential file, the system always starts at the beginning of the file. If the record sought is Somewhere in the file, the system reads its ways unto it, one record at a time. For example, ifa particular record happens to be the fifteen one in a file, the system starts at the first one, and reads a head one record at a time until the fifteenth one is reached. It cannot jump directly to the fifteenth one in a sequential file without starting from the beginning. Using the key field, in a sequential file the records have been arranged into ascending or descending order according to a Key field. This key field may be numeric, alphabetic, or a combination of both, but it must be occupy the same place in each record, as it forms the basis for determining header which the records will appear on the file. Sequential files are generally maintained a magnetic tape, disk or a mass storage system. The advantages and disadvantages of the sequential file organization are given below: Advantages: Simple (0 understand this approach * Locating a record requires only the record key ‘* Efficient and economical activity rate * Relatively inexpensive 1/0 media and devices may be used. 49 * Files may be relatively easy to reconstruct since a good measure of built in backup is usually available Disadvantages: + Entire file must be processed even when the activity rate is low ‘* Transactions must be sorted and placed in sequence 9 © Timeliness of data in file deteriorates while batches are prior to proces: being accumulated © Data redundancy is typically high since the same data may be stored in several files sequenced on different keys 3.1.2 Random or Direct File Organization For a proposed systern, when the sequential files are assumed as a disadvantage, another file organization called direct organization is used. As with a sequential file, each record in a direct file nest contains a key field. However the records need not appear on the file in deny field sequence. In addition any record stored on a direct file can be accesseddirectl, if its location or address is known (i.e) all previous records need not to be accessed. The problem, however is to determine how to store the data records so that, given the key field of the desired record, its storage location on the file can be determined. in other words, if the program knows the record key, it can determine the location address of a record and retrieve it independently of any other records in the fle. It would be ideal if the key field could also be the location of the record on the file. This method is known as direct addressing method. This is quite simple method but the requirements of this method often prevent its use, in case of many other factors, this method could become popular. Hence it is rarely used. Therefore, before a direct organized file can be created, a formula or method must be devised to convert the key field 50 value for a record to the address or location of the record on the file. This formula or method is generally called an algorithm, ‘Otherwise called the hashing addressing. Hashing refers to the process of deriving a storage address from a record key; there are many algorithms to determine the storage location using key field, some of the algorithms are: Division by Prime: In this procedure, the actual key is divided by any prime number. Here the modular division is used. That is quotient is discarded and the storage locations signified by the remainder. If the key field consists of farge number of digits, for instance, 10digits (e.g. 2345632278) then strip off the first or last 4 digits and then apply the division by prime method. Various common algorithms are also given as under: Folding, Extraction, Squaring. The advantages and disadvantages of direct file organization are as follows: Advantages: * Immediate access to records for inquiry and updating purposes is possible ‘+ Immediate updating of several files as a result of single transaction is possible * Time taken for sorting the transaction can be saved Disadvantages: Records in the on-line file may be exposed the risk of a loss of accuracy and a procedure for special backup and reconstruction is required « As compared to sequentially organized, this may be less efficient in using the storage space «Adding and deleting of records is more difficult then with sequential files * Relatively expensive hardware and software resources are required Si 3.1.3 Index Sequential File Organization The third way of accessing records stored in the system is through an index. The basic form of an index includes a record Key and the storage address for record. To find record, when the storage address is unknown it is necessary to scan the records. However, if an index is used, the search will be faster since it takes less time to search an index than an entire file of data Indexed file offers the simplicity of sequential file while the same time offering a capability for direct access. The records must be initially stored on the file in sequential order according to a key field. In addition, as the records are being recorded on the file, one or more indexes are established by the system to associate the key field value(s) with the storage location of the record on the file. These indexes are then used by the system to allow a record to be directly accessed. To find a specific record when the file is stored under an indexed organization, the index is searched first to find the key of the record wanted. When it is found, the corresponding storage address is noted and then the program can access the record directly. This method uses a sequential scan of the index, followed by direct access to the appropriate record. The index helps to speed up the search compared with a sequential file, but it is slower than the direct addressing. The indexed files are generally maintained on magnetic disk or on a mass storage system. The primary differences between direct and indexed organized files are as follows: Records may be accessed from a direct organized file only randomly, where as records may be accessed sequentially or randomly from an indexed organized files. Direct organized files utitize an algorithm to determine the location of a record, whereas indexed organized files utilize an index to locate a record to be randomly accessed. The 52 advantages and disadvantages of indexed sequential file organization are as follows: Advantages: * Permits the efficient and economical use of sequential processing techniques when the activity rate is high. * Permits quick access to records in a relatively efficient way this activity is a smalt fraction of the total workload. Disadvantages: * Less efficient in the use of storage space than some other alternatives. + Access to records may be slower using indexes than when transform algorithms are used. + Relatively expensive hardware and software resources are required Indexing and Hashing: An index for a file in a database system works in much the same way as the index in a book, The index is much smaller than the book, further reducing the effort needed to find the words we are looking for. Database system indices play the same role as book indices or card catalogs in libraries. There are two basic kinds of indices: + Ordered indices - Based on a sorted ordering of the values. * Hash indices - Based on a uniform distribution of the values across a range of buckets. The bucket to which a value is assigned is determined by a function, called a hash function. No one technique is the best. Rather, each technique is best suited to particular database application. Each technique must be evaluated on the basis of these factors: 53 + Access types: The types of access that are supported efficiently. Access types can include finding records with a specified attribute value and finding records whose attribute vaiues fall in a specified range. © Access time: The time it takes to find a particular data item, or set of items, using the technique. Insertion time: The time it takes to insert a new data item. This value includes the time it takes to find the correct place to insert the new data item, as well as the time it takes to update the index structure. * Deletion time: The time it takes to delete a data item. This value includes the time it takes to find the item to be deleted, as well as the time it takes to update the index structure. * Space overhead: The additional space occupied by an index structure. Provided that the amount of additional space is moderate, it is usually worthwhile to sacrifice the space to achieve improved performance. An attribute or set of attributes used to look up records in a file is called “search key.” Note that this definition of "key" differs from that used in “primary key, candidate key and superkey.” Using our notion of a search key, we see that if there are several indices on a file, there are several search keys. Ordered indices: To gain fast random access to records in a file, we can use an index structure, Each index structure is associated with a particular search key. Just like the index of a book or a library catalog, an ordered index stores the values of the search keys in sorted order, and associates with each search key the records that contain it. 54 The records in the indexed file may themselves be stored in some sorted order. A file may have several indices, on different search keys. If the file containing the records is Sequentially ordered, a "primary index" is an index whose * search key also defines the sequential order of the file, The term “primary index" is sometimes used to. mean an index on a primary key. However, such usage is nonstandard and should be avoided.) Primary indices are also called “clustering indices." The search key of a primary index is usually the primary key, although that is not necessarily so. Indices whose search key specifies an order different from the sequential order of the file called “secondary indices,” or — “nonclustering" indices. Primary Index: “Index-sequential files" are files that ordered sequentially on primary index on the search key. They are designed for applications that require both sequential processing of the entire file and random access to individual records Dense and Sparse Indices: ‘An “index record," or "index entry," consists of a search- key value, and pointers to one or more records with that value as their search-key value. The pointer to a record consists of the identifier of a disk block and an offset within the disk block to identify the record within the block. There are two types of ordered indices that we can use: Dense index: An index record appears for every search-key value in the file. In a dense primary index, the index record contains the search-key value and a pointer to the first data record with that search-key value. The rest of the records with the same search-key value would be stored sequentially after the first record, since, because the index is a primary one, records are sorted on the same search key. Dense index implementations may store a Jist of painters to all records with 55 the same search-key value; doing so is not essential for primary indices? Sparse index: An index record appears for only some of the search-key values. As is true dense indices, each index record contains @ search-key value and a pointer to the first data record with that search-key value, To locate a record, we find the index entry with the targest search-key value that is less than or equal to the search-key value for which we are looking We start at the record pointed to by that index entry, and follow the pointers in the file until we find the desired record Multilevel Indices: Even if we use a sparse index, the index itself may becorne too large for efficient processing. It is not unreasonable, in practice, to have a file with 100,000 records, with 10 records stored in each block. If we have one index record per block, the index has 10,000 records. Index records are smaller than data records, so let us assume that 100 index records fit on a block. Thus, our index occupies 100 blocks. ‘Such large indices are stored as sequential files on disk. If an index is sufficiently small to be kept in main memory, the search time to find an entry is low. However, if the index is so large that it must be kept on disk, a search for an contry requires several disk bluck reads. If the Index occupies N blocks, binary search requires as many as (log N) blocks to be read. Indices with two or more levels are called "multilevel" indices. Searching for records with a multilevel index requires significantly fewer /O operations than does searching for records by binary search. Multilevel indices are closely related to tree structures, such as the binary trees used for in-memory indexing 56 Index Update: Regardless of what form of index is used, every index must be updated whenever a record is either inserted into or deleted from the file. We first describe algorithms for updating single-level indices. Insertion: First the system performs a lookup using the search-key value that appears in the record to be inserted. Again, the actions the system takes next depend on whether the index is dense or sparse: Dense indices: ‘+ If the search-key value does not appear in the index, the system inserts an index record with the Search-key value in the index at the appropriate position. Otherwise the following actions are taken: * If the index record stores pointers to all tecords with the same search-key value, the system adds & pointer to the new record to the index record * Otherwise, the index record stores a pointer to only the first record with the search-key value. The system then places the record being inserted after the other records with the same search-key values. Sparse Indices: ‘We assume that the index stores an entry for each block If the system creates a new block, it inserts the first search-key value (in search-key order) appearing in the new block into the index. On the other hand, if the new record has the least search-key value in its black, the system updates the index entry pointing to the block; if not, the system makes no change to the index 37 Deletion: To delete a record, the system first looks up the record to be deleted. The actions the system takes next depend on whether the index is dense or sparse’ Dense indices: * If the deleted record was the only record with its particular search-key value, then the system deletes the corresponding index record from the index. Otherwise the following actions are taken: + If the index record stores pointers to all records with the same search-key value, the system deletes the pointer to the deleted recard from the index record. + Otherwise, the index record stores a pointer to only the first record with the search-key value, In this case, if the deleted record was the first record with the search-key value, the system updates the index record to point to the next record. Sparse Indices: © If the index does not contain an index record with the ‘search-key value of the deleted record, nothing needs to be done to the index. Otherwise the system takes the following actions: * if the deleted record was the only record with its search key, the system replaces the corresponding index record with an index record for the next search-key value (in search-Key order). If the next search-key value already has an index entry, the entry is deleted instead of being replaced. * Otherwise, if the index record for the search-key value Points to the record being deleted, the system updates the index record to point to the next record with the same search-key value. Insertion and deletion algorithms for multilevel indices are a simple extension of the scheme just described. Secondary Indices: Secondary indices must be dense, with an index entry for every search-key value, and a pointer to every record in the file A primary index may be sparse, storing only some of the search-key values, since it is always possible to find records with intermediate search-key values by a sequential access to a part of the file, If a secondary index stores only some of the search-key values, records with intermediate search-key values may be anywhere in the file and, in general, we cannot find them without searching the entire file. Secondary indices improve the performance of queries that use keys other than the search key of the primary index. However, they impose a significant overhead on the database. ‘The designer of a database decides which secondary indices are desirable on the basis of an estimate of the relative frequency of queries and modifications. 3.1.4 Multipte Key File Organization B+ tree Index Files: 5 The main disadvantage of the index-sequential file organization is that performance degrades as the file grows, both for index lookups and for sequential scans through the data. Although this degradation can be remedied by reorganization of the file, frequent reorganization are undesirable. The B+ tree index structure is the most widely used of several index structures that maintain their efficiency despite insertion and deletion of data. A B+ tree index takes the form of a balanced tree in which every path from the root of the tree to a leat of the tree is of the same length. Each non leaf node in the tree has between ceil(n/2) and n children, where n is fixed for a particular tree. 59 B+ tree structure imposes performance overhead on insertion and deletion, and adds space overhead, The overhead is acceptable even for frequently modified files, since the cost of file reorganization is avoided. Furthermore, since nodes may be as much as half empty, there is some wasted space overhead, too, is acceptable given the performance benefits of the B+ tree structure, Static Hashing: One disadvantage of sequential file organization is that we must access an index structure to locate data, or must use binary search, and that results in more VO operations. File organizations based on the technique of hashing allow us to avoid accessing an index structure. Hashing also provides a way of constructing indices. Dynamic Hashing: Most databases grow larger over time. If we are to use static hashing for such a database, we have three classes of options: ‘* Choose a hash function based on the current file size This option will result in performance degradation as the database grows, * Choose hash function based on the anticipated size of the file at some point in the future, Although performance degradation is avoided, a significant amount of space may be wasted initially, * Periodically reorganize the hash structure in response to file growth. Such a re-organization involves choosing a new hash function, re-computing the hash function on every record in the file, and generating new bucket assignments. This reorganization is a massive, time- consuming operation. Furthermore, it is necessary to forbid access to the file during reorganization 60 Multiple-Key Access: Use multiple indices for certain types of queries. Example: ‘SELECT * FROM ACCOUNT WHERE BRANCH_NAME = ‘CHENNAI AND BALANCE = 1000 Possible strategies for processing query using indices on single attributes: * Use index on BRANCH_NAME to find accounts with balances of 1000; test BRANCH_NAME = ‘CHENNAI’ * Use index on BALANCE to find accounts with balances of 1000; test BRANCH_NAME = 'CHENNAI" * Use BRANCH_NAME index to find pointers to all records pertaining to the CHENNAI branch. Similarly use index ‘on BALANCE, Take intersection of both sets of pointers obtained Indices on Multiple Keys: Composite search keys are search keys containing more than one attribute Example: (BRANCH_NAME, BALANCE) Indices on Multiple Attributes: Suppose we have an index on combined search-key (BRANCH_NAME, BALANCE) With the where ciause WHERE BRANCH_NAME = ‘CHENNAI AND BALANCE = 1000 the index on (BRANCH_NAME, BALANCE) can be used to fetch only records that satisfy both conditions. Using separate indices is less efficient — we may fetch many records (or pointers) that satisfy only one of the conditions. The combined search-key (BRANCH_NAME, BALANCE) can also efficiently handle WHERE BRANCH_NAME = ‘CHENNAI’ AND BALANCE < 1000 61 : 3.2 Manage! ions ent conside Evaluation of DBMS: Evaluation is done based on the following features: Data Definition Physical Definition + Accessibilty Transaction handling « Utilities * Development Data definition Physical definition Primary key enforcement File structures available Foreign key specification File structure maintenance Data types available Ease of reorganization Data type extensibility Indexing Domain specification Variable length fieldsrecords Ease of restructuring Data compression Integrity controls Encryption routines View mechanism Memory requirements Data dictionary ‘Storage requirements Data independence ‘Type of data model used Schema evolution 62 Accessibility Transaction handling Query language: SQL-92/SQL3 compliant Backup and recovery routines Other system interfacing Ceckpointing facility Interfacing to 3GLs Logging facility Moltiuser Granularity of concurrency Security Deadlock resolution strategy ~ Access controls ‘Advanced transaction models ~ Authorization mechanism Parallel query processing Unites Development Performance measuring GLSGL tools Tuning CASE tools Londlunlad fits Windows capabilites User usage monitoring Stored procedures, riggers, and rules Database administration support Other features Interoperability with other DBMSs and other systems Vendor stability Internet support User hase Replication wiles ‘Training and user support Distributed capabilities ‘Documentation Portability Operating system required Hardware required. Cost Network support Online help Object-oriented capabilities: ‘Standards used Architecture (2 of 3-tier client/server) Version management Performance Extensible query optimization ‘Transaction throughput ‘Scalability ‘Maximum number of concurrent users 63 Evaluation of Data Model: An optimal data model should satisfy the criteria tabulated below: Criteria Description Structural validity Consistency with the way the enterprise defines and organizes information, ‘Simplicity Ease of understanding by 'nformation System professionals and non-technical users. Expressability Ability to distinguish between different data, relationships, between data, and constraints Non redundancy Exclusion of extraneous information; in particular, the representation of any one piece of information exactly ones Sharabilty Not specific to any particular application or technology and thereby usable by many Extensibility Ability to evolve to support new requirements with minimal affect on existing users Integrity Consistency with the way the enterprise uses and manages information. Diagrammatic Ability to represent a model using easily representation _ understood diagrammatic notation Database Admit ration: A database administrator (DBA) is a person who is responsibie for the environmental aspects of a database. In general, these include: + Recoverabilty - Creating and testing Backups + Integrity - Verifying or helping to verify data integrity + Security - Defining and/or implementing access controls to the data 64 + Availability - Ensuring maximum uptime + Performance - Ensuring maximum performance given budgetary constraints + Development and testing support - Helping programmers and engineers to efficiently utilize the database The role of a database administrator has changed according to the technology of database management systems (OBMSs) as well as the needs of the owners of the databases For example, although logical and physical database design are traditionally the duties of a database analyst or database designer, a DBA may be tasked to perform those duties. The duties of a database administrator vary and depend on the job description, corporate and Information Technology (17) policies and the technical features and capabilities of the DBMS being administered, They nearly always include disasle" recovery (backups and testing of backups), perform: analysis and tuning, data dictionary maintenance, and some database design. Some of the roles of the DBA may include: + Installation of new software — It is primarily the job of the DBA to install new versions of DBMS software, application software, and other software related to DBMS administration. It is important that the DBA or other {S staff members test this new software before it is moved into a production environment. + Configuration of hardware and software with the system administrator — In many cases the system software can only be accessed by the system administrator In ths case, the DBA must work closely with the systen administrator to perform software installations, and io configure hardware and software so tnat it functiors optimally with :- DRMS, $s + Security administration — One of the main duties of the DBA is to monitor and administer DBMS security. This involvés adding and removing users, administering ~ quotas, auditing, and checking for security problems. + Data analysis — The DBA will frequently be called on to analyze the data stored in the database and to make recommendations relating to performance and efficiency of that data storage. This might relate to the more effective use of indexes, enabling "Parallel Query" execution, or other DBMS specific features. + Database design (preliminary) — The DBA is often involved at the preliminary database-design stages. Through the involvement of the DBA, many problems that might occur can be eliminated. The DBA knows the DBMS and system, can point out potential problems, and can help the development team with special performance considerations. + Data modeling and optimization — By modeling the data, it is possible to optimize the system layout to take the most advantage of the I/O subsystem. + Responsible for the administration of existing enterprise 1 databases and the analysis, design, and creation of new databases. Storage Structure: + A database file is parltioned into fixed-length storage | Units called blocks. Blocks are units of both storage allocation and data transfer. + Database system seeks to minimize the number of block transfers between the disk and memory. We can reduce the number of disk accesses by keeping as many blocks as possible in main memory.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy