0% found this document useful (0 votes)

41 views80 pages

Database Principles: Fundamentals of Design, Implementation, and Management 3Rd Edition Carlos Coronel - Ebook PDF Download

The document is a promotional material for various eBooks related to database principles, processing, and management, highlighting different editions and links for download. It includes titles such as 'Database Principles: Fundamentals of Design, Implementation, and Management' and 'Database Processing: Fundamentals, Design, and Implementation'. Additionally, it contains copyright information and an overview of the content structure of the database principles textbook.

Uploaded by

millocoutucc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views80 pages

Database Principles: Fundamentals of Design, Implementation, and Management 3Rd Edition Carlos Coronel - Ebook PDF Download

Uploaded by

millocoutucc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 80

Database Principles: Fundamentals of Design,

Implementation, and Management 3rd Edition

Carlos Coronel - eBook PDF download

https://ebookluna.com/download/database-principles-fundamentals-
of-design-implementation-and-management-ebook-pdf/

Download full version ebook from https://ebookluna.com

We believe these products will be a great fit for you. Click
the link to download now, or visit ebookluna.com
to discover even more!

(Original PDF) Database Processing: Fundamentals, Design, and

Implementation 14th

https://ebookluna.com/product/original-pdf-database-processing-
fundamentals-design-and-implementation-14th/

(eBook PDF) Database Processing Fundamentals, Design, and Implementation,

15th Edition

https://ebookluna.com/product/ebook-pdf-database-processing-fundamentals-
design-and-implementation-15th-edition/

Database Systems: Design, Implementation, Management 11th Edition (eBook

PDF)

https://ebookluna.com/product/database-systems-design-implementation-
management-11th-edition-ebook-pdf/

(eBook PDF) Database Systems Design, Implementation, & Management 13th

Edition

https://ebookluna.com/product/ebook-pdf-database-systems-design-
implementation-management-13th-edition/
Database Systems: Design, Implementation, & Management 13th Edition (eBook
PDF)

https://ebookluna.com/product/database-systems-design-implementation-
management-13th-edition-ebook-pdf/

Database Processing: Fundamentals, Design, and Implementation 16th Edition

David M. Kroenke - eBook PDF

https://ebookluna.com/download/database-processing-fundamentals-design-and-
implementation-ebook-pdf/

(Original PDF) Principles of Economics by Carlos Asarta

https://ebookluna.com/product/original-pdf-principles-of-economics-by-
carlos-asarta/

(eBook PDF) Innovation Management: Effective strategy and implementation

3rd ed. 2017 Edition

https://ebookluna.com/product/ebook-pdf-innovation-management-effective-
strategy-and-implementation-3rd-ed-2017-edition/

(eBook PDF) Foundation Design Principles and Practices 3rd Edition

https://ebookluna.com/product/ebook-pdf-foundation-design-principles-and-
practices-3rd-edition/
Australia Brazil Mexico South Africa Singapore United Kingdom United States

Copyright 2020 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it
This is an electronic version of the print textbook. Due to electronic rights restrictions,
some third party content may be suppressed. Editorial review has deemed that any suppressed
content does not materially affect the overall learning experience. The publisher reserves the right
to remove content from this title at any time if subsequent rights restrictions require it. For
valuable information on pricing, previous editions, changes to current editions, and alternate
formats, please visit www.cengage.com/highered to search by ISBN#, author, title, or keyword for
materials in your areas of interest.

Important Notice: Media content referenced within the product description or the product

text may not be available in the eBook version.

Design, Implementation, and Management

Adapted from Database Systems: Design, Implementation, &
Third Edition
Management 13th Edition, by Carlos Coronel, Steven Morris.
US Authors: Carlos Coronel, Steven Morris
Copyright Cengage Learning, Inc., 2019. All Rights
Adapters: Keeley Crockett, Craig Blewett
Reserved.

ALL RIGHTS RESERVED. No part of this work may be

Publisher: Marinda Louw reproduced, transmitted, stored, distributed or used

in any form or by any means, electronic, mechanical,

Marketing Manager: Anna Reading

photocopying, recording or otherwise, without the prior

Senior Content Project Manager: Sue Povey written permission of Cengage Learning or under license

in the U.K. from the Copyright Licensing Agency Ltd.

Manufacturing Manager: Eyvett Davis

The Author(s) and the Adapter(s) have asserted the right

Typesetter: SPi-Global
under the Copyright Designs and Patents Act 1988 to be

Cover Designer: Simon Levy Associates identified as Author(s) and Adapter(s) of this Work.

Cover Image(s): Vijay Kumar/Getty Images

For product information and technology assistance,

For permission to use material from this text or

product and for permission queries, email

emea.permissions@cengage.com

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the

British Library.

ISBN: 978-1-4737-6804-8

Cengage Learning, EMEA

Cheriton House, North Way

Andover, Hampshire, SP10 5BE

United Kingdom

Cengage Learning is a leading provider of customized

learning solutions with employees residing in nearly 40

different countries and sales in more than 125 countries

around the world. Find your local representative at:

www.cengage.co.uk.

Cengage Learning products are represented in Canada

by Nelson Education, Ltd.

To learn more about Cengage platforms and services,

register or access your online learning solution, or

purchase materials for your course,

visit www.cengage.com.

Printed in China at RR Donnelley

Print Number: 01 Print Year: 2020

Part i Database systems 2

1 The Database Approach 5
2 Data Models 34
3 Relational Model Characteristics 70
4 Relational Algebra and Calculus 119

Part ii Design Concepts 162

5 Data Modelling with Entity Relationship Diagrams 165
6 Data Modelling Advanced Concepts 233
7 Normalising Database Designs 271

Part iii Database Programming 318

8 Beginning Structured Query Language 320
9 Procedural Language SQL and Advanced SQL 426

Part iV Database Design 522

10 Database Development Process 525
11 Conceptual, Logical, and Physical Database Design 578

Part V Database transactions and Performance

tuning 632
12 Managing Transactions and Concurrency 635
13 Managing Database and SQL Performance 672

Part Vi Database Management 706

14 Distributed Databases 709
15 Databases for Business Intelligence 750
16 Big Data and NoSQL 826
17 Database Connectivity and Web Technologies 860

Glossary 912
Index 938

Appendices (Available online)

Appendix D: Converting an ER Model into a Database Structure

Appendix E: Comparison of ER Modelling Notations
Appendix F: Client/Server Systems
Appendix G: Object-Orientated Databases
Appendix H: Databases in e-Commerce
Appendix I: The Hierarchical Database Model
Appendix J: The Network Database Model
Appendix K: Database Administration
Appendix L: Data Warehouse Implementation Factors
Appendix M: Creating a New Database Using Oracle 12c
Appendix N: A Guide to Using SQL Developer with Oracle 12c
Appendix O: Building a Simple Object-Relational Database Using Oracle Objects
Appendix P: Microsoft Access Tutorial
Appendix Q: Working with MongoDB
Appendix R: Working with Neo4j

Preface xiii
Changes to the Third Edition xv
Acknowledgements xvi
About the Authors xvii
Walk Through Tour xviii
Dedication xx
Teaching and Learning Support Resources xxi

Parti Databasesystems 2
Business Vignette: The Relational Revolution An Historical Journey 3

1 the Database Approach 5

Preview 5

1.1 Data vs information 6

1.2 Introducing the database and the DBMS 8

1.3 Why database design is important 13

1.4 Historical roots: files and data processing 13

1.5 Problems with file system data management 17

1.6 Database systems 21

1.7 Preparing for your database professional career 28
Summary 30
Key terms 30
Further reading 31
Review questions 31
Problems 32

2 Data Models 34
Preview 34
2.1 The importance of data models 35

2.2 Data model basic building blocks 36

2.3 Business rules 37

2.4 The evolution of data models 39

2.5 Degrees of data abstraction 58
Summary 64
Key terms 65
Further reading 65
Review questions 65
Problems 66

3 relational Model Characteristics 70

Preview 70
3.1 A logical view of data 72
3.2 Keys 78
3.3 Integrity rules 83
3.4 The data dictionary and the system catalogue 85
3.5 Relationships within the relational database 87
3.6 Data redundancy revisited 98
3.7 Indexes 101
3.8 Codds relational database rules 103
Summary 104
Key terms 104
Further reading 104
Review questions 105
Problems 108

4 relational Algebra and Calculus 119

Preview 119
4.1 Relational operators 121
4.2 Joins 133
4.3 Constructing queries using relational algebraic expressions 141
4.4 Relational calculus 148
Summary 153
Key terms 154
Further reading 155
Review questions 155
Problems 157

Partii Design Concepts 162

Business Vignette: Using Data to Improve the Lives of Children and Women 163

5 Data Modelling with entity relationship Diagrams 165

Preview 165

5.1 The entity relationship (ER) model 167

5.2 Developing an ER diagram 196
5.3 Database design challenges: conflicting goals 212
Summary 215
Key terms 216
Further reading 216
Review questions 217
Problems 220

6 Data Modelling Advanced Concepts 233

Preview 233

6.1 The extended entity relationship model 234

6.2 Entity clustering 242

6.3 Entity integrity: selecting primary keys 244

6.4 Design cases: learning flexible database design 249
6.5 Data modelling checklist 255
Summary 256
Key terms 257
Further reading 257
Review questions 258
Problems 258
Case studies 261

7 normalising Database Designs 271

Preview 271
7.1 Database tables and normalisation 272
7.2 The need for normalisation 272
7.3 The normalisation process 276
7.4 Improving the design 284
7.5 Surrogate key considerations 289
7.6 Higher-level normal forms 290
7.7 Normalisation and database design 296
7.8 Denormalisation 302
Summary 303
Key terms 306
Further reading 306
Review questions 306
Problems 308

Part iii Database Programming 318

Business Vignette: Open Source Databases 319

8 Beginning structured Query Language 320

Preview 320
8.1 Introduction to SQL 321
8.2 Data definition commands 324
8.3 Data manipulation commands 339
8.4 Select queries 347
8.5 Advanced data definition commands 361
8.6 Advanced select queries 369
8.7 Virtual tables: creating a view 383
8.8 Joining database tables 385
Summary 392
Keyterms 393
Further reading 393
Review questions 394
Problems 401

9 Procedural Language sQL and Advanced sQL 426

Preview 426
9.1 Relational set operators 428
9.2 SQLjoin operators 438
9.3 Subqueries and correlated queries 446
9.4 SQL functions 459
9.5 Oracle sequences 468
9.6 Updatable views 472
9.7 Procedural SQL 475
9.8 Embedded SQL 495
Summary 500
Key terms 501
Further reading 502
Review questions 502
Problems 503
Case 515

PartiV Database Design 522

Business Vignette: EM-DAT:TheInternational DisasterDatabasefor DisasterPreparedness523

10 Database Development Process 525

Preview 525
10.1 The information system 527
10.2 The systems development life cycle (SDLC) 529
10.3 The database life cycle (DBLC) 532
10.4 Database design strategies 552
10.5 Centralised vs decentralised design 553
10.6 Database administration 555
Summary 573
Key terms 574
Further reading 575
Review questions 575
Problems 576

11 Conceptual, Logical, and Physical Database Design 578

Preview 578
11.1 Conceptual design 580
11.2 Logical database design 594
11.3 Physical database design 603
Summary 625
Key terms 626
Further reading 627
Review questions 627
Problems 628

Part V Databasetransactions and Performance

tuning 632
Business Vignette: From Data Warehouse to Data Lake 633

12 Managing transactions and Concurrency 635

Preview 635
12.1 What is a transaction? 637
12.2 Concurrency control 646
12.3 Concurrency control withlocking methods 651
12.4 Concurrency control with time stamping methods 659
12.5 Concurrency control with optimistic methods 660
12.6 ANSI levels of transaction isolation 661
12.7 Database recovery management 662
Summary 666
Key terms 668
Further reading 668
Review questions 668
Problems 669

13 Managing Database and sQL Performance 672

Preview 672

13.1 Database performance-tuning concepts 673

13.2 Query processing 678
13.3 Indexes and query optimisation 682
13.4 Optimiser choices 683
13.5 SQL performance tuning 685
13.6 Query formulation 688
13.7 DBMS performance tuning 689
13.8 Query optimisation example 692
Summary 699
Key terms 700
Further reading 700
Review questions 700
Problems 701

Part Vi Database Management 706

Business Vignette: The FacebookCambridge Analytica Data Scandal andthe GDPR 707

14 Distributed Databases 709

Preview 709

14.1 The evolution of distributed database management systems 710

14.2 DDBMS advantages and disadvantages 712
14.3 Distributed processing and distributed databases 714

14.4 Characteristics of distributed database management systems 715

14.5 DDBMS Components 717
14.6 Levels of data and process distribution 719
14.7 Distributed database transparency features 722
14.8 Distribution transparency 723
14.9 Transaction transparency 726
14.10 Performance and failure transparency 732
14.11 Distributed database design 733
14.12 The CAP theorem 740
14.13 Database security 742
14.14 Distributed databases within the cloud 742
14.15 C.J. Dates 12 commandments for distributed databases 744
Summary 745
Key terms 746
Further reading 746
Review questions 746
Problems 747

15 Databases for Business intelligence 750

Preview 750
15.1 The need for data analysis 751
15.2 Business intelligence 751
15.3 Decision support data 762
15.4 The data warehouse 767
15.5 Star schemas 777
15.6 Data analytics 789
15.7 Online analytical processing 794
15.8 SQL analytic functions 805
15.9 Data visualisation 811
Summary 818
Key terms 819
Further reading 820
Review questions 820
Problems 821

16 Big Data and nosQL 826

Preview 826
16.1 Big data 827
16.2 Hadoop 833
16.3 NoSQL databases 840
16.4 NewSQL databases 848
16.5 Working with document databases using MongoDB 849
16.6 Working with graph databases using Neo4j 853
Summary 857
Key terms 858
Review questions 859

17 Database Connectivity and Web technologies 860

Preview 860
17.1 Database connectivity 861
17.2 Database internet connectivity 873
17.3 Extensible markup language (xML) 884
17.4 Cloud computing services 898
17.5 The semantic web 907
Summary 908
Key terms 909
Further reading 909
Review questions 909
Problems 910

Glossary 912
Index 938

Appendices (Available online)

Appendix A: Designing Databases with Visio Professional: A Tutorial
Appendix B: The University Lab: Conceptual, Logical, and Physical Database Design
Appendix C: Global Tickets Ltd: Conceptual, Logical, and Physical Database Design
Appendix D: Converting an ER Model into a Database Structure
Appendix E: Comparison of ER Modelling Notations
Appendix F: Client/Server Systems
Appendix G: Object-Orientated Databases
Appendix H: Databases in e-Commerce
Appendix I: The Hierarchical Database Model
Appendix J: The Network Database Model
Appendix K: Database Administration
Appendix L: Data Warehouse Implementation Factors
Appendix M: Creating a New Database Using Oracle 12c
Appendix N: A Guide to Using SQL Developer with Oracle 12c
Appendix O: Building a Simple Object-Relational Database Using Oracle Objects
Appendix P: Microsoft Access Tutorial
Appendix Q: Working with MongoDB
Appendix R: Working with Neo4j

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it
Copyright 2020 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

We are excited to introduce the third edition of Database Principles, which is designed to provide a
solid and practical foundation for the design, implementation and management of database systems.
This foundation is built on the notion that, while databases are very practical things, their successful
creation depends on understanding the important concepts that define them.
This edition is suitable for a first course in databases at undergraduate level and will also provide
essential material for conversion postgraduate courses. Providing comprehensive and practical coverage
of core database concepts, it is an ideal text not only for those studying database management systems
in the context of computer science, but also those on courses in the areas of business technology,
introductory data science and data analytics.

The Approach: Continued Emphasis on the Stages of Design

As the title suggests, Database Principles: Design, Implementation, and Management covers three
broad aspects of database systems. However, for several important reasons, special attention is given
to database design:

The availability of excellent database software enables even database-inexperienced people to

create databases and database applications. Unfortunately, the create without design approach
usually paves the way to any number of database disasters. In our experience, many, if not most,
database system failures are traceable to poor design and cannot be solved with the help of even
the best programmers and managers. Nor is better DBMS software likely to overcome problems
created or magnified by poor design. Using an analogy, even the best bricklayers and carpenters
cant create a good building from a bad blueprint.

Most difficult problems associated with database system management seem to be triggered
by poorly designed databases. It hardly seems worthwhile to use scarce resources to develop
excellent and extensive database system management skills in order to exercise them on crises
induced by poorly designed databases.

Design provides an excellent means of communication. Clients are more likely to get what they
need when database system design is approached carefully and thoughtfully. In fact, clients may
discover how their organisations really function once a good database design is completed.

Familiarity with database design techniques promotes ones understanding of current database
technologies. For example, because data warehouses derive much of their data from operational
databases, data warehouse concepts, structures, and procedures make more sense when the
operational databases structure and implementation are understood.

Because the practical aspects of database design are stressed, we have covered design concepts and
procedures in detail, making sure that the numerous end-of-chapter problems are sufficiently challenging
for students to develop real and useful design skills. We also make sure that students understand
the potential and actual conflicts between database design elegance, information requirements, and
transaction processing speed. For example, it makes little sense to design databases that meet design

elegance standards while they fail to meet end-user information requirements. Therefore, we explore
the use of carefully defined trade-offs to ensure that the databases are capable of meeting end-user
requirements while conforming to high design standards.
This edition retains the use of UML (Unified Modelling Language) notation for data modelling.
Continual development by the Object Management Group has led to UML becoming an International
Standard (UML 2.5.1 is available as the 2017 edition standard: ISO/IEC 19505-1 and 19505-2), which
is continually reviewed. In keeping with the second edition, UML has continued to be used to produce
entity relationship models within this third edition. However, as organisations still use both Chen and
Crows Foot notation approaches to data modelling in order to maintain legacy systems, it is important
that familiarity is maintained. Appendix E, Comparison of ER Modelling Notations, contains coverage
of both these notations.

In this third edition, we have added some new features and continued to strengthen the already strong
database design coverage. Here are just a few of the highlights:

To support the growth of Big Data and NoSQL technology, we have added a new Chapter 16: Big
Data and NoSQL. The chapter focuses in greater depth on the characteristics of Big Data and the
technologies that have been developed to support its use, including Hadoop and MongoDB.

New and expanded coverage of data visualisation tools and techniques in Chapter 15, Databases
for Business Intelligence.

New and updated Business Vignettes to provide topical discussion points in the classroom.

Coverage of MongoDB with hands-on exercises for querying MongoDB databases (Appendix Q).

An additional appendix containing coverage of Neo4j with hands-on exercises for querying graph
databases (Appendix R).

The publisher acknowledges the contribution of the following lecturers, who provided invaluable
feedback on the second and third editions:

Emilia Mwim, UNISA

Patricia Alexander, University of Pretoria

Judy van Biljon, UNISA

Casper Wessels, Central University of Technology

Theo Macdonald, University of the Free State

Ismael Essop, University of Greenwich

Chris Jakeman, Peterborough Regional College

Andy Davies, Blackburn College

Mick Ridley, University of Bradford

Ray Turner, University of Essex

Mark Green, Oxford Brookes University

Duncan McPhee, University of Glamorgan

For this edition, I would like to say a special thanks to Pamela Quick, who previously worked as a
Senior Lecturer in the School of Computing, Maths and Digital Technology at Manchester Metropolitan
University. Her years of experience within the database field have been very valuable, specifically the
coverage of relational algebra.
On this third edition, I have been lucky to work with a very patient, supportive and professional
Publisher, Marinda Louw. Marinda provided fantastic support in answering all my emails. It has been
a pleasure working with you.
Last, and certainly not least, thank you to my family (my ohana) for your patience and support.

Keeley Crockett
January 2020

Carlos Coronel is currently the Lab Director for the College of Business Computer Labs at Middle
Tennessee State University. He has over 25 years of experience in various fields as a Database
Administrator, Network Administrator, Web Manager and Technology Specialist, and has taught
courses in Web development, database design and development, and data communications at the
undergraduate and graduate levels.

Steven Morris completed his Bachelor of Science and PhD from Auburn University. He has taught
Database Design and Development, Database Programming with Advanced SQL and PL/SQL, Systems
Analysis and Design, and Principles of MIS at Middle Tennessee State University. Steven has published
many articles, and currently serves on the review boards of several journals.

Dr Keeley Crockett is a Reader in Computational Intelligence in the School of Computing, Mathematics

and Digital Technology at Manchester Metropolitan University. She gained a BSc Degree (Hons) in
Computation from UMIST in 1993, and a PhD in the field of machine learning in 1998 entitled Fuzzy
Rule Induction from Data Domains. She has been teaching within the field of database systems
and data engineering for 20 years to both undergraduate and postgraduate students. She leads the
Computational Intelligence Research Lab, which has established a strong international presence for its
research into Adaptive Psychological Profiling using artificial intelligence, fuzzy systems, and natural
language dialogue systems. She has published over 125 refereed conference papers and journal articles
in major international conferences and journals. She is an active volunteer in the IEEE undertaking many
roles such as being a member of the IEEE Women in Engineering Leadership committee, and IEEE
Women in Computational intelligence subcommittee among many other roles. Keeley is also proud to
be a STEM Ambassador with a passion for outreach in computer science in rural schools.

Dr Craig Blewett has been researching and teaching in the area of Information Systems and Technology
in South Africa for over 25 years. His Masters explored the application of Artificial Intelligence to
database transaction management. His PhD, in education technology, resulted in the development of
the Activated Classroom Teaching (ACT) model, a unique approach to teaching with technology. Craig
is the founder of multiple technology companies and is the author of numerous books covering topics
such as computer literacy, database systems, teaching with technology, running, and active living. He
is also an internationally acclaimed speaker who is using his innovative approaches to help change
education in our rapidly changing digital world.

CHAPTER 1
The Database Approach

IN THIS CHAPTER, YOU WILL LEARN:

The difference between data and information

BUSINESS VIGNETTE What a database is, what the different types of databases are, and why they are

valuable assets for decision making

The importance of database design

THE RELATIONAL REVOLUTION How modern databases evolved from file systems

About flaws in file system data management

AN HISTORICAL JOURNEY

What the database systems main components are and how a database system

differs from a file system

Until the late 1970s, databases stored large amounts of data in structures that were inflexible and

difficult to navigate. Programmers needed to know what clients wanted to do with the data before The main functions of a database management system (DBMS)

the database was designed. Adding or changing the way the data were stored or analysed was
The role of open source database systems

time-consuming and expensive.

The importance of data governance and data quality

In 1970, Edgar Ted Codd, a mathematician employed by IBM, published a groundbreaking

article entitled A Relational Model of Data for Large Shared Data Banks. At the time, nobody

realised that Codds theories would spark a technological revolution on par with the development

of personal computers and the internet. Don Chamberlin, co-inventor of SQL, the most popular

database query language today, explains: There was this guy Ted Codd who had some kind of PREVIEW
strange mathematical notation, but nobody took it very seriously.

Then Ted Codd organised a symposium, and Chamberlin listened as Codd reduced complicated

five-page programs to one line. And I said, Wow, Chamberlin recalls. The symposium convinced

Good decisions require good information, which is derived from raw facts known as

IBM to fund System R, a research project that built a prototype of a relational database, which
data. Data are likely to be managed most efficiently when they are stored in a database.

would eventually lead to the creation of SQL and DB2. IBM, however, kept System R on the back

In this chapter, you learn what a database is, what it does and why it yields better
burner for a number of years, which turned out to be a crucial decision, because the company

results than other data management methods. You will also learn about different types
had a vested interest in IMS, a reliable, high-end database system that had been released in 1968.

of databases and why database design is so important.

At about the same time as System R started up, two professors from the University of California

Databases evolved from computer file systems. Although file system data
at Berkeley, who had read Codds work, established a similar project called Ingres. The competition

management is now largely outmoded, understanding the characteristics of file systems

between the two tight-knit groups fuelled a series of papers. Unaware of the market potential of

is important because they are the source of serious data management limitations. In
this research, IBM allowed its staff to publish these papers. Among those reading the papers was

this chapter, you will also learn how the database system approach helps eliminate
Larry Ellison, who had just founded a small company called Software Development Laboratories.

most of the shortcomings of file system data management.

Recruiting programmers from System R and Ingres, and securing funding from the CIA and the

Navy, Ellison was able to market the first SQL-based relational database in 1979, well before IBM.

By 1983, the company (Software Development Laboratories) had released a portable version

of the database, had grossed over 3 910 000 annually, and had changed its name to Oracle.

Business Vignettes illustrate the parttopics with a Chapter Previews setthe scenefor the chapter and
genuine scenario and show how the subject integrates with provide an overview of the chapters contents.
the real world.

20 PART I Database Systems

CHAPTER 3 The criticisms of field definitions and naming conventions shown in the file structure of Figure 1.3

1 are not unique to file systems. Because such conventions will prove to be important later, they are

introduced early. You will revisit field definitions and naming conventions when you learn about database

design in Chapter 5, Data Modelling with Entity Relationship Diagrams, and in Chapter 6, Data Modelling

Relational Model Advanced

Conceptual,
Concepts;

Logical and
and

Physical
when you

Database
learn about

Design.
database

Regardless
implementation

of the data environment,

issues in

the
Chapter

design
11,

Characteristics whether

needs and
it involves

the end users

a file system

reporting
or

and
a database

processing
must

requirements.
always reflect

Both
the

types
designers

of needs
documentation

are best served

by adhering to proper field definitions and naming conventions.

IN THIS CHAPTER, YOU WILL LEARN:

Online Content Appendices A to P are available on the online platform accompanying

That the relational database model takes a logical view of data this book.

That the relational models basic components are relations implemented through

tables in a relational DBMS

How relations are organised in tables composed of rows (tuples) and columns

(attributes)

NOTE
Key terminology used in describing relations

About the role of the data dictionary, and the system catalogue
No naming convention can fit all requirements for all systems. Some words or phrases are reserved for

How data redundancy is handled in the relational database model

the DBMSs internal use. For example, the name ORDER generates an error in some DBMSs. Similarly,

Why indexing is important your DBMS might interpret a hyphen (-) as a command to subtract. Therefore, the field CUS-NAME would

be interpreted as a command to subtract the NAME field from the CUS field. Because neither field exists,

you would get an error message. On the other hand, CUS_NAME would work fine because it uses an

underscore.

PREVIEW

1.5.3 Data Redundancy

In Chapter 2, Data Models, you learnt that the relational data models structural

and data independence allow you to examine the models logical structure without

The file systems structure and lack of security make it difficult to combine data from multiple sources.

considering the physical aspects of data storage and retrieval. You also learnt that the

The organisational structure promotes the storage of the same basic data in different locations.
ERM may be used to depict entities and their relationships graphically through an ERD.

(Database professionals use the term islands of information for such scattered data locations.) As
In this chapter, you will learn some important details about the relational models logical

it is unlikely that data stored in different locations will always be updated consistently, the islands of

structure and more about how the ERD can be used to design a relational database.
information often contain different versions of the same data. For example, in Figures 1.3 and 1.4, the

You will learn how the relational databases basic data components fit into a
agent names and phone numbers occur in both the CUSTOMER and the AGENT files. You need only

logical construct known as a table. You will discover that one important reason for the

one correct copy of the agent names and phone numbers. Having them occur in more than one place
relational database models simplicity is that its tables can be treated as logical rather

produces data redundancy. Data redundancy exists when the same data are stored unnecessarily at
than physical units. You will also learn how the independent tables within the database

different places.

can be related to one another.

Uncontrolled data redundancy sets the stage for:

After learning about tables, their components and their relationships, you are

introduced to the basic concepts that shape the design of tables. Because the table is
Data inconsistency. Data inconsistency exists when different and conflicting versions of the same

such an integral part of relational database design, you will also learn the characteristics
data appear in different places. For example, suppose you change an agents phone number or

of well-designed and poorly designed tables. address in the AGENT file. If you forget to make corresponding changes in the CUSTOMER file,

Finally, you are introduced to some basic concepts that will become your gateway the files contain different data for the same agent. Reports will yield inconsistent results depending

to the next few chapters. For example, you will examine different kinds of relationships on which version of the data is used.

and the way in which those relationships might be handled in the relational database

Poor data security. Having multiple copies of data increases the chances of a copy of the data
environment.

being susceptible to unauthorised access.

Learning Objectives appear at the start of each chapter Online Content boxes draw attention to relevant material
to help you monitoryour understandingand progress onthe online platformfor this book.
through each chapter. Each chapter also ends with a Notes highlight important facts about the concepts
summary section that recaps the key content for revision introduced in the chapter.
purposes.
Copyright 2020 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

User queries can be written as relational algebraic expressions. In order to write such as an

TABLE 2.3 Levels of data abstraction

expression, the following steps should be taken:

? List all the attributes we need to give the answer.

Model Degree of Focus Independent of

? Select all the relations we need, based on the list of attributes.

Abstraction
2

? Specify the relational operators and the intermediate results that are needed.
External High End-user views Hardware and software

Relational calculus is a formal language based upon a branch of mathematical logic called
Conceptual Global view of data Hardware and software

predicate calculus.
(independent of database model)

Tuple relational calculus allows users to describe what they want, rather than how to compute it,
Internal Specific database modelHardware

and underlines the appearance of Structured Query Language (SQL). Expressions in tuple

4
Physical Low Storage and access methods Neither hardware nor software
relational calculus return tuples for which a given predicate is true.

Domain relational calculus is different from tuple relational calculus as it uses domain variables

that take on values from an attribute domain.

SUMMARY TABLE 4.1 Summary of relational operators

Relational Operator Symbol Description

A data model is a (relatively) simple abstraction of a complex real-world data environment.

Database designers use data models to communicate with applications programmers and end SELECT s Selects a subset of tuples from a relation.

users. The basic data-modelling components are entities, attributes, relationships and constraints.
PROJECT P Selects a subset of columns from a relation.

Business rules are used to identify and define the basic modelling components within a specific DIFFERENCE - Selects tuples in Relation1 but not in Relation2*.

real-world environment.
INTERSECT Selects tuples in Relation1 or in Relation*.

The hierarchical and network data models were early models that are no longer used, but some of
UNION Selects tuples in Relation1 and Relation2, excluding duplicate tuples*.

the concepts are found in current data models.

CARTESIAN PRODUCT X Computes all the possible combinations of tuples.

The relational model is the current database implementation standard. In the relational model,
THETA JOIN u Allows two relations to be combined using one of the comparison operators

the end user perceives the data as being stored in tables. Tables are related to each other by { 5, ,, ,5, .5, , .}. When the operator is 5 the operator is known as an

means of common values in common attributes. The entity relationship (ER) model is a popular
EQUIJOIN.

graphical tool for data modelling that complements the relational model. The ER model allows
NATURAL JOIN |X|A version of the EQUIJOIN which selects those tuples where

database designers to visually present different views of the data as seen by database designers,
Relation1Tuple.Y 5 Relation2Tuple.Y. Y is a set of common attributes to

programmers and end users and to integrate the data into a common framework.
both relations which must share the same domain. Duplicate columns are

removed.
The object-orientated data model (OODM) uses objects as the basic modelling structure. An

object resembles an entity in that it includes the facts that define it. But unlike an entity, the object OUTERJOIN Based on the u-JOIN and natural JOIN, the OUTERJOIN in addition selects

also includes information about relationships between the facts as well as relationships with other all the tuples in Relation1 that have no corresponding values in the relation

Relation2.
objects, thus giving its data more meaning.

DIVIDE 4 Selects tuples in Relation1 that match every row in Relation2.

The relational model has adopted many object-orientated (OO) extensions to become the extended

relational data model (ERDM). At this point, the OODM is largely used in specialised engineering EXISTENTIAL ' A formula must be true for at least one instance

and scientific applications, while the ERDM is primarily geared to business applications. Although UNIVERSAL The formula must be true for all instances
;

the most likely future scenario is an increasing merger of OODM and ERDM technologies, both are
* in the case of these operators, relations must be union-compatible.

overshadowed by the need to develop internet access strategies for databases.

NoSQL databases are a new generation of databases that do not use the relational model and

KEY TERMS
are geared to support the very specific needs of Big Data organisations. NoSQL databases offer

distributed data stores that provide high scalability, availability and fault tolerance by sacrificing data

closure natural join SELECT

consistency and shifting the burden of maintaining relationships and data integrity to the program

difference PROJECT safe expression

code.

DIVISION predicate calculus set theory

Data modelling requirements are a function of different data views (global vs local) and the
domain relational calculus relational algebra theta join

level of data abstraction. The American National Standards Institute Standards Planning and
equijoin relational algebraic expression tuple relational calculus

Requirements Committee (ANSI/SPARC) describes three levels of data abstraction: external,

INTERSECT relational schema UNION

conceptual and internal. There is also a fourth level of data abstraction (the physical level). This
join column(s) RESTRICT union-compatible

lowest level of data abstraction is concerned exclusively with physical storage methods.
left outer join right outer join

Summary Eachchapter ends witha comprehensive Key Terms arelisted atthe end ofthe chapter and
summary that provides a thorough recap of the issues in explained in full in a Glossary at the end of the book,
each chapter, helping you to assess your understanding and enabling you to find explanations of key terms quickly.
revise key content.

CHAPTER 1 The Database Approach 31 32 PART I Database Systems

query single-user database transactional database PROBLEMS

1 1
query language social media workgroup database

query result set structural dependence XML database

record structural independence Online Content The file structures you see in this problem set are simulated

semi-structured data Structured Query Language (SQL) in a Microsoft Access database named Ch01_Problems, available on the online

platform for this book.

Date, C.J. Date on Database: Writings 20002006. Apress, 2006.

FIGURE P1.1 The file structure for Problems 14

Online Content Answers to selected Review Questions and Problems for this chapter PROJECT_ PROJECT_ MANAGER_ MANAGER_ADDRESS PROJECT_BID_

are available on the online platform accompanying this book. CODE MANAGER PHONE PRICE

21-5Z Holly B. Naidu 33-5-59200506 180 Boulevard Dr, Phoenix, 64700 13 179 975.00

25-2D Jane D. Grant 0181-898-9909 218 Clark Blvd., London, NW3 9 787 037.00

TRY

REVIEW QUESTIONS
25-5A Menzi F. Zulu 0181-227-1245 124 River Dr., Durban, 4001 25 458 005.00

1 Discuss each of the following terms:

25-9T Holly B. Naidu 33-5-59200506 180 Boulevard Dr, Phoenix, 64700 16 887 181.00

a data

27-4Q Menzi F. Zulu 0181-227-1245 124 River Dr., Durban, 4001 8 078 124.00

b field

29-2D Holly B. Naidu 33-5-59200506 180 Boulevard Dr, Phoenix, 64700 20 014 885.00

c record

31-7P William K. Moor 39-064885889 Via Valgia Silvilla 23, Roma, 00179 44 516 677.00

d file

2 What is data redundancy and which characteristics of the file system can lead to it?

3 If you wanted to produce alisting of the file contents by last name, area code, city, county or

3 Discuss the lack of data independence in file systems. postal code, how would you alter the file structure?

4 What is a DBMS, and what are its functions?

4 What data redundancies do you detect, and how could those redundancies lead to anomalies?

5 What is structural independence, and why is it important?

FIGURE P1.2 The file structure for Problems 58

6 Explain the difference between data and information.

7 What is the role of a DBMS, and what are its advantages?

PROJ_ PROJ_ EMP_ EMP_NAME JOB_ JOB_CHG_ PROJ_ EMP_PHONE

8 List and describe the different types of databases. NUM NAME NUM CODE HOUR HOURS

1 Hurricane 101 John D. Dlamini EE 65.00 13.3 31-20-6226060

9 What are the main components of a database system?

1 Hurricane 105 David F. Schwann CT 40.00 16.2 0191-234-1123

10 What is metadata?

1 Hurricane 110 Anne R. Ramoras CT 40.00 14.3 34-934412463

11 Explain why database design is important.

2 Coast 101 John D. Dlamini EE 65.00 19.8 31-20-6226060

12 What are the potential costs of implementing a database system?

2 Coast 108 June H. Ndlovu EE 65.00 17.5 0161-554-7812

13 Use examples to compare and contrast structured and unstructured data. Which type is more 3 Satellite 110 Anne R. Ramoras CT 42.00 11.6 34-934412463

prevalent in a typical business environment?

3 Satellite 105 David F. Schwann CT 6.00 23.4 0191-234-1123

14 What are the six levels on which the quality of data can be examined? 3 Satelite 123 Mary D. Chen EE 65.00 19.1 0181-233-5432

3 Satellite 112 Allecia R. Smith BE 65.00 20.7 0181-678-6879

15 Explain what is meant by data governance.

Further Reading allows you to explore the subject further, Problems become progressively more complex as
and acts as a starting pointfor projects and assignments. students draw onthe lessons learnt from the completion of
Review Questions help reinforce and test your knowledge preceding problems.
and understanding, and provide a basis for group
discussions and activities.
Copyright 2020 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

To my son, Kona, of whom I am so proud keep following your dreams.

To Craig, my best friend and patient husband. Thank you for supporting my crazy busy life without
you nothing would be possible. In memory of my father, Frank Crockett, who inspired me to be the
person I am today. To my mother, Norma Crockett, who is the angel in my life. Thank you for always
being there for me.

To my mother-and father-in-law Jackie and Bill Smith who have provided me with much love and
support.

In memory of Leslie Crockett, a true gentleman and much-loved uncle.

To my family and friends, all of whom have painted rainbows in my life.

Much love and aloha to you all.

Keeley Crockett

Cengages peer-reviewed content for higher and further education

courses is accompanied by a range of digital teaching and learning

support resources. The resources are carefully tailored to the

specific needs of the instructor, student and the course.

Examples of the kind of resources provided include:

A password-protected area for instructors with,

for example, a test bank, PowerPoint slides and

an instructors manual.

An open-access area for students including, for

example, online appendices, useful weblinks and

glossary terms.

Lecturers: to discover the dedicated teaching digital support

resources accompanying this textbook please register here

for access: cengage.com/dashboard/#login

Students: to discover the dedicated learning digital support

resources accompanying this textbook, please search for

Database Principles: Fundamentals of Design,

Implementation, and Management. Edition on: cengage.com

BEUNSTOPPABLE!

Learn more at cengage.com

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it
Copyright 2020 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

1 The
Database
Approach
2 Data
Models
3 Relational
Model
Characteristics
4 Relational
Algebra
andCalculus

THE RELATIONAL REVOLuTION

AN HISTORICALjOuRNEy
Until the late 1970s, databases stored large amounts of data in structures that wereinflexible and
difficult to navigate. Programmers needed to know what clients wanted to do with the data before
the database was designed. Adding or changing the way the data were stored or analysed was
time-consuming and expensive.
In 1970, Edgar Ted Codd, a mathematician employed by IBM, published a groundbreaking
article entitled A Relational Model of Data for Large Shared Data Banks. At the time, nobody
realised that Codds theories would spark atechnological revolution on par with the development
of personal computers and the internet. Don Chamberlin, co-inventor of SQL, the most popular
database query language today, explains: There was this guy Ted Codd who had some kind of
strange mathematical notation, but nobody took it very seriously.
Then Ted Codd organised a symposium, and Chamberlin listened as Codd reduced complicated
five-page programs to one line. And I said, Wow, Chamberlin recalls. The symposium convinced
IBM to fund System R, a research project that built a prototype of a relational database, which
would eventually lead to the creation of SQL and DB2. IBM, however, kept System R on the back
burner for a number of years, which turned out to be a crucial decision, because the company
had a vested interest in IMS, areliable, high-end database system that had been released in 1968.
At about the same time as System Rstarted up, two professors from the University of California
at Berkeley, who had read Codds work, established a similar project called Ingres. The competition
between the two tight-knit groups fuelled a series of papers. Unaware of the market potential of
this research, IBM allowed its staff to publish these papers. Among those reading the papers was
Larry Ellison, who had just founded a small company called Software Development Laboratories.
Recruiting programmers from System R and Ingres, and securing funding from the CIA and the
Navy, Ellison was able to market the first SQL-based relational database in 1979, well before IBM.
By 1983, the company (Software Development Laboratories) had released a portable version
of the database, had grossed over 13 910 000 annually, and had changed its name to Oracle.

Spurred on by competition, IBM finally released SQL/DS, its first relational database, in 1980.1
In 2008, a group of leading database researchers metin Berkeley and issued a report declaring
that the industry had reached an exciting turning point and was on the verge of another database
revolution.2
In 2010, Oracle acquired MySQL as part ofits acquisition of Sun. It has since maintained the
free open-source MySQL Community Edition while providing several versions (Standard Edition,
Enterprise Edition and Cluster Edition) for commercial customers. In 2019, the release of MySQL
Document Store brought together the SQL and the NoSQL languages, enabling developers to link
SQL relational tables to schema-less NoSQL databases.3 Oracles latest offering is Oracle Database
19c, where the c represents cloud; new versions now come out every year.
In our historical journey, we must also mention PostgreSQL, developed in1986 as part of the
POSTGRES project at the University of California at Berkeley. PostgreSQL4 is afree, open source,
object-relational database that extends the traditional SQL language by allowing creation of new
datatypes and functions, and the ability to write code in different programming languages. It is a
strong competitor to MySQL, given that it has had over 33 years of active development.
Analysts, journalists and business leaders continually see new developments with data
acquisition and its management, such as the explosion of unstructured data, the growing
importance of business intelligence, and the emergence of cloud technologies, which may require
the development of new database models. Although traditional relational databases meetrigorous
standards for data integrity and consistency, they do not scale unstructured data as well as new
database models such as NoSQL. NoSQL is also known as a non-relational database, which
allows the storage and retrieval of unstructured data using a dynamic schema. A key question
asked by database developers today is whether they need a NoSQL database or an SQL database
for their application. For example, Twitter and Facebook, which do not require high levels of data
consistency and integrity, have adopted NoSQL databases. In 2019, businesses are opting for
SQL and NoSQL multiple database combinations, which suggests that one size does not fit all.
As of March 2019, the most popular database management systems worldwide were Oracle,
MySQL, Microsoft SQL and PostgreSQL.5 So, whatis the future? Disruptive database technologies
are required for business to remain competitive and the key is real-time data. Alternative database
models such as cloud database platforms, which have the capability for real-time data analytics,
are for certain. Big data has a role to play as additional data sources must be processed using
data pipelines, all in accordance with the new General Data Protection Regulation (GDPR) data
regulations. The relational model will survive, but it will also adapt at unprecedented speed.

1 IBM and Oracle Trade Barbs over Databases, https://phys.org/news/2007-05-ibm-oracle-barbs-databases.html

2 Rakesh Agrawal et al.,The Claremont Report on Database Research, http://db.cs.berkeley.edu/claremont/

claremontreport08.pdf.
3 MySQL Editions, www.mysql.com/products/
4 PostgreSQL, www.postgresql.org/about/
5 Top 10 Databases for 2019, The Database Journal, www.databasejournal.com/features/oracle/slideshows/
top-10-2019-databases.html

IN THIS CHAPTER, yOu wILL LEARN:

The difference between data and information
What a database is, what the different types of databases are, and why they are
valuable assets for decision making
The importance of database design

How modern databases evolved from file systems

About flaws in file system data management

What the database systems main components are and how a database system

differs from a file system

The main functions of a database management system (DBMS)

The role of open source database systems

The importance of data governance and data quality

Preview
Good decisions require good information, which is derived from raw facts known as
data. Data are likely to be managed most efficiently when they are stored in a database.
In this chapter, you learn what a database is, what it does and why it yields better
results than other data management methods. You will also learn about different types
of databases and why database design is so important.
Databases evolved from computer file systems. Although file system data
management is now largely outmoded, understanding the characteristics of file systems
is important because they are the source of serious data management limitations. In
this chapter, you will also learn how the database system approach helps eliminate
most of the shortcomings of file system data management.

1.1 DATA VSINFORmATION

1
To understand what drives database design, you need to understand the difference between data and
information. Data are raw facts. The word raw indicates that the facts have not yet been processed
to reveal their meaning. For example, suppose that you want to know what the users of a computer
lab think of its services. Typically, you would begin by surveying users to assess the computer labs
performance. Figure 1.1, Panel (a), shows the Web survey form that enables users to respond to
your questions. When the survey form has been completed, the forms raw data are saved to a data
repository, such as the one shown in Figure 1.1, Panel (b). Although you now have the facts in hand,
they are not particularly useful in this format reading page after page of zeros and ones is not likely to
provide much insight. Therefore, you transform the raw data into a data summary like the one shown
in Figure 1.1, Panel (c). It is now possible to get quick answers to questions such as, What is the
composition of our labs customer base? In this case, you can quickly determine that most of your
customers are second-year undergraduates (38 per cent) and first-year undergraduates (32 per cent).
And, because graphics can enhance your ability to extract meaning from data quickly, you show the
data summary bar graph in Figure 1.1, Panel (d).

FIguRE 1.1 Transforming raw datainto information

(a) initial survey screen (b) raw data

(c) information in summary format (d) information in graphic format

information is the result of processing raw data to reveal its meaning. Data processing may be as
simple as organising data to reveal patterns or as complex as making forecasts or drawing inferences 1
using statistical modelling. Such information can then be used as the foundation for decision making.
For example, the data summary for each question on the survey form can point out the labs strengths
and weaknesses, helping you to make informed decisions to better meet the needs of lab customers.
Raw data must be properly formatted for storage, processing and presentation. For example, the
student classification in Figure 1.1, Panel (c) is formatted to show the results based on the classifications
undergraduates years 1 to 3, postgraduates and a category other. The respondents yes/no responses
may need to be converted to a Y/N format for data storage. More complex formatting is required when
working with complex data types such as sounds, videos or images.
In this information age, production of accurate, relevant and timely information is the key to good
decision making. In turn, good decision making is the key to business survival in a global market. We are
now said to be entering the knowledge age.6 Data are the foundation of information, which is the
bedrock of knowledge that is, the body ofinformation and facts about a specific subject. Knowledge
implies familiarity, awareness and understanding of information as it applies to an environment. A key
characteristic of knowledge is that new knowledge can be derived from old knowledge.
Lets summarise some key points:

Data constitute the building blocks of information.

Information is produced by processing data.

Information is used to reveal the meaning of data.

Accurate, relevant and timely information is the key to good decision making.

Good decision making is the key to organisational survival in a global environment.

Timely and useful information requires accurate data. Such data must be generated properly, and they
must be stored in a format that is easy to access and process. And, like any basic resource, the data
environment must be managed carefully. Data management is a discipline that focuses on the proper
generation, storage and retrieval of data. Given the crucial role that data play, it should not surprise you
that data management is a core activity for any business, government agency, service organisation or
charity.

1.1.1 Data Quality and Data governance

The quality of the data within the database is essential if the organisation is to make accurate short-and
long-term business decisions. Data must be fit for purpose and this often means that it can be used to
develop new strategies which aim to increase the income generation of an organisation. Data quality
can be examined at a number of different levels, including:

Accuracy: Is the data accurate and has it been obtained from a verifiable source?

Relevance: Is the data relevant to the organisation?

Completeness: Is the required data being stored?

Timeliness: Is the data updated frequently in order to meet the business requirements?

6 Peter Drucker coined the phrase knowledge worker in 1959 in his book Landmarks of Tomorrow. In 1994,
Ms Esther Dyson, Mr George Gilder, Dr George Keyworth and Dr Alvin Toffler introduced the concept of the
knowledge age.

Uniqueness: Is the data unique and without redundancy?

1
Unambiguous: Is the meaning of the data clear?

The above list is not exhaustive. Most countries will have their own laws on the storage of data which an
organisation must adhere to. For example, the General Data Protection Regulation (GDPR), which governs
collecting and processing data, became alegal requirement for all organisations in Europe from 25 May
2018. One of the major changes detailed in Article 22 of the GDPR includes the rights of an individual not
to be subject to automated decision making, which includes profiling, unless explicit consent is given.
Individuals who are subject to such decision making have the right to ask for an explanation of how the
decision is reached and organisations must utilise appropriate mathematical and statistical procedures.
South Africa has the Protection of Personal Information Act (POPIA) which was signed into law in 2013.
POPIA promotes the protection of personal information by public and private bodies.
Data governance is the term used to describe a strategy or methodology defined by an organisation
to safeguard data quality. Each organisation produces its own data governance strategy that willinvolve
the development of a series of policies and procedures for managing availability, usability, quality,
integrity and security of data within the organisation. For example, the strategy defines who owns
the data within the organisation and who is authorised to create, update and delete new records in the
database. Master Data Management (MDM) is a component of a data governance strategy that provides
the technological foundation for implementation of the strategy. MDM ensures that data is consistent
and accurate across all systems within an organisation and provides technology to allow the auditing,
reporting and compliance of data.
Creating a data governance strategy is a complex and time-consuming task and willinvolve many people
working at different levels within the organisation. Once the strategy has been developed and put into
operation, it will take the organisation several months to ensure that all data complies with the strategy.
Once in place, the polices and the procedures of the strategy should be regularly measured and monitored
to ensure that they are being followed. This will allow continual monitoring of the data governance strategy
to ensure that it is still relevant and up to date for the purpose of the organisation. Data profiling and data
quality tools are often used as part of the monitoring process to keep track of data over time.

1.2 INTRODuCINg THE DATABASE AND THE DBmS

Efficient data management typically requires the use of a computer database. A database is a shared,
integrated computer structure that stores a collection of:

end-user data, or raw facts of interest to the end user

metadata, or data about data, through which the end-user data are integrated and managed.

The metadata provide a description of the data characteristics and the set of relationships that link
the data found within the database. In a sense, a database resembles a very well-organised electronic
filing cabinet in which powerful software, known as a database management system, helps manage
the cabinets contents. A database management system (DBMS) is a collection of programs that
manages the database structure and controls access to the data stored in the database.

1.2.1 Role and Advantages of the DBmS

Figure 1.2 illustrates that the DBMS serves as the intermediary between the user and the database.
The DBMS receives all application requests and translates them into the complex operations required
to fulfil those requests. The DBMS hides much of the databases internal complexity from the

application programs and users. The application program might be written by a programmer using a
programming language such as Python, Visual Basic, C++ or Java, or it might be created through a 1
DBMS utility program.

FIguRE 1.2 The DBmS managesthe interaction between the end user and
the database

End users

Application Database structure

request

Metadata
Data

Customers

DBMS
database End-user
Invoices
management system data

End users

Application Products

request

Data

Having a DBMS between the end users applications and the database offers some important
advantages. First, the DBMS enables the data in the database to be shared among multiple applications
or users. Second, the DBMS integrates the many different users views of the data into a single all-encompassing
data repository.
Because data are the crucial raw material from which information is derived, you need a good way of
managing such data. As you will discover in this book, the DBMS helps make data management more
efficient and effective. In particular, a DBMS provides advantages such as:

Improved data sharing. The DBMS helps create an environment in which end users have better
access to more and better-managed data. Such access makes it possible for end users to
respond quickly to changes in their environment.

Better data integration. Wider access to well-managed data promotes an integrated view of the
organisations operations and a clearer view of the big picture. It becomes much easier to see how
actions in one segment of the company affect other segments.

Minimised data inconsistency. Data inconsistency exists when different versions of the same
data appear in different places. For example, data inconsistency exists when a companys sales
department stores a sales representatives name as Thobile Cele and the companys personnel
department stores that same persons name as Bathobile M. Cele or when the companys
regional sales office shows the price of product X as R390.00 in South African currency and
its national sales office shows the same products price as R350.00. The probability of data
inconsistency is greatly reduced in a properly designed database.

Improved data access. The DBMS makes it possible to produce quick answers to ad hoc queries.
From a database perspective, a query is a specific request for data manipulation (for example,

to read or update the data) issued to the DBMS. Simply put, a query is a question and an ad hoc
1 query is a spur-of-the-moment question. The DBMS sends back an answer (called the query result
set) to the application. For example, end users, when dealing withlarge amounts of sales data,
might want quick answers to questions (ad hoc queries) such as:

? What was the volume of sales by product during the past six months?

? What is the sales bonus figure for each of our salespeople during the past three months?

? How many of our customers have credit balances of R5 000 (or 3 000) or more?

Improved decision making. Better-managed data and improved data access make it possible to
generate better-quality information, on which better decisions are based.

Increased end-user productivity. The availability of data, combined with the tools that transform
data into usable information, empowers end users to make quick, informed decisions that can be
the difference between success and failure in the global economy.

The advantages of using a DBMS are not limited to the few just listed. In fact, you will discover many
more advantages as you learn more about the technical details of databases and their proper design.

1.2.2 Types of Databases

A DBMS can support many different types of databases. Databases can be classified according to
the number of users supported, where the data are located, the type of data stored, the intended data
usage and the degree to which the data are structured.
The number of users determines whether the database is classified as single-user or multi-user.
A single-user database supports only one user at a time. In other words, if user Ais using the database,
users B and C must wait until user Ais done. A single-user database that runs on a personal computer
is called a desktop database. In contrast, a multi-user database supports multiple users at the same
time. When the multi-user database supports a relatively small number of users (usually fewer than 50)
or a specific department within an organisation, it is called a workgroup database. Whenthe database
is used by the entire organisation and supports many users (more than 50, usually hundreds) across
many departments, the database is known as an enterprise database.
Location might also be used to classify the database. For example, a database that supports data
located at a single site is called a centralised database. A database that supports data distributed
across several different sites is called a distributed database. The extent to which a database can be
distributed, and the way in which such distribution is managed, is addressed in detail in Chapter 14,
Distributed Databases.
The most popular way of classifying databases today, however, is based on how they will be used
and on the time sensitivity of the information gathered from them. For example, transactions such as
product or service sales, payments and supply purchases reflect critical day-to-day operations. Such
transactions must be recorded accurately and immediately. A database that is designed primarily to
support a companys day-to-day operations is classified as an operational database, also referred
to as an online transaction processing (OLTP), transactional or production database.
Typically, analytical databases comprise two main components: a data warehouse and an online
analytical processing (OLAP) front end. The data warehouse is a specialised database that stores
data in a format optimised for decision support. The data warehouse contains historical data obtained
from the operational databases as well as data from other external sources. Online analytical processing

is a set of tools that work together to provide an advanced data analysis environment for retrieving,
processing and modelling data from the data warehouse. In recent times, this area of database 1
application has grown in importance and usage, to the point that it has evolved into its own discipline:
business intelligence. The term business intelligence describes a comprehensive approach to
capturing and processing business data with the purpose of generating information to support business
decision making. (See Chapter 15, Databases for Business Intelligence.)
Databases can also be classified to reflect the degree to which the data are structured. Unstructured
data are data that exist in their original (raw) state that is, in the format in which they were collected.
Therefore, unstructured data exist in a format that does not lend itself to the processing that yields
information. Structured data are the result of formatting unstructured data to facilitate its storage and
use, and the generation of information. You apply structure (format) based on the type of processing
that you intend to perform on the data. Some data might not be ready (unstructured) for some types of
processing, but they might be ready (structured) for other types of processing. For example, the data
value 37890 might refer to a postal code, a sales value or a product code. If this value represents a
postal code or a product code and is stored as text, you cannot perform mathematical computations
with it. On the other hand, if this value represents a sales transaction, it must be formatted as numeric.
To illustrate the concept of structure further, imagine a stack of printed paper invoices. If you
merely want to store these invoices as images for future retrieval and display, you can scan them and
save them in a graphic format. Onthe other hand, if you want to derive information such as monthly
totals and average sales, such graphic storage would not be useful. Instead, you could store the
invoice data in a (structured) spreadsheet format so that you can perform the requisite computations.
Actually, most data you encounter are best classified as semi-structured. Semi-structured data have
already been processed to some extent. For example, if you look at a typical Web page, the data are
presented in a prearranged format to convey some information. The database types mentioned thus
far focus on the storage and management of highly structured data. However, corporations are not
limited to the use of structured data. They also use semi-structured and unstructured data. Just think
of the valuable information that can be found in company emails, memos and documents such as
procedures, rules and Web pages. Unstructured and semi-structured data storage and management
needs are being addressed through a new generation of databases known as XML databases.
extensible Markup Language (XML) is a special language used to represent and manipulate data
elements in a textual format. An XML database supports the storage and management of semi-structured
XML data. XML databases will be discussed in more detail in Chapter 16, Database
Connectivity and Web Technologies.
Analytical databases focus primarily on storing historical data and business metrics used exclusively
for tactical or strategic decision making. Such analysis typically requires extensive data massaging
(data manipulation) to produce information on which to base pricing decisions, sales forecasts, market
strategies and so on. Analytical databases allow the end user to perform advanced data analysis of
business data using sophisticated tools.
In contrast, a data warehouse focuses primarily on storing data used to generate information
required to make tactical or strategic decisions. Such decisions typically require extensive data
massaging (data manipulation) to extract information to formulate pricing decisions, sales forecasts,
market positioning, etc. Most decisions supported by data are based on historical data obtained from
operational databases. Additionally, the data warehouse can store data derived from many sources.
To make it easier to retrieve such data, the data warehouse structure is quite different from that of a
transactional database. The design, implementation and use of data warehouses are covered in detail
in Chapter 15, Databases for Business Intelligence.
Table 1.1 compares features of several well-known database management systems.

TABLE 1.1 Types of databases

1
Product Number Of Users Data Location Data Usage XML

Multi-user
Single
User workgroup enterprise Centralised Distributed Operational Analytical

MS X X X X X
Access
MS SQL X3 X X X X X X X
Server

IBM DB2 X3 X X X X X X X

MySQL X X X X X X X X

Oracle X3 X X X X X X X
RDBMS

All the database management systems shown in Table 1.1 (except MySQL) are provided by
commercial vendors and require a significant investment from a company in order to buy the actual
DBMS, its applications and ongoing support and maintenance. MySQL7 is an open source database
system which allows users to build and modify a database of their choice, distribute the database and
improve the actual MySQL DBMS product. The idea is that users can develop the database system
for any purpose, look at the source code and make any improvements, which will then be released
back to the general public.
The main benefit of open source software is that it is free to acquire and use the product itself. However,
there will be costs involved in the development and ongoing support of the software. The term LAMP is
used to define the most popular open source software, namely: Linux, Apache Web server, MySQL DBMS
and the Perl PHP/Python development languages. Together this software stack provides the basic building
blocks for developing websites. Typically, open source database management system products such as
MySQL and PostgreSQL8 are easier to use than large-scale vendor DBMS products as they stick to the
basic fundamental database principles. This makes them ideal for smaller companies and organisations to
develop database-centred applications quickly. A disadvantage of open source software is that it does not
provide the robust functionality and durability required by large-scale commercial systems.
Withthe emergence of the World Wide Web and internet-based technologies as the basis for the new
social media generation, great amounts of data are being stored and analysed. Social media refers to
Web and mobile technologies that enable anywhere, anytime, always on human interactions. Websites
such as Google, Facebook, Instagram, Twitter and LinkedIn capture vast amounts of data about end
users and consumers. These data grow exponentially and require the use of specialised database
systems. Over the past few years, this new breed of specialised database has grown in sophistication
and widespread usage. Currently, this new type of database is known as a NoSQL database. The
term NoSQL9 (Not only SQL) is generally used to describe a new generation of database management
systems that is not based on the traditional relational database model. You will learn more about NoSQL
in Chapter 16 Big Data and NoSQL.

7 mysql.com Available: www.mysql.com/

8 PostGres Available: www.postgresql.org/

9 NoSQL Available: http://nosql-database.org/

NOTE 1

Most of the database design, implementation and management issues addressed in this book are based on
production (transactional) databases. The focus on production databases is based on two considerations.
First, production databases are the databases most frequently encountered in common activities such as
enrolling in a class, registering a car, buying a product or making a bank deposit or withdrawal. Second, data
warehouse databases derive most of their data from production databases, and if production databases are
poorly designed, the data warehouse databases based on them will lose their reliability and value as well.

1.3 wHy DATABASE DESIgN IS ImPORTANT

Database design refers to the activities that focus on the design of the database structure that will be
used to store and manage end-user data. A good database that is, a database that meets all user
requirements does not just happen; its structure must be designed carefully. In fact, database design is
such a crucial aspect of working with databases that most of this book is dedicated to the development
of good database design techniques. Even a good DBMS will perform poorly with a badly designed
database.
Proper database design requires the database designer to identify precisely the databases
expected use. Designing a transactional database emphasises accurate and consistent data and
operational speed. The design of a data warehouse database recognises the use of historical and
aggregated data. Designing a database to be used in a centralised, single-user environment requires
a different approach from that used in the design of a distributed, multi-user database. This book
emphasises the design of transactional, centralised, single-user and multi-user databases. Chapters 14
and 15 also examine critical issues confronting the designer of distributed and data warehouse
databases.
A well-designed database facilitates data management and generates accurate and valuable
information. A poorly designed database is likely to become a breeding ground for difficult-to-trace
errors that may lead to bad decision making and bad decision making can lead to the failure of an
organisation. Database design is simply too important to be left to luck. Thats why university students
study database design, why organisations of all types and sizes send personnel to database design
seminars, and why database design consultants often make an excellent living.

1.4 HISTORICAL ROOTS: FILES AND DATA PROCESSINg

Understanding what a database is, what it does and the proper way to use it can be clarified by
considering what a database is not. A brief explanation of the evolution of file system data processing
can be helpful in understanding the data access limitations that databases attempt to overcome.
Understanding these limitations is relevant to database designers and developers because database
technologies do not make these problems magically disappear database technologies simply make
it easier to create solutions that avoid these problems. Creating database designs that avoid the
pitfalls of earlier systems requires that the designer understands these problems and how to avoid
them; otherwise, the database technologies are no better (and are potentially even worse!) than the
technologies and techniques they have replaced.

1.4.1 manual File Systems

1
To be successful, an organisation must develop systems for handling core business tasks. Historically,
such systems were often manual, paper-and-pencil systems. The papers within these systems were
organised to facilitate the expected use of the data. Typically, this was accomplished through a
system of file folders and filing cabinets. As long as a collection of data was relatively small and an
organisations business users had few reporting requirements, the manual system served its role well
as a data repository. However, as organisations grew and as reporting requirements became more
complex, keeping track of data in a manual file system became more difficult. Therefore, companies
looked to computer technology for help.

1.4.2 Computerised File Systems

Generating reports from manual file systems was slow and cumbersome. In fact, some business
managers faced government-imposed reporting requirements that led to weeks of intensive effort
each quarter, even when a well-designed manual system was used. Therefore, a data processing
(DP) specialist was hired to create a computer-based system that would track data and produce
required reports. Initially, the computer files within the file system were similar to the manual files. A
simple example of a customer data file for a small insurance company is shown in Figure 1.3. (You will
discover later that the file structure shown in Figure 1.3, although typically found in early file systems, is
unsatisfactory for a database.)The description of computer files requires a specialised vocabulary. Every
discipline develops its own terminology to enable its practitioners to communicate clearly. The basic file
vocabulary shown in Table 1.2 will help you to understand subsequent discussions more easily.

Online Content Thedatabases

usedin the chapters
areavailable
onthe onlineplatform
accompanying this book. Throughout the book, Online Content boxes highlight material related
to chapter content located on the online platform. Please see the prelims for details on how to
access these useful resources.

TABLE 1.2 Basic file terminology

Term Definition

Data Raw facts, such as a telephone number, a birth date, a customer name and a year-to-date (YTD)
sales value. Data have little meaning unless they have been organised in some logical manner. The
smallest piece of data that can be recognised by the computer is a single character, such as the
letter A, the number 5 or a symbol such as /. A single character requires 1 byte of computer storage.

Field A character or group of characters (alphabetic or numeric) that has a specific meaning. A field is used
to define and store data.

record Alogically connected set of one or morefields that describes a person, place or thing. For example,
the fields that constitute a record for a customer named J. D. Rudd might consist of J. D. Rudds
name, address, phone number, date of birth, credit limit and unpaid balance.

File A collection of related records. For example, a file might contain data about vendors of ROBCOR
Company, or a file might contain the records for the students currently enrolled at Gigantic University.

FIguRE 1.3 Contents of the CuSTOmER file

1
C_NAMe C_PHONe C_ADDreSS C_POSTCODe A_NAMe A_PHONe TP AMT reN

Alfred A. 32-3-8891367 Stationsplein 2, 2880 Leah F. 27-21-410-7100 T1 100.00 05-Apr-2018

Ramas Sea Point, Hahn
Cape Town

Mpu K. 0181-894-1238 Box 12A Rd, N6 4WE Alex B. 0161-228-1249 T1 250.00 16-Jun-2018
Dlamini Highgate, Alby
Johannesburg

Loli W. 32-3-8890340 Rijksweg 58, 2880 Nkita F. 27-12-410-7100 S2 150.00 29-Jan-2018

Ndlovu Pretoria Brown

Paul F. 31-20-6226060 Martin Rd, 1018 Nkita F. 27-21-410-7100 S1 300.00 14-Oct-2018

Olowski Westville, Brown
Durban

Fatima 0161-222-1672 Box 111 Dr., M15 REE Alex B. 0181-228-1249 T1 100.00 28-Dec-2018
Naidoo Chatsworth, Alby
Durban

Amy B. 0181-442-3381 387 Troll Dr., N6 LOP Menzi T. 0181-123-5589 T2 850.00 22-Sep-2018
OBrian Highgate, Ndlovu

East London

James G. 33-5-59200506 19 East 647000 Nkita F. 27-21-410-7100 S1 120.00 25-Mar-2018

Khumalo Block Street, Brown
Mitchells Plain

Saajidah 39-064885889 3 Baobab 00179 Menzi T. 0181-123-5589 S1 250.00 17-Jul-2018

Mahraj Street, Ndlovu
Queenswood,
Pretoria

Anne G. 0181-382-7185 2119 Elm St., NW3 RTA Alex B. 0161-228-1249 T2 100.00 03-Dec-2018
Farriss Parkview, Alby
Johannesburg

Olette K. 34-934412463 35 Libertas 08001 Menzi T. 0181-123-5589 S2 500.00 14-Mar-2018

Snyman Avenue, Ndlovu
Stellenbosch

C_NAME 5 Customer name A_PHONE 5 Agent phone

C_PHONE 5 Customer phone TP 5 Insurance type

C_ADDRESS 5 Customer address AMT 5 Insurance policy amount, in thousands of euro

C_POSTCODE 5 Customer postcode REN 5 Insurance renewal date

A_NAME 5 Agent name

Using the proper file terminology given in Table 1.2, you can identify the file components shown in
Figure 1.3. The CUSTOMER file shown in Figure 1.3 contains ten records. Each record is composed
of nine fields: C_NAME, C_PHONE, C_ADDRESS, C_POSTCODE, A_NAME, A_PHONE, TP, AMT and
REN. The ten records are stored in a named file. Because the file in Figure 1.3 contains customer data,
its filename is CUSTOMER.

When business users wanted data from the computerised file, they sent requests for the data to
1 the DP specialist. For each request, the DP specialist had to create programs to retrieve the data
from the file, manipulate it in whatever manner the user had requested and present it as a printed
report. If a request was for a report that had been run previously, the DP specialist could rerun
the existing program and provide the printed results to the user. As other business users saw the
new and innovative ways in which customer data were being reported, they wanted to be able to
view their data in similar fashions. This generated more requests for the DP specialist to create
more computerised files of other business data, which in turn meant that more data management
programs had to be created, and more requests for reports. For example, the sales department at
the insurance company created a file named SALES, which helped track daily sales efforts. The sales
departments success was so obvious that the personnel department manager demanded access to
the DP specialist to automate payroll processing and other personnel functions. Consequently, the
DP specialist was asked to create the AGENT file shown in Figure 1.4. The data in the AGENT file
were used to do electronic fund transfers (EFTs), keep track of taxes paid and summarise insurance
coverage, among other tasks.

FIguRE 1.4 Contents of the AgENT file

A_NAMe A_PHONe A_ADDreSS POSTCODe HireD YTD_PAY YTD_iT YTD_Ni YTD_SLS DeP

Alex B. 0161-228-1249 Deken Van 5492 01-Nov-2001 20 806.00 5 201.00 1 664.00 103 963.00 3

Alby Erpstraat 20,

Best

Nkita F. 27-21-410-7100 West Quay 8002 23-May-2004 25 230.00 6 308.00 2 018.00 108 844.00 0

Brown Road,
Waterfront,

Cape Town

Menzi T. 0181-123-5589 452 Elm St., 2193 15-Jun-2003 18 169.00 4 542.00 1 453.00 99 548.00 2

Ndlovu Parkview,

Johannesburg

A_NAME 5 Agent name YTD_PAY 5 Year-to-date pay

A_PHONE 5 Agent phone YTD_IT 5 Year-to-date income tax paid

A_ADDRESS 5 Agent address YTD_NI 5 Year-to-date national insurance paid

POSTCODE 5 Agent postcode YTD_SLS 5 Year-to-date sales

HIRED 5 Agent date of hire DEP 5 Number of dependents

As the number of files increased, a small file system, like the one shown in Figure 1.5, evolved. Each file
in the system used its own application programs to store, retrieve and modify data. And each file was
owned by the individual or the department that commissioned its creation.
As the file system grew, the demand for the DP specialists programming skills grew even faster, and
the DP specialist was authorised to hire additional programmers. The size of the file system also required
alarger, more complex computer. The new computer and the additional programming staff caused the
DP specialist to spend less time programming and more time managing technical and human resources.
Therefore, the DP specialists job evolved into that of a data processing (DP) manager, who supervised
a DP department. In spite of these organisational changes, however, the DP departments primary
activity remained programming, and the DP manager inevitably spent much time as a supervising senior
programmer and program troubleshooter.

FIguRE 1.5 Asimple file system

Sales department Personnel department

File File
Management Management
Programs Programs

CUSTOMER SALES AGENT

file file file

File File
Report Report
Program Program

1.5 PROBLEmS wITH FILE SySTEm DATA mANAgEmENT

The file system method of organising and managing data was a definite improvement on a manual
system and served a useful purpose in data management for over two decades, a very long timespan
in the computer era. Nonetheless, many problems and limitations became evident in this approach. A
critique of the file system method serves two major purposes:

Understanding the shortcomings of the file system enables you to understand the development of
modern databases.

Many of the problems are not unique to file systems. Failure to understand such problems is likely
to lead to their duplication in a database environment, even though database technology makes it
easy to avoid them.

The following problems severely challenge the types of information that can be created from the data
as well as the accuracy of the information:

Lengthy development times. The first and most glaring problem with the file system approach
is that even the simplest data-retrieval task requires extensive programming. Withthe older file
systems, programmers had to specify what must be done and how to do it. As you will learn in
upcoming chapters, modern databases use a non-procedural data manipulation language that
allows the user to specify what must be done without specifying how.

Difficulty in getting quick answers. The need to write programs to produce even the simplest
reports makes ad hoc queries impossible. DP specialists who work with mature file systems often
receive numerous requests for new reports. They are often forced to say that the report will be
ready next week or even next month. If you need the information now, getting it next week or
next month will not serve your information needs.

Complex system administration. System administration becomes more difficult as the number of files in
1 the system expands. Even a simple file system with afew files requires creating and maintaining several
file management programs. Each file must have its own file management programs that allow the user
to add, modify and delete records; to list the file contents; and to generate reports. Because ad hoc
queries are not possible, the file reporting programs can multiply quickly. The problem is compounded
by the fact that each department in the organisation owns its data by creating its own files.

Lack of security and limited data sharing. Another fault of a file system data repository is alack of
security and limited data sharing. Data sharing and security are closely related. Sharing data among
multiple geographically dispersed users introduces a lot of security risks. In terms of creating data
management and reporting programs, security and data-sharing features are difficult to program
and consequently are often omitted from a file system environment. Such features include effective
password protection, the ability to lock out parts of files or parts of the system itself, and other
measures designed to safeguard data confidentiality. Even when an attempt is made to improve
system and data security, the security devices tend to be limited in scope and effectiveness.

Extensive programming. Making changes to an existing file structure can be difficult in a file
system environment. For example, changing just one field in the original CUSTOMER file would
require a program that:

1 Reads a record from the original file.

2 Transforms the original data to conform to the new structures storage requirements.
3 Writesthe transformed data into the new file structure.
4 Repeats the preceding steps for each record in the original file.

In fact, any change to a file structure, no matter how minor, forces modifications in all of the programs
that use the data in that file. Modifications are likely to produce errors (bugs), and additional time is
spent using a debugging process to find those errors. Those limitations, in turn, lead to problems of
structural and data dependence.

1.5.1 Structural and Data Dependence

Afile system exhibits structural dependence; that is, access to afile is dependent onits structure. For
example, adding a customer date-of-birth field to the CUSTOMER file shown in Figure 1.3 would require
the five steps described in the previous section. Given this change, none of the previous programs
will work with the new CUSTOMER file structure. Therefore, all of the file system programs must be
modified to conform to the new file structure. In short, because the file system application programs
are affected by change in the file structure, they exhibit structural dependence. Conversely, structural
independence exists when it is possible to make changes in the file structure without affecting the
application programs ability to access the data.
Even changes in file data characteristics, such as changing a field from integer to decimal, require
changes in all programs that access the file. Because all data access programs are subject to change when
any of the files data storage characteristics change (that is, changing the data type), the file system is said
to exhibit data dependence. Conversely, data independence exists when it is possible to make changes
in the data storage characteristics without affecting the application programs ability to access the data.
The practical significance of data dependence is the difference between the logical data format
(how the human being views the data) and the physical data format (how the computer sees the
data). Any program that accesses afile systems file must tell the computer not only what to do, but also
how to do it. Consequently, each program must contain lines that specify the opening of a specific file
type, its record specification and its field definitions. Data dependence makes the file system extremely
cumbersome from a programming and data management point of view.

1.5.2 Field Definitions and Naming Conventions

1
At first glance, the CUSTOMER file shown in Figure 1.3 appears to have served its purpose well:
requested reports could usually be generated. But suppose you want to create a customer phone
directory based on the data stored in the CUSTOMER file. Storing the customer name as a single field
turns out to be aliability because the directory must break up the field contents to list the last names,
first names and initials in alphabetical order. Or suppose you want to get a customer listing by area
code. Including the area code in the phone number field is inefficient.
Similarly, producing alisting of customers by city is a more difficult task than is necessary. From
the users point of view, a much better (more flexible) record definition would be one that anticipates
reporting requirements by breaking up fields into their component parts. Thus, the CUSTOMER files
fields might be listed as shown in Table 1.3.

TABLE 1.3 Sample customer file fields

Field Contents Sample entry

CUS_LNAME Customer last name Ramas

CUS_FNAME Customer first name Alfred

CUS_INITIAL Customer initial A

CUS_AREACODE Customer area code 1615

CUS_PHONE Customer phone 0161-234-5678

CUS_ADDRESS Customer street address or box number 123 Green Meadow Lane

CUS_CITY Customer city East London

CUS_COUNTY Customer county/district Eastern Cape

CUS_POSTCODE Customer postcode 3001

Selecting proper field names is also important. For example, make sure that the field names are
reasonably descriptive. In examining the file structure shown in Figure 1.3, it is not obvious that the
field name REN represents the customers insurance renewal date. Using the field name CUS_RENEW_
DATE would be better for two reasons. First, the prefix CUS can be used as an indicator of the fields
origin, which is the CUSTOMER file. Therefore, you know that the field in question yields a CUSTOMER
property. Second, the RENEW_DATE portion of the field name is more descriptive of the fields contents.
With proper naming conventions, the file structure becomes self-documenting. That is, by simply looking
at the field names, you can determine which files the fields belong to and what information the fields
are likely to contain.
Some software packages place restrictions on the length of field names, so it is wise to be as
descriptive as possible within those restrictions. In addition, very long field names make it difficult to
fit more than a few fields on a page, thus making output spacing a problem. For example, the field
name CUSTOMER_INSURANCE_RENEWAL_DATE, while being self-documenting, is less desirable
than CUS_RENEW_DATE.
Another problem in Figure 1.3s CUSTOMER file is the difficulty of finding desired data efficiently.
The CUSTOMER file currently does not have a unique record identifier. For example, it is possible to
have several customers named James G. Khumalo. Consequently, the addition of a CUS_ACCOUNT
field that contains a unique customer account number would be appropriate.

The criticisms of field definitions and naming conventions shown in the file structure of Figure 1.3
1 are not unique to file systems. Because such conventions will prove to be important later, they are
introduced early. You will revisit field definitions and naming conventions when you learn about database
design in Chapter 5, Data Modelling with Entity Relationship Diagrams, and in Chapter 6, Data Modelling
Advanced Concepts; and when you learn about database implementation issues in Chapter 11,
Conceptual, Logical, and Physical Database Design. Regardless of the data environment, the design
whether it involves a file system or a database must always reflect the designers documentation
needs and the end users reporting and processing requirements. Both types of needs are best served
by adhering to proper field definitions and naming conventions.

Online Content Appendices

Ato Rareavailable
ontheonlineplatformaccompanying
this book.

NOTE

No naming convention can fit all requirements for all systems. Some words or phrases are reserved for
the DBMSs internal use. For example, the name ORDER generates an error in some DBMSs. Similarly,
your DBMS might interpret a hyphen (-) as a command to subtract. Therefore, the field CUS-NAME would
be interpreted as a command to subtract the NAME field from the CUS field. Because neither field exists,
you would get an error message. On the other hand, CUS_NAME would work fine because it uses an
underscore.

1.5.3 Data Redundancy

The file systems structure and lack of security makeit difficult to combine data from multiple sources.
The organisational structure promotes the storage of the same basic data in different locations.

Database professionals use the term islands of information for such scattered data locations. As
it is unlikely that data stored in different locations will always be updated consistently, the islands of
information often contain different versions of the same data. For example, in Figures 1.3 and 1.4, the
agent names and phone numbers occur in both the CUSTOMER and the AGENT files. You need only
one correct copy of the agent names and phone numbers. Having them occur in more than one place
produces data redundancy. Data redundancy exists when the same data are stored unnecessarily at
different places.
Uncontrolled data redundancy sets the stage for:

Data inconsistency. Data inconsistency exists when different and conflicting versions of the same
data appear in different places. For example, suppose you change an agents phone number or
address in the AGENT file. If you forget to make corresponding changes in the CUSTOMER file,
the files contain different data for the same agent. Reports will yield inconsistent results depending
on which version of the data is used.

Poor data security. Having multiple copies of data increases the chances of a copy of the data
being susceptible to unauthorised access.

NOTE 1

Data that display data inconsistency are also referred to as data that lack data integrity. Data integrity is
defined as the condition in which all of the data in the database are consistent with the real-world events
and conditions. In other words,

Data are accurate; there are no data inconsistencies.

Data are verifiable; the data will always yield consistent results.

Data entry errors are more likely to occur when complex entries (such as 12-digit phone numbers)
are made in several different files and/or recur frequently in one or morefiles. In fact, the CUSTOMER
file shown in Figure 1.3 contains just such an entry error: the third record in the CUSTOMER file
has a transposed digit in the agents phone number (27-12-410-7100 rather than 27-21-410-1700).
It is possible to enter a non-existent sales agents name and phone number into the CUSTOMER
file, but customers are not likely to be impressed if the insurance agency supplies the name and
phone number of an agent who does not exist. And should the personnel manager allow a non-existent
agent to accrue bonuses and benefits? In fact, a data entry error such as an incorrectly
spelled name or an incorrect phone number yields the same kind of data integrity problems.

Data anomalies. The dictionary defines anomaly as an abnormality. Ideally, a field value change
should be made in only a single place. Data redundancy, however, fosters an abnormal condition
by forcing field value changes in many different locations. Look at the CUSTOMER file in Figure 1.3.
If agent Nikita F. Brown decides to get married and move, the agent name, address and phone are
likely to change. Instead of making just a single name and/or phone/address change in a single
file (AGENT), you also must make the change each time that agents name, phone number and
address occur in the CUSTOMER file. You could be faced with the prospect of making hundreds of
corrections, one for each of the customers served by that agent! The same problem occurs when
an agent decides to quit. Each customer served by that agent must be assigned a new agent.
Any change in any field value must be correctly made in many places to maintain data integrity.
A data anomaly develops when all of the required changes in the redundant data are not made
successfully. The data anomalies found in Figure 1.3 are commonly defined as follows:

? Update anomalies. If agent Nikita F. Brown has a new phone number, that number must be
entered in each of the CUSTOMER file records in which Ms Browns phone number is shown. In
this case, only three changes must be made. In a large file system, such changes might occur in
hundreds or even thousands of records. Clearly, the potential for data inconsistencies is great.

? Insertion anomalies. For example, if only the CUSTOMER file existed, to add a new agent, you
would also add a dummy customer data entry to reflect the new agents addition. Again, the
potential for creating data inconsistencies would be great.
? Deletion anomalies. If you delete Amy B. OBrian, Saajidah Maharaj and Olette K. Snyman, then
you will also delete Menzi T. Ndlovus agent data. Clearly, this is not desirable.

1.6 DATABASE SySTEmS

The problems inherent in file systems make using a database system very desirable. Traditional
file systems often made reference to several files such as the customer master file, the product
master file and the transaction file, which were stored separately. However, unlike the file system,

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it
Discovering Diverse Content Through
Random Scribd Documents
prepared so as to display not only the shape of the
crowns, but also the number and character of the
roots by which they are implanted.
Bay II. In bay No. II the two wall-cases contain a
Classification
of Mammals.
collection arranged to show in a serial manner the
orders and sub-orders of existing Mammals, by
examples selected to illustrate the predominating
characters by which these are distinguished. A brief
popular account of the characteristics of the group,
and a map showing its geographical distribution,
are placed with each. This is intended to serve not
only for an introduction to the study of the class by
visitors to the museum, but also as a guide to a
method of arrangement which may be adopted in
smaller institutions.
Among the illustrations of the order Primates is
placed the skeleton of a young Chimpanzee
dissected by Dr. Tyson, which formed the subject of
his work on the “Anatomy of a Pigmie,” published in
1699, the earliest scientific description of any Man-
like Ape.
Skin of The central case of this bay contains
Mammals. illustrations of the outer covering or skin and its
modifications in the class of Mammals, divided into
the following sections:
1. Expansion of skin to aid in locomotion, as the
webs between the fingers of swimming and flying
animals, the parachutes of flying animals.
2. The development of bony plates in the skin,
found among Mammals only in the Armadillos and
their allies. The cast of a section of the tail of a
gigantic extinct species (Glyptodon) shows a bony
external as well as an internal skeleton.
3. The outer covering modified into true scales,
much resembling in structure the nails of the
human hand. This occurs in only one family of
Mammals, the Pangolins, or Manidæ.
4. Hair in various forms, including bristles and
spines. The two kinds of hair composing the
external clothing of most Mammals, the long, stiffer
outer hair, and the short, soft under-fur, are shown
by various examples.
5. The special epidermal appendages found in
nearly all Mammals on the ends of the fingers and
toes, called according to the various forms they
assume, nails, claws, or hoofs.
6. The one or two unpaired horns of the
Rhinoceroses, shown by sections to consist of a
solid mass of hair-like epidermic fibres.
7. The horns of Oxen, Goats, and Antelopes,
each consisting of a hollow conical sheath of horn,
covering a permanent projection of the frontal bone
(the horn-core).
8. The antlers of Deer, forming solid, bony, and
generally branched projections, covered during
growth with soft hairy skin, and in most cases shed
and renewed annually.
On the wall is arranged a series of antlers of an
individual Stag or Red Deer (Cervus elaphus),
grown and shed (except the last) in thirteen
successive years, showing the changes which took
place in their size and form, and the development
of the branches, or tines, in each year. In old age
the number of these tines tends to diminish.
On the north side of the table-case are shown
dissections of the principal internal organs of
Mammals.
Bay III. Bay No. III is devoted to the class of Birds. An
General
structure of
Albatross (Diomedea exulans) mounted with the
Birds. wings expanded shows the most important
characters by which a Bird is externally
distinguished from other animals. The body is
clothed with feathers, which (in the majority of
Birds), by their great size and special arrangement
upon the fore-limbs, enable these to act as organs
of flight. The mouth is in the form of a horny beak.
A nestling Albatross shows that at this stage of its
existence the bird is not clothed with ordinary
feathers, but with soft down, which serves to keep
the body warm, although it confers no power of
flight. An Emu and an Apteryx in the lower
compartment of the case display the exceptional
condition (found only in a comparatively few
members of the class) of Birds with wings so small
as to be concealed beneath the general feathery
covering of the body, and quite useless. In the
Penguins, of which two species are shown in the
case, the wings are reduced to the condition of fins,
and are serviceable only for progress through water.
In the first wall-case the principal features of
the skeleton of the class are shown. Sections of
bones exhibit the large air-cavities within; a
complete skeleton of an Eagle, with the bones
separated and named, and mounted skeletons of
the Ostrich, Penguin, Pelican, Vulture, Night-Parrot,
Fowl, etc., show the chief modifications of the
skeleton. The Apteryx possesses the smallest, and
the Frigate-bird the longest bones of the wing, the
correspondence of which can be readily traced by
means of the labels attached to them. The under
surfaces of the skulls of various birds are shown
with the different bones coloured to indicate their
limits and relations; these are followed by a series
of the different types of sternum or breast-bone.
The second wall-case contains further
illustrations of the anatomy of Birds. In the left-
hand part a series of wings of Birds displays the
form characteristic of different groups; while above
them are a few of the different types of tails,
supplementing the series of tails in the table-case.
Very instructive is a series of skins of white chickens
of the same brood at different ages, displaying the
gradual replacement of the down by the adult
plumage.
The table-case in the middle of the bay
contains illustrations of the external characters, the
beak, the feathers, and the tail, as well as of the
fore and hind limbs, or wings and feet. By the aid
of the explanatory labels, the essential characters
and the principal modifications of all these parts
may easily be followed.
Two cases on the wall in the vestibule leading
to the Fish Gallery illustrate the chief modifications
of the eggs of Birds, and their differences in
structure, number, form, size, texture of surface,
and colour. On the side of the main staircase
opposite are specimens illustrating the parasitic
nesting habits of certain Cuckoos and various other
Birds; while near by is a remarkably fine series of
the eggs of Cuckoos with those of the Birds among
which they were respectively deposited. On the
opposite (east) side of the staircase the visitor will
find a case showing the remarkable variation in
colouring and markings displayed by the eggs of
the Guillemot.
Bay IV. The fourth bay on the west side of the hall
General exhibits the leading peculiarities in the structure of
structure of
Reptiles and
Reptiles and Amphibians. Owing to the large
Amphibians. number of groups in the former class now extinct,
many fossil specimens, or plaster reproductions of
the same, are shown. The wall-case on the south
side of this bay illustrates the different ordinal
groups of Reptiles—living and extinct. Very
instructive are the skeletons of Tortoises and
Turtles, showing the relations of the vertebræ and
limb-bones to the bony part of the shell. Lizards
and Snakes are mostly represented by coloured
casts. The extinct Dinosaurs are represented by a
small-sized model of Iguanodon, together with a
photograph of the skeleton and a plaster-cast of the
bones of the hind-foot showing the three toes.
The adjacent side of the table-case shows the
modifications of the backbone, or vertebral column,
of the ribs, and of the limbs, in the different groups
of the class. Specially noticeable are examples of
five types of Skink-like Lizards, exhibiting the
gradual diminution in the size of the limbs and their
final disappearance.
The opposite, or north, side of the table-case
displays the different modifications of the skull and
teeth of living and extinct Reptiles. In some, like
Crocodiles and Ichthyosaurs, the jaws are armed
with a full series of sharply pointed teeth, while in
others, like the Tortoises and Turtles, they are
devoid of teeth and encased in horn. Very
remarkable is the approximation to a carnivorous
mammalian type presented by the dentition of
some of the extinct mammal-like Reptiles, or
Theromorphs, and equally noticeable are the palatal
crushing teeth of certain other extinct Reptiles
known as Placodus and Cyamodus. The peculiar
dentition of the New Zealand Tuatera, and likewise
that of its extinct European and Indian ally
Hyperodapedon (fig. 9), are also shown.

Fig. 9.—Skull of the Giant Tuatera (Hyperodapedon

gordoni), from the Triassic Sandstone of
Lossiemouth, Elgin, (¼ nat. size). A, upper surface
of skull; B, palatal aspect of skull; C, under side of
front of lower jaw; Pmx, premaxillary bone; Mx,
maxillary; Pl, palatal teeth; Md, lower jaw; O,
orbit, or eye-socket; N, nostrils; S, temporal pit;
S’, lateral temporal fossa.

The brain and other internal organs of Reptiles

are displayed in the left half of the wall-case on the
north side of this bay, in which are also shown the
eggs of many species, in some cases with the
embryo.
Fig. 10.—Skeleton of the Great Blue Shark (Carcharodon
rondeletii), with portion of backbone on a large scale. pl,
functional upper jaw, and su, its reflected portion; md,
lower jaw; hy, ceratohyal; br, branchial arches; co,
pectoral girdle; ph, cartilaginous portion of pectoral, or
front paired fin; r, dermal portion of pectoral fin; pu,
pelvic, or hind paired, fin; c, centra, or bodies, of the
vertebræ; na, neural, or upper, and ha, hæmal, or lower,
arch. The median fins are not lettered.

In the right half of the same case are exhibited

a number of preparations showing the external
form and internal structure of Frogs and
Salamanders, or Amphibians, living and extinct. The
Giant Salamander of Japan (Megalobatrachus or
Cryptobranchus) is represented by a stuffed
specimen; but the Newts, Salamanders, and Frogs
are shown in spirit. Very curious is the almost
colourless and blind Olm (Proteus) from the caves
of Carniola; as also are the so-called Cœcilians, or
Apoda, which have the habits and, in some degree,
the appearance of large worms. Special specimens
exhibit the structure of the extinct Labyrinthodonts,
in which the hinder half of the skull is completely
roofed over by bone; while the teeth in many
instances exhibit a curious in-folded arrangement
from which the group derives its name.
Bay V.
Structure of
Fishes. The last bay (No. V) on the west side of the
Central Hall is devoted to the display of the form
and structure of Fishes.
The wall-case on the left side of this bay
exhibits the external form of several characteristic
types of Fishes, such as the Pike, Cod, Turbot, Dog-
fish, and Skate, with the names of the various fins
affixed. A striking specimen is the skeleton—mainly
cartilaginous—of the Great Blue Shark
(Carcharodon rondeletii), fig. 10, which occupies
the greater portion of this case. It should be noted
that, as in all Sharks and Rays, the upper jaw does
not correspond with that of the higher Vertebrates;
and particular attention should be devoted to the
structure and arrangement of the arches supporting
the gills.
In the south side of the table-case in this bay
are shown a number of dissections, mounted in
spirit, displaying the different types of skeletal
structure presented by the fins in various groups of
Fishes. One of the most remarkable of these types
occurs in Ceratodus forsteri, the Queensland Lung-
fish, in which the skeleton of the fin consists of a
central jointed rod, from each side of which diverge
narrower jointed rods. Alongside are specimens
showing special modifications of certain fins, as in
the Flying Fish (fig. 11) and Flying Gurnard (fig.
12), for the purpose of sustaining the body in the
air, or, as in Pentanemus, to serve as organs of
touch. Specimens of the West Indian Goby and the
Lump-Sucker show modifications of the pelvic fins
in connection with a sucker on the lower surface of
the body; while other preparations display the
pectoral (Doras) and pelvic fins (Monocentris)
reduced to the condition of saw-like spines.
The structure of the skull of Fishes is illustrated
in another part of the same side of this case. From
this the visitor may learn how the primitive
cartilaginous skull of the Sharks (fig. 10), Rays,
Chimæras, and Lung-fishes has been gradually
modified, by the addition of superficial sheathing-
bones, into the bony skull of modern Fishes, such
as the Cod and Perch.

Fig. 11.—The Flying Fish (Exocœtus).

Fig. 12.—The Flying Gurnard (Dactylopterus).

The north side of the table-case in bay V is

mainly devoted to the display of the different types
of scales, spines, and teeth found among Fishes. In
one corner are the enamelled “ganoid” scales of the
modern American Bony Pike (Lepidosteus) and the
African Bichir (Polypterus) alongside those of
certain extinct forms. A scale of the Tarpon, or
King-of-the-Herrings, illustrates the largest
development in point of size of the modern
“cycloid” type. Spines of the Porcupine-fish show an
extreme development of this kind of structure.
Diagrams and spirit-preparations illustrate the mode
of attachment and succession of fish-teeth. A large
series of the teeth of Sharks and Rays displays the
gradual passage from those of the ordinary pointed
form to others arranged in a pavement-like manner
and adapted solely for crushing. Both types occur in
the Port Jackson Shark (fig. 13), but those of some
Rays are solely of the pavement modification. Very
remarkable is the dental structure in the Parrot-fish.
The west end of this side of the case shows the
various modifications assumed by the teeth of the
modern Bony Fishes; among which, as exemplified
by the Wrasse, teeth are developed on the bones of
the throat, as well as on those of the jaws.
Throughout this case specimens, or models, of the
teeth of extinct Fishes are placed side by side with
those of their nearest living relatives.
Fig. 13.—A Jaw of the Port Jackson
Shark (Cestracion philippi),
showing sharp teeth in front and
crushing ones behind.

The wall-case on the north side of this bay

shows the history of the development of various
Fishes, together with the form and structure of the
gills, brain, heart, digestive system, and other
organs.
Lancelet. A small case affixed to the pillar at the entrance
of the fifth bay illustrates the structure of the
Lancelet (Branchiostoma, or Amphioxus), by the aid
of spirit-specimens, enlarged models, and coloured
diagrams. One of the most remarkable features in
the structure of this strange and primitive little
creature is the outer cavity enclosing the part of the
body which contains the large and complex
pharynx. The Lancelet was formerly included
among the Fishes, but is now accorded the rank of
a class (Cephalochorda) to itself.
Leaving bay VI, next the principal staircase on
the east side of the central hall, which is devoted to
illustrations of heredity, especially in relation to the
Mendelian theory, and to modes of Flight in
Vertebrates and Insects, we pass on to a table-case
assigned to the illustration of “Mimicry” and kindred
phenomena. Most of the examples shown occur
among Insects; but one example among Mammals
and a second in Birds are illustrated. Very striking is
a coloured sketch showing a group of red and black
caterpillars from Singapore grouped side by side on
the stem of a plant so as to present a remarkable
similarity to a succulent fruit.
Bay VIII. In bay VIII, on the eastern side of the central
Bay IX. hall, is displayed an exhibition illustrating trees,
native to or grown in Britain. The winter and
summer states are indicated by photographs, and
the foliage, flowers, fruits, seedlings, and texture of
wood and bark by specimens, models, and
drawings. Bays IX and X are intended to illustrate
the general characters of the great groups of the
Vegetable Kingdom. Bay IX, in course of
arrangement, is devoted to the Cryptogams (Ferns,
Mosses, Fungi, Seaweeds, and Lichens).
At the back of the bay is a fine polished section
of a buttress from the base of the Tapang (Abauria
excelsa), the largest tree in Borneo, which attains a
height of 250 feet.
Bay X. Seed- The last bay (No. X) is devoted to the Seed-
bearing
Plants.
bearing Plants, which are characterised by the
formation of a seed—the result of the fertilisation of
an ovule by the male cell which is developed in the
pollen. The series begins on the left hand side with
the Pteridosperms, an extinct group combining the
characters of Ferns and Seed-plants and forming a
link between them. Then follow the Gymnosperms
(Cycads, Pines, Firs, etc.), in which the seed is
borne naked on an open scale which generally
forms, with others like it, the characteristic cone.
Certain points in the development of pollen and
ovule recall similar stages in the Fern group, and
indicate that the Gymnosperms stand nearer to the
Cryptogams than do the Angiosperms, the other
and larger group of Seed-plants. The Gymnosperms
are also the older group, and contain many extinct
forms. In the Angiosperms the seed is enclosed in
the fruit, and in the development of pollen and
ovule almost all traces of a cryptogamic ancestry
have been lost; the great development of the
flower is a characteristic feature of the
Angiosperms. The arrangement of the vegetative
parts of the plant is based on its separation into
root, stem, and leaf. In the right-hand wall-case the
upper series of specimens illustrates the leaf, its
form, veining, direction, the characters of its stalk
and stipules, its modification for special purposes,
and its arrangement on the stem and in the bud.
Below, the stem and root are similarly treated, and
above are some anatomical drawings. The display
of the root is continued in the lower part of the
opposite wall-case. In the central case the chief
types of the flower with its parts, the fruit, and the
seed are exhibited.
At the back of the bay is a large transverse
section of the Karri tree (Eucalyptus diversicolor) of
Western Australia, a species which grows to a
height of 400 feet. The tree from which the section
was cut was about 200 years old when felled.
The Introductory Collection of Minerals will be
found in the gallery devoted to the Mineral
Department (see p. 90).
The North Hall.
Domesticated The North Hall, or that portion of the building
6
Animals, situated to the northward of the principal staircase,
Hybrids, and
Economic is used for the exhibition of the more important
Zoology. breeds of Domesticated Animals, as well as of
examples of Hybrids and other Abnormalities. A
series of specimens illustrative of Economic Zoology
is likewise temporarily placed here.
The examples of Domesticated Mammals
include Horses, Cattle, Sheep, Goats, Llamas, Dogs,
Cats, and Rabbits. One of the main objects of this
series is to show the leading characteristics of the
well-established breeds, both British and foreign. In
addition to Domesticated Animals properly so
called, there are also exhibited examples of what
may be termed Semi-domesticated Animals, such
as white or parti-coloured Rats and Mice.
7
The skulls and skeletons of celebrated Horses
of all breeds, including those of the Thoroughbreds
“Persimmon” (presented by His Majesty King
Edward VII.), “Stockwell,” “Bend Or,” and
“Ormonde,” and of the Shire “Blaisdon Conqueror,”
form a notable feature of the series. In another
case is exhibited the dentition of the Horse at
different periods of existence; while on the opposite
side of the same is illustrated the evolution of the
Horse from three-toed and four-toed ancestors, and
also certain peculiarities distinguishing the skulls of
Thoroughbreds and Arabs from those of most other
breeds.
Among the more notable exhibits are a
mounted specimen of a Spanish Fighting Bull,
which belongs to an altogether peculiar breed, and
heads of Spanish Draught Cattle, presented by H.M.
King Edward VII. Among the Sheep, attention may
be directed to the four-horned, fat-tailed, and fat-
rumped breeds, and also to the small breed from
the island of Soa, as well as the curious spiral
horned Wallachian Sheep. The so-called wild cattle
of Chillingham Park are included in this series, since
they are not truly wild animals, but are descended
from a domesticated breed. The celebrated
Greyhound “Fullerton” is shown among the series of
Dogs, which also comprises examples of the Afghan
Greyhound, and of the Slughi or Arab Greyhound.
Small-sized models of Cattle, Horses, Sheep, and
Pigs also form a feature of the series.
A hybrid between the Zebra and the Ass is
shown in one of the cases; while photographs
illustrate the results of experiments undertaken by
Professor Ewart in cross-breeding between
Burchell’s Zebra and the Horse. An example of the
Lion-Tiger hybrids born many years ago in Atkins’
menagerie is likewise shown.
A fine series of hybrid Ducks and hybrid
Pheasants is exhibited in the north hall.
Skeletons of Facing the visitor as he approaches the middle
Man and
Horse.
of the north hall are the skeletons of a Man and of
a Horse, arranged for comparison with each other,
and also to show the position of the bones of both
in relation to the external surface. In the case of
the Horse, the skin of the same animal from which
the skeleton was prepared was carefully mounted,
and, when dry, divided in the middle line; one half,
lined with velvet, has been placed behind the
skeleton. In the case of the Man, the external
surface is shown by a papier-maché model,
similarly lined and placed in a corresponding
position. As all the principal bones of both skeletons
have their names attached, a study of this group
will not only afford a lesson in comparative
anatomy, but be of practical utility to the artist.
Section of “Big Against the wall dividing the north hall from the
Tree.”
central hall is placed a section of a very large
Wellingtonia or “Big Tree” (Sequoia gigantea),
which was cut down in 1892 near Fresno, in
California. It is about fifteen feet in diameter, and
perfectly sound to the centre, showing distinctly
1,335 rings of annual growth, which afford exact
evidence of the age of the tree. An instantaneous
photograph, taken while the tree was being felled,
is placed near by, and shows its general appearance
when living. The height of the tree was 276 feet.
The exhibits of Economic Zoology at present
occupy the northern division of this hall. In the
western wall-case are specimens showing the
injuries caused to trees by various insects. The
table-cases contain examples of the damage done
in Britain to fruit, roots, corn, and garden and
vegetable produce, with specimens of the insects,
and hints as to methods of destruction. There are
also examples of injury done by insects abroad to
cotton, tea, coffee, etc. In the cases under the
windows are various parasites affecting man and
domesticated animals.

Staircase and Corridors.

Statue of On the first landing of the great staircase,
Darwin.
facing the centre of the hall, is placed the seated
marble statue of Charles Darwin (b. 1809, d. 1882),
to whose labours the study of natural history owes
so vast an impulse. The statue was executed by Sir
J. E. Boehm, R.A., as part of the “Darwin Memorial”
raised by public subscription. It was unveiled and
placed under the care of the Trustees of the
Museum on the 9th of June, 1885, when an
address was delivered on behalf of the Memorial
Committee by the late Professor Huxley, P.R.S., to
which His late Majesty King Edward VII. (then
Prince of Wales), as representing the Trustees,
replied.
Statue of Above the first landing the staircase divides into
Banks. two flights, each leading to one of the corridors
which flank the west and east sides of the hall and
give access to the galleries of the first floor of the
building. Near the southern ends of these corridors
two staircases join to form a central flight leading
to the second floor. On the landing at the top is a
marble statue by Chantrey of Sir Joseph Banks (b.
1743, d. 1820), who for 41 years presided over the
Royal Society and was Trustee of the Museum. His
botanical collections are preserved in the adjoining
gallery, but his library of works on natural history,
also bequeathed to the Museum, remains at
Bloomsbury, where the statue, erected by public
subscription in 1826, stood until it was removed to
its present situation in 1886. On the wall above is
displayed a series of unusually fine heads of Indian
Big Game Animals, bequeathed by Mr. A. O. Hume,
C.B., in 1912.
Fig. 14.—A Female Okapi (Okapia johnstoni).

African The west, south, and east corridors contain a

Antelopes. portion of the collection of mounted Mammals for
which there is not room in the gallery immediately
adjoining. The specimens placed here include a
large number of species of the finest African
Antelopes, animals remarkable for their beauty, for
their former countless numbers, and for their
threatened extermination in consequence of the
inroads of civilized man into their domain.
Giraffes and In a case at the head of the staircase leading to
Okapi.
the east corridor are several mounted specimens of
Giraffes, and near by a skeleton of the same.
Alongside the former is placed a case containing
the heads and necks, together with skulls, of the
various local races of Giraffes; while in a third are
displayed three specimens of their near ally the
Okapi (fig. 14) of the Congo Forest, as well as a
skeleton of the same.
Collections of The collection of Humming-Birds (Trochilidæ)
Humming- arranged and mounted by the late Mr. John Gould,
Birds.
and purchased for the Museum in 1881, is
principally shown in the vestibule leading from the
hall to the Fish-gallery, but a few cases are placed
on the pillars of the staircase. Another large
collection of these birds, presented in 1913 by Mr.
E. J. Balston, of Maidstone, is exhibited in the
corridor leading to the Whale-room.

WEST WING.
The whole of the west wing of the building is
devoted to the collections of recent Zoology.

(A) Ground Floor.

8
Bird Gallery. The ground floor is entered from the west side
(left hand) of the central hall, near the main
entrance of the building. The long gallery,
extending the entire length of the front of the wing
as far as the west pavilion, is assigned to the
exhibited collection of Birds, the study-series of the
same group being kept in cabinets in a room
behind.
Systematic The wall-cases contain mounted specimens of
Series in Wall-
cases.
all the principal genera, placed in systematic order,
beginning with the Crows and Birds of Paradise on
the left hand on entering, and ending with the
Ostriches, Emus, etc., on the right.
British Museum (Natural History)
Ground Floor.
Fig. 15.—The Great Auk or
Gare-Fowl (Plautus, or
Alca, impennis), and its
egg.

Among the multitude of species exhibited in this

gallery, which form, however, but a small proportion
of the different kinds of Birds known to inhabit the
globe, only a few of the more striking can be
mentioned. The various types of the Birds-of-Prey
are very fully represented: from the Condor of the
Andes, the large Sea-Eagle of Bering Strait, and the
Great Eagle-Owl of Europe (all of which are placed
in separate cases), to the Dwarf Falcon in case 53,
which is not much larger than a sparrow, and preys
upon insects. Among the large group of Perching-
Birds, attention may be directed to the cases of
Birds of Paradise and Bower-Birds in the first bay on
the left. In separate cases in the sixth bay on the
opposite side of the gallery are placed skeletons of
the Dodo and Solitaire, large Pigeon-like birds with
wings too small for flight, once inhabiting the
islands of Mauritius and Rodriguez, respectively, but
now extinct. Other cases on the right-hand side of
the gallery are occupied by the Game-Birds, and
the Wading and Swimming Birds. Here may be
noticed a nearly complete series of the genera of
Pheasants and Pigeons, showing the various forms.
Special attention may be directed to the Great Auk
(fig. 15), from the Northern Atlantic, which became
extinct only in the last century. Casts of the eggs
(fig. 16) of this curious bird are also exhibited. A
case in the 7th bay contains a series of Penguins,
flightless birds which may be regarded as
representing the northern Auks and Guillemots in
the southern oceans. Particularly interesting is the
great Emperor Penguin, which lays its eggs and
rears its young in winter amidst the ice of the
Antarctic. Most of the specimens exhibited were
obtained during the British Antarctic Expedition of
1839–43, under the command of Captain Sir James
Clark Ross.
Other noteworthy types are the Great Bustard,
once an inhabitant of England, and the Flamingos;
a pair of the latter being exhibited with their nest.
In the first two bays on the right side of the
gallery are placed specimens of the Ostrich group,
characterised by the flat or raft-like form of the
breast-bone. Owing to the rudimentary character of
their wings, these Birds lack the power of flight.
They include the largest existing Birds, the
Ostriches, Emus, and Cassowaries, as well as the
small Kiwis (Apteryx) of New Zealand, together with
the extinct Moas (Dinornis, etc.), of the same
country, and the Roc (Æpyornis) of Madagascar. A
fossil egg of the latter is placed alongside eggs of
the existing species of the group.
Fig. 16.—Egg of the Great Auk or Gare-Fowl: Size of nature.

Groups of Down the middle line of the gallery, as well as

British Birds in many of the bays, are placed groups showing the
and Nests.
nesting-habits of various species of British birds.
The great value of these groups consists in their
absolute truthfulness to nature. The surroundings
are not selected by chance or from imagination, but
in every case are carefully executed reproductions
of those that were present round the individual
nest. When it has been possible, the actual rocks,
trees, or grass, have been preserved, but in cases
where these could not be used, they have been
accurately modelled from nature. Great care has
also been taken in preserving the natural form and
characteristic attitudes of the Birds themselves.
Among the more attractive cases are, near the
centre of the gallery, a pair of Puffins feeding their
single young one, and Black-throated Divers with
their eggs in a hollow in the grass on the edge of a
mountain-loch in Sutherland. Hen-harriers—the
male grey and the female brown—are shown with
their nest among the heather from the moorland of
the same county. On the left of these is a Peregrine
Falcon’s eyrie, on the ledge of a rocky cliff,
containing three white downy nestlings. Near by are
various species of Ducks, notably the Red-headed
Pochard on the sedgy border of a Norfolk mere. In
the last bay but one on the right side is a nest of
the Heron, in a fir-tree, with the two old birds and
three nearly fledged young. Various species of Gulls
and a particularly beautiful group of Arctic Terns
from the Shetland Islands are exhibited in the
middle line towards the west end of the gallery and
in the eighth and ninth bays. In the eighth bay on
the right side and in the adjoining passage are
Plovers, Sandpipers, Snipes, etc., some of which
(especially the Ringed and Kentish Plovers) show
the wonderful adaptation of the colouring of the
eggs and young birds to their natural surroundings
for the purpose of concealment. In the second
passage leading to the Coral-gallery are Ptarmigan
and Capercaillie from Scotland, and in the adjacent
part of the middle line Wood-Pigeons and Turtle-
Doves building their simple, flat nests of sticks in
ivy-clad trees. In the fourth, sixth and seventh bays
on the left are Sand-Martins and Kingfishers,
showing, by means of sections of the banks of sand
or earth, the form and depth of the hole in which
the eggs are placed; and also nests of the Swift,
Swallow, and House-Martin, all in portions of
human habitations.
Pavilion, with The “pavilion” at the west end of the Bird-
British Land
and Fresh-
gallery is devoted to the exhibition of the land and
water fresh-water Vertebrated Animals of the British
Vertebrates.
9 Islands. The larger Mammals and Fishes occupy the
wall-case on the north side, which is surmounted
with horns. In the two pairs of centre cases is
exhibited the series of British Birds, supplemented
by the groups, to which reference has been made
already. The wall-case on the north side of the
archway contains a group of Gannets and other
sea-birds from the Bass Rock in the Firth of Forth.
On the opposite side are two striking groups with
the surroundings true to nature, the one of the
Golden Eagle and the other of the Buzzard, both
taken in Scotland. Other groups in the pavilion
display the Kestrel, the Peregrine Falcon, and the
Merlin amid natural surroundings. Among the
Mammals, especial attention may be directed to a
case of British Hares and Rabbits. In another case
may be seen a female Badger and her young; in a
third is a group of Otters; in a fourth a vixen Fox
with her cubs; in a fifth a Mole-hill with its
inhabitants; in a sixth a pair of Martens; in a
seventh Polecats and their young; while other cases
are devoted to Stoats, Weasels, Hedgehogs,
Squirrels, Rats, Mice, etc.
Here it may be mentioned that the animal
inhabitants of any country or district are collectively
termed its “fauna.” The British Islands in this
respect belong to the great zoological region called
Palæarctic, or Eastern Holarctic, embracing all
Europe, the north of Africa, and the western and
northern portions of Asia. As in the case of other
islands, the species belonging to groups in which
the power of locomotion is limited to land or fresh-
water are not numerous compared with those
inhabiting large continental tracts. Their numbers
can only increase under exceptional circumstances,
and have a tendency to diminish as the growth of
human population and increase of the area of
cultivated land gradually reduce their native haunts.
In this way the Brown Bear, the Wolf, the Beaver,
and the Wild Boar have disappeared from Britain
within the historic period, while other species, such
as the Badger, Marten, and Wild Cat, with difficulty
maintain a more or less precarious existence. All
these were originally derived from the mainland of
Europe, probably before the formation of the
channel which now separates it from Great Britain.
The wider and older channel which separates
Ireland from Great Britain has been a greater
barrier to the emigration of animal life than that
between the latter and the Continent, many species
(as the Polecat, Wild Cat, Mole, Squirrel, Dormouse,
Harvest-Mouse, Water-Rat, Short-tailed Field-
Mouse, Brown Hare, Roedeer, as well as Snakes
and Toads) never having crossed what is now the
Irish Sea, unless by human agency.
On the other hand, those species that have the
power of travelling through the air or traversing the
ocean are far less fixed in their habitat; and it
results from this that the list of so-called “British
Birds” receives accessions from time to time from
stragglers which find their way from the European
continent or Asia, or even across the Atlantic.
Slight but permanent variations from the
continental type may be recognised in many native
British species, some of the most marked among
vertebrated animals being the Irish Stoat, the
Squirrel, the Red Grouse, the St. Kilda Wren, the
Coal-Tit, the Goldcrest, and several species of fresh-
water fishes, mostly belonging to the genera Salmo
and Coregonus. Some of the latter, such as the
Vendace, the Gwyniad, and their allies, of which
specimens are exhibited in the wall-case in the
pavilion, have an extremely local distribution, being
found only in certain small groups of mountain
lakes.
Of the Seals, only two species are really natives
of Britain, the Common Seal (Phoca vitulina) and
the great Grey Seal (Halichœrus grypus);
specimens of both these are shown in the pavilion.
Those desirous of studying more minutely the
characteristics of British Mammals should examine
the series of skins and skulls exhibited in a special
case on the right side of the central west window.
Coral Parallel with the Bird-gallery, on the north side
10
Gallery. (right on entering), and approached by several
passages, is a long narrow gallery containing the
collection of Corals and Sponges and allied types.
Commencing at the eastern end, some of the
lowest forms of animal life are exhibited in the wall-
case and table-cases; they belong to a group called
Protozoa, and, for the greater part, are so minute,
that they can be studied only with the microscope;
their structure is therefore illustrated chiefly by
means of models and figures. The next divisions of
the gallery are occupied by the Sponges, most
conspicuous among these being a series showing
the variations of the common Bath-Sponge (cases 1
and 2), the beautiful flinty Venus’ Flower-basket or
Euplectella (fig. 17), the Japanese Glass-rope
Sponge or Hyalonema (case 3), and the gigantic
Neptune’s Cup or Poterion, of which several
specimens are placed on separate stands. Special
interest attaches to the case showing the different
kinds of Sponges used in commerce.
Fig. 17.—Venus’ Flower-Basket (Euplectella
imperialis and E. aspergillum). (One-sixth
natural size.)
Fig. 18.—Brain-Coral. (Meandrina cerebriformis).

Nearly the whole of the remainder of the gallery

is given up to Corals. In life these organisms display
an immense variety of form and colour, sometimes
presenting a marvellous resemblance to vegetable
growths; but the part exhibited in the gallery is
merely the dried, hard, horny, or stony basis or
supporting skeleton, either of isolated individuals,
or of colonies. Corals are allied to the well-known
Sea-anemones of the British and other coasts; the
combined skeletons of myriads of these animals
form the coral-reefs which constitute the bases of
thousands of islands in the Indo-Pacific Ocean.
Among the larger reef-making species are the
Brain-Corals (Meandrina), one of which is shown in
figure 18. Near the west end of the gallery is placed
a magnificent specimen of the Black Coral of the
Mediterranean (Gerardia savalia), obtained off the
coast of the island of Eubœa in the Ægean Sea.
The drawing in the case shows a magnified view of
the “animals” or polyps of this species as they
appear in life. In case 13 are specimens and
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and

personal growth!

ebookluna.com

Principles of Database Manageme - Wilfried Lemahieu
100% (6)
Principles of Database Manageme - Wilfried Lemahieu
1,843 pages
2023 Syllabus Advanced Database System
No ratings yet
2023 Syllabus Advanced Database System
10 pages
Car Showroom Management System Synopsis in VB 6.0
63% (8)
Car Showroom Management System Synopsis in VB 6.0
49 pages
Online Bus Reservationfinal
50% (6)
Online Bus Reservationfinal
33 pages
Mysql Database Administration
No ratings yet
Mysql Database Administration
62 pages
RDBMS Concepts: © Tata Consultancy Services Ltd. July 7, 2018 1
No ratings yet
RDBMS Concepts: © Tata Consultancy Services Ltd. July 7, 2018 1
38 pages
Hu Vehicle Management System Project Edited
No ratings yet
Hu Vehicle Management System Project Edited
27 pages
SQL Easy
No ratings yet
SQL Easy
200 pages
Question Bank Final Year Project Report
No ratings yet
Question Bank Final Year Project Report
69 pages
Basic & Advanced SQL Interview Questions and Answers
No ratings yet
Basic & Advanced SQL Interview Questions and Answers
24 pages
Objective
No ratings yet
Objective
25 pages
Normalization in DBMS
No ratings yet
Normalization in DBMS
9 pages
CH 14 FDs and Normalization PDF
No ratings yet
CH 14 FDs and Normalization PDF
55 pages
Grade 10 - 12 Information Technology Revision Document
100% (1)
Grade 10 - 12 Information Technology Revision Document
32 pages
Top 52 DBMS Interview Questions (2022)
No ratings yet
Top 52 DBMS Interview Questions (2022)
24 pages
Software Requirements Specification Document
No ratings yet
Software Requirements Specification Document
26 pages
BBA II Yr (Sem IV) IT II Notes
No ratings yet
BBA II Yr (Sem IV) IT II Notes
11 pages
Core 1: Computer Fundamentals Unit - I
No ratings yet
Core 1: Computer Fundamentals Unit - I
45 pages
Chapter6 NormalizationDatabaseTables Part4
No ratings yet
Chapter6 NormalizationDatabaseTables Part4
38 pages
Database Management Systems (DBMS)
No ratings yet
Database Management Systems (DBMS)
5 pages
(Ebook PDF) Modern Database Management 12th Global Edition Instant Download
No ratings yet
(Ebook PDF) Modern Database Management 12th Global Edition Instant Download
51 pages
Airline Reservation System
No ratings yet
Airline Reservation System
13 pages
9 HBase
No ratings yet
9 HBase
77 pages
Faculty Lab Manual Super
No ratings yet
Faculty Lab Manual Super
5 pages
Student Lecture Notes On Normalization
No ratings yet
Student Lecture Notes On Normalization
21 pages
3Nf, BCNF, 4Nf & 5Nf: UNIT-3 Rdbms BCA304 Presenter-Daisy Sharmah
No ratings yet
3Nf, BCNF, 4Nf & 5Nf: UNIT-3 Rdbms BCA304 Presenter-Daisy Sharmah
31 pages
Databases 1 Programming Assignment Unit 3
No ratings yet
Databases 1 Programming Assignment Unit 3
5 pages
MCS-023 Introduction To Database Management Systems
No ratings yet
MCS-023 Introduction To Database Management Systems
14 pages
Sem. 3 DBMS Theory Bca
No ratings yet
Sem. 3 DBMS Theory Bca
2 pages
ER and Normalization
No ratings yet
ER and Normalization
4 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.