0% found this document useful (0 votes)
76 views51 pages

Chapter 3 Database Management System

Chapter 3 discusses Database Management Systems (DBMS) and their role in organizing and managing data effectively, contrasting them with traditional file environments that often lead to data redundancy and inconsistency. It highlights the capabilities of DBMS, including data centralization, reduced redundancy, and improved data security, while also introducing relational and object-oriented databases. Additionally, the chapter covers the importance of designing databases with a focus on relationships among data and the necessity of normalization for efficient data management.

Uploaded by

monjumanoor23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views51 pages

Chapter 3 Database Management System

Chapter 3 discusses Database Management Systems (DBMS) and their role in organizing and managing data effectively, contrasting them with traditional file environments that often lead to data redundancy and inconsistency. It highlights the capabilities of DBMS, including data centralization, reduced redundancy, and improved data security, while also introducing relational and object-oriented databases. Additionally, the chapter covers the importance of designing databases with a focus on relationships among data and the necessity of normalization for efficient data management.

Uploaded by

monjumanoor23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Chapter 3: Database

Management Systems
Contents:
Organizing Data in Traditional File Environment
Database Management Systems
Capabilities of Database Management Systems
Data Warehouse-Data Mart, Data Mining
ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT

 An effective information system provides users with accurate, timely, and


relevant information.
 Accurate information is free of errors.
 Information is timely when it is available to decision makers when it is needed.
 Information is relevant when it is useful and appropriate for the types of work and
decisions that require it.
 It is surprising to learn that many businesses don’t have timely, accurate, or
relevant information because the data in their information systems have been
poorly organized and maintained.
 That’s why data management is so essential.
FILE ORGANIZATION TERMS AND CONCEPTS
 A computer system organizes data in a hierarchy that starts with bits and bytes
and progresses to fields, records, files, and databases (see Figure 6-1 in next
slide).
 A bit represents the smallest unit of data a computer can handle.
 A group of bits, called a byte, represents a single character, which can be a letter,
a number, or another symbol.
 A grouping of characters into a word, a group of words, or a complete number
(such as a person’s name or age) is called a field.
 A group of related fields, such as the student’s name, the course taken, the date,
and the grade, comprises a record;
 a group of records of the same type is called a file.
 For example, the records in previous figure could constitute a student course file.
 A group of related files makes up a database. The student course file illustrated in
figure 6-1 could be grouped with files on students’ personal histories and financial
backgrounds to create a student database.
 A record describes an entity.
 An entity is a person, place, thing, or event on which we store and maintain
information.
 Each characteristic or quality describing a particular entity is called an
attribute. For example, Student ID, Course, Date, and Grade are attributes of
the entity COURSE.
 The specific values that these attributes can have are found in the fields of
the record describing the entity COURSE.
PROBLEMS WITH THE TRADITIONAL FILE ENVIRONMENT

 In most organizations, systems tended to grow independently without a company-


wide plan.
 Accounting, finance, manufacturing, human resources, and sales and marketing all
developed their own systems and data files.
 Figure 6-2 illustrates the traditional approach to information processing
 Each application, of course, required its own files and its own computer program
to operate.
 For example, the human resources functional area might have a personnel master file, a
payroll file, a medical insurance file, a pension file, a mailing list file, and so forth until
tens, perhaps hundreds, of files and programs existed.
 In the company as a whole, this process led to multiple master files created, maintained, and
operated by separate divisions or departments.
 As this process goes on for 5 or 10 years, the organization is saddled with hundreds
of programs and applications that are very difficult to maintain and manage.
 The resulting problems are data redundancy and inconsistency, program-data
dependence, inflexibility, poor data security, and an inability to share data among
applications.
(1) Data Redundancy and Inconsistency
 Data redundancy is the presence of duplicate data in multiple data files so
that the same data are stored in more than place or location.
 Data redundancy occurs when different groups in an organization
independently collect the same piece of data and store it independently of
each other.
 Data redundancy wastes storage resources and also leads to data
inconsistency, where the same attribute may have different values.
 For example
 The same attribute, Student_ID, may also have different names in different
systems throughout the organization.
 Additional confusion might result from using different coding systems to represent
values for an attribute. For instance, the sales, inventory, and manufacturing
systems of a clothing retailer might use different codes to represent clothing size.
One system might represent clothing size as “extra large,” whereas another might
use the code “XL” for the same purpose.
(2) Program-Data Dependence

 Program-data dependence refers to the coupling of data stored in files and


the specific programs required to update and maintain those files such that
changes in programs require changes to the data.
 Every traditional computer program has to describe the location and nature of
the data with which it works.
 In a traditional file environment, any change in a software program could
require a change in the data accessed by that program.
 One program might be modified from a five-digit to a nine-digit ZIP code.
 If the original data file were changed from five-digit to nine-digit ZIP codes,
then other programs that required the five-digit ZIP code would no longer
work properly.
(3) Lack of Flexibility

 A traditional file system can deliver routine scheduled reports after extensive
programming efforts, but it cannot deliver ad hoc reports or respond to
unanticipated information requirements in a timely fashion.
 The information required by ad hoc requests is somewhere in the system but
may be too expensive to retrieve.
 Several programmers might have to work for weeks to put together the
required data items in a new file.
(4) Poor Security

 Because there is little control or management of data,


access to and dissemination of information may be out of
control.
 Management may have no way of knowing who is
accessing or even making changes to the organization’s
data.
(5) Lack of Data Sharing and Availability

 Because pieces of information in different files and different parts of


the organization cannot be related to one another, it is virtually
impossible for information to be shared or accessed in a timely
manner.
 Information cannot flow freely across different functional areas or
different parts of the organization.
 If users find different values of the same piece of information in two
different systems, they may not want to use these systems because
they cannot trust the accuracy of their data.
THE DATABASE APPROACH TO DATA
MANAGEMENT
 Database technology cuts through many of the problems of traditional file
organization.
 A more rigorous definition of a database is a collection of data organized to
serve many applications efficiently by centralizing the data and controlling
redundant data.
 Rather than storing data in separate files for each application, data are
stored so as to appear to users as being stored in only one location.
 A single database services multiple applications.
 For example, instead of a corporation storing employee data in separate
information systems and separate files for personnel, payroll, and benefits, the
corporation could create a single common human resources database.
DATABASE MANAGEMENT SYSTEMS

 A database management system (DBMS) is software that permits an


organization to centralize data, manage them efficiently, and provide access
to the stored data by application programs.
 The DBMS acts as an interface between application programs and the physical
data files.
 When the application program calls for a data item, such as gross pay, the
DBMS finds this item in the database and presents it to the application
program.
 Using traditional data files, the programmer would have to specify the size
and format of each data element used in the program and then tell the
computer where they were located.
 The DBMS relieves the programmer or end user from the
task of understanding where and how the data are
actually stored by separating the logical and physical
views of the data.
 The logical view presents data as they would be perceived
by end users or business specialists, whereas the physical
view shows how data are actually organized and
structured on physical storage media.
 The database management software makes the physical database
available for different logical views required by users.
 For example, for the human resources database illustrated in Figure
6-3, a benefits specialist might require a view consisting of the
employee’s name, social security number, and health insurance
coverage.
 A payroll department member might need data such as the
employee’s name, social security number, gross pay, and net pay.
How a DBMS Solves the Problems of the
Traditional File Environment
 A DBMS reduces data redundancy and inconsistency by minimizing isolated files in
which the same data are repeated.
 The DBMS may not enable the organization to eliminate data redundancy entirely,
but it can help control redundancy.
 Even if the organization maintains some redundant data, using a DBMS eliminates
data inconsistency because the DBMS can help the organization ensure that every
occurrence of redundant data has the same values.
 The DBMS uncouples programs and data, enabling data to stand on their own.
 Access and availability of information will be increased and program development
and maintenance costs reduced because users and programmers can perform ad
hoc queries of data in the database.
 The DBMS enables the organization to centrally manage data, their use, and
security.
Relational DBMS

 Contemporary DBMS use different database models to keep track of entities,


attributes, and relationships.
 The most popular type of DBMS today for PCs as well as for larger computers and
mainframes is the relational DBMS.
 Relational databases represent data as two-dimensional tables (called relations).
 Tables may be referred to as files.
 Each table contains data on an entity and its attributes.
 Microsoft Access is a relational DBMS for desktop systems, whereas DB2, Oracle
Database, and Microsoft SQL Server are relational DBMS for large mainframes and
midrange computers.
 MySQL is a popular open-source DBMS, and Oracle Database Lite is a DBMS for
small handheld computing devices.
Let’s look at how a relational database organizes data about
suppliers and parts (see Figure 6-4).
Operations of a Relational DBMS

 Relational database tables can be combined easily to deliver data


required by users, provided that any two tables share a common data
element.
 Suppose we want to find in this database the names of suppliers who
could provide us with part number 137 or part number 150.
 We would need information from two tables: the SUPPLIER table and
the PART table.
 Note that these two files have a shared data element:
Supplier_Number.
 In a relational database, three basic operations are used to develop useful
sets of data: select, join, and project.
 The select operation creates a subset consisting of all records in the file that
meet stated criteria. Select creates, in other words, a subset of rows that
meet certain criteria. In our example, we want to select records (rows) from
the PART table where the Part_Number equals 137 or 150.
 The join operation combines relational tables to provide the user with more
information than is available in individual tables. In our example, we want to
join the now-shortened PART table (only parts 137 or 150 will be presented)
and the SUPPLIER table into a single new table.
 The project operation creates a subset consisting of columns in a table,
permitting the user to create new tables that contain only the information
required. In our example, we want to extract from the new table only the
following columns: Part_Number, Part_Name, Supplier_Number, and
Supplier_Name.
 See the figure in next slide
Object-Oriented DBMS

 Many applications today and in the future require databases that can store
and retrieve not only structured numbers and characters but also drawings,
images, photographs, voice, and full-motion video.
 DBMS designed for organizing structured data into rows and columns are not
well suited to handling graphics based or multimedia applications.
 Object-oriented databases are better suited for this purpose.
 An object-oriented DBMS stores the data and procedures that act on those
data as objects that can be automatically retrieved and shared.
 Object-oriented database management systems (OODBMS) are becoming
popular because they can be used to manage the various multimedia
components or Java applets used in Web applications, which typically
integrate pieces of information from a variety of sources.
Databases in the Cloud

 Suppose your company wants to use cloud computing services.


 Is there a way to manage data in the cloud?
 The answer is a qualified “Yes.”
 Cloud computing providers offer database management services, but these
services typically have less functionality than their on-premises counterparts.
 At the moment, the primary customer base for cloud-based data management
consists of Webfocused start-ups or small to medium-sized businesses looking
for database capabilities at a lower price than a standard relational DBMS.
 Amazon Web Services has both a simple non-relational database called
SimpleDB and a Relational Database Service, which is based on an online
implementation of the MySQL open source DBMS.
CAPABILITIES OF DATABASE MANAGEMENT SYSTEMS

 A DBMS includes capabilities and tools for organizing, managing, and accessing the data in the
database.
 The most important are its data definition language, data dictionary, and data manipulation
language.
 DBMS have a data definition capability to specify the structure of the content of the
database.
 It would be used to create database tables and to define the characteristics of the fields in
each table.
 This information about the database would be documented in a data dictionary.
 A data dictionary is an automated or manual file that stores definitions of data elements and
their characteristics.
 Microsoft Access has a rudimentary data dictionary capability that displays information about
the name, description, size, type, format, and other properties of each field in a table (see
Figure 6-6).
 Data dictionaries for large corporate databases may capture additional information, such as
usage, ownership (who in the organization is responsible for maintaining the data),
authorization; security, and the individuals, business functions, programs, and reports that
use each data element.
Querying and Reporting
 DBMS includes tools for accessing and manipulating
information in databases.
 Most DBMS have a specialized language called a data
manipulation language that is used to add, change,
delete, and retrieve the data in the database.
 This language contains commands that permit end users
and programming specialists to extract data from the
database to satisfy information requests and develop
applications.
 The most prominent data manipulation language today is
Structured Query Language, or SQL.
Figure 6-7 illustrates the SQL query that would produce the
new resultant table in Figure 6-5.
Figure 6-8 illustrates how the same query as the SQL query to select
parts and suppliers would be constructed using the Microsoft query-
building tools.
DESIGNING DATABASES

 To create a database, you must understand the relationships among


the data, the type of data that will be maintained in the database,
how the data will be used, and how the organization will need to
change to manage data from a company-wide perspective.
 The database requires both a conceptual design and a physical design.
 The conceptual, or logical, design of a database is an abstract model
of the database from a business perspective, whereas the physical
design shows how the database is actually arranged on direct-access
storage devices.
Normalization and Entity-Relationship
Diagrams
 The conceptual database design describes how the data elements in the database are to
be grouped.
 The design process identifies relationships among data elements and the most efficient
way of grouping data elements together to meet business information requirements.
 The process also identifies redundant data elements and the groupings of data elements
required for specific application programs.
 Groups of data are organized, refined, and streamlined until an overall logical view of the
relationships among all the data in the database emerges.
 To use a relational database model effectively, complex groupings of data must be
streamlined to minimize redundant data elements and awkward many-to-many
relationships.
 The process of creating small, stable, yet flexible and adaptive data structures from
complex groups of data is called normalization.
 Figures 6-9 and 6-10 illustrate this process.
USING DATABASES TO IMPROVE BUSINESS
PERFORMANCE AND DECISION MAKING
 Businesses use their databases to keep track of basic transactions, such as
paying suppliers, processing orders, keeping track of customers, and paying
employees.
 But they also need databases to provide information that will help the
company run the business more efficiently, and help managers and employees
make better decisions.
 If a company wants to know which product is the most popular or who is its
most profitable customer, the answer lies in the data.
 For example, by analyzing data from customer credit card purchases, Louise’s
Trattoria, a Los Angeles restaurant chain, learned that quality was more
important than price for most of its customers, who were college educated
and liked fine wine.
 Acting on this information, the chain introduced vegetarian dishes, more
seafood selections, and more expensive wines, raising sales by more than 10
percent.
 In a large company, with large databases or large systems
for separate functions, such as manufacturing, sales, and
accounting, special capabilities and tools are required for
analyzing vast quantities of data and for accessing data
from multiple systems.
 These capabilities include data warehousing, data mining,
and tools for accessing internal databases through the
Web.
Why DATA WAREHOUSES?
 Suppose you want concise, reliable information about current operations,
trends, and changes across the entire company If you worked in a large
company, obtaining this might be difficult because data are often maintained
in separate systems, such as sales, manufacturing, or accounting.
 Some of the data you need might be found in the sales system, and other
pieces in the manufacturing system.
 Many of these systems are older legacy systems that use outdated data
management technologies or file systems where information is difficult for
users to access.
 You might have to spend an inordinate amount of time locating and gathering
the data you need, or you would be forced to make your decision based on
incomplete knowledge.
 If you want information about trends, you might also have trouble finding
data about past events because most firms only make their current data
immediately available. Data warehousing addresses these problems.
What Is a Data Warehouse?

 A data warehouse is a database that stores current and historical data of


potential interest to decision makers throughout the company.
 The data originate in many core operational transaction systems, such as
systems for sales, customer accounts, and manufacturing, and may include
data from Web site transactions.
 The data warehouse consolidates and standardizes information from different
operational databases so that the information can be used across the
enterprise for management analysis and decision making.
 Figure 6-12 illustrates how a data warehouse works.
 The data warehouse makes the data available for anyone to access as
needed, but it cannot be altered.
 A data warehouse system also provides a range of ad hoc and standardized
query tools, analytical tools, and graphical reporting facilities.
 Many firms use intranet portals to make the data warehouse information
widely available throughout the firm.
Data Marts
 Companies often build enterprise-wide data warehouses, where a central data warehouse
serves the entire organization, or they create smaller, decentralized warehouses called
data marts.
 A data mart is a subset of a data warehouse in which a summarized or highly focused
portion of the organization’s data is placed in a separate database for a specific population
of users.
 For example, a company might develop marketing and sales data marts to deal with
customer information.
 Before implementing an enterprise-wide data warehouse, bookseller Barnes & Noble
maintained a series of data marts—one for point-of-sale data in retail stores, another for
college bookstore sales, and a third for online sales.
 A data mart typically focuses on a single subject area or line of business, so it usually can
be constructed more rapidly and at lower cost than an enterprise-wide data warehouse.
TOOLS FOR BUSINESS INTELLIGENCE: MULTIDIMENSIONAL DATA
ANALYSIS AND DATA MINING

 Once data have been captured and organized in data


warehouses and data marts, they are available for further
analysis using tools for business intelligence.
 Business intelligence tools enable users to analyze data to
see new patterns, relationships, and insights that are
useful for guiding decision making.
 Principal tools for business intelligence include software
for database querying and reporting, tools for
multidimensional data analysis (online analytical
processing), and tools for data mining.
Online Analytical Processing (OLAP)

 Suppose your company sells four different products—nuts,


bolts, washers, and screws—in the East, West, and Central
regions.
 If you wanted to ask a fairly straightforward question,
such as how many washers were sold during the past
quarter, you could easily find the answer by querying your
sales database.
 But what if you wanted to know how many washers sold in
each of your sales regions and compare actual results with
projected sales?
 To obtain the answer, you would need online
analytical processing (OLAP).
 OLAP supports multidimensional data analysis,
enabling users to view the same data in different
ways using multiple dimensions.
 Each aspect of information—product, pricing,
cost, region, or time period—represents a
different dimension.
Data Mining
 Traditional database queries answer such questions as, “How many units of product
number 403 were shipped in February 2010?” OLAP, or multidimensional analysis,
supports much more complex requests for information, such as “Compare sales of
product 403 relative to plan by quarter and sales region for the past two years.” With
OLAP and query-oriented data analysis, users need to have a good idea about the
information for which they are looking.
 Data mining is more discovery-driven.
 Data mining provides insights into corporate data that cannot be obtained with OLAP by
finding hidden patterns and relationships in large databases and inferring rules from
them to predict future behavior.
 The patterns and rules are used to guide decision making and forecast the effect of
those decisions.
 The types of information obtainable from data mining include associations, sequences,
classifications, clusters, and forecasts.
(1) Associations

 Associations are occurrences linked to a single event.


 For instance, a study of supermarket purchasing patterns might reveal that,
when corn chips are purchased, a cola drink is purchased 65 percent of the
time, but when there is a promotion, cola is purchased 85 percent of the
time.
 This information helps managers make better decisions because they have
learned the profitability of a promotion.
(2) Sequences

 In sequences, events are linked over time.


 We might find, for example, that if a house
is purchased, a new refrigerator will be
purchased within two weeks 65 percent of
the time, and an oven will be bought within
one month of the home purchase 45
percent of the time.
(3) Classification

 Classification recognizes patterns that describe the group to which an


item belongs by examining existing items that have been classified
and by inferring a set of rules.
 For example, businesses such as credit card or telephone companies
worry about the loss of steady customers.
 Classification helps discover the characteristics of customers who are
likely to leave and can provide a model to help managers predict who
those customers are so that the managers can devise special
campaigns to retain such customers.
(4) Clustering

 Clustering works in a manner similar to


classification when no groups have yet been
defined.
 A data mining tool can discover different
groupings within data, such as finding affinity
groups for bank cards or partitioning a database
into groups of customers based on demographics
and types of personal investments.
(5) Forecasting

 Although these applications involve predictions,


forecasting uses predictions in a different way.
 It uses a series of existing values to forecast what
other values will be.
 For example, forecasting might find patterns in
data to help managers estimate the future value
of continuous variables, such as sales figures.
Thank
You

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy