Lectrure Series 4 - Mid 2 - Data Resources - (Book - CH 5)
Lectrure Series 4 - Mid 2 - Data Resources - (Book - CH 5)
Example: “A” is a character, which represents the uppercase letter “A”. Character “A” can be
encoded as 01000001 in binary terms, which represents or corresponds to it’s ASCII (American
Standard Code for Information Interchange) code: 65.
Field – a grouping of related characters, as a last name or a salary, represents an attribute of some
entity General Purpose Application Programs – perform common information processing jobs for end
users.
Example: Field can be the smallest unit of data, contains a single attribute or property. Think about
a database about students, where “Name” field contains the names of all students.
Fundamental Data Concepts
Record – a grouping of attributes/fields that describe an entity or object.
Example: Think about the student database example, where “Name” was a field. Now a single record
about a student might contain the fields like Age, Name, section, grade etc.
Example: Think about the student database example we are discussing, where a file might be a collection
of records for all the students enrolled in a particular course.
Database Development
Database– is a collection of logically related data elements, which is organized and stored in such a
way that allows the users to efficiently retrieve, manage and manipulation of data. It might be made up of
one or more files or tables.
Example: Think about a student information system where the entire student database could be made
up of multiple files, where each files containing different types of data like student records, course outline,
attendance records, section number, semester number, payment information etc.
Database Development
Database Administrator (DBA) – controls development and administration of the database.
Example: Think about your university database system, where a DBA ensures the students data is
properly stored, secured, and easily accessible for academics, administration, students and relevant
stakeholders.
Database Development
Data Definition Language (DDL) is a set of commands in SQL that are used to
create, modify, and delete the structure of a database.
DDL commands do not manipulate the data in the database, but they do define how
the data is organized and stored.
A Data Dictionary contains information about the relationships between the data elements.
Data
Data Element Description Type Length Constraints/Validation
A relationship in MySQL helps to combine data from two different tables. Each relationship consists of
fields in two tables with corresponding data. When a user uses related tables in a query,
the relationship lets MySQL determine which records from each table to combine in the result set.
MySQL Join Example
Left Outer Join - MySQL
DBMS
DBMS
Under the database management approach, data records are consolidated into databases that can be accessed by
many different application programs. A database management system (DBMS) is a set of computer programs
that control the creation, maintenance, and use of the databases of an organization and its end users. Four
major DBMS facilities include:
Database Maintenance: Updating the databases and other maintenance are conducted by
transaction processing programs.
Application Development: A DBMS makes application development much easier and quicker
by allowing developers to include data manipulation language (DML) statements in their
programs that let the DBMS perform necessary data-handling activities.
Data Across Enterprise
Start of a Business Intelligence –
Data Across Enterprise
A data warehouse stores data that has been extracted from various operational, external, and other
databases within the organization.
To create a data warehouse, data from various databases are captured, cleaned e.g. sorted, filtered,
converted, and transformed into data that can be better used for analysis. The data is then stored in the
enterprise data warehouse, from where it can be moved into data marts or to an analytical data store that
holds data to support certain types of analysis.
Metadata, that defines the data in the data warehouse, is stored in a Metadata Directory that is used to
support data administration. A variety of analytical software tools can then be used to query, report, and
analyze data.
One such means for analyzing data in a data warehouse is called data mining.
Six Major Databases
Operational Databases: These databases store detailed data needed to support the operations of
the entire organization. They are also called subject area databases (SADB), transaction
databases, and production databases. These also include databases of Internet and electronic
commerce activity, such as click stream data or data describing online behavior of visitors at a
company’s website.
Data Warehouse Databases: These store data from current and previous years that has been
extracted from the various operational and management databases of the organization. As a
standardized and integrated central source of data, warehouses can be used by managers for
pattern processing, where key factors and trends about operations can be identified from the
historical record.
Data Marts: Are subsets of the data included in a Data Warehouse which focus on specific
aspects of a company, e.g. department, business process, etc.
Six Major Databases
Distributed Databases: These are the databases of local workgroups and departments at regional
offices, branch offices, and other work sites needed to complete the task at hand. They include relevant
information from other organizational databases combined with data and information generated only at
the particular site. These databases can reside on network servers, on the World Wide Web, or on
Intranets and Extranets.
End User Databases: These consist of a variety of data files developed by end users at their
workstations. For example, an end user in sales might combine information on a customer’s order
history with her own notes and impressions from face-to-face meetings to improve follow-up.
External Databases: Many organizations make use of privately generated and owned online databases
or data banks that specialize in a particular area of interest. Access is usually through a subscription for
continuing links or a one-time fee for a specific piece of information (like the results of a single search).
Other sources like those found on the Web are free.
Database Structures
Hierarchical Network
Relational
Database Structures
The relationships among the records stored in databases are based upon one of several logical database
structures or models. These fundamental database structures are described below.
Hierarchical Structure: Under this tree-like structure, each data element is related only to one element
above it, a so-called one-to-many relationship. All records are dependent and arranged in multilevel
structures.
Network Structure: This structure features a many-to-many arrangement whereby the DBMS can access
a data element by following one of several paths.
Relational Structure: This model has become the most popular structure and is used by most
microcomputers. All data elements within the database are viewed as being stored in the form of simple
tables. The DBMS can link data elements from various tables to provide information to end users.
Database Development
1.
1. Data
Data Planning
Planning Physical
Physical Models
Models
Enterprise
Enterprise Model
Model 5.
5. Physical
Physical Design
Design
2.
2. Requirements
Requirements
Specifications Logical
Logical Models
Models
Specifications
User
User Needs
Needs 4.
4. Logical
Logical Design
Design
Description
Description
3.
3. Conceptual
Conceptual Data
Data Models
Models
Design
Design
Database Development
Database planning, beyond that of a personal or small business end user database created by a
database management package, typically requires use of a top-down data planning process based
upon the systems development model covered earlier:
Data Planning: At this stage, planners develop a model of business processes. This results in an
enterprise model of business processes with documentation.
Requirements Specification: This stage defines the information needs of end users in a business
process. Description of needs may be provided in natural language or using the tools of a
particular design methodology.
Conceptual Design: This stage expresses all information requirements in the form of a high-
level model.
Logical Design: This stage translates the conceptual model into the data model of a DBMS.
Physical Design: This stage determines the data storage structures and access methods.