0% found this document useful (0 votes)
19 views5 pages

Chapter 6-Foundaiton of BI

The document discusses the capabilities and benefits of database management systems (DBMS) over traditional file environments. It describes how a DBMS centralizes data, controls redundancy, and improves security, flexibility, and data sharing. It also covers relational databases, database design, non-relational databases, data warehouses, blockchain, and business intelligence tools like Hadoop.

Uploaded by

forkrocksyo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views5 pages

Chapter 6-Foundaiton of BI

The document discusses the capabilities and benefits of database management systems (DBMS) over traditional file environments. It describes how a DBMS centralizes data, controls redundancy, and improves security, flexibility, and data sharing. It also covers relational databases, database design, non-relational databases, data warehouses, blockchain, and business intelligence tools like Hadoop.

Uploaded by

forkrocksyo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Chapter 6: foundations of BI: Databases and Information Management

6.1 what are the problems of managing data resources in a traditional file environment?
→ An effective information system provides users with accurate, timely, and relevant information.
File organization terms and concepts
- A computer system organizes data in a hierarchy that starts with bits and bytes
A database: an organised collection of data stored centrally to serve various information system
applications. A group of related files makes up a database

A bit: Represents the smallest unit of data a computer can handle.


Byte: a group of bits. Represents a single character. Can be letter, number or symbol.
File: group of records of same type
Record: group of related fields.
Field: group of characteristics as word(s) or letters
Entity: a person, place, thing, or event on which we store and maintain info. (employee)
Attribute: each quality describing an entity is called an attribute( employee’s name and hire date)
Key field: identifier field used to retrieve, update, sort a record.

Why collect and store data?


- Collecting data is valuable because you can use it to make informed decisions. The more relevant,
high-quality data you have, the more likely you are to make good choices

Problems with the traditional file environment


● Files maintained separately by different departments
● each functional area in a corporation make diff data files, after few years they have subsets of these
files and are hard to maintain

→ Data redundancy and inconsistency


- Data redundancy: the presence of duplicate data in multiple data files so that the same data are
stored in more than one place. wastes storage resources and also leads to data inconsistency (same
data in different formats. Date updated in 1 file but not the other.)
→ Program-data dependence
- the coupling of data stores in files and the specific programs required to update and maintain those
files.
- Changes in programs require changes to the data accessed by the program.
→ Lack of flexibilityw
- can only deliver routine scheduled reports.
- A traditional file system cant deliver ad hoc reports or respond to unanticipated info requirements in
a timely fashion.
→ Poor Security
- Management might have no way of knowing who is accessing or even making changes to the
organization’s data.
→ Lack of data sharing and availability
- Pieces of information in different files and different parts of the organization cannot be related to one
another, it is virtually impossible for information to be shared or accessed in a timely manner. Info
cannot flow easily across diff parts of the organization.

6.2- What are the major capabilities of database management systems(DBMS)

Database Management Systems


database: a collection of data organized to serve many applications efficiently by centralizing the data and
controlling redundant data.

database management system (DBMS): is software that enables an organization to centralize data,
manage them efficiently, and provide access to the stored data by application programs.

- DBMS acts as an interface between application programs and the physical data files.
- Collects all data in the organization and stores in one place. (ab customer employees, etc)
- Separates the logical (presents data as they would be perceived by end users or business
specialists) and physical view (shows how data are actually organized and structured on physical
storage media ) of the data. Relieves the programmer from the task of understanding where and how
data are actually stored.
- Example: Database is in the middle of accounting department(application system) and the human
resources database (where every info is kept). They send requests to DBMS and it retrieves certain
info that they want.
- Solves problems of traditional file environment
- Controls redundancy (same data stored in diff places), eliminates inconsistencies
- Solves problem of data redundancy by controlling it bc one copy of data
- Solves problem of inconsistency, also more secure

Relational DBMS (218)


- The most popular type of DBMS today for larger computers is relational DBMS. Reps data as
2-dimensional tables(called relations)
- Each table(referred to as files) contains data on entity and attributes
- Microsoft Access is a relational DBMS for desktop systems
- Table: grid of columns and rows of data
- rows(tuples): records for diff entities
- fields(columns): reps attribute for entity
- Key field: field used to uniquely identify each record
- Primary key: field in table used for key fields
- Foreign key: primary key used in second table as look-up filed to identify records from
original table

Capabilities of database management systems


● DBMS have a Data definition capability to Specify the structure of the database.
● Data dictionary: an automated or manual file that stores definitions of data elements and their
characteristics
● Querying and reporting (pg. 222)
- Data manipulation language- used to add, change, delete, and retrieve the data in the database.
- Structured query language (SQL) (most prominent data manipulation language)
● Many DBMS have report generation capabilities for creating polished reports (microsoft access)
Operations of a relational DBMS
● 3 basic operations used to develop useful sets of data
→ select: creates subset of data of all records that meet stated criteria
→ join: combines relational tables to provide user with more info than available in individual tables
→ project: creates subset of columns in table, creating tables with only the info specified.

Designing databases:
to create a database, need to understand the relationships among the data, the type of data that will be
maintained in the database, how the data will be used,
● Conceptual design: abstract model of database from a business perspective. describes how the
data elements in the database are to be grouped
● Entity-relationship diagram: methodology for documenting data illustrating relationships. identifies
relationships among data elements and the most efficient way of grouping data elements together
- If the business doesn’t get its data model right, the system wont be bale to serve the business well
● Normalization: process of creating small stable data structures from complex groups of data
● Physical design: detailed description of how the data will actually be arranged and stores on
physical devices
Non-relational databases and databases in the cloud
● Non-relational databases “NoSQL”
- Use a More flexible data model. Designed for managing large data across many distributed
machines and for easily scaling up or down
- Data sets stored across distributed machines, easier to scale, handle large columns on unstructured
and structured data
● Databases in the cloud
- Appeal to start-ups, smaller businesses. example: amazon relational database service
- Private clouds
→ A distributed database is one that is stored in multiple physical locations.
Blockchain (pg 226)
- New business intelligence. Rather than storing data in one palace. We will store in many places.
They will have perfect ability to verify that data.
- Sharing data storage with other people.
- distributed database technology that enables firms and organizations to create and verify
transactions on a network nearly instantaneously without a central authority.
- The blockchain maintains a continuously growing list of records called blocks.
- There are many large benefits to firms using blockchain databases.
- reduce the cost of verifying users, validating transactions, and the risks of storing and
processing transaction information across thousands of firms.
- foundation of bitcoin and other crypto currencies
- Used for financial transactions, supply chain and medical records
- Giving the responsibility of data storage and security to multiple people at the same time
Business intelligence infrastructure
→ array of tools for obtaining info from separate systems and from big data
● Data warehouse
- a database that Stores current and historical data from many core operational transaction systems of
potential interest to decision makers
- data is available to iphones to access as needed but the data cannot be altered. provides analysis
and reporting tools
- Data marts: subset of data warehouses
- Summarized or highly focused portion of firms data for use by a specific population of users. Could
be info on data ab a specific product
- typically focuses on single subject or line of business

● Hadoop 229
- For handling unstructured and semi-structured data in vast quantities, as well as structured data,
organizations are using Hadoop. . It breaks a big data problem down into subproblems, distributes
them among up to thousands of inexpensive computer processing nodes, and then combines the
result into a smaller data set that is easier to analyze
- It breaks a big data problem down into subproblems, distributes them among up to thousands of
inexpensive computer processing nodes, and then combines the result into a smaller data set that is
easier to analyze.
- example: searching for directions on google, connect w a friend on facebook
- key services: hadoop distributed file system (HDFS), make reduce, HBase
● In-memory computing
- Another way of facilitating big data analysis is to use in-memory computing, which relies primarily on
a computer’s main memory (RAM) for data storage to avoid delays in retrieving data
- Complex business calculations that used to take hours or days are able to be completed within
seconds,
● Analytics platforms
- high-speed analytic platforms using both relational and non-relational technology that are optimized
for analyzing large data sets.
Data lake: A data lake is a repository for raw unstructured data or structured data that for the most part has
not yet been analyzed, and the data can be accessed in many ways.

→ Tools for analyzing and providing access to vast amounts of data ot help users make better
business decisions
- OLAP, data mining, text mining, web mining

online analytic processing OLAP


- OLAP supports multidimensional data analysis, enabling users to view the same data in different
ways using multiple dimensions.
- each aspect of info(product, pricing, region,cost,time) is a diff dimension
Data mining
- provides insight into corporate data that cannot be obtained with OLAP by Finding hidden patterns,
relationships in datasets (customer buying patterns)
- Infers rules to predict future behaviour (forecasting, clustering)
- types of info obtainable from data mining: associations, sequences, classification, clustering
- forecasting
Text mining:
- able to Extract key elements from large unstructured data sets. discover patterns and relationships
Web mining (pg 235)
- Discovery and analysis of useful patterns and info from web
- helps to understand customer behavior, evaluate the effectiveness of a particular website,
- example:marketers use the Google Trends service, which tracks the popularity of various words and
phrases used in Google search queries, to learn what people are interested in and what they are
interested in buying.
sentiment analysis
- software that is able to mines text comments in email,blog,social media conversation, or survey to
detect favourable and unfavorable opinions ab specific subjects
- example: Kraft Foods uses a Community Intelligence Portal and sentiment analysis to tune into
consumer conversations about its products across numerous social networks, blogs, and other
websites

databases and the web


- many companies use the web to make some internal databases available to customers or partners
- advantages of using the web for database access:ease of use of browser software, web interface
requires few or no changes to database, inexpensive to ass web interface to system
- In a client/server environment, the DBMS resides on a dedicated computer called a database server.
establishing an info policy
- firms rules, procedures, roles for sharing, managing, standardizing data
-->An information policy specifies the organization’s rules for sharing, disseminating, acquiring,
standardizing, classifying, and inventorying information.
→ Data administration is responsible for establishing specific policies and procedures through which data
can be managed as an organizational resource
→ database administration: creating and maintaining database
→ Data governance: deals w policies and processes for managing availability, usability, and security of
data, eso regarding government regulations
→ Data cleansing, aka data scrubbing, consists of activities for detecting and correcting data in a database
that are incorrect, incomplete, improperly formatted, or redundant
data quality: structured survey
→ Data quality audit: a structured survey of the accuracy and level of completeness of the data in an
information system
- data thats inaccurate or inconsistent w other sources of indo lead to incorrect decisions and financial
losses

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy