Ramy Mahmoud 52117
Ramy Mahmoud 52117
Group c
Data Mining as the Evolution of Information Technology Data mining can be viewed as a
result of the natural evolution of information technology. The database and data management
industry evolved in the development of several critical functionalities (Figure 1.1): data
collection and database creation, data management (including data storage and retrieval and
database transaction processing), and advanced data analysis (involving data warehousing and
data mining). The early development of data collection and database creation mechanisms
served as a prerequisite for the later development of effective mechanisms for data storage and
retrieval, as well as query and transaction processing. Nowadays numerous database systems
offer query and transaction processing as common practice. Advanced data analysis has
naturally become the next step.
Database Data
A database system, also called a database management system (DBMS), consists of a collection
of interrelated data, known as a database, and a set of software programs to manage and access
the data. The software programs provide mechanisms for defining database structures and data
storage; for specifying and managing concurrent, shared, or distributed data access; and for
ensuring consistency and security of the information stored despite system crashes or attempts
at unauthorized access. A relational database is a collection of tables, each of which is assigned a
unique name. Each table consists of a set of attributes (columns or fields) and usually stores a
large set of tuples (records or rows). Each tuple in a relational table represents an object
identified by a unique key and described by a set of attribute values. A semantic data model,
such as an entity-relationship (ER) data model, is often constructed for relational databases. An
ER data model represents the database as a set of entities and their relationships
Relational data can be accessed by database queries written in a relational query language (e.g.,
SQL) or with the assistance of graphical user interfaces. A given query is transformed into a set
of relational operations, such as join, selection, and projection, and is then optimized for
efficient processing. A query allows retrieval of specified subsets of the data. Suppose that your
job is to analyze the AllElectronics data. Through the use of relational queries, you can ask things
like, “Show me a list of all items that were sold in the last quarter.” Relational languages also use
aggregate functions such as sum, avg (average), count, max (maximum), and min (minimum).
Using aggregates allows you to ask: “Show me the total sales of the last month, grouped by
branch,” or “How many sales transactions occurred in the month of December?” or “Which
salesperson had the highest sales
Reference:
Jiawei Han
Micheline Kamber
Jian Pei