0% found this document useful (0 votes)
2 views17 pages

Chapter 1 - Introduction

The document introduces machine learning, emphasizing its necessity in processing vast amounts of data for better decision-making in organizations. It explains the relationship between machine learning and other fields such as artificial intelligence, data science, and statistics, while detailing types of machine learning including supervised, unsupervised, and semi-supervised learning. Additionally, it outlines key concepts, algorithms, and the importance of data quality in the learning process.

Uploaded by

Manu Manoj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views17 pages

Chapter 1 - Introduction

The document introduces machine learning, emphasizing its necessity in processing vast amounts of data for better decision-making in organizations. It explains the relationship between machine learning and other fields such as artificial intelligence, data science, and statistics, while detailing types of machine learning including supervised, unsupervised, and semi-supervised learning. Additionally, it outlines key concepts, algorithms, and the importance of data quality in the learning process.

Uploaded by

Manu Manoj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

MODULE 1

INTRODUCTION:

1.1 Need for Machine Learning


Business organizations use huge amount of data for their daily activities. Earlier, the full
potential of this data was not utilized due to two reasons. One reason was data being
scattered across different archive systems and organizations not being able to integrate
these sources fully. Secondly, the lack of awareness about software tools that could help to
unearth the useful information from data.

Machine learning has become so popular because of three reasons:


1. High volume of available data to manage: Big companies such as Facebook,
Twitter, and YouTube generate huge amount of data that grows at a phenomenal
rate. It is estimated that the data approximately gets doubled every year.
2. Second reason is that the cost of storage has reduced. The hardware cost has also
dropped. Therefore, it is easier now to capture, process, store, distribute, and
transmit the digital information.
3. Third reason for popularity of machine learning is the availability of complex
algorithms now. Especially with the advent of deep learning, many algorithms are
available for machine learning.

With the popularity and ready adaption of machine learning by business organizations, it
has become a dominant technology trend now. Before starting the machine learning journey,
let us establish these terms – data, information, knowledge, intelligence, and wisdom. A
knowledge pyramid is shown in Figure 1.1.

1
MODULE 1

All facts are data. Data can be numbers or text that can be processed by a computer. Today,
organizations are accumulating vast and growing amounts of data with data sources such as flat
files, databases, or data warehouses in different storage formats.

Processed data is called information. This includes patterns, associations, or relationships


among data. For example, sales data can be analysed to extract information like which is the
fast selling product. Condensed information is called knowledge. For example, the historical
patterns and future trends obtained in the above sales data can be called knowledge. Unless
knowledge is extracted, data is of no use. Similarly, knowledge is not useful unless it is put into
action. Intelligence is the applied knowledge for actions. An actionable form of knowledge is
called intelligence. Computer systems have been successful till this stage. The ultimate
objective of knowledge pyramid is wisdom that represents the maturity of mind that is, so far,
exhibited only by humans.

Here comes the need for machine learning. The objective of machine learning is to process
these archival data for organizations to take better decisions to design new products, improve
the business processes, and to develop effective decision support systems.

1.2 Machine Learning Explained

Machine learning is an important sub-branch of Artificial Intelligence (AI). A frequently


quoted definition of machine learning was by Arthur Samuel, one of the pioneers of
Artificial Intelligence. He stated that "Machine learning is the field of study that gives the
computers ability to learn without being explicitly programmed."

The key to this definition is that the systems should learn by itself without explicit
programming. It is widely known that to perform a computation, one needs to write
programs that teach the computers how to do that computation.

In conventional programming, after understanding the problem, a detailed design of the


program such as a flowchart or an algorithm needs to be created and converted into
programs using a suitable programming language. This approach could be difficult for many
real-world problems such as puzzles, games, and complex image recognition applications.
Initially, artificial intelligence aims to understand these problems and develop general
purpose rules manually. Then, these rules are formulated into logic and implemented in a
program to create intelligent systems. This idea of developing intelligent systems by using
logic and reasoning by converting an expert's knowledge into a set of rules and programs is
called an expert system. An expert system like MYCIN was designed for medical diagnosis

2
MODULE 1

after converting the expert knowledge of many doctors into a system. However, this
approach did not progress much as programs lacked real intelligence. The word MYCIN is
derived from the fact that most of the antibiotics' names end with 'mycin'.

As humans take decisions based on an experience, computers make models based on


extracted patterns in the input data and then use these data-filled models for prediction
and to take decisions. For computers, the learnt model is equivalent to human experience.
This is shown in Figure 1.2.

Often, the quality of data determines the quality of experience and, therefore, the quality of
the learning system. In statistical learning, the relationship between the input x and output
y is modelled as a function in the form y=f(x). Here, f is the learning function that maps the
input x to output y. Learning of function f is the crucial aspect of forming a model in
statistical learning. In machine learning, this is simply called mapping of input to output.

The learning program summarizes the raw data in a model. Formally stated, a model is an
explicit description of patterns within the data in the form of:

1. Mathematical equation
2. Relational diagrams like trees/graphs
3. Logical if/else rules, or
4. Groupings called clusters

In summary, a model can be a formula, procedure or representation that can generate data
decisions. The difference between pattern and model is that the former is local and
applicable only to certain attributes but the latter is global and fits the entire dataset. For
example, a model can be helpful to examine whether a given email is spam or not. The
point is that the model is generated automatically from the given data.

3
MODULE 1

1.3 Machine Learning in Relation to other Fields

Machine learning uses the concepts of Artificial Intelligence, Data Science, and Statistics
primarily. It is the resultant of combined ideas of diverse fields.

1.3.1 Machine Learning and Artificial Intelligence

Machine learning is an important branch of Al, which is a much broader subject. The aim of AI is
to develop intelligent agents. An agent can be a robot, humans, or any autonomous systems.
Initially, the idea of Al was ambitious, that is, to develop intelligent systems like human beings.
The focus was on logic and logical inferences. It had seen many ups and downs. These down
periods were called Al winters.

The resurgence in Al happened due to development of data driven systems. The aim is to find
relations and regularities present in the data. Machine learning is the subbranch of Al, whose
aim is to extract the patterns for prediction. It is a broad field that includes learning from
examples and other areas like reinforcement learning. The relationship of Al and machine
learning is shown in Figure 1.3. The model can take an unknown instance and generate results.

Deep learning is a subbranch of machine learning. In deep learning, the models are
constructed using neural network technology. Neural networks are based on the human neuron
models. Many neurons form a network connected with the activation functions that trigger
further neurons to perform tasks.

4
MODULE 1

1.3.2 Machine Learning Data Science, Data Mining, and Data Analytics

Data science is an ‘Umbrella’ term that encompasses many fields. Machine learning starts with
data. Therefore, data science and machine learning are interlinked. Machine learning is a
branch of data science. Data science deals with gathering of data for analysis. It is a broad field
that includes:

Big Data: Data science concerns about collection of data. Big data is a field of data science that
deals with data's following characteristics:

1. Volume: Huge amount of data is generated by big companies like Facebook, Twitter,
YouTube.

2. Variety: Data is available in variety of forms like images, videos, and in different formats.

3. Velocity: It refers to the speed at which the data is generated and processed.

Big data is used by many machine learning algorithms for applications such as language
translation and image recognition. Big data influences the growth of subjects like Deep learning.
Deep learning is a branch of machine learning that deals with constructing models using neural
networks.

Data Mining Data mining's original genesis is in the business. Like while mining the earth one
gets into precious resources, it is often believed that unearthing of the data produces hidden
information that otherwise would have eluded the attention of the management. Nowadays,
many consider that data mining and machine learning are same. There is no difference between
these fields except that data mining aims to extract the hidden patterns that are present in the
data, whereas, machine learning aims to use it for prediction.

Data Analytics Another branch of data science is data analytics. It aims to extract useful
knowledge from crude data. There are different types of analytics. Predictive data analytics is
used for making predictions. Machine learning is closely related to this branch of analytics and
shares almost all algorithms.

Pattern Recognition It is an engineering field. It uses machine learning algorithms to extract the
features for pattern analysis and pattern classification. One can view pattern recognition as a
specific application of machine learning.

These relations are summarized in Figure 1.4.

5
MODULE 1

1.3.3 Machine Learning and Statistics

Statistics is a branch of mathematics that has a solid theoretical foundation regarding


statistical learning. Like machine learning (ML), it can learn from data. But the difference
between statistics and ML is that statistical methods look for regularity in data called
patterns. Initially, statistics sets a hypothesis and performs experiments to verify and
validate the hypothesis in order to find relationships among data.

Statistics requires knowledge of the statistical procedures and the guidance of a good
statistician. It is mathematics intensive and models are often complicated equations and
involve many assumptions. Statistical methods are developed in relation to the data
being analysed. In addition, statistical methods are coherent and rigorous. It has strong
theoretical foundations and interpretations that require a strong statistical knowledge.
Machine learning, comparatively, has less assumptions and requires less statistical
knowledge. But, it often requires interaction with various tools to automate the process
of learning.

Nevertheless, there is a school of thought that machine learning is just the latest version
of 'old Statistics' and hence this relationship should be recognized.

6
MODULE 1

1.4 Types of Machine Learning

Learning, like adaptation, occurs as the result of interaction of the program with its
environment. It can be compared with the interaction between a teacher and a student.
There are four types of machine learning as shown in Figure 1.5.

Labelled and Unlabelled Data: Data is a raw fact. Normally, data is represented in the form
of a table. Data also can be referred to as a data point, sample, or an example. Each row of
the table represents a data point. Features are attributes or characteristics of an object.
Normally, the columns of the table are attributes. Out of all attributes, one attribute is
important and is called a label. Label is the feature that we aim to predict. Thus, there are
two types of data - labelled and unlabelled. A

Labelled Data To illustrate labelled data, let us take one example dataset called Iris flower
dataset or Fisher's Iris dataset. The dataset has 50 samples of Iris with four attributes,
length and width of sepals and petals. The target variable is called class. There are three
classes - Iris setosa, Iris virginica, and Iris versicolor.

The partial data of Iris dataset is shown in Table 1.1.

7
MODULE 1

(a)

(b)

Figure 1.6: (a) Labelled Dataset (b) Unlabelled Dataset

1.4.1 Supervised Learning

Supervised algorithms use labelled dataset. As the name suggests, there is a supervisor
or teacher component in supervised learning. A supervisor provides labelled data so
that the model is constructed and generates test data.

In supervised learning algorithms, learning takes place in two stages. In layman


terms, during the first stage, the teacher communicates the information to the student
that the student is supposed to master. The student receives the information and
understands it. During this stage, the teacher has no knowledge of whether the
information is grasped by the student. Based on these questions, the student is tested,
and the teacher informs the student about his assessment. This kind of learning is
typically called supervised learning.

8
MODULE 1

Supervised learning has two methods:


1. Classification
2. Regression

Classification

Classification is a supervised learning method. The input attributes of the classification


algorithms are called independent variables. The target attribute is called label or dependent
variable. The relationship between the input and target variable is represented in the form of a
structure which is called a classification model. So, the focus of classification is to predict the
'label' that is in a discrete form (a value from the set of finite values). An example is shown in
Figure 1.7 where a classification algorithm takes a set of labelled data images such as dogs and
cats to construct a model that can later be used to classify an unknown test image data.

In classification, learning takes place in two stages. During the first stage, called training stage,
the learning algorithm takes a labelled dataset and starts learning. After the training set,
samples are processed and the model is generated. In the second stage, the constructed model
is tested with test or unknown sample and assigned a label. This is the classification process.

This is illustrated in the above Figure 1.7. Initially, the classification learning algorithm learns
with the collection of labelled data and constructs the model. Then, a test case is selected, and
the model assigns a label.

The classification models can be categorized based on the implementation technology like
decision trees, probabilistic methods, distance measures, and soft computing methods.

9
MODULE 1

Classification models can also be classified as generative models and discriminative models.
Generative models deal with the process of data generation and its distribution. Probabilistic
models are examples of generative models. Discriminative models do not care about the
generation of data. Instead, they simply concentrate on classifying the given data.

Some of key algorithms of classifications are:

 Decision Tree
 Random Forest
 Support Vector Machines
 Naïve Bayes
 Artificial Neural Network and Deep Learning networks like CNN

Regression Models

Regression models, unlike classification algorithms, predict continuous variables like price. In
other words, it is a number. A fitted regression model is shown in Figure 1.8 for a dataset that
represent weeks input x and product sales y.

The regression model takes input x and generates a model in the form of a fitted line of the
form y=f(x). Here, x is the independent variable that may be one or more attributes and y is the
dependent variable. In Figure 1.8, linear regression takes the training set and tries to fit it with a

10
MODULE 1

line – product sales = 0.66* Week + 0.54. Here, 0.66 and 0.54 are all regression coefficients that
are learnt from data. The advantage of this model is that prediction for product sales (y) can be
made for unknown week data (x). For example, the prediction for unknown eight week can be
made by substituting x as 8 in that regression formula to get y. Both regression and
classification models are supervised algorithms. Both have a supervisor and the concepts of
training and testing are applicable to both.

1.4.2 Unsupervised Learning


The second kind of learning is by self-instruction. As the name suggests, there are no
supervisor or teacher components.

Cluster analysis and Dimensional reduction algorithms are examples of unsupervised


algorithms.

Cluster Analysis

Cluster analysis is an example of unsupervised learning. It aims to group objects into


disjoint clusters or groups. Cluster analysis clusters objects based on its attributes. All
the data objects of the partitions are similar in some aspect and vary from the data
objects in the other partitions significantly.
Some of the examples of clustering processes are – segmentation of a region of
interest in an image, detection of abnormal growth in a medical image, and determining
clusters of signatures in a gene database.
An example of clustering scheme is shown in Figure 1.9 where the clustering
algorithm takes a set of dogs and cats images and groups is as two clusters-dogs and
cats. It can be observed that the samples belonging to a cluster are similar and samples
are different radically across clusters.

11
MODULE 1

Some of the key clustering algorithms are:


 k-means algorithm
 Hierarchical algorithms

Dimensionality Reduction

Dimensionality reduction algorithms are examples of unsupervised algorithms. It takes a higher


dimension data as input and outputs the data in lower dimension by taking advantage of the
variance of the data. It is a task of reducing the dataset with few features without losing the
generality.

The differences between supervised and unsupervised learning are listed in the following Table
1.2.

Table 1.2: Differences between Supervised and Unsupervised Learning

1.4.3 Semi-supervised Learning

12
MODULE 1

There are circumstances where the dataset has a huge collection of unlabelled data and
some labelled data. Labelling is a costly process and difficult to perform by the humans.
Semi –supervised algorithms use unlabelled data by assigning a pseudo-label. Then, the
labelled and pseudo-labelled dataset can be combined.

1.4.4 Reinforcement Learning

Reinforcement learning mimics human beings. Like human beings use ears and eyes to
perceive the world and take actions, reinforcement learning allows the agent to interact
with the environment to get rewards. The agent can be human, animal, robot, or any
independent program. The rewards enable the agent to gain experience. The agent aims
to maximize the reward.
The reward can be positive or negative( Punishment). When the rewards are more,
the behaviour gets reinforced and learning becomes possible.
Consider the following example of a Grid game as shown in Figure 1.10.

In this grid game, the gray tile indicates the danger, black is a block, and the tile with
diagonal lines is the goal. The aim is to start, say from bottom-left grid, using the actions
left, right, top and bottom to reach the goal state.
To solve this sort of problem, there is no data. The agent interacts with the
environment to get experience. In the above case, the agent tries to create a model by
simulating many paths and finding rewarding paths. This experience helps in
constructing a model.
It can be said in summary, compared to supervised learning, there is no supervisor
or labelled dataset. Many sequential decisions need to be taken to reach the final
decision. Therefore, reinforcement algorithms are reward based , goal- oriented
algorithms.

1.5 Challenges of Machine Learning

13
MODULE 1

Problems that can be Dealt with Machine learning

However, humans are better than computers in many aspects like recognition. But, deep
learning systems challenge human beings in this aspect as well. Machines can recognize
human faces in a second. Still, there are tasks where humans are better as machine learning
systems still require quality data for model construction. The quality of learning system
depends on the quality of data. This is a challenge. Some of the challenges are listed below:
1. Problems- Machine learning can deal with the ‘well-posed’ problems where
specifications are complete and available. Computers cannot solve ‘ill-posed’ problems.
2. Huge data- This is a primary requirement of machine learning. Availability of a quality
data is a challenge. A quality data means it should be large and should not have data
problems such as missing data or incorrect data.
3. High computation power- With the availability of Big Data, the computational resource
requirement has also increased. Systems with Graphics Processing Unit (GPU) or even
Tensor Processing Unit (TPU) are required to execute machine learning algorithms. Also,
machine learning tasks have become complex and hence time complexity has increased,
and that can be solved only with high computing power.
4. Complexity of the algorithms- The selection of algorithms, describing the algorithms,
application of algorithms to solve machine learning task, and comparison of algorithms
have become necessary for machine learning or data scientists now. Algorithms have
become a big topic of discussion and it is a challenge for machine learning professionals
to design, select, and evaluate optimal algorithms.
5. Bias/Variance- Variance is the error of the model. This leads to a problem called
bias/variance tradeoff. A model that fits the training data correctly but fails for test
data, in general lacks generalization, is called overfitting. The reverse problem is called
underfitting where the model fails for training data but has good generalization.
Overfitting and underefitting are great challenges for machine learning algorithms.

1.6 Machine Learning Process


The emerging process model for the data mining solutions for business organization is
CRISP-DM. Since machine learning is like data mining, except for the aim, this process can
be used for machine learning. CRISP-DM stands for Cross Industry Standard Process- Data
Mining. This process involves six steps. The steps are listed below in Figure 1.11.

14
MODULE 1

1. Understanding the business This step involves understanding the objectives and
requirements of the business organization. Generally, a single data mining algorithm is
enough for giving the solution. This step also involves the formulation of the problem
statement for the data mining process.

2. Understanding the data - It involves the steps like data collection, study of the charac
teristics of the data, formulation of hypothesis, and matching of patterns to the selected
hypothesis.

3. Preparation of data - This step involves producing the final dataset by cleaning the
raw data and preparation of data for the data mining process. The missing values may
cause problems during both training and testing phases. Missing data forces classifiers
to produce inaccurate results. This is a perennial problem for the classification models.
Hence, suitable strategies should be adopted to handle the missing data.

4. Modelling-This step plays a role in the application of data mining algorithm for the
data to obtain a model or pattern.

15
MODULE 1

5. Evaluate - This step involves the evaluation of the data mining results using statistical
analysis and visualization methods. The performance of the classifier is determined by
evaluating the accuracy of the classifier. The process of classification is a fuzzy issue. For
example, classification of emails requires extensive domain knowledge and requires
domain experts. Hence, performance of the classifier is very crucial.

6. Deployment -This step involves the deployment of results of the data mining algorithm
to improve the existing process or for a new situation.

1.7 Machine Learning Applications

Machine Learning technologies are used widely now in different domains. Machine learning
applications are everywhere! One encounters many machine learning applications in the
day-to-day life. Some applications are listed below:
1. Sentiment analysis- This is an application of natural language processing (NLP) where
the words of documents are converted to sentiments like happy, sad, and angry
which are captured by emoticons effectively. For more reviews or product reviews,
five stars or one star are automatically attached using sentiment analysis programs.
2. Recommendation systems- These are systems that make personalized purchases
possible. For example, Amazon recommends users to find related books or books
bought by people who have the same taste like you, and Netflix suggests shows or
related movies of your taste. The recommendation systems are based on machine
learning.
3. Voice assistants- Products like Amazon Alexa, Microsoft Cortana, Apple Siri, and
Google Assistant are all examples of voice assistants. They take speech commands
and perform tasks. These chatbots are the result of machine learning technologies.
4. Technologies like Google Maps and those by user by User are all examples of
machine learning which offer to locate and navigate shortest paths to reduce time.

The machine learning applications are enormous. The following Table 1.4 summarizes
some of the machine learning applications.

16
MODULE 1

17

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy