0% found this document useful (0 votes)

32 views21 pages

Machine Learning Unit 4

Uploaded by

sundharimega

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views21 pages

Machine Learning Unit 4

Uploaded by

sundharimega

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

MACHINE LEARNING

UNIT 4
REGULARIZATION-SOLVING THE PROBLEM ON OVERFITTING :

REGULARIZATION:

Regularization refers to techniques that are used to calibrate machine learning

models in order to minimize the adjusted loss function and prevent overfitting
or underfitting.

Overfitting is a common problem in machine learning, where a model performs well

on the training data but poorly on unseen data. It occurs when the model is too
complex and has too many parameters, and has learned patterns in the training data
that do not generalize to new examples.

Regularization is a technique used to prevent overfitting by adding a penalty to the

model's objective function during training. The penalty term helps to constrain the
model and prevent it from becoming too complex.

There are several types of regularization techniques that can be used to solve the
problem of overfitting, including:

1. L1 regularization:

 L1 Regularization, also called a lasso regression, adds the “absolute value

of magnitude” of the coefficient as a penalty term to the loss function.
 This method adds a penalty term to the objective function that is
proportional to the absolute value of the model's weights. It encourages
the model to use only a small number of the available features, which can
help to reduce overfitting.

2. L2 regularization: .

 L2 Regularization, also called a ridge regression, adds the “squared

magnitude” of the coefficient as the penalty term to the loss function
 This method adds a penalty term to the objective function that is
proportional to the square of the model's weights. It encourages the model
to have small, non-zero weights, which can help to reduce overfitting.
3. Early stopping:

 This method involves monitoring the model's performance on a validation

set during training and stopping the training process when the model's
performance starts to degrade. This can help to prevent the model from
overfitting to the training data.

4. Dropout:

 This method involves randomly dropping out (setting to zero) a certain

percentage of the neurons in the model during training. This helps to
prevent the model from relying too heavily on any one feature, which can
help to reduce overfitting.

By using regularization techniques, it is possible to train a more generalizable

and robust model that is less prone to overfitting.
PERCEPTION IN MACHINE LEARNING:
 Perceptron is a type of artificial neural network that is used for supervised
learning tasks. It is a single layer feedforward network that consists of one or
more inputs, a single output, and weights that are used to make predictions.

 The perceptron works by taking a weighted sum of the input features, and if
the sum exceeds a certain threshold, it activates the output (for example, by
predicting a 1). If the sum does not exceed the threshold, it does not activate
the output (for example, by predicting a 0).
 Perceptrons are used for tasks such as binary classification, where the goal is
to predict whether an input belongs to one of two classes (e.g. spam vs. not
spam). They are trained using an algorithm called the perceptron learning rule,
which adjusts the weights of the perceptron based on the error between the
predicted output and the true label.
 Perceptrons are simple and efficient, but they have some limitations. They can
only learn to make linear decision boundaries, which means they are not
suitable for tasks where the decision boundary is non-linear. Additionally, they
are only capable of learning one class at a time, which makes them less
powerful than multi-layer neural networks that can learn to classify multiple
classes at once.

Perceptron model is also treated as one of the best and simplest types of Artificial Neural
networks. However, it is a supervised learning algorithm of binary classifiers. Hence, we can
consider it as a single-layer neural network with four main parameters, i.e., input values,
weights and Bias, net sum, and an activation function.
How it works?

Types of Perceptron Models

Based on the layers, Perceptron models are divided into two types. These are as
follows:

1. Single-layer Perceptron Model

2. Multi-layer Perceptron model

Single Layer Perceptron Model: This is one of the easiest Artificial neural
networks (ANN) types. A single-layered perceptron model consists feed-forward network
and also includes a threshold transfer function inside the model. The main objective of the
single-layer perceptron model is to analyze the linearly separable objects with binary
outcomes.

Multi-Layered Perceptron Model: Like a single-layer perceptron model, a

multi-layer perceptron model also has the same model structure but has a greater number of
hidden layers.The multi-layer perceptron model is also known as the Backpropagation
algorithm, which executes in two stages as follows:

o Forward Stage: Activation functions start from the input layer in the forward
stage and terminate on the output layer.
o Backward Stage: In the backward stage, weight and bias values are modified
as per the model's requirement. In this stage, the error between actual output
and demanded originated backward on the output layer and ended on the
input layer
A perceptron model has limitations as follows:

o The output of a perceptron can only be a binary number (0 or 1) due to the

hard limit transfer function.
o Perceptron can only be used to classify the linearly separable sets of input
vectors. If input vectors are non-linear, it is not easy to classify them properly.

NEURAL NETWORK:

A neural network is a machine learning model inspired by the structure and function
of the human brain. It is composed of layers of interconnected "neurons," which
process and transmit information.

Neural networks are capable of learning to perform a wide range of tasks by

analyzing and interpreting data, such as recognizing patterns, making predictions,
and classifying inputs. They are particularly well-suited for tasks that require the
processing of large and complex datasets, such as image and speech recognition.

Advantages of Artificial Neural Network :

 Parallel processing capability:

 Storing data on the entire network:
 Having a memory distribution:
 Having fault tolerance

Disadvantages of Artificial Neural Network:

 Assurance of proper network structure

 Unrecognized behavior of the network:
 Difficulty of showing the issue to the network:
 The duration of the network is unknown
There are several types of neural networks, including

 feedforward neural networks,

 convolutional neural networks, and
 recurrent neural networks.

Feedforward neural networks are the most basic type of neural network. They
consist of an input layer, one or more hidden layers, and an output layer. The input
layer receives the input data, and the hidden layers process the data using weights
and biases. The output layer produces the final output based on the processed data.

Convolutional neural networks (CNNs) are used for tasks such as image
classification and object detection. They are designed to process data with a grid-like
topology, such as an image, and are particularly effective at identifying patterns and
features in images.

Recurrent neural networks (RNNs) are used for tasks such as language translation
and text generation. They are designed to process sequential data, such as a time
series or a sentence, and are able to maintain a state or "memory" across time.

Neural networks are trained using a process called backpropagation, which involves
adjusting the weights and biases of the neurons in the network based on the error
between the predicted output and the true label. This process allows the network to
learn to make accurate predictions on new data.

MULTI CLASS CLASSIFICATION:

Multi-class classification is a machine learning problem in which the goal is to predict the class of
an input sample from a set of predefined classes. For example, a multi-class classification model
might be trained to predict the type of animal in an image (e.g. dog, cat, bird) based on the
features in the image.

There are several approaches to solving a multi-class classification problem, including:

1. One-vs-rest (OvR) approach: This approach involves training a separate binary classifier for each
class, where the class is treated as the positive class and all other classes are treated as the
negative class. During prediction, the input is passed through each classifier, and the class with
the highest predicted probability is chosen as the final prediction.
2. One-vs-one (OvO) approach: This approach involves training a separate binary classifier for
each pair of classes. For example, if there are three classes, three classifiers would be trained: one
to distinguish class 1 from class 2, one to distinguish class 1 from class 3, and one to distinguish
class 2 from class 3. During prediction, the input is passed through each classifier, and the class
that is predicted the most times is chosen as the final prediction.
3. Multinomial logistic regression: This approach involves training a single model that can predict
the probability of each class for an input sample. The class with the highest predicted probability
is chosen as the final prediction.
4. Support vector machines (SVMs): This approach involves training a classifier that finds the
hyperplane in a high-dimensional space that maximally separates the different classes. During
prediction, the input is mapped to the high-dimensional space and classified based on which side
of the hyperplane it falls on.

Choosing the right approach for a multi-class classification problem depends on the nature of
the data and the complexity of the task. It may be necessary to try multiple approaches and
compare their performance in order to find the best solution.

Non linearity with activation function:

Activation functions are used in artificial neural networks to introduce nonlinearity
into the model. They are applied to the output of a neuron before it is passed on to
the next layer in the network.

The use of activation functions allows neural networks to model complex

relationships in data, as they enable the network to learn nonlinear decision
boundaries. Without activation functions, neural networks would only be able to
learn linear decision boundaries, which limits their ability to solve more complex
problems.

There are several common activation functions that are used in neural networks,
including:

1. Sigmoid function: The sigmoid function maps input values to a range between 0
and 1, which makes it useful for classification tasks where the output is a probability.
It has a smooth, s-shaped curve and is differentiable, which makes it well-suited for
training with gradient descent. However, it can saturate for large input values and
produce slow convergence, which limits its use in deep networks.
2. Tanh function: The tanh function maps input values to a range between -1 and 1,
which makes it useful for classification tasks where the output is a probability. It has a
smooth, s-shaped curve and is differentiable, which makes it well-suited for training
with gradient descent. It has a stronger gradient than the sigmoid function, which
allows it to escape the saturation problem and make faster convergence.
3. ReLU function: The ReLU (Rectified Linear Unit) function maps input values to the
positive range, and outputs 0 for negative input values. It is fast to compute and
does not saturate, which makes it well-suited for use in deep networks. However, it
can produce "dead neurons" if the input values are consistently negative, which can
degrade the performance of the network.
4. Leaky ReLU function: The Leaky ReLU function is a variant of the ReLU function that
allows small negative input values to pass through, rather than outputting 0. This
helps to mitigate the problem of dead neurons and improve the performance of the
network.

Choosing the right activation function for a particular task depends on the nature of
the data and the complexity of the model. It may be necessary to try multiple
activation functions and compare their performance in order to find the best
solution.

DROPOUT AS REGULARIZATION:

Dropout is a regularization technique that is used to prevent overfitting in neural

networks. It works by randomly setting a certain percentage of the neurons in the
network to zero during training, which forces the network to rely on a different set of
neurons for each training example.

Dropout can be thought of as a form of model ensemble, as it effectively trains

multiple models (each with a different subset of neurons) and averages their
predictions to make the final prediction. This helps to reduce the dependence of the
model on any one neuron, which can help to reduce overfitting and improve the
generalizability of the model.

To use dropout, a dropout rate is chosen, which determines the percentage of

neurons that will be set to zero. For example, a dropout rate of 0.2 means that 20%
of the neurons will be set to zero during training. The dropout rate is typically chosen
through hyperparameter tuning, which involves experimenting with different values
to find the one that produces the best results.

Dropout is typically applied to the fully-connected layers of a neural network, but it

can also be applied to the convolutional layers in a convolutional neural network. It is
typically used in conjunction with other regularization techniques, such as weight
decay (also known as L2 regularization) and early stopping, to further improve the
generalizability of the model.

07 5123 08 Zigbee Cluster Library 1
No ratings yet
07 5123 08 Zigbee Cluster Library 1
1,213 pages
unit2ml-230101150634-5590aaef
No ratings yet
unit2ml-230101150634-5590aaef
202 pages
AIML(4th Sem)
No ratings yet
AIML(4th Sem)
22 pages
4.0 The Complete Guide To Artificial Neural Networks
No ratings yet
4.0 The Complete Guide To Artificial Neural Networks
23 pages
Artificial Neural Network (2)
No ratings yet
Artificial Neural Network (2)
75 pages
Unit Online 1.4
No ratings yet
Unit Online 1.4
132 pages
ML QuestionPaper Solution
No ratings yet
ML QuestionPaper Solution
33 pages
33. Multi Layer Perceptron
No ratings yet
33. Multi Layer Perceptron
82 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
39 pages
lect 5
No ratings yet
lect 5
41 pages
Deep Learning Sem
No ratings yet
Deep Learning Sem
128 pages
4. Learning Algorithm
No ratings yet
4. Learning Algorithm
58 pages
Supervised Learning Network Introduction: Unit 2
No ratings yet
Supervised Learning Network Introduction: Unit 2
52 pages
ML UNIT 3-2-18
No ratings yet
ML UNIT 3-2-18
17 pages
Multi Layer Perceptron
No ratings yet
Multi Layer Perceptron
51 pages
IV Ai & Ds Al3451 Ml Unit4
No ratings yet
IV Ai & Ds Al3451 Ml Unit4
36 pages
BT Lab Manual
0% (1)
BT Lab Manual
23 pages
Artificial Neural Network Image Processing: Presented by
No ratings yet
Artificial Neural Network Image Processing: Presented by
88 pages
Aiml Unit 5
No ratings yet
Aiml Unit 5
16 pages
Unit 4
No ratings yet
Unit 4
38 pages
CNC lap 1_merged
No ratings yet
CNC lap 1_merged
35 pages
unit-3_ml[1]
No ratings yet
unit-3_ml[1]
21 pages
UNIT 2
No ratings yet
UNIT 2
19 pages
AIML-UNIT-5
No ratings yet
AIML-UNIT-5
34 pages
The Development of Scientific Thinking Skills in e
No ratings yet
The Development of Scientific Thinking Skills in e
53 pages
ML Module 5
No ratings yet
ML Module 5
5 pages
-3
No ratings yet
-3
28 pages
Introduction Deep Eng (1)
No ratings yet
Introduction Deep Eng (1)
50 pages
Object Classification Through Perceptron Model Using Labview
No ratings yet
Object Classification Through Perceptron Model Using Labview
4 pages
Unit 1 Fundamentals of Deep Learning
No ratings yet
Unit 1 Fundamentals of Deep Learning
20 pages
Node - Js Course For GitHub
No ratings yet
Node - Js Course For GitHub
361 pages
Pattern Recognition & Analysis Assignment - Ii
No ratings yet
Pattern Recognition & Analysis Assignment - Ii
19 pages
Chapter 5 Final
No ratings yet
Chapter 5 Final
80 pages
Leica Disto s910 Wlan Disto Transfer Cad Plugin v1 en
No ratings yet
Leica Disto s910 Wlan Disto Transfer Cad Plugin v1 en
57 pages
This Document Is About Artificial Inteligence.
No ratings yet
This Document Is About Artificial Inteligence.
81 pages
Business Forecasting 9th Edition Hanke Solution Manual
71% (7)
Business Forecasting 9th Edition Hanke Solution Manual
9 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
Unit Iv DM
No ratings yet
Unit Iv DM
58 pages
22222222222222222
No ratings yet
22222222222222222
1 page
Unit 1 (1)
No ratings yet
Unit 1 (1)
72 pages
unit 2 -ml
No ratings yet
unit 2 -ml
18 pages
Ut
No ratings yet
Ut
14 pages
Machine Learning
No ratings yet
Machine Learning
13 pages
Introduction to Artificial Neural Networks
No ratings yet
Introduction to Artificial Neural Networks
31 pages
3rd Unit ML
No ratings yet
3rd Unit ML
7 pages
Mid 1 DL Notes
No ratings yet
Mid 1 DL Notes
15 pages
C# Practical Solution
No ratings yet
C# Practical Solution
61 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Lecture 5 - CS50's Introduction to Artificial Intelligence with Python
No ratings yet
Lecture 5 - CS50's Introduction to Artificial Intelligence with Python
16 pages
Unit 3
No ratings yet
Unit 3
7 pages
cst414- Deep learning
No ratings yet
cst414- Deep learning
34 pages
Unit_2
No ratings yet
Unit_2
20 pages
Biomedical Engineering Theory and Practice-Biomedical Instrumentation-Electrocardiography
No ratings yet
Biomedical Engineering Theory and Practice-Biomedical Instrumentation-Electrocardiography
4 pages
Neural NetworksChapter2Sup
No ratings yet
Neural NetworksChapter2Sup
20 pages
NeuralNetworks_JorgeAndreu
No ratings yet
NeuralNetworks_JorgeAndreu
6 pages
BAU 012 07 FD BAUER Helix e Preview
100% (1)
BAU 012 07 FD BAUER Helix e Preview
4 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Deep Learning
No ratings yet
Deep Learning
21 pages
Let S Know About The Superalloys
No ratings yet
Let S Know About The Superalloys
34 pages
shortnotedeeplearning (2)
No ratings yet
shortnotedeeplearning (2)
11 pages
Analysis of The Determinants of Universities Efficiency in Turkey Application of The Data Envelopment Analysis and Panel Tobit Model
No ratings yet
Analysis of The Determinants of Universities Efficiency in Turkey Application of The Data Envelopment Analysis and Panel Tobit Model
6 pages
A Presentation On: By: Edutechlearners
No ratings yet
A Presentation On: By: Edutechlearners
33 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
Creating Stress vs. Strain Plots in Excel
No ratings yet
Creating Stress vs. Strain Plots in Excel
4 pages
Unit 3
No ratings yet
Unit 3
8 pages
Sizing (Slashing)
100% (1)
Sizing (Slashing)
46 pages
GCSE Foundation Revision Work Book
No ratings yet
GCSE Foundation Revision Work Book
47 pages
Ann Mid1: Artificial Neural Networks With Biological Neural Network - Similarity
No ratings yet
Ann Mid1: Artificial Neural Networks With Biological Neural Network - Similarity
13 pages
Structural Design of Steel Structures
No ratings yet
Structural Design of Steel Structures
43 pages
Code Amount Password Time/Data Speed Expiration Can Pause
No ratings yet
Code Amount Password Time/Data Speed Expiration Can Pause
8 pages
TB - FX5U GXW3 Using File Password and Permanent PLC Lock Function
No ratings yet
TB - FX5U GXW3 Using File Password and Permanent PLC Lock Function
13 pages
Part I: Interpreting A Piping and Instrumentation Diagram (P&Id)
No ratings yet
Part I: Interpreting A Piping and Instrumentation Diagram (P&Id)
4 pages
Lecture 10 Neural Network
No ratings yet
Lecture 10 Neural Network
34 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
21 pages
DL mod 1 final
No ratings yet
DL mod 1 final
4 pages
Bismuth Magnet Core Generater
100% (1)
Bismuth Magnet Core Generater
3 pages
Neural network
No ratings yet
Neural network
7 pages
HTML Dom Parser
No ratings yet
HTML Dom Parser
3 pages
Congestion Control Using Network Based Protocol Abstract
No ratings yet
Congestion Control Using Network Based Protocol Abstract
5 pages
Minutes Course Meeting ECON046
No ratings yet
Minutes Course Meeting ECON046
4 pages
Ieee Surg Arrestor .0) en
No ratings yet
Ieee Surg Arrestor .0) en
9 pages
20.SAM64000-461-01 FUEL OIL SERVICE SYSTEM 燃油日用系统
No ratings yet
20.SAM64000-461-01 FUEL OIL SERVICE SYSTEM 燃油日用系统
17 pages
DL Unit-2 Notes PPT
No ratings yet
DL Unit-2 Notes PPT
39 pages
REV - 3 MS Cable Testing - RG
No ratings yet
REV - 3 MS Cable Testing - RG
13 pages
Pongal Funwork Schedule: Class: Viii
No ratings yet
Pongal Funwork Schedule: Class: Viii
1 page
Confirmatory Factor Analysis (CFA) of First Order Factor Measurement Model-ICT Empowerment in Nigeria
No ratings yet
Confirmatory Factor Analysis (CFA) of First Order Factor Measurement Model-ICT Empowerment in Nigeria
8 pages
Console Shortcut Keys CHANGED: Keyboard (Scroll Lock) Must Be OFF
No ratings yet
Console Shortcut Keys CHANGED: Keyboard (Scroll Lock) Must Be OFF
3 pages
Copeland Afe14c5e-Iaa-901 Article 3688596415520414 en Ss
No ratings yet
Copeland Afe14c5e-Iaa-901 Article 3688596415520414 en Ss
1 page
Feedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs
From Everand
Feedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs
Fouad Sabry
No ratings yet
Artificial Neural Networks: Fundamentals and Applications for Decoding the Mysteries of Neural Computation
From Everand
Artificial Neural Networks: Fundamentals and Applications for Decoding the Mysteries of Neural Computation
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Machine Learning Unit 4

Uploaded by

Machine Learning Unit 4

Uploaded by

MACHINE LEARNING

Regularization refers to techniques that are used to calibrate machine learning

Overfitting is a common problem in machine learning, where a model performs well

Regularization is a technique used to prevent overfitting by adding a penalty to the

 L1 Regularization, also called a lasso regression, adds the “absolute value

 L2 Regularization, also called a ridge regression, adds the “squared

 This method involves monitoring the model's performance on a validation

 This method involves randomly dropping out (setting to zero) a certain

By using regularization techniques, it is possible to train a more generalizable

Types of Perceptron Models

1. Single-layer Perceptron Model

Multi-Layered Perceptron Model: Like a single-layer perceptron model, a

o The output of a perceptron can only be a binary number (0 or 1) due to the

Neural networks are capable of learning to perform a wide range of tasks by

Advantages of Artificial Neural Network :

 Parallel processing capability:

Disadvantages of Artificial Neural Network:

 Assurance of proper network structure

 feedforward neural networks,

MULTI CLASS CLASSIFICATION:

There are several approaches to solving a multi-class classification problem, including:

Non linearity with activation function:

The use of activation functions allows neural networks to model complex

Dropout is a regularization technique that is used to prevent overfitting in neural

Dropout can be thought of as a form of model ensemble, as it effectively trains

To use dropout, a dropout rate is chosen, which determines the percentage of

Dropout is typically applied to the fully-connected layers of a neural network, but it

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.