0% found this document useful (0 votes)
32 views21 pages

Machine Learning Unit 4

Uploaded by

sundharimega
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views21 pages

Machine Learning Unit 4

Uploaded by

sundharimega
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

MACHINE LEARNING

UNIT 4
REGULARIZATION-SOLVING THE PROBLEM ON OVERFITTING :

REGULARIZATION:

Regularization refers to techniques that are used to calibrate machine learning


models in order to minimize the adjusted loss function and prevent overfitting
or underfitting.

Overfitting is a common problem in machine learning, where a model performs well


on the training data but poorly on unseen data. It occurs when the model is too
complex and has too many parameters, and has learned patterns in the training data
that do not generalize to new examples.

Regularization is a technique used to prevent overfitting by adding a penalty to the


model's objective function during training. The penalty term helps to constrain the
model and prevent it from becoming too complex.

There are several types of regularization techniques that can be used to solve the
problem of overfitting, including:

1. L1 regularization:

 L1 Regularization, also called a lasso regression, adds the “absolute value


of magnitude” of the coefficient as a penalty term to the loss function.
 This method adds a penalty term to the objective function that is
proportional to the absolute value of the model's weights. It encourages
the model to use only a small number of the available features, which can
help to reduce overfitting.

2. L2 regularization: .

 L2 Regularization, also called a ridge regression, adds the “squared


magnitude” of the coefficient as the penalty term to the loss function
 This method adds a penalty term to the objective function that is
proportional to the square of the model's weights. It encourages the model
to have small, non-zero weights, which can help to reduce overfitting.
3. Early stopping:

 This method involves monitoring the model's performance on a validation


set during training and stopping the training process when the model's
performance starts to degrade. This can help to prevent the model from
overfitting to the training data.

4. Dropout:

 This method involves randomly dropping out (setting to zero) a certain


percentage of the neurons in the model during training. This helps to
prevent the model from relying too heavily on any one feature, which can
help to reduce overfitting.

By using regularization techniques, it is possible to train a more generalizable


and robust model that is less prone to overfitting.
PERCEPTION IN MACHINE LEARNING:
 Perceptron is a type of artificial neural network that is used for supervised
learning tasks. It is a single layer feedforward network that consists of one or
more inputs, a single output, and weights that are used to make predictions.

 The perceptron works by taking a weighted sum of the input features, and if
the sum exceeds a certain threshold, it activates the output (for example, by
predicting a 1). If the sum does not exceed the threshold, it does not activate
the output (for example, by predicting a 0).
 Perceptrons are used for tasks such as binary classification, where the goal is
to predict whether an input belongs to one of two classes (e.g. spam vs. not
spam). They are trained using an algorithm called the perceptron learning rule,
which adjusts the weights of the perceptron based on the error between the
predicted output and the true label.
 Perceptrons are simple and efficient, but they have some limitations. They can
only learn to make linear decision boundaries, which means they are not
suitable for tasks where the decision boundary is non-linear. Additionally, they
are only capable of learning one class at a time, which makes them less
powerful than multi-layer neural networks that can learn to classify multiple
classes at once.

Perceptron model is also treated as one of the best and simplest types of Artificial Neural
networks. However, it is a supervised learning algorithm of binary classifiers. Hence, we can
consider it as a single-layer neural network with four main parameters, i.e., input values,
weights and Bias, net sum, and an activation function.
How it works?

Types of Perceptron Models


Based on the layers, Perceptron models are divided into two types. These are as
follows:

1. Single-layer Perceptron Model


2. Multi-layer Perceptron model

Single Layer Perceptron Model: This is one of the easiest Artificial neural
networks (ANN) types. A single-layered perceptron model consists feed-forward network
and also includes a threshold transfer function inside the model. The main objective of the
single-layer perceptron model is to analyze the linearly separable objects with binary
outcomes.

Multi-Layered Perceptron Model: Like a single-layer perceptron model, a


multi-layer perceptron model also has the same model structure but has a greater number of
hidden layers.The multi-layer perceptron model is also known as the Backpropagation
algorithm, which executes in two stages as follows:

o Forward Stage: Activation functions start from the input layer in the forward
stage and terminate on the output layer.
o Backward Stage: In the backward stage, weight and bias values are modified
as per the model's requirement. In this stage, the error between actual output
and demanded originated backward on the output layer and ended on the
input layer
A perceptron model has limitations as follows:

o The output of a perceptron can only be a binary number (0 or 1) due to the


hard limit transfer function.
o Perceptron can only be used to classify the linearly separable sets of input
vectors. If input vectors are non-linear, it is not easy to classify them properly.

NEURAL NETWORK:

A neural network is a machine learning model inspired by the structure and function
of the human brain. It is composed of layers of interconnected "neurons," which
process and transmit information.

Neural networks are capable of learning to perform a wide range of tasks by


analyzing and interpreting data, such as recognizing patterns, making predictions,
and classifying inputs. They are particularly well-suited for tasks that require the
processing of large and complex datasets, such as image and speech recognition.

Advantages of Artificial Neural Network :

 Parallel processing capability:


 Storing data on the entire network:
 Having a memory distribution:
 Having fault tolerance

Disadvantages of Artificial Neural Network:

 Assurance of proper network structure


 Unrecognized behavior of the network:
 Difficulty of showing the issue to the network:
 The duration of the network is unknown
There are several types of neural networks, including

 feedforward neural networks,


 convolutional neural networks, and
 recurrent neural networks.

Feedforward neural networks are the most basic type of neural network. They
consist of an input layer, one or more hidden layers, and an output layer. The input
layer receives the input data, and the hidden layers process the data using weights
and biases. The output layer produces the final output based on the processed data.

Convolutional neural networks (CNNs) are used for tasks such as image
classification and object detection. They are designed to process data with a grid-like
topology, such as an image, and are particularly effective at identifying patterns and
features in images.

Recurrent neural networks (RNNs) are used for tasks such as language translation
and text generation. They are designed to process sequential data, such as a time
series or a sentence, and are able to maintain a state or "memory" across time.

Neural networks are trained using a process called backpropagation, which involves
adjusting the weights and biases of the neurons in the network based on the error
between the predicted output and the true label. This process allows the network to
learn to make accurate predictions on new data.

MULTI CLASS CLASSIFICATION:

Multi-class classification is a machine learning problem in which the goal is to predict the class of
an input sample from a set of predefined classes. For example, a multi-class classification model
might be trained to predict the type of animal in an image (e.g. dog, cat, bird) based on the
features in the image.

There are several approaches to solving a multi-class classification problem, including:

1. One-vs-rest (OvR) approach: This approach involves training a separate binary classifier for each
class, where the class is treated as the positive class and all other classes are treated as the
negative class. During prediction, the input is passed through each classifier, and the class with
the highest predicted probability is chosen as the final prediction.
2. One-vs-one (OvO) approach: This approach involves training a separate binary classifier for
each pair of classes. For example, if there are three classes, three classifiers would be trained: one
to distinguish class 1 from class 2, one to distinguish class 1 from class 3, and one to distinguish
class 2 from class 3. During prediction, the input is passed through each classifier, and the class
that is predicted the most times is chosen as the final prediction.
3. Multinomial logistic regression: This approach involves training a single model that can predict
the probability of each class for an input sample. The class with the highest predicted probability
is chosen as the final prediction.
4. Support vector machines (SVMs): This approach involves training a classifier that finds the
hyperplane in a high-dimensional space that maximally separates the different classes. During
prediction, the input is mapped to the high-dimensional space and classified based on which side
of the hyperplane it falls on.

Choosing the right approach for a multi-class classification problem depends on the nature of
the data and the complexity of the task. It may be necessary to try multiple approaches and
compare their performance in order to find the best solution.

Non linearity with activation function:


Activation functions are used in artificial neural networks to introduce nonlinearity
into the model. They are applied to the output of a neuron before it is passed on to
the next layer in the network.

The use of activation functions allows neural networks to model complex


relationships in data, as they enable the network to learn nonlinear decision
boundaries. Without activation functions, neural networks would only be able to
learn linear decision boundaries, which limits their ability to solve more complex
problems.

There are several common activation functions that are used in neural networks,
including:

1. Sigmoid function: The sigmoid function maps input values to a range between 0
and 1, which makes it useful for classification tasks where the output is a probability.
It has a smooth, s-shaped curve and is differentiable, which makes it well-suited for
training with gradient descent. However, it can saturate for large input values and
produce slow convergence, which limits its use in deep networks.
2. Tanh function: The tanh function maps input values to a range between -1 and 1,
which makes it useful for classification tasks where the output is a probability. It has a
smooth, s-shaped curve and is differentiable, which makes it well-suited for training
with gradient descent. It has a stronger gradient than the sigmoid function, which
allows it to escape the saturation problem and make faster convergence.
3. ReLU function: The ReLU (Rectified Linear Unit) function maps input values to the
positive range, and outputs 0 for negative input values. It is fast to compute and
does not saturate, which makes it well-suited for use in deep networks. However, it
can produce "dead neurons" if the input values are consistently negative, which can
degrade the performance of the network.
4. Leaky ReLU function: The Leaky ReLU function is a variant of the ReLU function that
allows small negative input values to pass through, rather than outputting 0. This
helps to mitigate the problem of dead neurons and improve the performance of the
network.

Choosing the right activation function for a particular task depends on the nature of
the data and the complexity of the model. It may be necessary to try multiple
activation functions and compare their performance in order to find the best
solution.

DROPOUT AS REGULARIZATION:

Dropout is a regularization technique that is used to prevent overfitting in neural


networks. It works by randomly setting a certain percentage of the neurons in the
network to zero during training, which forces the network to rely on a different set of
neurons for each training example.

Dropout can be thought of as a form of model ensemble, as it effectively trains


multiple models (each with a different subset of neurons) and averages their
predictions to make the final prediction. This helps to reduce the dependence of the
model on any one neuron, which can help to reduce overfitting and improve the
generalizability of the model.

To use dropout, a dropout rate is chosen, which determines the percentage of


neurons that will be set to zero. For example, a dropout rate of 0.2 means that 20%
of the neurons will be set to zero during training. The dropout rate is typically chosen
through hyperparameter tuning, which involves experimenting with different values
to find the one that produces the best results.

Dropout is typically applied to the fully-connected layers of a neural network, but it


can also be applied to the convolutional layers in a convolutional neural network. It is
typically used in conjunction with other regularization techniques, such as weight
decay (also known as L2 regularization) and early stopping, to further improve the
generalizability of the model.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy