0% found this document useful (0 votes)
18 views28 pages

Unit - 1 DL

The document provides an overview of deep learning, artificial intelligence, and machine learning, explaining their definitions, key concepts, and historical development. It discusses various neural network types, applications, and probabilistic modeling, as well as decision tree algorithms and their functionalities. Overall, it highlights the evolution and significance of these technologies in solving complex problems across different domains.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views28 pages

Unit - 1 DL

The document provides an overview of deep learning, artificial intelligence, and machine learning, explaining their definitions, key concepts, and historical development. It discusses various neural network types, applications, and probabilistic modeling, as well as decision tree algorithms and their functionalities. Overall, it highlights the evolution and significance of these technologies in solving complex problems across different domains.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 28

UNIT – 1

Deep learning:
Deep Learning is a type of Machine Learning where computers learn to solve
problems by mimicking how the human brain works. It uses neural networks,
which are made up of layers of connected nodes (like neurons) that process
data and find patterns.
Example:
Imagine you want a computer to recognize whether a picture is of a cat or a
dog. You show it thousands of pictures of cats and dogs. The computer’s neural
network learns by analyzing these pictures layer by layer, identifying features
like ears, fur, or tails. Over time, it becomes so good at recognizing these
features that it can correctly predict whether a new, unseen picture is a cat or a
dog.
Deep Learning is powerful because it can automatically discover important
features in the data without requiring humans to define them explicitly. This
makes it useful for tasks like image recognition, voice assistants, and self-
driving cars.
Fundamentals of Deep Learning:
Deep Learning is a subfield of Machine Learning that uses artificial neural
networks to model and solve complex problems. Here's a simple breakdown of
its fundamentals:
1. Neural Networks
 Core Concept: Mimics the structure of the human brain using layers of
neurons (nodes) to process information.
 Structure:
o Input Layer: Takes data as input.
o Hidden Layers: Process data using weights, biases, and activation
functions.
o Output Layer: Produces the result (e.g., classification or
prediction).

2.Key Components

 The essential elements of deep learning include weights, biases,


activation functions, loss functions, and optimizers. Weights and biases
are parameters adjusted during training to minimize errors. Activation
functions, such as ReLU, Sigmoid, and Tanh, introduce non-linearity,
allowing the network to handle complex data. Loss functions measure the
error between predicted and actual outputs, while optimizers like
Gradient Descent and Adam are used to update the network’s parameters
efficiently.

3. Training Process
 Forward Propagation: Data flows through the network to produce an
output.
 Backpropagation: The network adjusts weights by computing gradients
(derivatives) to reduce the error.

4. Types of Neural Networks

 Feedforward Neural Networks (FNN)


In Feedforward Neural Networks, data flows only in one direction—from input to
output—making them simple for tasks like classification and regression. They don’t
have feedback loops and work well for non-sequential problems.

 Convolutional Neural Networks (CNN)


CNNs are designed for image processing, using convolutional layers to detect patterns
like edges and textures. They excel in tasks like image classification and object
detection.

 Recurrent Neural Networks (RNN)


RNNs are suited for sequential data, remembering previous inputs to make
predictions, such as in text generation or time series analysis..

5. Applications
Image Recognition
Deep Learning is widely used for tasks like identifying objects, faces, or scenes
in images, enabling applications such as facial recognition and object detection.

Natural Language Processing (NLP)


NLP leverages Deep Learning to understand and generate human language,
powering applications like chatbots, language translation, and sentiment
analysis.
Self-driving Cars
Deep Learning enables autonomous vehicles to process and interpret sensor
data, navigate environments, and make decisions in real-time.
Speech Recognition
With Deep Learning, systems can accurately convert spoken language into text,
allowing applications like voice assistants and transcription services.
Healthcare Diagnostics
Deep Learning helps analyze medical images, such as X-rays and MRIs, to
detect diseases and assist in accurate diagnoses and treatment planning.

Artificial Intelligence:
Artificial Intelligence (AI) is the simulation of human intelligence in machines
that are programmed to think, learn, and make decisions. These systems use
algorithms to process data and perform tasks that would typically require
human reasoning, such as recognizing patterns, understanding language, and
solving problems.
Example
For instance, AI is used in virtual assistants like Siri and Alexa, which understand
voice commands and provide relevant information or perform tasks such as
setting reminders or answering questions.
Why Artificial Intelligence?
Artificial Intelligence is important because it helps automate tasks, improves
accuracy, and enhances decision-making. By analyzing large amounts of data,
AI can uncover insights that humans might miss, leading to smarter solutions in
various industries. AI reduces the workload on humans for repetitive tasks,
allowing them to focus on more creative and complex activities. Additionally, it
adapts and learns from new information, continuously improving its
performance over time, making it highly valuable in fields like healthcare,
finance, and customer service.
Types of Artificial Intelligence :

Types of Artificial Intelligence: BASED ON CAPABILITIES


1.Narrow AI
Narrow AI is designed for a specific task, such as image recognition or speech
recognition. It excels at one task and performs well within those constraints.
Example: A recommendation system on a streaming platform.
2.General AI
General AI, often considered theoretical, refers to systems with the ability to
perform any intellectual task a human can do, such as reasoning, problem-
solving, and adapting to new situations.
Example: No existing system fully embodies General AI yet.
3.Superintelligent AI
Superintelligent AI surpasses human intelligence in all areas, making decisions
far beyond human capability. It is a speculative concept with ethical and safety
concerns.
Example: Still hypothetical and yet to be developed.
Types of Artificial Intelligence Based on Functionality
1. Reactive Machines
Reactive Machines are the most basic form of AI, which can only react to
specific inputs without understanding past experiences. They don’t store
memories and operate solely based on predefined rules.
Example: IBM’s Deep Blue, which defeated chess grandmasters by
evaluating possible moves without retaining past games.
2. Limited Memory
Limited Memory AI can learn from past experiences to make decisions
but only for a short duration. It stores data temporarily and updates
based on new input.
Example: Autonomous vehicles that use past driving data to predict and
navigate future situations.
3. Theory of Mind
AI with a Theory of Mind aims to understand and predict human
emotions, intentions, and behaviors. It focuses on social interaction and
empathy, allowing more personalized and human-like responses.
Example: AI-based customer service bots that adapt based on customer
sentiment and conversation history.
4. Self-Awareness
Self-Aware AI is a theoretical concept where the system gains a sense of
consciousness, self-awareness, and understanding of its own existence
and emotions. Currently, it is not yet achieved, but it would allow
machines to have a deeper level of understanding and independent
thought.
Example: Fictional or conceptual discussions, as no such system exists
today.
History of Machine Learning:
Definition of Machine Learning
Machine Learning is a branch of artificial intelligence that allows computers to
learn from data and make decisions without being explicitly programmed. It
involves using algorithms to identify patterns and make predictions or decisions
based on the information provided.
Example
For instance, a recommendation system used by streaming platforms like
Netflix or Amazon learns what content you enjoy by analyzing your past
behavior, such as which movies or products you’ve viewed or purchased. Based
on this data, it recommends similar items, helping to improve your experience.

History of Machine Learning


1950s-1960s
Machine Learning began in the 1950s when early computers started
performing simple tasks, like recognizing patterns. In 1950, Alan Turing
introduced the idea of machines learning like humans. Early methods included
decision trees and rule-based systems.
1970s-1980s
During the 70s and 80s, machine learning became more advanced. Neural
networks were developed, allowing computers to handle tasks like image
recognition. However, computing power was still limited.
1990s
In the 90s, support vector machines (SVM) and ensemble methods improved
machine learning. Computers became faster, allowing larger datasets to be
processed. This helped with applications like bioinformatics and fraud
detection.
2000s
The 2000s brought a rise in big data, making machine learning essential for
handling vast amounts of information. Libraries like TensorFlow made it easier
to create complex models. Deep Learning, especially CNNs, became popular for
tasks like image recognition.
2010s
Machine Learning gained popularity with advancements in cloud computing
and AI applications. Companies like Google and Amazon used it for search
engines, recommendations, and social media. RNNs and transformers emerged
for natural language processing.
2020s
Today, machine learning is used in self-driving cars, healthcare, and AI like GPT
models. New techniques like federated learning and reinforcement learning are
advancing rapidly, shaping the future of AI.
Overall conclusion:
Machine Learning has evolved significantly over the years, starting from
simple rule-based systems in the 1950s to advanced deep learning models
today. Early research focused on basic algorithms, while advancements in
computing power and data led to the development of neural networks and
big data solutions. In recent years, Machine Learning has become essential in
fields like healthcare, finance, and artificial intelligence, shaping the future of
technology.

Probabilistic Modelling in Machine Learning


Probabilistic modelling in machine learning is a method that uses probability
theory to handle uncertainty and make predictions. It helps systems
understand and quantify the likelihood of different outcomes based on data,
allowing for more informed decision-making.
 Probabilistic models consider both observed data and prior knowledge
to improve predictions.
 They are widely used in tasks where uncertainty is inherent, such as
forecasting, recommendation systems, and risk assessment.
Key Concepts:
1. Probability Distributions
Probability distributions are mathematical functions that describe how
data is distributed. Common examples include:
o Gaussian Distribution: Used for continuous data, modeling things
like heights or test scores.
o Binomial Distribution: Used for discrete data, like the number of
successes in a fixed number of trials.
2. Bayesian Inference
Bayesian inference incorporates prior beliefs or assumptions with new
evidence to update probabilities. For example, in medical diagnosis,
Bayesian models combine symptoms with historical patient data to
estimate disease likelihoods.
3. Graphical Models
Graphical models, such as Bayesian Networks, visually represent
dependencies between variables. This allows for easy modeling of
complex relationships and helps in understanding how different factors
influence outcomes.
4. Uncertainty and Learning
Probabilistic models excel in managing uncertainty by making predictions
based on probabilities. They are especially useful when data is
incomplete or noisy, offering more flexible and robust solutions.
Example
In credit scoring, probabilistic models assess the risk of a loan default by
evaluating factors like income, debt, and repayment history, combining
historical data to make more reliable predictions.

EARLY NEURAL NETWORKS


• Early neural networks in machine learning refer to the initial attempts to
build artificial neural networks inspired by the biological neurons in the
brain. These early networks were first proposed in the 1940s and 1950s by
researchers such as Warren McCulloch and Walter Pitts, and later refined by
Frank Rosenblatt in the form of the perceptron.

• The basic idea behind these early neural networks was to create
computational models that could learn from data and make predictions based
on that learning. They were composed of interconnected nodes, called
neurons, that were organized into layers. Each neuron took input from the
neurons in the previous layer and produced output that was fed into the
neurons in the next layer.

• Early neural networks were trained using a technique called supervised


learning, in which the network was presented with input-output pairs and
adjusted its internal weights to minimize the error between its predicted
output and the actual output. This training was typically done using a
process called backpropagation, which involves propagating the error
backwards through the network and adjusting the weights to reduce the
error.
• One limitation of early neural networks was that they could only learn
linearly separable functions, meaning that they could only classify data that
could be separated by a single straight line. This limitation was addressed in
the 1980s with the development of more sophisticated neural network
architectures, such as multi-layer perceptrons and convolutional neural
networks, which allowed for more complex non-linear functions to be
learned.

• Despite their limitations, early neural networks represented an important


milestone in the development of machine learning and paved the way for the
development of more powerful and sophisticated neural network models.

Kernel Method
A Kernel Method is a technique in machine learning that maps data into
higher-dimensional spaces using mathematical functions called kernels. These
kernels allow algorithms to handle non-linear relationships by transforming
data into a space where linear methods can effectively solve complex
problems.
Key Points of Kernel Methods
1. Support Vector Machine (SVM)
o SVM uses kernel methods to find the optimal hyperplane that
separates data into different classes.
o It maps the data into a higher-dimensional space using kernel
functions (e.g., RBF, linear) to handle non-linear separable data.
o Example: Classifying handwritten digits (e.g., MNIST dataset).
2. Adaptive Filter
o Used in signal processing and noise reduction.
o Kernels help adapt filters to specific data patterns by minimizing
errors based on kernel similarity.
o Example: Removing noise from audio signals while preserving
important features.
3. Kernel Perceptron
o A neural network-like model that uses kernel functions instead of
weights and biases to map data into a higher-dimensional space
for classification.
o It is effective for handling complex, non-linear data.
o Example: Classifying images or speech with complex data
relationships.
4. Principal Component Analysis (PCA)
o PCA applies kernel methods to reduce the dimensionality of data
while preserving important features by projecting it into a lower-
dimensional subspace.
o This method is useful for feature extraction and visualization of
complex data.
o Example: Reducing dimensions of high-dimensional datasets in
machine learning.
5. Spectral Clustering
o Kernel methods are used to transform the similarity between data
points into a kernel matrix, which is then used for clustering tasks
based on graph representations.
o It helps uncover clusters in data that may not be linearly
separable.
o Example: Identifying groups in social network analysis or image
segmentation.
Conclusion
Kernel methods provide powerful tools for handling non-linear problems and
transforming data into spaces where linear techniques can be applied
effectively. These methods are widely used in various machine learning tasks,
including classification, dimensionality reduction, and clustering.

Decision Tree classification algorithm:


A Decision Tree is a machine learning algorithm used for both classification and
regression tasks. It splits data into branches based on decision rules derived
from feature values, forming a tree-like structure that makes predictions by
traversing from the root to leaf nodes.
>>In a Decision tree, there are two nodes, which are the Decision
Node and Leaf Node. Decision nodes are used to make any decision and
have multiple branches, whereas Leaf nodes are the output of those
decisions and do n

Why use Decision Trees?


There are various algorithms in Machine learning, so choosing the best
algorithm for the given dataset and problem is the main point to remember
while creating a machine learning model. Below are the two reasons for using
the Decision tree:
Decision Tree Terminologies
>> Root Node: Root node is from where the decision tree starts. It
represents the entire dataset, which further gets divided into two or more
homogeneous sets.
>> Leaf Node: Leaf nodes are the final output node, and the tree cannot be
segregated further after getting a leaf node.
>>Splitting: Splitting is the process of dividing the decision node/root node
into sub-nodes according to the given conditions.
>>Branch/Sub Tree: A tree formed by splitting the tree.
>>Pruning: Pruning is the process of removing the unwanted branches
from the tree.
>> Parent/Child node: The root node of the tree is called the parent node,
and other nodes are called the child nodes.
How Does the Decision Tree Algorithm Work?
In a decision tree, the goal is to predict the class of a given dataset by splitting
it into subsets based on feature values. The algorithm starts at the root node
and follows branches down the tree until it reaches a leaf node.
Steps in Decision Tree Algorithm
1. Step-1: Start with the Root Node
o Begin with the root node, S, which contains the entire dataset.
Step-2: Select the Best Attribute
 Use an Attribute Selection Measure (ASM) to determine the best
attribute that provides the most information gain or reduces impurity
(e.g., Gini Impurity or Entropy).
Step-3: Divide the Dataset
 Split the dataset S into subsets based on the possible values of the best
attribute. For example, if the attribute is "Age," subsets could be divided
into different age groups (e.g., ≤ 30, 30-50, > 50).
Step-4: Create a Decision Tree Node
 Create a decision tree node using the best attribute chosen in step 2.
This node represents the split based on that attribute.
Step-5: Recursively Build the Tree
 Repeat the process for each subset generated in step 3.
 Continue creating new decision nodes by selecting the best attribute for
each subset until a stage is reached where no further splits are possible
(i.e., leaf node).
Step-6: Generate Leaf Nodes
 Once no further splits are possible, the final nodes are called leaf nodes,
which represent the predicted class for that subset.

Example:

Random Forest in Machine Learning


Random Forest is a powerful machine learning algorithm used for both
Classification and Regression tasks. It is part of ensemble learning techniques,
which combine multiple base models (in this case, decision trees) to improve
the overall performance of the model.
 It builds decision tree on different samples and takes their majority of
voting for classification and average in case of regression.
 One of the most important features of the random forest algorithm is
that it can handle the data set containing continuous variables as I the
case of regression and categorical variables as in the case of
classification.
 It performs better for classification and regression tasks.
Steps involved in random forest algorithm:
Step 1: in the random forest model a sub net of data set and a subset of
features is selected for constructing each decision tree. simply put n
random records and m features are taken from the data set having K
numbers of records.
Step 2: Individual decision trees are constructed for each sample.
step 3: Each decision tree win generate on output.
Step 4: Final output is considered based on majority voting or averaging
for classification and regression .

Advantages of Random Forest


 High Accuracy: Combines multiple trees for more accurate predictions.
 Robustness: Less prone to overfitting compared to a single decision tree.
 Handling Missing Data: Can handle datasets with missing values
efficiently.
 Versatility: Works well with both numerical and categorical data.
Disadvantages:
 Computational Cost: Requires more resources as the number of trees
grows.
 Overfitting: Although rare, excessive trees can lead to overfitting,
especially if hyperparameters are not tuned properly.
 Interpretability: While highly accurate, the model can be less
interpretable compared to simpler models.
Applications of Random Forest
 Classification: Image recognition, fraud detection, customer
segmentation.
 Regression: Predicting house prices, sales forecasting, time series
analysis.

What is Gradient Boosting Machine (GBM)?


Gradient Boosting Machine (GBM) is an ensemble learning technique used
primarily for regression and classification tasks. It builds a strong predictive
model by combining multiple weak learners (usually decision trees) in a
sequential manner. GBM focuses on minimizing the loss function iteratively by
correcting errors made by previous models.

How do GBM works?


Generally, most supervised learning algorithms are based on a single predictive
model such as linear regression, penalized regression model, decision trees,
etc.
But there are some supervised algorithms in ML that depend on a combination
of
various models together through the ensemble. In other words, when multiple
base
models contribute their predictions, an average of all predictions is adapted by
boosting algorithms.
Gradient boosting machines consist 3 elements as follows:
1. Loss function
2. Weak learners
3.Additive model
Let's understand these three elements in detail.
1. Loss Function
 Definition: A loss function measures the difference between the
predicted output and the actual output. GBM uses the loss function to
guide the training of each weak learner.
 Role: At each step, GBM adjusts the weights of the errors to minimize
the loss function.
Example:
o For regression, the MSE measures how far the predicted values
are from the actual values.
o For classification, Cross-Entropy Loss measures how well the
model predicts the probability of each class.
2. Weak Learner
 Definition: A weak learner is a simple and shallow model (e.g., a decision
tree) that performs slightly better than random guessing.
 Role: Each weak learner is trained to minimize the errors from the
previous models.
 Process: After each weak learner is trained, it is added to the ensemble
to correct the errors and improve the overall prediction.
3. Additive Model
 Definition: GBM builds an additive model by combining multiple weak
learners in a sequential manner.
 Working: Each new model attempts to correct the errors made by
previous models, thus improving the overall performance.
o Step-by-Step Process:
 Fit the first weak learner to the data.
 Fit the second weak learner to the residuals from the first
model.
 Continue this process, fitting a new model for each stage,
until the error is minimized.
Advantages of GBM
 High Accuracy: Strong predictive performance, especially when
combined with many trees.
 Robust to Overfitting: Can be tuned to balance bias and variance
effectively.
 Flexibility: Works with both regression and classification tasks.
Disadvantages
 High Computational Cost: Training multiple weak learners can be time-
consuming, especially with large datasets.
 Sensitivity: Requires careful tuning of hyperparameters such as learning
rate, number of trees, and depth.

FUNDAMENTALS IN MACHINE LEARNING:


Machine learning is a field of computer science and artificial intelligence that
enables computer systems to automatically learn and improve from experience
without being explicitly programmed. The core concepts and fundamentals of
machine learning are as follows:

• Data: Machine learning requires a significant amount of data to train


models. This data must be accurately labeled and relevant to the problem at
hand.
• Algorithms: Machine learning algorithms are the set of rules and
instructions that enable computers to learn from the data. There are three
types of algorithms: supervised learning, unsupervised learning, and
reinforcement learning.

• Model: A machine learning model is the output generated by an algorithm


that learns from the data. The model is used to make predictions or decisions
based on new data.
• Training: The process of training a machine learning model involves
feeding large amounts of data into an algorithm to enable it to learn and
improve its accuracy over time.

• Validation: Once the model is trained, it needs to be validated to ensure its


accuracy and reliability in making predictions on new data.

• Testing: Testing a machine learning model involves evaluating its


performance on a separate dataset, which was not used in training or
validation.
• Feature Engineering: Feature engineering is the process of selecting and
extracting relevant features or variables from the data that can help improve
the accuracy of the model.

• Overfitting and Underfitting: Overfitting occurs when the model is too


complex and captures noise or random variations in the training data,
leading to poor generalization on new data. Underfitting occurs when the
model is too simple and fails to capture the underlying patterns in the data.
• Hyperparameter Tuning: Machine learning algorithms have several
hyperparameters that need to be optimized to achieve the best performance.
Hyperparameter tuning involves adjusting these parameters to improve the
model's accuracy.
• Deployment: Finally, the trained machine learning model needs to be
deployed in a real-world application, where it can make predictions or
decisions based on new data.

FOUR BRANCHES OF MACHINE LEARNING:


1.Supervised Machine Learning
2.Unsupervised Machine Learning
3.Semi-Supervised Machine Learning
4.Reinforcement Learning
Supervised Machine Learning
 Definition: In supervised learning, a model is trained on a labeled
dataset, where both input data and corresponding output are provided.
The model learns to map inputs to outputs by minimizing a loss function.
 Let's understand supervised learning with an example. Suppose we have
an input dataset of cats and dog images. So, first, we will provide the
training to the machine to understand the images, such as the shape &
size of the tail of cat and dog, Shape of eyes, colour, height (dogs are
taller, cats are smaller), etc. After completion of training, we input the
picture of a cat and ask the machine to identify the object and predict
the output. Now, the machine is well trained, so it will check all the
features of the object, such as height, shape, colour, eyes, ears, tail, etc.,
and find that it's a cat. So, it will put it in the Cat category. This is the
process of how the machine identifies the objects in Supervised
Learning.
Categories of Supervised Machine Learning
 Supervised machine learning can be classified into two types of
problems, which are given below:
 Classification
 Regression
 Example: Predicting whether a customer will buy a product based on
their past behavior.
2.Unsupervised Machine Learning
Unsupervised learning is different from the Supervised learning technique; as
its name suggests, there is no need for supervision. It means, in unsupervised
machine learning, the machine is trained using the unlabeled dataset, and the
machine predicts the output without any supervision.
Let's take an example to understand it more preciously; suppose there is a
basket of fruit images, and we input it into the machine learning model. The
images are totally unknown to the model, and the task of the machine is to find
the patterns and categories of the objects.
Categories of Unsupervised Machine Learning
Unsupervised Learning can be further classified into two types, which are given
below:
> Clustering
> Association
Example: Grouping similar articles for content recommendations.

3. Semi-Supervised Learning
Semi-Supervised learning is a type of Machine Learning algorithm that lies
between Supervised and Unsupervised machine learning. It represents the
intermediate ground between Supervised (With Labelled training data) and
Unsupervised learning (with no labelled training data) algorithms and uses the
combination of labelled and unlabeled datasets during the training period.
To overcome the drawbacks of supervised learning and unsupervised learning
algorithms, the concept of Semi-supervised learning is introduced. The main
aim of semi-supervised learning is to effectively use all the available data,
rather than only labelled data like in supervised learning.
We can imagine these algorithms with an example. Supervised learning is
where a student is under the supervision of an instructor at home and college.
Further, if that student is self-analysing the same concept without any help
from the instructor, it comes under unsupervised learning. Under semi-
supervised learning, the student has to revise himself after analyzing the same
concept under the guidance of an instructor at college.
Example: Categorizing medical images for disease detection using a small set of
labeled images and many unlabeled ones for feature extraction.
Image Classification: Using limited labeled images with a vast set of unlabeled
images to train a model.
Speech Recognition: Improving accuracy by supplementing small labeled
datasets with a large collection of unlabeled audio data.

4. Reinforcement Learning
Reinforcement learning works on a feedback-based process, in which an AI
agent (A software component) automatically explore its surrounding by hitting
& trail, taking action, learning from experiences, and improving its
performance. Agent gets rewarded for each good action and get punished for
each bad action; hence the goal of reinforcement learning agent is to maximize
the rewards. In reinforcement learning, there is no labelled data like supervised
learning, and agents learn from their experiences only.
Due to its way of working, reinforcement learning is employed in different
fields such as Game theory, Operation Research, Information theory, multi-
agent systems.
Categories of Reinforcement Learning:
1.Positive Reinforcement Learning
2.Negative Reinforcement Learning
Example: AI optimizing movement in a maze to reach the goal while avoiding
obstacles.
 Image Classification: Using limited labeled images with a vast set of
unlabeled images to train a model.
 Speech Recognition: Improving accuracy by supplementing small labeled
datasets with a large collection of unlabeled audio data.
Evaluating Machine Learning Models
Evaluating machine learning models is crucial to understand their performance
and reliability before deploying them. Evaluation ensures that the model can
make accurate predictions, generalize well to new data, and meet the
objectives of the task. Below are key methods and metrics used for evaluation:
1. Train-Test Split
 Definition: Divide the dataset into two parts—training set (to train the
model) and testing set (to evaluate its performance).
 Purpose: Helps assess how well the model generalizes to unseen data.
2. Cross-Validation
 Definition: A method to split the data into multiple subsets or "folds."
The model is trained on some folds and validated on the remaining folds,
repeating the process across all folds.
 Purpose: Reduces bias and provides a more robust estimate of model
performance.
 Common Methods: K-Fold Cross-Validation, Stratified K-Fold.
3. Confusion Matrix
 Definition: A table used to evaluate the performance of a classification
model by showing true positives, true negatives, false positives, and false
negatives.
 Purpose: Helps analyze errors and understand where the model
struggles.
4. Overfitting and Underfitting Analysis
 Overfitting: Model performs well on training data but poorly on test
data.
 Underfitting: Model performs poorly on both training and test data.
Solution: Use regularization techniques, tune hyperparameters, or
gather more data.
5. Time and Resource Efficiency
 Evaluate the model's training and inference time, as well as its
computational requirements, especially for deployment in resource-
constrained environments.

Overfitting and Underfitting in Machine Learning


Overfitting and Underfitting are the two main problems that occur in
machine learning and degrade the performance of the machine learning
models.
The main goal of each machine learning model is to generalize well.
Here generalization defines the ability of an ML model to provide a
suitable output by adapting the given set of unknown input. It means
after providing training on the dataset, it can produce reliable and
accurate output. Hence, the underfitting and overfitting are the two
terms that need to be checked for the performance of the model and
whether the model is generalizing well or not.
Before understanding the overfitting and underfitting, let's understand
some basic term that will help to understand this topic well:
1.Signal: It refers to the true underlying pattern of the data that helps
the machine learning model to learn from the data.
2. Noise: Noise is unnecessary and irrelevant data that reduces the
performance of the model.
3.Bias: Bias is a prediction error that is introduced in the model due to
oversimplifying the machine learning algorithms. Or it is the difference
between the predicted values and the actual values.
4.Variance: If the machine learning model performs well with the training
dataset, but does not perform well with the test dataset, then variance
occurs.
Overfitting in Machine Learning
Overfitting happens when a model learns too much from the training
data, including noise and irrelevant details. This makes it perform poorly
on new, unseen data (high variance). It occurs often with complex
algorithms, like non-linear methods, which have more freedom to fit the
data too closely.
How to Avoid Overfitting:
 Use simpler models (e.g., linear algorithms for linear data).
 Limit model complexity (e.g., set a maximum depth for decision trees).
Reasons for Overfitting:
1. High variance and low bias.
2. The model is too complex.
3. The size of the training data.

Underfitting
Underfitting in machine learning occurs when a model is too simple to
capture the underlying patterns in the data, leading to poor performance
on both the training and testing datasets. It happens when the model
fails to learn adequately, resulting in inaccurate predictions.
1. Low Training Accuracy: The model cannot even perform well on the
training data.
2. Low Testing Accuracy: It struggles even more on unseen test data.
3. Oversimplification: The model makes overly generalized assumptions,
ignoring important patterns in the data.
Reasons for Underfitting
1. The model is too simple, So it may be not capable to represent the
complexities in the data.
2. The input features which is used to train the model is not the adequate
representations of underlying factors influencing the target variable.
3. The size of the training dataset used is not enough.
4. Excessive regularization are used to prevent the overfitting, which
constraint the model to capture the data well.
5. Features are not scaled.
Techniques to Reduce Underfitting
1. Increase model complexity.
2. Increase the number of features, performing feature engineering.
3. Remove noise from the data.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy