Unit - 1 DL
Unit - 1 DL
Deep learning:
Deep Learning is a type of Machine Learning where computers learn to solve
problems by mimicking how the human brain works. It uses neural networks,
which are made up of layers of connected nodes (like neurons) that process
data and find patterns.
Example:
Imagine you want a computer to recognize whether a picture is of a cat or a
dog. You show it thousands of pictures of cats and dogs. The computer’s neural
network learns by analyzing these pictures layer by layer, identifying features
like ears, fur, or tails. Over time, it becomes so good at recognizing these
features that it can correctly predict whether a new, unseen picture is a cat or a
dog.
Deep Learning is powerful because it can automatically discover important
features in the data without requiring humans to define them explicitly. This
makes it useful for tasks like image recognition, voice assistants, and self-
driving cars.
Fundamentals of Deep Learning:
Deep Learning is a subfield of Machine Learning that uses artificial neural
networks to model and solve complex problems. Here's a simple breakdown of
its fundamentals:
1. Neural Networks
Core Concept: Mimics the structure of the human brain using layers of
neurons (nodes) to process information.
Structure:
o Input Layer: Takes data as input.
o Hidden Layers: Process data using weights, biases, and activation
functions.
o Output Layer: Produces the result (e.g., classification or
prediction).
2.Key Components
3. Training Process
Forward Propagation: Data flows through the network to produce an
output.
Backpropagation: The network adjusts weights by computing gradients
(derivatives) to reduce the error.
5. Applications
Image Recognition
Deep Learning is widely used for tasks like identifying objects, faces, or scenes
in images, enabling applications such as facial recognition and object detection.
Artificial Intelligence:
Artificial Intelligence (AI) is the simulation of human intelligence in machines
that are programmed to think, learn, and make decisions. These systems use
algorithms to process data and perform tasks that would typically require
human reasoning, such as recognizing patterns, understanding language, and
solving problems.
Example
For instance, AI is used in virtual assistants like Siri and Alexa, which understand
voice commands and provide relevant information or perform tasks such as
setting reminders or answering questions.
Why Artificial Intelligence?
Artificial Intelligence is important because it helps automate tasks, improves
accuracy, and enhances decision-making. By analyzing large amounts of data,
AI can uncover insights that humans might miss, leading to smarter solutions in
various industries. AI reduces the workload on humans for repetitive tasks,
allowing them to focus on more creative and complex activities. Additionally, it
adapts and learns from new information, continuously improving its
performance over time, making it highly valuable in fields like healthcare,
finance, and customer service.
Types of Artificial Intelligence :
• The basic idea behind these early neural networks was to create
computational models that could learn from data and make predictions based
on that learning. They were composed of interconnected nodes, called
neurons, that were organized into layers. Each neuron took input from the
neurons in the previous layer and produced output that was fed into the
neurons in the next layer.
Kernel Method
A Kernel Method is a technique in machine learning that maps data into
higher-dimensional spaces using mathematical functions called kernels. These
kernels allow algorithms to handle non-linear relationships by transforming
data into a space where linear methods can effectively solve complex
problems.
Key Points of Kernel Methods
1. Support Vector Machine (SVM)
o SVM uses kernel methods to find the optimal hyperplane that
separates data into different classes.
o It maps the data into a higher-dimensional space using kernel
functions (e.g., RBF, linear) to handle non-linear separable data.
o Example: Classifying handwritten digits (e.g., MNIST dataset).
2. Adaptive Filter
o Used in signal processing and noise reduction.
o Kernels help adapt filters to specific data patterns by minimizing
errors based on kernel similarity.
o Example: Removing noise from audio signals while preserving
important features.
3. Kernel Perceptron
o A neural network-like model that uses kernel functions instead of
weights and biases to map data into a higher-dimensional space
for classification.
o It is effective for handling complex, non-linear data.
o Example: Classifying images or speech with complex data
relationships.
4. Principal Component Analysis (PCA)
o PCA applies kernel methods to reduce the dimensionality of data
while preserving important features by projecting it into a lower-
dimensional subspace.
o This method is useful for feature extraction and visualization of
complex data.
o Example: Reducing dimensions of high-dimensional datasets in
machine learning.
5. Spectral Clustering
o Kernel methods are used to transform the similarity between data
points into a kernel matrix, which is then used for clustering tasks
based on graph representations.
o It helps uncover clusters in data that may not be linearly
separable.
o Example: Identifying groups in social network analysis or image
segmentation.
Conclusion
Kernel methods provide powerful tools for handling non-linear problems and
transforming data into spaces where linear techniques can be applied
effectively. These methods are widely used in various machine learning tasks,
including classification, dimensionality reduction, and clustering.
Example:
3. Semi-Supervised Learning
Semi-Supervised learning is a type of Machine Learning algorithm that lies
between Supervised and Unsupervised machine learning. It represents the
intermediate ground between Supervised (With Labelled training data) and
Unsupervised learning (with no labelled training data) algorithms and uses the
combination of labelled and unlabeled datasets during the training period.
To overcome the drawbacks of supervised learning and unsupervised learning
algorithms, the concept of Semi-supervised learning is introduced. The main
aim of semi-supervised learning is to effectively use all the available data,
rather than only labelled data like in supervised learning.
We can imagine these algorithms with an example. Supervised learning is
where a student is under the supervision of an instructor at home and college.
Further, if that student is self-analysing the same concept without any help
from the instructor, it comes under unsupervised learning. Under semi-
supervised learning, the student has to revise himself after analyzing the same
concept under the guidance of an instructor at college.
Example: Categorizing medical images for disease detection using a small set of
labeled images and many unlabeled ones for feature extraction.
Image Classification: Using limited labeled images with a vast set of unlabeled
images to train a model.
Speech Recognition: Improving accuracy by supplementing small labeled
datasets with a large collection of unlabeled audio data.
4. Reinforcement Learning
Reinforcement learning works on a feedback-based process, in which an AI
agent (A software component) automatically explore its surrounding by hitting
& trail, taking action, learning from experiences, and improving its
performance. Agent gets rewarded for each good action and get punished for
each bad action; hence the goal of reinforcement learning agent is to maximize
the rewards. In reinforcement learning, there is no labelled data like supervised
learning, and agents learn from their experiences only.
Due to its way of working, reinforcement learning is employed in different
fields such as Game theory, Operation Research, Information theory, multi-
agent systems.
Categories of Reinforcement Learning:
1.Positive Reinforcement Learning
2.Negative Reinforcement Learning
Example: AI optimizing movement in a maze to reach the goal while avoiding
obstacles.
Image Classification: Using limited labeled images with a vast set of
unlabeled images to train a model.
Speech Recognition: Improving accuracy by supplementing small labeled
datasets with a large collection of unlabeled audio data.
Evaluating Machine Learning Models
Evaluating machine learning models is crucial to understand their performance
and reliability before deploying them. Evaluation ensures that the model can
make accurate predictions, generalize well to new data, and meet the
objectives of the task. Below are key methods and metrics used for evaluation:
1. Train-Test Split
Definition: Divide the dataset into two parts—training set (to train the
model) and testing set (to evaluate its performance).
Purpose: Helps assess how well the model generalizes to unseen data.
2. Cross-Validation
Definition: A method to split the data into multiple subsets or "folds."
The model is trained on some folds and validated on the remaining folds,
repeating the process across all folds.
Purpose: Reduces bias and provides a more robust estimate of model
performance.
Common Methods: K-Fold Cross-Validation, Stratified K-Fold.
3. Confusion Matrix
Definition: A table used to evaluate the performance of a classification
model by showing true positives, true negatives, false positives, and false
negatives.
Purpose: Helps analyze errors and understand where the model
struggles.
4. Overfitting and Underfitting Analysis
Overfitting: Model performs well on training data but poorly on test
data.
Underfitting: Model performs poorly on both training and test data.
Solution: Use regularization techniques, tune hyperparameters, or
gather more data.
5. Time and Resource Efficiency
Evaluate the model's training and inference time, as well as its
computational requirements, especially for deployment in resource-
constrained environments.
Underfitting
Underfitting in machine learning occurs when a model is too simple to
capture the underlying patterns in the data, leading to poor performance
on both the training and testing datasets. It happens when the model
fails to learn adequately, resulting in inaccurate predictions.
1. Low Training Accuracy: The model cannot even perform well on the
training data.
2. Low Testing Accuracy: It struggles even more on unseen test data.
3. Oversimplification: The model makes overly generalized assumptions,
ignoring important patterns in the data.
Reasons for Underfitting
1. The model is too simple, So it may be not capable to represent the
complexities in the data.
2. The input features which is used to train the model is not the adequate
representations of underlying factors influencing the target variable.
3. The size of the training dataset used is not enough.
4. Excessive regularization are used to prevent the overfitting, which
constraint the model to capture the data well.
5. Features are not scaled.
Techniques to Reduce Underfitting
1. Increase model complexity.
2. Increase the number of features, performing feature engineering.
3. Remove noise from the data.