0% found this document useful (0 votes)
12 views16 pages

UNIT I

The document provides an overview of deep learning, artificial intelligence (AI), and machine learning (ML), explaining their definitions, types, and applications. It highlights the differences between supervised and unsupervised learning, discusses bias-variance tradeoff, and addresses the limitations and advantages of machine learning and deep learning. Additionally, it touches on the history of deep learning and its impact on various fields such as healthcare, finance, and natural language processing.

Uploaded by

Manasi Pawar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views16 pages

UNIT I

The document provides an overview of deep learning, artificial intelligence (AI), and machine learning (ML), explaining their definitions, types, and applications. It highlights the differences between supervised and unsupervised learning, discusses bias-variance tradeoff, and addresses the limitations and advantages of machine learning and deep learning. Additionally, it touches on the history of deep learning and its impact on various fields such as healthcare, finance, and natural language processing.

Uploaded by

Manasi Pawar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Deep Learning

Unit I: Foundations of Deep learning

Artificial Intelligence:[Important]
• Artificial Intelligence is basically the mechanism to incorporate human intelligence into machines
through a set of rules(algorithm).
• AI is a combination of two words: “Artificial” meaning something made by humans or non-
natural things and “Intelligence” meaning the ability to understand or think accordingly.
• Another definition could be that “AI is basically the study of training your machine(computers) to
mimic a human brain and its thinking capabilities”.
• AI focuses on 3 major aspects(skills): learning, reasoning, and self-correction to obtain the
maximum efficiency possible.
Types:
• Weak AI: Siri, Google Assistant, and Chatbots
• Strong AI (Hypothetical): AI with human-like reasoning
Examples of AI:
• Optical character recognition (OCR): Uses AI to extract text and data from images and
documents
• Voice assistants: Like Siri and Alexa, which use AI technology
• Customer service chatbots: Help users navigate websites
Uses of AI :
• AI can help businesses become more efficient and profitable.
• AI can help with data collection, analysis, and decision-making.
• AI can help with problem-solving and creativity.
Machine Learning:[Important]

• Machine Learning is basically the study/process which provides the system(computer) to learn
automatically on its own through experiences it had and improve accordingly without being
explicitly programmed.
• ML is an application or subset of AI. ML focuses on the development of programs so that it can
access data to use it for itself.
• The entire process makes observations on data to identify the possible patterns being formed and
make better future decisions as per the examples provided to them.
• The major aim of ML is to allow the systems to learn by themselves through experience without
any kind of human intervention or assistance.
• Types of machine learning:
• Supervised learning: where labeled data is provided to train the model to predict
specific outcomes.
• Unsupervised learning: where unlabeled data is used to discover hidden patterns and
group data points.
• Reinforcement learning: where the model learns by receiving rewards or penalties
based on its actions.
Example applications of machine learning:
• Image recognition: Identifying objects in images, like recognizing faces in a photo.
• Spam filtering: Classifying emails as spam or not spam.
• Recommendation systems: Suggesting products to users based on their past behavior.
• Fraud detection: Identifying fraudulent transactions in financial systems.
• Medical diagnosis: Analyzing medical scans to detect diseases.
Deep Learning:[Important]
• Deep Learning is basically a sub-part of the broader family of Machine Learning which makes
use of Neural Networks(similar to the neurons working in our brain) to mimic human brain-like
behavior.
• Deep Learning is transforming the way machines understand, learn, and interact with complex
data. Deep learning mimics neural networks of the human brain, it enables computers to
autonomously uncover patterns and make informed decisions from vast amounts of unstructured
data.
• DL algorithms focus on information processing patterns mechanism to possibly identify the
patterns just like our human brain does and classifies the information accordingly.
• DL works on larger sets of data when compared to ML and the prediction mechanism is self-
administered by machines.
Deep Learning has revolutionized many fields, including:
Computer Vision (Image classification, facial recognition)
Natural Language Processing (NLP) (Chatbots, speech recognition, translation)
Healthcare (Disease detection, drug discovery)
Finance (Fraud detection, stock price prediction)
Self-Driving Cars (Object detection, decision-making)

Deep Learning models consist of artificial neural networks with multiple layers:
Deep learning automates feature extraction by passing data through multiple layers of neurons. This
hierarchical learning process allows the model to understand complex relationships in data, making it
superior to traditional machine learning methods for tasks involving images, audio, text, and
unstructured data.
1. Input Layer – Receives raw data (images, text, etc.).
2. Hidden Layers – Extracts features and patterns.
3. Output Layer – Makes predictions.
Each neuron in a layer processes input using:
• Weights & Biases – Adjusted during training to improve accuracy.
• Activation Functions – Determine whether a neuron should fire (e.g., ReLU, Sigmoid).
Difference Between between Artificial Intelligence, Machine Learning, and
Deep Learning.[Important]
Supervised and Unsupervised Learning
--Supervised:
Supervised learning as the name indicates has the presence of a supervisor as a teacher. Supervised
learning is when we teach or train the machine using data that is well-labelled.
Which means some data is already tagged with the correct answer. After that, the machine is provided
with a new set of examples (data) so that the supervised learning algorithm analyses the training data
(set of training examples) and produces a correct outcome from labeled data.
Supervised learning is a type of machine learning where the algorithm learns from labeled data
meaning the data comes with correct answers or classifications.
For example, a labeled dataset of images of Elephant, Camel and Cow would have each image tagged
with either “Elephant“, “Camel” or “Cow.”
Supervised learning is classified into two categories of algorithms:
• Regression: A regression problem is when the output variable is a real value, such as
“dollars” or “weight”.
• Classification: A classification problem is when the output variable is a category, such as
“Yes” or “No” , “disease” or “no disease”

--Unsupervised:
Unsupervised learning is a type of machine learning that works with data that has no labels or
categories. The main goal is to find patterns and relationships in the data without any guidance.
In this approach, the machine analyzes unorganized information and groups it based on similarities,
patterns, or differences.
Unlike supervised learning, there is no teacher or training involved. The machine must uncover
hidden structures in the data on its own.
For example, unsupervised learning can analyze animal data and group the animals by their traits and
behavior. These groups could correspond to different species, making it possible to organize the
animals without pre-existing labels.
-----Compare Supervised and Unsupervised:[Important]

Parameters Supervised machine learning Unsupervised machine learning

Algorithms are trained using labeled Algorithms are used against data
Input Data data. that is not labeled

Computational
Simpler method Computationally complex
Complexity

Accuracy Highly accurate Less accurate

No. of classes No. of classes is known No. of classes is not known

Data Analysis Uses offline analysis Uses real-time analysis of data

Linear and Logistics regression, KNN


Random forest, multi-class K-Means clustering, Hierarchical
classification, decision tree, Support clustering, Apriori algorithm, etc.
Vector Machine, Neural Network, etc.
Algorithms used

Output Desired output is given. Desired output is not given.

Use training data to infer model. No training data is used.


Training data

It is not possible to learn larger and It is possible to learn larger and


more complex models than with more complex models with
Complex model supervised learning. unsupervised learning.

Model We can test our model. We can not test our model.

Supervised learning is also called Unsupervised learning is also called


Called as classification. clustering.
Parameters Supervised machine learning Unsupervised machine learning

Example: Optical character


Example: Find a face in an image.
Example recognition.

supervised learning needs supervision Unsupervised learning does not need


to train the model. any supervision to train the model.
Supervision

Divided into two types: Divided into two types:


1. Regression 1. Clustering
Classification 2. Classification 2. Association

Feedback It has feedback mechanism. It has no feedback mechanism.

Time
It’s more time consuming. It’s less time consuming.
Consumption

Bias ,Variance, Tradeoff [Important]


Bias:
The bias is known as the difference between the prediction of the values by the Machine
Learning model and the correct value.
Being high in biasing gives a large error in training as well as testing data.
It recommended that an algorithm should always be low-biased to avoid the problem of underfitting.
By high bias, the data predicted is in a straight line format, thus not fitting accurately in the data in the
data set. Such fitting is known as the Underfitting of Data.
This happens when the hypothesis is too simple or linear in nature. Refer to the graph given below for
an example of such a situation.
High Bias in the Model
In such a problem, a hypothesis looks like follows.

Variance:
The variability of model prediction for a given data point which tells us the spread of our data is
called the variance of the model.
The model with high variance has a very complex fit to the training data and thus is not able to fit
accurately on the data which it hasn’t seen before.
As a result, such models perform very well on training data but have high error rates on test data.
When a model is high on variance, it is then said to as Overfitting of Data.
Overfitting is fitting the training set accurately via complex curve and high order hypothesis but is not
the solution as the error with unseen data is high.
While training a data model variance should be kept low. The high variance data looks as follows.

High Variance in the Model


In such a problem, a hypothesis looks like follows.
Bias Variance TradeOff:
If the algorithm is too simple (hypothesis with linear equation) then it may be on high bias and low
variance condition and thus is error-prone.
If algorithms fit too complex (hypothesis with high degree equation) then it may be on high variance
and low bias. In the latter condition, the new entries will not perform well.
Well, there is something between both of these conditions, known as a Trade-off or Bias Variance
Trade-off.
This tradeoff in complexity is why there is a tradeoff between bias and variance.
An algorithm can’t be more complex and less complex at the same time. For the graph, the perfect
tradeoff will be like this.

We try to optimize the value of the total error for the model by using the Bias-Variance Tradeoff.

--Hyperparameter: [Not Important]


In machine learning, "hyperparameters" are configuration variables that are set before training a
model, controlling aspects of the learning process like model complexity, learning rate, and
architecture, and are not directly learned from the data during training, unlike the model parameters
which are adjusted automatically during the training process.
Hyperparameters are tunable configurations that are set before training a machine learning model.
They control the learning process and influence the model's performance but are not learned from the
data
Key points about hyperparameters:
• Set before training:
Unlike model parameters, hyperparameters are chosen manually by the user before the machine
learning model starts training.
• Impact on learning process:
They influence how the model learns from the data, affecting the speed and quality of training.
• Examples:
Common hyperparameters include the number of layers in a neural network, learning rate, batch size,
kernel size in a convolutional neural network, and the maximum depth of a decision tree.

Overfitting and Underfitting: [Important]


1. What is Overfitting?
Overfitting occurs when a machine learning model learns too much from the training data, including
noise and random fluctuations, instead of capturing the underlying pattern. As a result, the model
performs exceptionally well on training data but poorly on new, unseen data (test data) because it fails
to generalize.
Causes of Overfitting:
• The model is too complex (e.g., deep decision trees, large neural networks).
• There is too little training data, and the model memorizes it instead of learning patterns.
• The model is trained for too many epochs (in deep learning).
• High variance in the model, meaning it is too sensitive to small changes in data.
How to Detect Overfitting?
• Training accuracy is very high, but test accuracy is low.
• The model performs poorly on unseen data.
• The loss decreases significantly on training data but remains high on validation data.
How to Prevent Overfitting?

Use More Training Data: A larger dataset helps the model learn the general pattern.
Regularization Techniques: Apply L1 (Lasso) or L2 (Ridge) regularization to reduce the
complexity of the model.
Early Stopping: Stop training when validation loss stops improving.
Dropout (for Neural Networks): Randomly deactivate neurons during training to prevent reliance
on specific features.
Reduce Model Complexity: Use fewer layers, nodes, or a smaller decision tree.
Cross-Validation: Use k-fold cross-validation to ensure the model generalizes well.

2. What is Underfitting?
Underfitting happens when a machine learning model is too simple and fails to capture the underlying
pattern in the data. It results in poor performance on both training and test data because the model
hasn’t learned enough from the data.
Causes of Underfitting:
• The model is too simple (e.g., using linear regression for a non-linear problem).
• Too few features are used, leading to insufficient information for learning.
• Insufficient training time (not enough epochs in deep learning).
• High bias in the model, meaning it makes strong assumptions and fails to capture complexity.
How to Detect Underfitting?
• Training and test accuracy are both low.
• The model fails to capture the patterns even in training data.
• Increasing training data does not improve performance.
How to Prevent Underfitting?

Increase Model Complexity: Use a more complex algorithm (e.g., switch from linear regression
to polynomial regression).
Feature Engineering: Add more relevant features to help the model learn better patterns.
Train for More Epochs: Allow the model to learn longer (for deep learning models).
Reduce Regularization: If regularization is too strong, it may limit the model’s ability to learn.

-----------------Limitations of machine learning [Important]----------


1. Lack of Transparency and Interpretability
One of its main drawbacks is more transparency and interpretability in machine learning. As they
don't reveal how a judgment was made or how it came to be, machine learning algorithms are
frequently called "black boxes." This makes it challenging to comprehend how a certain model
concluded and might be problematic when explanations are required.
2. Overfitting and Underfitting
Machine learning algorithms frequently have two limitations: overfitting and underfitting. Overfitting
is a condition where a machine learning model performs poorly on new, unknown data because it
needs to be simplified and has been trained too successfully on the training data. On the other side,
underfitting happens when a machine learning model is overly simplistic and unable to recognize the
underlying patterns in the data, resulting in subpar performance on both the training data and fresh
data.
3. Limited Data Availability
A major challenge for machine learning is the need for more available data. Machine learning
algorithms need a lot of data to learn and produce precise predictions. However, there might need to
be more data available or only restricted access to it in many fields.
4. Computational Resources
Machine learning algorithms can be computationally expensive, and they may require a lot of
resources to be successfully trained. This may be a major barrier, particularly for people or smaller
companies who want access to high-performance computing resources.
5. Ethical Considerations
Machine learning models can have major social, ethical, and legal repercussions when used to make
judgments that affect people's lives. Privacy, security, and data ownership must also be addressed
when adopting machine learning models.
6. Lack of Causality
The main purpose of machine learning algorithms is to find patterns and correlations in data; however,
they cannot establish causal links between different variables.

----------------History of deep learning [NOT IMP]---------------------


Early developments
• In 1943, Walter Pitts and Warren McCulloch created a computer model of the human brain's
neural networks. They used algorithms and mathematics called "threshold logic" to mimic
thought processes.
• In the 1940s–1960s, deep learning was known as cybernetics.
Later developments
• In 1986, Geoffrey Hinton, David Rumelhart, and Ronald Williams published a back-
propagation training algorithm.
• In 1986, Rina Dechter introduced the term "deep learning" to the machine learning
community.
• In 2000, Igor Aizenberg and colleagues introduced deep learning to artificial neural
networks.
• In 2006, deep learning began to experience a resurgence.

---------Advantage and challenges of deep learning[Important] ----


Advantages of Deep Learning:
1. Automatic feature learning: Deep learning algorithms can automatically learn features from
the data, which means that they don’t require the features to be hand-engineered. This is
particularly useful for tasks where the features are difficult to define, such as image
recognition.
2. Handling large and complex data: Deep learning algorithms can handle large and complex
datasets that would be difficult for traditional machine learning algorithms to process. This
makes it a useful tool for extracting insights from big data.
3. Improved performance: Deep learning algorithms have been shown to achieve state-of-the-
art performance on a wide range of problems, including image and speech recognition, natural
language processing, and computer vision.
4. Handling non-linear relationships: Deep learning can uncover non-linear relationships in
data that would be difficult to detect through traditional methods.
5. Handling structured and unstructured data: Deep learning algorithms can handle both
structured and unstructured data such as images, text, and audio.
6. Predictive modeling: Deep learning can be used to make predictions about future events or
trends, which can help organizations plan for the future and make strategic decisions.
7. Scalability: Deep learning models can be easily scaled to handle an increasing amount of
data and can be deployed on cloud platforms and edge devices.
8. Generalization: Deep learning models can generalize well to new situations or contexts, as
they are able to learn abstract and hierarchical representations of the data.

Challenges of Deep Learning:

1. High computational cost: Training deep learning models requires significant computational
resources, including powerful GPUs and large amounts of memory. This can be costly and
time-consuming.
2. Overfitting: Overfitting occurs when a model is trained too well on the training data and
performs poorly on new, unseen data. This is a common problem in deep learning, especially
with large neural networks, and can be caused by a lack of data, a complex model, or a lack of
regularization.
3. Lack of interpretability: Deep learning models, especially those with many layers, can be
complex and difficult to interpret. This can make it difficult to understand how the model is
making predictions and to identify any errors or biases in the model.
4. Dependence on data quality: Deep learning algorithms rely on the quality of the data they
are trained on. If the data is noisy, incomplete, or biased, the model’s performance will be
negatively affected.
5. Data privacy and security concerns: As deep learning models often rely on large amounts
of data, there are concerns about data privacy and security. Misuse of data by malicious actors
can lead to serious consequences like identity theft, financial loss and invasion of privacy.
6. Lack of domain expertise: Deep learning requires a good understanding of the domain and
the problem you are trying to solve. If the domain expertise is lacking, it can be difficult to
formulate the problem and select the appropriate algorithm.
7. Unforeseen consequences: Deep learning models can lead to unintended consequences, for
example, a biased model can discriminate against certain groups of people, leading to ethical
concerns.
8. Limited to the data its trained on: Deep learning models can only make predictions based
on the data it has been trained on. They may not be able to generalize to new situations or
contexts that were not represented in the training data.
9. Black box models: some deep learning models are considered as “black-box” models, as it is
difficult to understand how the model is making predictions and identifying the factors that
influence the predictions.

-------How deep learning works in three figures [Important]--------


---------Common Architectural Principles of Deep Network [Important]-----
-------------------Applications of Deep learning [Important]-----------

----------------Popular industry tools [Important]----------------------


1. TensorFlow:
• Developed by: Google
• Key Features: Open-source, robust, highly scalable, supports large-scale deep learning
projects, strong GPU acceleration with CUDA integration, flexible dataflow graphs for
complex model design.
• Use Cases: Image recognition, natural language processing, time series analysis, scientific
computing.

2. PyTorch:
• Key Features: Dynamic computation graph, highly flexible for research and prototyping,
Pythonic API, strong community support, good for custom model development
• Strengths: Excellent for research-oriented projects where rapid experimentation and
customization are crucial
• Use Cases: Computer vision, natural language processing, generative models
3. Keras:
• Built on top of: TensorFlow (primarily)
• Key Features: User-friendly interface, rapid prototyping, easy to learn, high-level API for
building neural networks, good for beginners
• Strengths: Streamlines model development with a simple syntax, ideal for quick
experimentation and proof-of-concept projects

4. Caffe:
• Focus Area: Image recognition, computer vision
• Key Features: Optimized for fast training and inference, modular architecture, well-suited
for large-scale image processing tasks
• Strengths: High performance for image-centric deep learning applications, particularly
convolutional neural networks (CNNs)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy