0% found this document useful (0 votes)
2 views29 pages

Unit 1 Notes

Machine Learning (ML) is a subset of Artificial Intelligence that utilizes algorithms to learn from data and make predictions through supervised and unsupervised learning. Its applications span various industries including healthcare, finance, retail, and education, with emerging technologies like deep learning and reinforcement learning driving advancements. The evolution of ML has progressed from foundational theories to modern trends, impacting numerous domains and addressing global challenges.

Uploaded by

B06Shifa Fatima
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views29 pages

Unit 1 Notes

Machine Learning (ML) is a subset of Artificial Intelligence that utilizes algorithms to learn from data and make predictions through supervised and unsupervised learning. Its applications span various industries including healthcare, finance, retail, and education, with emerging technologies like deep learning and reinforcement learning driving advancements. The evolution of ML has progressed from foundational theories to modern trends, impacting numerous domains and addressing global challenges.

Uploaded by

B06Shifa Fatima
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

MACHINE LEARNING NOTES

UNIT-I

Definition of Machine Learning: Machine Learning is a subset of Artificial Intelligence,


which uses Algorithms and that learns from the data and generate predictions.
Predictions are generated through supervised learning where algorithms learns patterns
from the existing data and unsupervised learning algorithms where discover general
patterns from the data.

The scope of Machine Learning (ML) is vast


and rapidly expanding across various domains due to its ability to process data, identify
patterns, and make predictions. Here are some key areas where ML plays a crucial role:

1. Artificial Intelligence Development


• ML is the backbone of AI systems, enabling computers to learn from data and make decisions.

• ML powers AI applications like natural language processing (NLP), computer vision, speech
recognition, and robotics.

2. Industries and Applications

a. Healthcare

• Disease diagnosis: Predict diseases using patient data (e.g., cancer detection, diabetes prediction).

• Drug discovery: Accelerate the process of finding new drugs through predictive modeling.
• Personalized medicine: Tailor treatments to individuals based on genetic data.

• Medical imaging: Analyze X-rays, MRIs, and CT scans for abnormalities.

b. Finance

• Fraud detection: Identify unusual patterns in transactions.

• Algorithmic trading: Optimize financial portfolios and trading strategies.


• Credit scoring: Assess loan risks and predict defaults.

• Customer segmentation: Improve marketing campaigns.

c. Retail and E-commerce

• Recommendation systems: Suggest products based on customer behavior.


MACHINE LEARNING NOTES
• Demand forecasting: Optimize inventory and supply chains.

• Dynamic pricing: Adjust prices based on market trends and customer demand.

d. Manufacturing

• Predictive maintenance: Anticipate equipment failures before they occur.

• Quality control: Automate defect detection in production lines.

• Process optimization: Enhance production efficiency and reduce costs.

e. Transportation

• Autonomous vehicles: Enable self-driving cars using computer vision and decision-making
algorithms.
• Traffic management: Optimize traffic flow and reduce congestion.

• Route planning: Improve logistics and delivery efficiency.

f. Education

• Personalized learning: Adapt educational content to individual learning styles.

• Automated grading: Reduce manual effort for assessments.


• Student performance analysis: Identify at-risk students and improve outcomes.

g. Energy

• Smart grids: Predict energy demand and optimize supply.


• Renewable energy forecasting: Improve solar and wind power management.

• Energy consumption analysis: Reduce waste and increase efficiency.

3. Emerging Technologies
• Deep Learning: Advanced neural networks for solving complex problems in vision, NLP, and
robotics.

• Reinforcement Learning: Autonomous systems learning by interacting with their environment


(e.g., AlphaGo, OpenAI’s ChatGPT).

• Edge AI: Deploying ML models on devices like smartphones, IoT devices, and sensors.
• Quantum Machine Learning: Leveraging quantum computing to solve high-dimensional ML
problems.

4. Research and Development


• ML accelerates scientific discovery in fields like genomics, astronomy, and materials science.
• It is vital for advancing AI ethics, explainable AI, and improving model robustness.

5. Societal and Global Impact


• Climate modeling: Predict weather patterns and analyze climate change impacts.
• Disaster response: Assist in early warning systems for natural disasters.
MACHINE LEARNING NOTES
• Social media analytics: Detect fake news, monitor trends, and analyze sentiment.

• Language translation: Break language barriers with tools like Google Translate

Future Scope
• Democratization of ML: Easier access to ML tools and platforms for non-experts.
• AI in governance: Improving policymaking and public services.

• Augmented reality (AR) and virtual reality (VR): Enhancing immersive experiences.

• Human-AI collaboration: Creating systems that augment human intelligence.

With continuous advancements, machine learning is expected to transform industries, redefine how we work,
and address global challenges in the coming decades.

history and evolution of Machine


The

Learning (ML) is a fascinating journey that spans over decades, driven by


advancements in mathematics, computer science, and artificial intelligence. Here’s an overview:

1. Pre-1950s: Foundations
• Mathematical Foundations:

• ML’s roots lie in statistics and probability theory, developed in the 18th and 19th centuries (e.g.,
Bayes’ theorem in the 1760s).

• Early concepts of learning were explored in psychology and neuroscience.

• Automata Theory:
• In the 1940s, researchers like Warren McCulloch and Walter Pitts developed a theoretical model of
artificial neurons, laying the foundation for neural networks.

• Alan Turing (1950):

• Turing proposed the idea of “learning machines” in his paper “Computing Machinery and
Intelligence” and introduced the Turing Test.

2. 1950s–1970s: Early Development


• Perceptron (1958):
• Frank Rosenblatt developed the Perceptron, the first algorithm modelled after a biological neural
network, capable of simple pattern recognition.

• Symbolic AI (1950s–60s):

• Early AI focused on symbolic reasoning and logic rather than learning from data (e.g., early expert
systems).
• Limitations of Perceptrons (1969):
MACHINE LEARNING NOTES
• Marvin Minsky and Seymour Papert’s book Perceptrons showed that single-layer perceptrons
couldn’t solve non-linear problems, causing a decline in neural network research.

• Bayesian Methods:

• Statistical methods, including Bayesian networks, began gaining traction for probabilistic
reasoning.

3. 1980s: The Revival of Neural Networks


• Backpropagation (1986):

• David Rumelhart, Geoffrey Hinton, and Ronald J. Williams reintroduced backpropagation,


enabling multi-layer neural networks to learn and solve complex problems.

• Expert Systems:

• Rule-based systems, like MYCIN, became popular for decision-making in specific domains like
medicine.
• Evolutionary Algorithms:

• Algorithms inspired by natural selection, like Genetic Algorithms, gained attention.

4. 1990s: Machine Learning Comes Into Focus


• Shift from Symbolic AI to ML:
• Researchers shifted their focus from rule-based AI to data-driven approaches.

• Support Vector Machines (1992):


• Developed by Vladimir Vapnik, SVMs introduced a powerful framework for classification tasks.

• Ensemble Methods:
• Techniques like Bagging (1994) and Boosting (1997) improved predictive performance by
combining multiple models.

• Reinforcement Learning:
• Sutton and Barto popularized Reinforcement Learning, with applications in robotics and gaming.

5. 2000s: Big Data and Computational Power


• Data Explosion:

• The internet and advancements in storage technology led to an explosion of data, creating
opportunities for ML.

• Deep Learning Foundations:

• Neural networks were extended with more layers, leading to the rise of deep learning.

• Unsupervised Learning:
• Algorithms like k-means clustering and Gaussian Mixture Models became prominent for
exploring unlabeled data.

• Kernel Methods:
• Kernelized approaches allowed SVMs to handle non-linear problems efficiently.
MACHINE LEARNING NOTES

6. 2010s: The Deep Learning Revolution


• Breakthrough in Neural Networks:

• GPUs enabled faster computation, driving advancements in deep learning.

• AlexNet (2012):

• Alex Krizhevsky’s deep convolutional network won the ImageNet competition, proving deep
learning’s potential.

• Natural Language Processing (NLP):


• Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTMs), and later transformers
(like BERT and GPT) revolutionized NLP.

• Generative Models:
• Techniques like GANs (2014) and Variational Autoencoders (VAEs) emerged for generating
realistic images, videos, and text.

• Reinforcement Learning Milestones:

• DeepMind’s AlphaGo (2016) defeated the world champion in Go, showcasing the power of
combining deep learning with reinforcement learning.

• Cloud ML Platforms:
• Companies like Google, Amazon, and Microsoft launched cloud-based ML services, democratizing
access to ML tools.

7. 2020s: Modern Trends


• Transformers and Large Language Models (LLMs):

• GPT models (e.g., ChatGPT) and similar architectures revolutionized conversational AI and
generative tasks.

• Self-supervised Learning:

• Models learn patterns from unlabeled data, reducing the need for expensive labeling.

• Edge AI:
• ML models are deployed on edge devices like smartphones, enabling real-time processing.
• Ethical AI:

• Focus on fairness, accountability, and transparency in AI systems.

• Multi-modal Learning:

• Models like OpenAI’s CLIP can process multiple types of data (e.g., images and text)
simultaneously.
MACHINE LEARNING NOTES
Key Milestones in Evolution:

Year Milestone Impact


1958 Perceptron First neural network model.
1986 Backpropagation Enabled training of multi-layer networks.
1997 Deep Blue defeats Kasparov Highlighted AI’s capability in strategy.
2012 AlexNet wins ImageNet Catalyzed the deep learning revolution.
2016 AlphaGo defeats Lee Sedol Reinforcement learning breakthrough.
2020s GPT models (e.g., GPT-3, GPT-4) Redefined NLP and generative AI.

Conclusion

Machine Learning has evolved from theoretical concepts to a transformative force impacting almost every
domain. With continued advancements, ML is poised to become even more pervasive, solving complex
global challenges and reshaping industries.

Supervised Machine Learning


Supervised learning is the types of machine learning in which
machines are trained using well "labelled" training data, and on
basis of that data, machines predict the output. The labelled data
means some input data is already tagged with the correct output.

In supervised learning, the training data provided to the machines


work as the supervisor that teaches the machines to predict the
output correctly. It applies the same concept as a student learns
in the supervision of the teacher.

Supervised learning is a process of providing input data as well


as correct output data to the machine learning model. The aim of a
supervised learning algorithm is to find a mapping function to map
the input variable(x) with the output variable(y).

In the real-world, supervised learning can be used for Risk


Assessment, Image classification, Fraud Detection, spam filtering,
etc.

How Supervised Learning Works?


In supervised learning, models are trained using labelled dataset,
where the model learns about each type of data. Once the training
process is completed, the model is tested on the basis of test data
(a subset of the training set), and then it predicts the output.
MACHINE LEARNING NOTES
The working of Supervised learning can be easily understood by the
below example and diagram:

Suppose we have a dataset of different types of shapes which


includes square, rectangle, triangle, and Polygon. Now the first
step is that we need to train the model for each shape.

o If the given shape has four sides, and all the sides are
equal, then it will be labelled as a Square.
o If the given shape has three sides, then it will be labelled
as a triangle.
o If the given shape has six equal sides then it will be
labelled as hexagon.
Now, after training, we test our model using the test set, and the
task of the model is to identify the shape.

The machine is already trained on all types of shapes, and when it


finds a new shape, it classifies the shape on the bases of a number
of sides, and predicts the output.

Steps Involved in Supervised Learning:


o First Determine the type of training dataset
o Collect/Gather the labelled training data.
o Split the training dataset into training dataset, test
dataset, and validation dataset.
o Determine the input features of the training dataset, which
should have enough knowledge so that the model can
accurately predict the output.
o Determine the suitable algorithm for the model, such as
support vector machine, decision tree, etc.
o Execute the algorithm on the training dataset. Sometimes
we need validation sets as the control parameters, which
are the subset of training datasets.
MACHINE LEARNING NOTES
o Evaluate the accuracy of the model by providing the test
set. If the model predicts the correct output, which means
our model is accurate.

Types of supervised Machine learning


Algorithms:
Supervised learning can be further divided into two types of
problems:

Advertisement

1. Regression

Regression algorithms are used if there is a relationship between


the input variable and the output variable. It is used for the
prediction of continuous variables, such as Weather forecasting,
Market Trends, etc. Below are some popular Regression algorithms
which come under supervised learning:

o Linear Regression
o Regression Trees
o Non-Linear Regression
o Bayesian Linear Regression
o Polynomial Regression
2. Classification

Classification algorithms are used when the output variable is


categorical, which means there are two classes such as Yes-No,
Male-Female, True-false, etc.

Spam Filtering,

o Random Forest
o Decision Trees
o Logistic Regression
MACHINE LEARNING NOTES
o Support vector Machines

Advantages of Supervised learning:


o With the help of supervised learning, the model can predict
the output on the basis of prior experiences.
o In supervised learning, we can have an exact idea about the
classes of objects.
o Supervised learning model helps us to solve various real-
world problems such as fraud detection, spam filtering,
etc.

Disadvantages of supervised learning:


o Supervised learning models are not suitable for handling
the complex tasks.
o Supervised learning cannot predict the correct output if
the test data is different from the training dataset.
o Training required lots of computation times.
o In supervised learning, we need enough knowledge about the
classes of object.

Unsupervised Machine Learning


In the previous topic, we learned supervised machine learning in
which models are trained using labeled data under the supervision
of training data. But there may be many cases in which we do not
have labeled data and need to find the hidden patterns from the
given dataset. So, to solve such types of cases in machine learning,
we need unsupervised learning techniques.

What is Unsupervised Learning?


As the name suggests, unsupervised learning is a machine learning
technique in which models are not supervised using training
dataset. Instead, models itself find the hidden patterns and
insights from the given data. It can be compared to learning which
takes place in the human brain while learning new things. It can
be defined as:
MACHINE LEARNING NOTES
Unsupervised learning is a type of machine learning in which
models are trained using unlabeled dataset and are allowed to
act on that data without any supervision.
Unsupervised learning cannot be directly applied to a regression
or classification problem because unlike supervised learning, we
have the input data but no corresponding output data. The goal of
unsupervised learning is to find the underlying structure of
dataset, group that data according to similarities, and represent
that dataset in a compressed format.

Example: Suppose the unsupervised learning algorithm is given an


input dataset containing images of different types of cats and
dogs. The algorithm is never trained upon the given dataset, which
means it does not have any idea about the features of the dataset.
The task of the unsupervised learning algorithm is to identify the
image features on their own. Unsupervised learning algorithm will
perform this task by clustering the image dataset into the groups
according to similarities between images.

Why use Unsupervised Learning?


Below are some main reasons which describe the importance of
Unsupervised Learning:

o Unsupervised learning is helpful for finding useful


insights from the data.
o Unsupervised learning is much similar as a human learns to
think by their own experiences, which makes it closer to
the real AI.
o Unsupervised learning works on unlabeled and uncategorized
data which make unsupervised learning more important.
MACHINE LEARNING NOTES
o In real-world, we do not always have input data with the
corresponding output so to solve such cases, we need
unsupervised learning.

Working of Unsupervised Learning


Working of unsupervised learning can be understood by the below
diagram:

Here, we have taken an unlabeled input data, which means it is not


categorized and corresponding outputs are also not given. Now, this
unlabeled input data is fed to the machine learning model in order
to train it. Firstly, it will interpret the raw data to find the
hidden patterns from the data and then will apply suitable
algorithms such as k-means clustering, Decision tree, etc.

Once it applies the suitable algorithm, the algorithm divides the


data objects into groups according to the similarities and
difference between the objects.

Types of Unsupervised Learning Algorithm:


The unsupervised learning algorithm can be further categorized into
two types of problems:
MACHINE LEARNING NOTES

o Clustering: Clustering is a method of grouping the objects


into clusters such that objects with most similarities
remains into a group and has less or no similarities with
the objects of another group. Cluster analysis finds the
commonalities between the data objects and categorizes them
as per the presence and absence of those commonalities.
o Association: An association rule is an unsupervised
learning method which is used for finding the relationships
between variables in the large database. It determines the
set of items that occurs together in the dataset.
Association rule makes marketing strategy more effective.
Such as people who buy X item (suppose a bread) are also
tend to purchase Y (Butter/Jam) item. A typical example of
Association rule is Market Basket Analysis.

Unsupervised Learning algorithms:


Below is the list of some popular unsupervised learning algorithms:

o K-means clustering
o KNN (k-nearest neighbors)
o Hierarchal clustering
o Anomaly detection
o Neural Networks
o Principle Component Analysis
o Independent Component Analysis
o Apriori algorithm
MACHINE LEARNING NOTES
o Singular value decomposition

Advantages of Unsupervised Learning


o Unsupervised learning is used for more complex tasks as
compared to supervised learning because, in unsupervised
learning, we don't have labeled input data.
o Unsupervised learning is preferable as it is easy to get
unlabeled data in comparison to labeled data.

Disadvantages of Unsupervised Learning


o Unsupervised learning is intrinsically more difficult than
supervised learning as it does not have corresponding
output.
o The result of the unsupervised learning algorithm might be
less accurate as input data is not labeled, and algorithms
do not know the exact output in advance.

Unsupervised Machine Learning


In the previous topic, we learned supervised machine learning in
which models are trained using labeled data under the supervision
of training data. But there may be many cases in which we do not
have labeled data and need to find the hidden patterns from the
given dataset. So, to solve such types of cases in machine learning,
we need unsupervised learning techniques.

What is Unsupervised Learning?


As the name suggests, unsupervised learning is a machine learning
technique in which models are not supervised using training
dataset. Instead, models itself find the hidden patterns and
insights from the given data. It can be compared to learning which
takes place in the human brain while learning new things. It can
be defined as:

Unsupervised learning is a type of machine learning in which


models are trained using unlabeled dataset and are allowed to
act on that data without any supervision.
Unsupervised learning cannot be directly applied to a regression
or classification problem because unlike supervised learning, we
have the input data but no corresponding output data. The goal of
unsupervised learning is to find the underlying structure of
MACHINE LEARNING NOTES
dataset, group that data according to similarities, and represent
that dataset in a compressed format.

Example: Suppose the unsupervised learning algorithm is given an


input dataset containing images of different types of cats and
dogs. The algorithm is never trained upon the given dataset, which
means it does not have any idea about the features of the dataset.
The task of the unsupervised learning algorithm is to identify the
image features on their own. Unsupervised learning algorithm will
perform this task by clustering the image dataset into the groups
according to similarities between images.

Backward Skip 10sPlay VideoForward Skip 10s

Why use Unsupervised Learning?


Below are some main reasons which describe the importance of
Unsupervised Learning:

o Unsupervised learning is helpful for finding useful


insights from the data.
o Unsupervised learning is much similar as a human learns to
think by their own experiences, which makes it closer to
the real AI.
o Unsupervised learning works on unlabeled and uncategorized
data which make unsupervised learning more important.
o In real-world, we do not always have input data with the
corresponding output so to solve such cases, we need
unsupervised learning.

Working of Unsupervised Learning


MACHINE LEARNING NOTES
Working of unsupervised learning can be understood by the below
diagram:

Here, we have taken an unlabeled input data, which means it is not


categorized and corresponding outputs are also not given. Now, this
unlabeled input data is fed to the machine learning model in order
to train it. Firstly, it will interpret the raw data to find the
hidden patterns from the data and then will apply suitable
algorithms such as k-means clustering, Decision tree, etc.

Once it applies the suitable algorithm, the algorithm divides the


data objects into groups according to the similarities and
difference between the objects.

Types of Unsupervised Learning Algorithm:


The unsupervised learning algorithm can be further categorized into
two types of problems:
MACHINE LEARNING NOTES

o Clustering: Clustering is a method of grouping the objects


into clusters such that objects with most similarities
remains into a group and has less or no similarities with
the objects of another group. Cluster analysis finds the
commonalities between the data objects and categorizes them
as per the presence and absence of those commonalities.
o Association: An association rule is an unsupervised
learning method which is used for finding the relationships
between variables in the large database. It determines the
set of items that occurs together in the dataset.
Association rule makes marketing strategy more effective.
Such as people who buy X item (suppose a bread) are also
tend to purchase Y (Butter/Jam) item. A typical example of
Association rule is Market Basket Analysis.

Note: We will learn these algorithms in later chapters.

Unsupervised Learning algorithms:


Below is the list of some popular unsupervised learning algorithms:

o K-means clustering
o KNN (k-nearest neighbors)
o Hierarchal clustering
o Anomaly detection
o Neural Networks
o Principle Component Analysis
MACHINE LEARNING NOTES
o Independent Component Analysis
o Apriori algorithm
o Singular value decomposition

Advantages of Unsupervised Learning


o Unsupervised learning is used for more complex tasks as
compared to supervised learning because, in unsupervised
learning, we don't have labeled input data.
o Unsupervised learning is preferable as it is easy to get
unlabeled data in comparison to labeled data.

Disadvantages of Unsupervised Learning


o Unsupervised learning is intrinsically more difficult than
supervised learning as it does not have corresponding
output.
o The result of the unsupervised learning algorithm might be
less accurate as input data is not labeled, and algorithms
do not know the exact output in advance.

V(s1) = max [R(s,a) + γV(s`)], here γ= 0.9(lets), V(s')= 0.9, and


R(s, a)= 0, because there is no reward at this state also.

V(s1)= max[0.9(0.9)]=> V(s3)= max[0.81]=> V(s1) =0.81

For 4th block:

V(s5) = max [R(s,a) + γV(s`)], here γ= 0.9(lets), V(s')= 0.81, and


R(s, a)= 0, because there is no reward at this state also.

V(s5)= max[0.9(0.81)]=> V(s5)= max[0.81]=> V(s5) =0.73

For 5th block:

V(s9) = max [R(s,a) + γV(s`)], here γ= 0.9(lets), V(s')= 0.73, and


R(s, a)= 0, because there is no reward at this state also.

V(s9)= max[0.9(0.73)]=> V(s4)= max[0.81]=> V(s4) =0.66

Consider the below image:


MACHINE LEARNING NOTES

Now, we will move further to the 6 th block, and here agent may
change the route because it always tries to find the optimal path.
So now, let's consider from the block next to the fire pit.
MACHINE LEARNING NOTES

Now, the agent has three options to move; if he moves to the blue
box, then he will feel a bump if he moves to the fire pit, then he
will get the -1 reward. But here we are taking only positive
rewards, so for this, he will move to upwards only. The complete
block values will be calculated using this formula. Consider the
below image:
MACHINE LEARNING NOTES

Types of Reinforcement learning


There are mainly two types of reinforcement learning, which are:

o Positive Reinforcement
o Negative Reinforcement
Positive Reinforcement:

The positive reinforcement learning means adding something to


increase the tendency that expected behavior would occur again. It
impacts positively on the behavior of the agent and increases the
strength of the behavior.

This type of reinforcement can sustain the changes for a long time,
but too much positive reinforcement may lead to an overload of
states that can reduce the consequences.

Negative Reinforcement:

The negative reinforcement learning is opposite to the positive


reinforcement as it increases the tendency that the specific
behavior will occur again by avoiding the negative condition.
MACHINE LEARNING NOTES
It can be more effective than the positive reinforcement depending
on situation and behavior, but it provides reinforcement only to
meet minimum behavior.

How to represent the agent state?


We can represent the agent state using the Markov State that
contains all the required information from the history. The State
St is Markov state if it follows the given condition:

P[St+1 | St ] = P[St +1 | S1,......, St]


The Markov state follows the Markov property, which says that the
future is independent of the past and can only be defined with the
present. The RL works on fully observable environments, where the
agent can observe the environment and act for the new state. The
complete process is known as Markov Decision process, which is
explained below:

Markov Decision Process


Markov Decision Process or MDP, is used to formalize the
reinforcement learning problems. If the environment is completely
observable, then its dynamic can be modeled as a Markov Process.
In MDP, the agent constantly interacts with the environment and
performs actions; at each action, the environment responds and
generates a new state.
MACHINE LEARNING NOTES
MDP is used to describe the environment for the RL, and almost all
the RL problem can be formalized using MDP.

MDP contains a tuple of four elements (S, A, P a, Ra):

o A set of finite States S


o A set of finite Actions A
o Rewards received after transitioning from state S to state
S', due to action a.
o Probability Pa.
MDP uses Markov property, and to better understand the MDP, we need
to learn about it.

Markov Property:
It says that "If the agent is present in the current state S1,
performs an action a1 and move to the state s2, then the state
transition from s1 to s2 only depends on the current state and
future action and states do not depend on past actions, rewards,
or states."

Or, in other words, as per Markov Property, the current state


transition does not depend on any past action or state. Hence, MDP
is an RL problem that satisfies the Markov property. Such as in
a Chess game, the players only focus on the current state and do
not need to remember past actions or states.

Finite MDP:

A finite MDP is when there are finite states, finite rewards, and
finite actions. In RL, we consider only the finite MDP.

Markov Process:
Markov Process is a memoryless process with a sequence of random
states S1, S2, ....., St that uses the Markov Property. Markov
process is also known as Markov chain, which is a tuple (S, P) on
state S and transition function P. These two components (S and P)
can define the dynamics of the system.

Reinforcement Learning Algorithms


Reinforcement learning algorithms are mainly used in AI
applications and gaming applications. The main used algorithms are:

o Q-Learning:
o Q-learning is an Off policy RL algorithm, which is
used for the temporal difference Learning. The
temporal difference learning methods are the way of
comparing temporally successive predictions.
MACHINE LEARNING NOTES
o It learns the value function Q (S, a), which means how
good to take action "a" at a particular state "s."
o The below flowchart explains the working of Q-
learning:

o State Action Reward State action (SARSA):


o SARSA stands for State Action Reward State action,
which is an on-policy temporal difference learning
method. The on-policy control method selects the
action for each state while learning using a specific
policy.
o The goal of SARSA is to calculate the Q π (s, a) for
the selected current policy π and all pairs of (s-a).
o The main difference between Q-learning and SARSA
algorithms is that unlike Q-learning, the maximum
reward for the next state is not required for updating
the Q-value in the table.
MACHINE LEARNING NOTES
o In SARSA, new action and reward are selected using the
same policy, which has determined the original action.
o The SARSA is named because it uses the quintuple Q(s,
a, r, s', a'). Where,
s: original state
a: Original action
r: reward observed while following the states
s' and a': New state, action pair.
o Deep Q Neural Network (DQN):
o As the name suggests, DQN is a Q-learning using Neural
networks.
o For a big state space environment, it will be a
challenging and complex task to define and update a
Q-table.
o To solve such an issue, we can use a DQN algorithm.
Where, instead of defining a Q-table, neural network
approximates the Q-values for each action and state.
Now, we will expand the Q-learning.

Q-Learning Explanation:
o Q-learning is a popular model-free reinforcement learning
algorithm based on the Bellman equation.
o The main objective of Q-learning is to learn the policy
which can inform the agent that what actions should be
taken for maximizing the reward under what circumstances.
o It is an off-policy RL that attempts to find the best
action to take at a current state.
o The goal of the agent in Q-learning is to maximize the
value of Q.
o The value of Q-learning can be derived from the Bellman
equation. Consider the Bellman equation given below:

In the equation, we have various components, including reward,


discount factor (γ), probability, and end states s'. But there is
no any Q-value is given so first consider the below image:
MACHINE LEARNING NOTES

In the above image, we can see there is an agent who has three
values options, V(s1), V(s2), V(s3). As this is MDP, so agent only
cares for the current state and the future state. The agent can go
to any direction (Up, Left, or Right), so he needs to decide where
to go for the optimal path. Here agent will take a move as per
probability bases and changes the state. But if we want some exact
moves, so for this, we need to make some changes in terms of Q-
value. Consider the below image:
MACHINE LEARNING NOTES

Q- represents the quality of the actions at each state. So instead


of using a value at each state, we will use a pair of state and
action, i.e., Q(s, a). Q-value specifies that which action is more
lubricative than others, and according to the best Q-value, the
agent takes his next move. The Bellman equation can be used for
deriving the Q-value.

To perform any action, the agent will get a reward R(s, a), and
also he will end up on a certain state, so the Q -value equation
will be:

Hence, we can say that, V(s) = max [Q(s, a)]

The above formula is used to estimate the Q-values in Q-Learning.

What is 'Q' in Q-learning?

The Q stands for quality in Q-learning, which means it specifies


the quality of an action taken by the agent.
MACHINE LEARNING NOTES
Q-table:
A Q-table or matrix is created while performing the Q-learning.
The table follows the state and action pair, i.e., [s, a], and
initializes the values to zero. After each action, the table is
updated, and the q-values are stored within the table.

The RL agent uses this Q-table as a reference table to select the


best action based on the q-values.

Difference between Reinforcement Learning


and Supervised Learning
The Reinforcement Learning and Supervised Learning both are the
part of machine learning, but both types of learnings are far
opposite to each other. The RL agents interact with the
environment, explore it, take action, and get rewarded. Whereas
supervised learning algorithms learn from the labeled dataset and,
on the basis of the training, predict the output.

The difference table between RL and Supervised learning is given


below:

Reinforcement Learning Supervised Learning

RL works by interacting with the Supervised learning works on the


environment. existing dataset.

The RL algorithm works like the Supervised Learning works as


human brain works when making when a human learns things in
some decisions. the supervision of a guide.

There is no labeled dataset is


The labeled dataset is present.
present

Training is provided to the


No previous training is provided
algorithm so that it can predict
to the learning agent.
the output.

In Supervised learning,
RL helps to take decisions
decisions are made when input is
sequentially.
given.
MACHINE LEARNING NOTES

Reinforcement Learning Applications

1. Robotics:
1. RL is used in Robot navigation, Robo-soccer, walking,
juggling, etc.
2. Control:
1. RL can be used for adaptive control such as Factory processes,
admission control in telecommunication, and Helicopter pilot
is an example of reinforcement learning.
3. Game Playing:
1. RL can be used in Game playing such as tic-tac-toe, chess,
etc.
4. Chemistry:
1. RL can be used for optimizing the chemical reactions.
5. Business:
1. RL is now used for business strategy planning.
MACHINE LEARNING NOTES
6. Manufacturing:
1. In various automobile manufacturing companies, the robots use
deep reinforcement learning to pick goods and put them in
some containers.
7. Finance Sector:
1. The RL is currently used in the finance sector for evaluating
trading strategies.

Conclusion:
From the above discussion, we can say that Reinforcement Learning
is one of the most interesting and useful parts of Machine learning.
In RL, the agent explores the environment by exploring it without
any human intervention. It is the main learning algorithm that is
used in Artificial Intelligence. But there are some cases where it
should not be used, such as if you have enough data to solve the
problem, then other ML algorithms can be used more efficiently.
The main issue with the RL algorithm is that some of the parameters
may affect the speed of the learning, such as delayed feedback.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy