0% found this document useful (0 votes)

41 views47 pages

ML - Unit-3 - Reinforcement Learning

Uploaded by

pabrimoo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views47 pages

ML - Unit-3 - Reinforcement Learning

Uploaded by

pabrimoo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 47

Reinforcement Learning

Unit- 3

Dr. Prabhakaran
Assistant Professor, Department of Computer Application
1. Understand RL task formulation.
2. Understand Tabular based solutions.
3. Identify Function approximation solutions.
4. Devise Policy gradient from basic (REINFORCE)
towards advanced topics.
5. Understand Model-based reinforcement
learning.
 INTRODUCTION TO RL & MARKOV DECISION
PROCESS
 MODEL-FREE PREDICTION & MODEL-FREE
CONTROL
 VALUE FUNCTION APPROXIMATION & POLICY
GRADIENT METHODS
 INTEGRATING PLANNING WITH LEARNING &
HIERARCHICAL RL
 DEEP RL & MULTI-AGENT RL
INTRODUCTION TO RL & MARKOV DECISION
PROCESS

 The RL-Problem-Markov Process

 Markov Reward Process
 Markov Decision Process and Bellman Equations
 Partially Observable MDPs Policy Evaluation
 Value Iteration, Policy Iteration
 DP Extensions and Convergence using Contraction
Mapping.
Course Structure
What is learning?
 Learning takes place as a result of interaction
between an agent and the world.
 Percepts received by an agent should be used
not only for acting, but also for improving the
agent’s ability to behave optimally in the
future to achieve the goal.
Reinforcement Learning
RL Application
RL - Application
 Video Gaming

 User Interaction

 Control

 Finance

 Technology
RL – Applications and Scope
Artificial Neural networks car sales prediction
Deep Neural networks classification
Prophet Time series – Crime rate
Prophet Time series – Tomato / crops
Le-net Deep network – Traffic sign classification
NLP – Email spam filters
NLP – Reviews
User based collaborative filtering - Recommendation
Taxonomy of AI
Model-Free
Model Based
Value Based
Policy Based
Off Policy
On Policy
Learning Comparision
 Supervised learning:
A situation in which sample (input, output) pairs of the function to be learned
can be perceived.
 Unsupervised learning
Hidden patterns in the data can be found using the unsupervised learning model.
 Reinforcement Learning
In the case of the agent acts on its environment, it receives some evaluation of its
action (reinforcement), but is not told of which action is the correct one
to achieve its goal.
Learning Comparision
RL model
 Each percept(e) is enough to determine the State (the
state is accessible)
 The agent can decompose the Reward component
from a percept.
 The agent task: to find a optimal policy, mapping
states to actions, that maximize long-run measure of
the reinforcement
 Think of reinforcement as reward
 Can be modelled as MDP model!
Markov Decesion Process
Control Tasks
State (St)
Action (At)
Rewards (Rt)
Agent
Environment
Markov Decesion Process
- Templates
- MDP
Discrete-time stochastic control process
Markov Decesion Process
Markov Decesion Process

If the process meet this property its is

know as Markov process.
Markov Decesion Process
Markov Decesion Process

Finite Infinite
Markov Decesion Process
Episotic Continuing
Markov Decesion Process
Trajectory Vs Episode
Markov Decesion Process

Rewards Returns
Markov Decesion Process
Discount Factor
Markov Decesion Process
Markov Decesion Process
Policy
Markov Decesion Process
Markov Decesion Process
State Values
Markov Decesion Process
Bellman Equation
Bellman Equation
Solving MDP
Solving MDP
MDP – Bellman Optimality Equations
MODEL-FREE PREDICTION
Model-free reinforcement learning is a
category of reinforcement learning
algorithms that do not require a model
of the environment to operate. Model-
free algorithms learn directly from
experience or trial-and-error and use
the feedback they receive to update
their internal policies or value functions.
MODEL-FREE PREDICTION
Model-free prediction is predicting the
value function of a certain policy
without a concrete model.
The simplest method is Monte-Carlo
learning.
MODEL-FREE PREDICTION
The main benefit of the model-free
approach is its computational efficiency.
Due to a cheap computational demand,
a model-free algorithm can usually
support a representation larger than
that of a model-based algorithm.
'Model-Free' reinforcement
learning algorithms
Monte Carlo Methods
1. MC Prediction
2. MC Estimation of Action Value
3. MC Control
4. MC Control without Exploring Starts
5. Off-Policy Prediction via Importance Sampling
6. Incremental Implementations
7. Off-Policy MC Control
8. Discounting-aware importance sampling
Monte Carlo Methods
- Estimating value function
- Discovering Optimal Policies
# Complete knowledge of environment
# Only Experience
# Actual Experience no prior knowledge of the environment's dynamics.
# Although the model is required (only sample generate)
# Averaging the sample returns
- To ensure the well defined returns are available
- Episodic tasks i.e experience is divided into episodes.
- Incremental in episode-by-episode not a step by step
Monte Carlo Prediction
1.Prediction One
2.Prediction Two

Function Approximation and eligibility

traces
First-Visit MC Prediction
BlackJack…
MC Estimation of Action Values
- Estimate q*
MC – Control
Approximate optimal policies
Generalized Policy Iteration (GPI)

Policy Evaluation – Policy Improvement

Monte Carlo ES

Ai (It) Unit-5
No ratings yet
Ai (It) Unit-5
43 pages
CMPE257 - W10C13 - Reinforcement Learning
No ratings yet
CMPE257 - W10C13 - Reinforcement Learning
161 pages
Free Ai Undress Tool
No ratings yet
Free Ai Undress Tool
5 pages
Reinforcement Learning: Karan Kathpalia
No ratings yet
Reinforcement Learning: Karan Kathpalia
80 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
50 pages
DLMAIRIL01 Q4-2024 Session4
No ratings yet
DLMAIRIL01 Q4-2024 Session4
80 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
31 pages
Data Management and Data Transformation, Introduction To Machine Learning
No ratings yet
Data Management and Data Transformation, Introduction To Machine Learning
54 pages
Chapter 18 - Reinforcement Learning
No ratings yet
Chapter 18 - Reinforcement Learning
29 pages
RL & DL Notes
No ratings yet
RL & DL Notes
73 pages
3.RL Unit 3
No ratings yet
3.RL Unit 3
31 pages
Reinforcement Learning Note
No ratings yet
Reinforcement Learning Note
16 pages
Neural Networks: 1 October, 2016
No ratings yet
Neural Networks: 1 October, 2016
51 pages
Unit 4
No ratings yet
Unit 4
49 pages
RL & DL Notes
No ratings yet
RL & DL Notes
43 pages
RL 1
No ratings yet
RL 1
30 pages
Module - 1 - Reinforcement Learning and Markov Decision Process
No ratings yet
Module - 1 - Reinforcement Learning and Markov Decision Process
19 pages
Lecture 1
No ratings yet
Lecture 1
38 pages
L13 Reinforcement Learning
No ratings yet
L13 Reinforcement Learning
57 pages
RL Basics 1737166593
No ratings yet
RL Basics 1737166593
30 pages
16 RL
No ratings yet
16 RL
51 pages
Unit-8 - Reinforcement Learning
No ratings yet
Unit-8 - Reinforcement Learning
52 pages
Tri-Tue-Nhan-Tao - Nathan-Lambert - Lec13 - 6up-Reinforcement-Learning - (Cuuduongthancong - Com)
No ratings yet
Tri-Tue-Nhan-Tao - Nathan-Lambert - Lec13 - 6up-Reinforcement-Learning - (Cuuduongthancong - Com)
8 pages
Artificial Intelligence: Lecture 10 - Reinforcement Learning Prof. Shivanjali Khare
No ratings yet
Artificial Intelligence: Lecture 10 - Reinforcement Learning Prof. Shivanjali Khare
45 pages
Artificial Intelligence: Computer Science & Engineering, Khulna University
No ratings yet
Artificial Intelligence: Computer Science & Engineering, Khulna University
30 pages
Ppce Unit - I Process Planning and Cost Estimation
No ratings yet
Ppce Unit - I Process Planning and Cost Estimation
14 pages
Unit 3
No ratings yet
Unit 3
29 pages
databookRL Steve Brunton PDF
No ratings yet
databookRL Steve Brunton PDF
76 pages
Artificial Intelligence in Oil and Gas Industry
No ratings yet
Artificial Intelligence in Oil and Gas Industry
9 pages
Reinforcement
No ratings yet
Reinforcement
9 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
45 pages
Lecture 29 RL
No ratings yet
Lecture 29 RL
38 pages
Markov Decision Process
No ratings yet
Markov Decision Process
15 pages
L12 Reinforcement Learning 2
No ratings yet
L12 Reinforcement Learning 2
26 pages
Subtitle
No ratings yet
Subtitle
2 pages
DSA5102 Lecture11
No ratings yet
DSA5102 Lecture11
44 pages
DLMAIRIL01 Q4-2024 Session2
No ratings yet
DLMAIRIL01 Q4-2024 Session2
68 pages
IntroductiontoRL BR
No ratings yet
IntroductiontoRL BR
22 pages
Lec17 ReinforcementLearning
No ratings yet
Lec17 ReinforcementLearning
58 pages
Dissertation Proposal Marking Scheme
100% (2)
Dissertation Proposal Marking Scheme
6 pages
Elementos Basicos Aprendizaje Por Refuerzo
No ratings yet
Elementos Basicos Aprendizaje Por Refuerzo
52 pages
RL Ese
No ratings yet
RL Ese
7 pages
Immediate Access Cognitive Science An Introduction To The Study of Mind 4th Edition Friedenberg Verified PDF Download
No ratings yet
Immediate Access Cognitive Science An Introduction To The Study of Mind 4th Edition Friedenberg Verified PDF Download
403 pages
Reinforcement Learning: By: Chandra Prakash IIITM Gwalior
No ratings yet
Reinforcement Learning: By: Chandra Prakash IIITM Gwalior
64 pages
Lecture 9 Reiforcement Learning
No ratings yet
Lecture 9 Reiforcement Learning
29 pages
11-DL-Deep Learning For Reinforcement Learning
No ratings yet
11-DL-Deep Learning For Reinforcement Learning
47 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
28 pages
Lecture#5 Monte Carlo Methods Part I
No ratings yet
Lecture#5 Monte Carlo Methods Part I
28 pages
Fundamentals of Reinforcement Learning
No ratings yet
Fundamentals of Reinforcement Learning
33 pages
MLT Unit-5 Notes
No ratings yet
MLT Unit-5 Notes
17 pages
Reinforcement Learning MY101
No ratings yet
Reinforcement Learning MY101
15 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
32 pages
Add-On DRL CS06
No ratings yet
Add-On DRL CS06
23 pages
AI - Lab#08 - Task - Solution
100% (1)
AI - Lab#08 - Task - Solution
3 pages
7.reinforcement Learning-Introduction-The Learning Task Q-Learning
No ratings yet
7.reinforcement Learning-Introduction-The Learning Task Q-Learning
34 pages
What Is Reinforcement Learning
No ratings yet
What Is Reinforcement Learning
5 pages
Unit Vi
No ratings yet
Unit Vi
17 pages
Reinforcement Learning: Csci 5512: Artificial Intelligence Ii
No ratings yet
Reinforcement Learning: Csci 5512: Artificial Intelligence Ii
30 pages
Unit 5 ML
No ratings yet
Unit 5 ML
15 pages
Unit 5 - Reinforcement Learning
No ratings yet
Unit 5 - Reinforcement Learning
15 pages
AI Use Ergonomics
No ratings yet
AI Use Ergonomics
2 pages
Lecture 30 Reinforcement-Learning
No ratings yet
Lecture 30 Reinforcement-Learning
50 pages
Pandey 2022 J. Phys. Conf. Ser. 2161 012027
No ratings yet
Pandey 2022 J. Phys. Conf. Ser. 2161 012027
13 pages
SP14 CS188 Lecture 10 - Reinforcement Learning I
No ratings yet
SP14 CS188 Lecture 10 - Reinforcement Learning I
35 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
46 pages
13 Useful Deep Learning Interview Questions and Answer
No ratings yet
13 Useful Deep Learning Interview Questions and Answer
6 pages
Notes ML 02 Slides RNN ANN
No ratings yet
Notes ML 02 Slides RNN ANN
105 pages
Recommendations and Propects
No ratings yet
Recommendations and Propects
6 pages
IPS Academy, Institute of Engineering & Science
No ratings yet
IPS Academy, Institute of Engineering & Science
19 pages
Ai&ml Unit 3
No ratings yet
Ai&ml Unit 3
81 pages
Stochastic Process - Markov Property - Markov Chain - Markov Decision Process - Reinforcement Learning - RL Techniques - Example Applications
No ratings yet
Stochastic Process - Markov Property - Markov Chain - Markov Decision Process - Reinforcement Learning - RL Techniques - Example Applications
39 pages
RL Complete Unit-5
No ratings yet
RL Complete Unit-5
30 pages
التحقيق المحاسبي في العصر الرقمي
No ratings yet
التحقيق المحاسبي في العصر الرقمي
21 pages
Fortinet Report - Cyberthreat Predictions 2024
No ratings yet
Fortinet Report - Cyberthreat Predictions 2024
8 pages
Markov Decision Processes & Reinforcement Learning: Megan Smith Lehigh University, Fall 2006
No ratings yet
Markov Decision Processes & Reinforcement Learning: Megan Smith Lehigh University, Fall 2006
40 pages
Chapter 1 Introduction To AI
No ratings yet
Chapter 1 Introduction To AI
48 pages
Modeling and Optimization of Paper-Making Wastewater Treatment Based On Reinforcement Learning
No ratings yet
Modeling and Optimization of Paper-Making Wastewater Treatment Based On Reinforcement Learning
5 pages
VSQuAD ICIP 2022 Paper Camera Ready - 1 Compressed
No ratings yet
VSQuAD ICIP 2022 Paper Camera Ready - 1 Compressed
5 pages
Urban Land-Use Classification Using Machine Learning Classifiers Comparative Evaluation and Post-Classification Multi-Feature Fusion Approach
No ratings yet
Urban Land-Use Classification Using Machine Learning Classifiers Comparative Evaluation and Post-Classification Multi-Feature Fusion Approach
22 pages
BUE AI - Lec 03-Expert Systems Uncertainty
No ratings yet
BUE AI - Lec 03-Expert Systems Uncertainty
49 pages
Cornell CHRO Leadership Program Brochure
No ratings yet
Cornell CHRO Leadership Program Brochure
16 pages
Generative AI Hub
No ratings yet
Generative AI Hub
29 pages
Sushil S
No ratings yet
Sushil S
4 pages
TikTok Whats Next Shopping Trend Report 2024
No ratings yet
TikTok Whats Next Shopping Trend Report 2024
26 pages
Ii-Ii Aids-A TT 24-25
No ratings yet
Ii-Ii Aids-A TT 24-25
1 page
Praca ML:AI Engineer - AI - Lingaro - Zdalnie - No Fluff Jobs.
No ratings yet
Praca ML:AI Engineer - AI - Lingaro - Zdalnie - No Fluff Jobs.
1 page
Ts202517dethithuthpt2025 - Thptthanhdong Haiduong
No ratings yet
Ts202517dethithuthpt2025 - Thptthanhdong Haiduong
8 pages
Edge AI Box
No ratings yet
Edge AI Box
12 pages
Udemy - AI 900 - Exam
No ratings yet
Udemy - AI 900 - Exam
17 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

ML - Unit-3 - Reinforcement Learning

Uploaded by

ML - Unit-3 - Reinforcement Learning

Uploaded by

Reinforcement Learning

 The RL-Problem-Markov Process

If the process meet this property its is

Function Approximation and eligibility

Policy Evaluation – Policy Improvement

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.