0% found this document useful (0 votes)
12 views28 pages

Lecture Reinforcement Learning

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views28 pages

Lecture Reinforcement Learning

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Edge AI and Robotics Teaching Kit

Lecture 5.1
Reinforcement Learning
The Edge AI and Robotics Teaching Kit is licensed by NVIDIA and UMBC under the
Creative Commons Attribution-NonCommercial 4.0 International License.

2
Topics

• Describe concept of Reinforcement Learning


• Reinforcement Learning Algorithms and Approaches
• Deep Learning
• States, Actions, Rewards
• Lab and Example Environments

3
Learning Objectives - Reinforcement Learning

Explain concepts of Reinforcement Learning


Explain different reinforcement learning approaches
Describe DQN and how Q-Learning is leveraged
Gain hands-on experience training agents using sample environments
in Openai Gym

4
Reinforcement Learning Concepts

5
Concepts

• Environment- attributes
• Agents
• State/Actions
• Learning – policies, functions,
models
• Objective
• Rewards

6
© D . Poole and A. Mackworth 2019 Artificial Intelligence: Foundations of Computational Agents
Reinforcement Learning

Agent Environment

7
Reinforcement Learning

What should an agent do given:


• Prior knowledge – possible states, baseline, possible actions
• Observations – current state, immediate reward
• Goal – optimal set of actions that maximizes the mean cumulative discounted reward
We can train this agent approximating its environment

8
© D . Poole and A. Mackworth 2019 Artificial Intelligence: Foundations of Computational Agents
Reinforcement Learning Loop

Figure 1.2 The reinforcement learning control


loop

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13:
9
9780135172384)
Copyright © 2020 Pearson Education, Inc. All rights reserved.
Rewards and Values

Figure 1.4 Rewards r and values V(s) for each state s in a simple grid-world
environment. The value of a state is calculated from the rewards using
Equation 1.10 with  = 0.9 while using a policy  that always takes the
shortest path to the goal state with r = +1.

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
10 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Approaches

11
Reinforcement Learning Approaches

Figure 1.5 Deep reinforcement learning


algorithm families

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
Copyright © 2020 Pearson Education, Inc. All rights reserved.
12
Neural Networks Leveraged for RL

Figure 12.4 Neural network families

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
Copyright © 2020 Pearson Education, Inc. All rights reserved.
13
States

14
Simple Environment

Figure 3.1 Simple environment: five states, two actions per state

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13:
9780135172384)
15 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Simple Environment

Figure 3.2 Optimal Q-values for


the simple environment from
Figure 3.1,  = 0.9

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng
(ISBN-13: 9780135172384)
16 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Simple Environment - Learning

Figure 3.3 Learning the Q*(s, a) for the simple environment from Figure 3.1

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
17 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Simple Environment – Optimal Values

Figure 3.4 Optimal Q-values for the simple environment from Figure 3.1,  = 0
(left),  = 1 (right)

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13:
9780135172384)
18 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Processing of Data

Figure 14.2 Information flow from the world to an algorithm

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13:
19 9780135172384)
Reinforcement Learning - GPU

20
Environment

21
Gym Openai

Gym (openai.com)

For Jetson- Github Project: dusty-nv


/jetson-reinforcement: Deep reinforcement learning GPU libraries for NVIDIA Jetson TX1/TX2 with
PyTorch, OpenAI Gym, and Gazebo robotics simulator. (github.com)

Sample Notebook Tutorial using GPU: jetson-reinforcement/intro-DQN.ipynb at master · dusty-nv


/jetson-reinforcement (github.com)

With ROS and Gazebo


https://github.com/AcutronicRobotics/gym-gazebo2/blob/dashing/docker/README.md

22
OpenAI Gym - Cartpole

Figure 1.1 CartPole-v0 is a simple toy environment. The objective is to


balance a pole for 200 time steps by controlling the left-
right motion of a cart.

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
23 Copyright © 2020 Pearson Education, Inc. All rights reserved.
OpenAI Gym - Cartpole

(a) t = 1 (b) t = 2 (c) t (d) t = 4


=3
Figure 14.7 Four consecutive frames of the
CartPole environment

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
Copyright © 2020 Pearson Education, Inc. All rights reserved.
24
OpenAI Gym - LunarLander

Figure B.3 The LunarLander-v2 environment. The objective is to steer and


land the lander between the flags using minimal fuel,
without crashing.

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
25 Copyright © 2020 Pearson Education, Inc. All rights reserved.
OpenAI Gym - Environments

CartPole Atari Breakout BipedalWalker


Figure 1.3 Three example environments with different states, actions, and
rewards. These environments are available in
OpenAI Gym.

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13:
9780135172384)
26 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Additional Information

Foundations of Deep Reinforcement Learning


https://www.pearson.com/us/higher-education/program/Graesser-Foundations-of-Deep-Reinforceme
nt-Learning-Theory-and-Practice-in-Python/PGM2027228.html

NVIDIA Technical Blog – Deep Learning in a Nutshell: Reinforcement Learning


https://developer.nvidia.com/blog/deep-learning-nutshell-reinforcement-learning/

27
Thank You
Edge AI and Robotics Teaching Kit

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy