0% found this document useful (0 votes)

15 views3 pages

Chapter 4 5 Formatted RL Report Kiran

The document details the application of the Deep Q-Network (DQN) algorithm to the CartPole-v1 task, highlighting successful training and evaluation results, including a perfect mean reward of 500 across test episodes. Observations indicate the agent learned an optimal policy with stable performance, supported by visual evidence of effective action selection. The project aligns with reinforcement learning objectives and suggests future work to explore more complex environments and methods.

Uploaded by

tkirangowda15

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views3 pages

Chapter 4 5 Formatted RL Report Kiran

Uploaded by

tkirangowda15

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

4.

Results / Interpretation

4.1 Training Performance

The agent was trained using the Deep Q-Network (DQN) algorithm on the CartPole-v1
environment for 100,000 timesteps. Throughout training, the performance was monitored
using TensorBoard. The reward curve began with low returns during early exploration
and gradually improved as the Q-network approximated optimal action values. This
progression is evidence of the policy learning meaningful control behavior through value
updates and exploitation of improved estimates.

📸 ← Insert reward curve from TensorBoard here

Figure 1: Reward per episode plotted during training. The increase in average reward
indicates effective learning and convergence of the agent toward a stable policy.

4.2 Quantitative Evaluation

Post-training, the model was evaluated using 10 separate test episodes with deterministic
policy execution. The evaluation was conducted using the `evaluate_policy()` function
provided by Stable Baselines3. The resulting mean reward was 500.00 with a standard
deviation of 0.00, demonstrating flawless performance across all trials.

📸 ← Insert screenshot of evaluation result here

Figure 2: Evaluation result showing the agent achieving the maximum possible reward
consistently across all 10 test episodes.

4.3 Visual Observation of Policy

To further verify the agent's performance qualitatively, an episode was rendered and a
snapshot was captured mid-execution. The rendered frame shows the cart in a centered
position with the pole upright, demonstrating that the agent is correctly selecting actions
to maintain balance.

📸 ← Insert frame of rendered agent gameplay here

Figure 3: Snapshot of the CartPole agent during execution. The agent maintains balance
and avoids episode termination by executing a stable policy.

4.4 Observations
Based on the training logs, evaluation metrics, and rendered gameplay, the following
observations were made:
- The agent successfully learned a near-optimal policy to solve the CartPole-v1 task.
- The reward curve shows consistent convergence with minimal oscillations or instability.
- The evaluation score confirms 100% episode success rate, with maximum allowable
reward in each test run.
- The visual inspection validates the effectiveness of the learned policy from a behavior
standpoint.
- The training setup, though relatively lightweight, was sufficient to yield a high-
performing RL agent in a constrained control setting.

These results collectively demonstrate that Deep Q-Networks, when applied correctly
with tuned hyperparameters, are effective at solving classic control problems like
CartPole.

5. Conclusion
This project applied the Deep Q-Network (DQN) algorithm to the CartPole-v1 control
task using a structured reinforcement learning approach. The problem was framed as a
Markov Decision Process (MDP), and the agent's objective was to learn an optimal
control policy for balancing the pole through trial and error using Q-learning with
function approximation.

The training phase involved a neural network that predicted Q-values for state-action
pairs and was optimized using experience replay and a separate target network.
Throughout 100,000 training steps, the agent progressively improved its performance, as
evidenced by the TensorBoard reward graphs and evaluation metrics.

The final model consistently achieved the maximum reward of 500 in every test episode
with zero standard deviation, confirming both convergence and stability. Visual rendering
further supported the effectiveness of the learned policy, showing the agent keeping the
pole balanced throughout its motion.

This work successfully aligns with the objectives of the Reinforcement Learning course.
It demonstrates:
- The transition from theoretical understanding to hands-on application,
- Familiarity with popular libraries like Gymnasium and Stable Baselines3,
- Mastery in monitoring, debugging, and evaluating RL models using tools such as
TensorBoard.

In future work, this baseline could be extended in several ways:

- Experimenting with different architectures or policy gradient methods like PPO and
A2C,
- Applying the agent to more complex environments such as LunarLander or
BipedalWalker,
- Introducing noise and stochasticity to test policy robustness,
- Deploying the agent in a real-time control system using a simulated or physical robot.

The CartPole project ultimately served as an excellent platform to implement and

interpret foundational RL concepts and deep Q-learning in practice.

TGREDCO - Telangana Renewable Energy Development Corporation LTD.
No ratings yet
TGREDCO - Telangana Renewable Energy Development Corporation LTD.
5 pages
Filipino Bill of Rights
No ratings yet
Filipino Bill of Rights
13 pages
Hyperparameter Impact On Learning Efficiency in Q-Learning and DQN Using Openai Gymnasium Environments
No ratings yet
Hyperparameter Impact On Learning Efficiency in Q-Learning and DQN Using Openai Gymnasium Environments
13 pages
Mini Project-9menmorries
100% (1)
Mini Project-9menmorries
7 pages
Trade Mogul Trading Guide
No ratings yet
Trade Mogul Trading Guide
17 pages
N.E.F. Phobia
No ratings yet
N.E.F. Phobia
2 pages
Interview To A Manager
No ratings yet
Interview To A Manager
5 pages
Introduction To Logic Module 3 Language and Definitions
No ratings yet
Introduction To Logic Module 3 Language and Definitions
16 pages
Autonomous Vehicle Control Via Deep Reinforcement Learning: Simon Kardell Mattias Kuosku
No ratings yet
Autonomous Vehicle Control Via Deep Reinforcement Learning: Simon Kardell Mattias Kuosku
73 pages
2023 11 30 Linus Bantel
No ratings yet
2023 11 30 Linus Bantel
61 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
1 page
5SC28 L7 Machine Learning
No ratings yet
5SC28 L7 Machine Learning
61 pages
Final Report RL
No ratings yet
Final Report RL
5 pages
w7 - Reinforcement Learning
No ratings yet
w7 - Reinforcement Learning
5 pages
Emergent Agentic Transformer From Chain of Hindsight Experience
No ratings yet
Emergent Agentic Transformer From Chain of Hindsight Experience
13 pages
Reinforcement Learning Dissertation
No ratings yet
Reinforcement Learning Dissertation
12 pages
Reinforcement Learning Notes ?
No ratings yet
Reinforcement Learning Notes ?
40 pages
Chapter 4 5 Results Conclusion RL Report Kiran
No ratings yet
Chapter 4 5 Results Conclusion RL Report Kiran
2 pages
Q Transformer
No ratings yet
Q Transformer
20 pages
Using Q-Learning For OpenAI's CartPole-v1 - by Ali Fakhry - The Startup - Medium
No ratings yet
Using Q-Learning For OpenAI's CartPole-v1 - by Ali Fakhry - The Startup - Medium
10 pages
6S191 MIT DeepLearning L5
No ratings yet
6S191 MIT DeepLearning L5
62 pages
Lecture Reinforcement Learning
No ratings yet
Lecture Reinforcement Learning
28 pages
Dice Resume CV Kelly Carlson
No ratings yet
Dice Resume CV Kelly Carlson
4 pages
Final Pong RL Report
No ratings yet
Final Pong RL Report
3 pages
Experiment Number 5
No ratings yet
Experiment Number 5
2 pages
HW4 Spec
No ratings yet
HW4 Spec
5 pages
4bs1 02 Rms 20230824
No ratings yet
4bs1 02 Rms 20230824
25 pages
Deep Reinforcement Learning: Lecture Notes
No ratings yet
Deep Reinforcement Learning: Lecture Notes
60 pages
Exploring Reinforcement Learning Algorithms: Information and Communication Technologies Department
No ratings yet
Exploring Reinforcement Learning Algorithms: Information and Communication Technologies Department
60 pages
Section 7 Gravitational Fields
No ratings yet
Section 7 Gravitational Fields
39 pages
Application of Deep Reinforcement Learning in Mobile Robot Path Planning
No ratings yet
Application of Deep Reinforcement Learning in Mobile Robot Path Planning
5 pages
Reinforcement Learning Review 1
No ratings yet
Reinforcement Learning Review 1
7 pages
Autonomous Car Racing in Simulation Environment Using Deep Reinforcement Learning
No ratings yet
Autonomous Car Racing in Simulation Environment Using Deep Reinforcement Learning
6 pages
CS6700 Programming Assignment 2
No ratings yet
CS6700 Programming Assignment 2
17 pages
A Study On Drug Addiction Among Youngsters at Coimbatore District
No ratings yet
A Study On Drug Addiction Among Youngsters at Coimbatore District
5 pages
Introduction To Deep Reinforcement Learning
No ratings yet
Introduction To Deep Reinforcement Learning
7 pages
RADL LACuong
No ratings yet
RADL LACuong
81 pages
Ilpobservation Submission 1163492602
No ratings yet
Ilpobservation Submission 1163492602
11 pages
MEG511 - Term Report
No ratings yet
MEG511 - Term Report
15 pages
Chapter 4 5 Vast Professional RL Report Kiran
No ratings yet
Chapter 4 5 Vast Professional RL Report Kiran
2 pages
Position Description BIM Manager
No ratings yet
Position Description BIM Manager
5 pages
DL Questions
No ratings yet
DL Questions
30 pages
Tyagi Wang Wen Zuo
No ratings yet
Tyagi Wang Wen Zuo
17 pages
Untitled Document
No ratings yet
Untitled Document
11 pages
Unit 5 ML
No ratings yet
Unit 5 ML
49 pages
9800 Relay Series
No ratings yet
9800 Relay Series
2 pages
Midterm Report Example3
No ratings yet
Midterm Report Example3
4 pages
Narrative Report
No ratings yet
Narrative Report
2 pages
Monthly RE Generation Report April 2025
No ratings yet
Monthly RE Generation Report April 2025
28 pages
SOS Midterm
No ratings yet
SOS Midterm
8 pages
Report On Reinforcement Learning
No ratings yet
Report On Reinforcement Learning
26 pages
Chapter 2 3 Problem and Methodology RL Report Kiran
No ratings yet
Chapter 2 3 Problem and Methodology RL Report Kiran
3 pages
Chapter 1 Introduction RL Report Kiran
No ratings yet
Chapter 1 Introduction RL Report Kiran
2 pages
Self-Driving Car Racing: Application of Deep Reinforcement Learning
No ratings yet
Self-Driving Car Racing: Application of Deep Reinforcement Learning
12 pages
RL PyTexas 2017 PDF
No ratings yet
RL PyTexas 2017 PDF
29 pages
CH5 - Function Approximation
No ratings yet
CH5 - Function Approximation
33 pages
Assignment3 Report MDS202312
No ratings yet
Assignment3 Report MDS202312
2 pages
Report PDF
No ratings yet
Report PDF
12 pages
RLDL PBL AmriteshChandra 09411503121
No ratings yet
RLDL PBL AmriteshChandra 09411503121
15 pages
Reaction Paper
No ratings yet
Reaction Paper
2 pages
List of Some Implementation Based Problems On Spoj
No ratings yet
List of Some Implementation Based Problems On Spoj
2 pages
Recruitment and Selection
No ratings yet
Recruitment and Selection
2 pages
Uid-Module 3 Menus
No ratings yet
Uid-Module 3 Menus
25 pages
Karnataka FPOs
No ratings yet
Karnataka FPOs
66 pages
Unit 5d - Deep Reinforcement Learning
No ratings yet
Unit 5d - Deep Reinforcement Learning
52 pages
03 04 Lessonarticle
No ratings yet
03 04 Lessonarticle
5 pages
21 - Reinforcement Learning
No ratings yet
21 - Reinforcement Learning
25 pages
3.5 Intro2DeepQLearning
No ratings yet
3.5 Intro2DeepQLearning
12 pages
15) EXPLAIN Fitted Q and Deep Q-Learning
No ratings yet
15) EXPLAIN Fitted Q and Deep Q-Learning
17 pages
Franck Hertz
No ratings yet
Franck Hertz
6 pages
Deep Quality-Value (DQV) Learning: Preprint. Work in Progress
No ratings yet
Deep Quality-Value (DQV) Learning: Preprint. Work in Progress
10 pages
Autonomous Driving With Deep Reinforcement Learning in CARLA Simulation
No ratings yet
Autonomous Driving With Deep Reinforcement Learning in CARLA Simulation
7 pages
Towards Adapting Reinforcement Learning Agents To New Tasks: Insights From Q-Values
No ratings yet
Towards Adapting Reinforcement Learning Agents To New Tasks: Insights From Q-Values
10 pages
Walking Through Original DQN Paper - by Stas Olekhnovich - Medium
No ratings yet
Walking Through Original DQN Paper - by Stas Olekhnovich - Medium
13 pages
112 Experiment 3
No ratings yet
112 Experiment 3
3 pages
Candidates Viva Voce MonsoonSemester2014 15
No ratings yet
Candidates Viva Voce MonsoonSemester2014 15
17 pages
Bridging The Gap Between Value and Policy Based Reinforcement Learning
No ratings yet
Bridging The Gap Between Value and Policy Based Reinforcement Learning
21 pages
04 Nursing Process of MHN
100% (1)
04 Nursing Process of MHN
13 pages
Q-Learning and Deep Q Networks (DQN)
No ratings yet
Q-Learning and Deep Q Networks (DQN)
52 pages
OverviewPricingContact Us
No ratings yet
OverviewPricingContact Us
15 pages
Ofosu
No ratings yet
Ofosu
9 pages
Cart-Pole Balancing With Deep Q Network (DQN) : The Objective
No ratings yet
Cart-Pole Balancing With Deep Q Network (DQN) : The Objective
1 page
07 Deep Reinforcement Learning (John)
No ratings yet
07 Deep Reinforcement Learning (John)
52 pages
Deep RL Tutorial Small
No ratings yet
Deep RL Tutorial Small
66 pages
Economic and Product Design Considerations in Machining
No ratings yet
Economic and Product Design Considerations in Machining
29 pages
A. Preliminary Activities
No ratings yet
A. Preliminary Activities
7 pages
An Introduction To Deep Reinforcement Learning PDF
No ratings yet
An Introduction To Deep Reinforcement Learning PDF
140 pages
Pre Sales Questionnaire
No ratings yet
Pre Sales Questionnaire
15 pages
IT Specialist: Artificial Intelligence Exam Prep - 500 Questions for Certification Success (0225)
From Everand
IT Specialist: Artificial Intelligence Exam Prep - 500 Questions for Certification Success (0225)
Satou Takahiro
No ratings yet
PMI-ACP Exam Companion : Q & A with Explanations
From Everand
PMI-ACP Exam Companion : Q & A with Explanations
SUJAN
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Chapter 4 5 Formatted RL Report Kiran

Uploaded by

Chapter 4 5 Formatted RL Report Kiran

Uploaded by

4.

4.1 Training Performance

📸 ← Insert reward curve from TensorBoard here

4.2 Quantitative Evaluation

📸 ← Insert screenshot of evaluation result here

4.3 Visual Observation of Policy

📸 ← Insert frame of rendered agent gameplay here

In future work, this baseline could be extended in several ways:

The CartPole project ultimately served as an excellent platform to implement and

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.