0% found this document useful (0 votes)
12 views43 pages

Reinforcement Learning in A Id - 12008003

Uploaded by

saminalrashid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views43 pages

Reinforcement Learning in A Id - 12008003

Uploaded by

saminalrashid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

PRESENTATION ON A.

Presented By
Abdullah Al Rashid Samir
Id:12008003
WHAT IS ARTIFICIAL INTELLIGENCE (AI)?

• Artificial intelligence is a field of science concerned with


building computers and machines that can reason, learn, and
act in such a way that would normally require human
intelligence or that involves data whose scale exceeds what
humans can analyze.
• AI is a broad field that encompasses many different disciplines,
including computer science, data analytics and statistics,
hardware and software engineering, linguistics, neuroscience,
and even philosophy and psychology.
TYPES OF LEARNING

• 1. Supervised Learning

• 2. Unsupervised Learning

• 3. Reinforcement Learning
SUPERVISED LEARNING

• Supervised learning is a type of machine learning where a


model is trained on labeled data. In supervised learning, each
training example is a pair consisting of an input object (often a
vector) and the desired output value (label). The model learns
a mapping from inputs to the outputs, using this labeled data
to adjust its parameters. The aim is for the model to accurately
predict the output labels for new, unseen data based on the
patterns learned during training.
METHODS OF SUPERVISED
LEARNING

• Classification: Predicting discrete labels, such as identifying


whether an email is spam or not.
• Regression: Predicting continuous values, like estimating
house prices based on features like location, size, and
condition.
POPULAR ALGORITHMS OF
SUPERVISED LEARNING

• 1. Linear Regression
• 2. Decision Trees
• 3. Support Vector Machines (SVM)
ADVANTAGES OF SUPERVISED
LEARNING

• 1. Produces highly accurate models when sufficient labeled


data is available.
• 2. Effective for both classification and regression tasks.
• 3. Allows for continuous model improvement with labeled data.
• 4. Performance can be easily evaluated using metrics like
accuracy, precision, and recall.
DISADVANTAGES OF SUPERVISED
LEARNING

• 1. Large amounts of labeled data are needed, which can be


time-consuming and expensive to obtain.
• 2. May not perform well on unseen data or tasks outside its
training scope.
• 3. Models can overfit if trained on noisy or irrelevant data.
• 4. Training can be slow for complex models or large datasets.
UNSUPERVISED LEARNING

• The model is trained on unlabeled data. The goal is to discover


patterns or relationships in the data without explicit labels.
• Unsupervised learning in artificial intelligence is a type of
machine learning that learns from data without human
supervision. Unlike supervised learning, unsupervised machine
learning models are given unlabeled data and allowed to
discover patterns and insights without any explicit guidance or
instruction.
METHODS UNSUPERVISED LEARNING

• 1. Clustering
• 2. Association
CLUSTERING

• Clustering is a technique for exploring raw, unlabeled data and


breaking it down into groups (or clusters) based on similarities
or differences. It is used in a variety of applications, including
customer segmentation, fraud detection, and image analysis.
Clustering algorithms split data into natural groups by finding
similar structures or patterns in uncategorized data.
• Clustering is one of the most popular unsupervised machine
learning approaches. There are several types of unsupervised
learning algorithms that are used for clustering, which include
exclusive, overlapping, hierarchical, and probabilistic.
ASSOCIATION

• Association rule mining is a rule-based approach to reveal interesting


relationships between data points in large datasets. Unsupervised learning
algorithms search for frequent if-then associations—also called rules—to
discover correlations and co-occurrences within the data and the different
connections between data objects.
• It is most commonly used to analyze retail baskets or transactional
datasets to represent how often certain items are purchased together.
These algorithms uncover customer purchasing patterns and previously
hidden relationships between products that help inform recommendation
engines or other cross-selling opportunities. You might be most familiar
with these rules from the “Frequently bought together” and “People who
bought this item also bought” sections on your favorite online retail shop.
COMMON ALGORITHMS OF
UNSUPERVISED LEARNING

• 1. K-means clustering
• 2. PCA
ADVANTAGES OF UNSUPERVISED
LEARNING

• 1. Better suited for more complex processing tasks


• 2. Useful for identifying previously undetected patterns
• 3. Can help identify features useful for categorizing data
DISADVANTAGES OF UNSUPERVISED
LEARNING

• 1. Results may be unpredictable or difficult to understand


• 2. Difficult to measure accuracy or effectiveness due to lack of
predefined answers during training
REINFORCEMENT LEARNING

• An agent learns by interacting with its environment and


receiving feedback in the form of rewards or punishments. The
goal is to learn a strategy (or policy) that maximizes
cumulative rewards.
INTRODUCTION TO REINFORCEMENT
LEARNING

• Reinforcement learning (RL) is a type of machine learning


process that focuses on decision making by autonomous
agents.
• Reinforcement learning (RL) is a machine learning (ML)
technique that trains software to make decisions to achieve the
most optimal results.
• Reinforcement Learning (RL) is a branch of machine learning
focused on making decisions to maximize cumulative rewards
in a given situation.
• Autonomous Agent: An autonomous agent is any system that
can make decisions and act in response to its environment
independent of direct instruction by a human user.
KEY CONCEPT OF REINFORCEMENT LEARNING

• According to IBM:
• Beyond the agent-environment-goal triumvirate, four principal
sub-elements characterize reinforcement learning problems.
• - Policy
• - Reward signal.
• - Value function.
• - Model
KEY CONCEPT OF REINFORCEMENT LEARNING

• According to Amazon and GeeksforGeeks:


• *Agent: The learner or decision-maker.
• *Environment: Everything the agent interacts with.
• *State: A specific situation in which the agent finds itself.
• *Action: All possible moves the agent can make.
• *Reward: Feedback from the environment based on the
action taken.
RL PROCESS

• Two main contributors of making process in RL is:


• Markov decision process
• Exploration- Exploitation trade off
RL PROCESS AND MARKOV DECISION
PROCESS
TYPES OF ALGORITHM IN RL

• Algorithms can be grouped into two broad categories:


• 1)Model-based RL
• 2)Model-free RL
MODEL-BASED RL

• Model-based RL is typically used when environments are well-


defined and unchanging and where real-world environment
testing is difficult.
• The agent first builds an internal representation (model) of the
environment. It uses this process to build this model:
• 1) It takes actions within the environment and notes the new
state and reward value
• 2) It associates the action-state transition with the reward
value.
MODEL-FREE RL

• Model-free RL is best to use when the environment is large,


complex, and not easily describable. It’s also ideal when the
environment is unknown and changing, and environment-
based testing does not come with significant downsides.
• The agent doesn’t build an internal model of the environment
and its dynamics. Instead, it uses a trial-and-error approach
within the environment. It scores and notes state-action pairs—
and sequences of state-action pairs—to develop a policy.
EXPLORATION VS EXPLOITATION

• Because an RL agent has no manually labeled input data guiding its


behavior, it must explore its environment, attempting new actions to
discover those that receive rewards. From these reward signals, the agent
learns to prefer actions for which it was rewarded in order to maximize its
gain. But the agent must continue exploring new states and actions as well.
In doing so, it can then use that experience to improve its decision-making.
• Because an RL agent has no manually labeled input data guiding its
behavior, it must explore its environment, attempting new actions to
discover those that receive rewards. From these reward signals, the agent
learns to prefer actions for which it was rewarded in order to maximize its
gain. But the agent must continue exploring new states and actions as well.
In doing so, it can then use that experience to improve its decision-making.
APPLICATION OF REINFORCEMENT
LEARNING

• i) Robotics: Automating tasks in structured environments like


manufacturing.

• ii) Game Playing: Developing strategies in complex games like chess.

• iii) Industrial Control: Real-time adjustments in operations like refinery


controls.

• iv) Personalized Training Systems: Customizing instruction based on


individual needs.
DISADVANTAGES

• 1. Reinforcement learning is not preferable to use for solving simple problems.

• 2. Reinforcement learning needs a lot of data and a lot of computation

• 3. Reinforcement learning is highly dependent on the quality of the reward


function. If the reward function is poorly designed, the agent may not learn the
desired behavior.

• 4. Reinforcement learning can be difficult to debug and interpret. It is not


always clear why the agent is behaving in a certain way, which can make it
difficult to diagnose and fix problems.
ADVANTAGES

• 1. Reinforcement learning can be used to solve very complex problems that


cannot be solved by conventional techniques.
• 2. The model can correct the errors that occurred during the training process.
• 3. In RL, training data is obtained via the direct interaction of the agent with the
environment
• 4. Reinforcement learning can handle environments that are non-deterministic,
meaning that the outcomes of actions are not always predictable. This is useful
in real-world applications where the environment may change over time or is
uncertain.
• 5. Reinforcement learning is a flexible approach that can be combined with
other machine learning techniques, such as deep learning, to improve
performance.
R E IN F OR C E ME N T L E A R N IN G V S . S U P E R V IS E D
L E A RN IN G

• In supervised learning, you define both the input and the expected associated
output. For instance, you can provide a set of images labeled dogs or cats, and
the algorithm is then expected to identify a new animal image as a dog or cat.
• Supervised learning algorithms learn patterns and relationships between the
input and output pairs. Then, they predict outcomes based on new input data.
It requires a supervisor, typically a human, to label each data record in a
training data set with an output.
• In contrast, RL has a well-defined end goal in the form of a desired result but
no supervisor to label associated data in advance. During training, instead of
trying to map inputs with known outputs, it maps inputs with possible
outcomes. By rewarding desired behaviors, you give weightage to the best
outcomes.
R E IN F OR C E ME N T L E A R N IN G V S .
U N S U P E R V IS E D L E A R N IN G

• Unsupervised learning algorithms receive inputs with no


specified outputs during the training process. They find hidden
patterns and relationships within the data using statistical
means. For instance, you could provide a set of documents,
and the algorithm may group them into categories it identifies
based on the words in the text. You do not get any specific
outcomes; they fall within a range.
• Conversely, RL has a predetermined end goal. While it takes an
exploratory approach, the explorations are continuously
validated and improved to increase the probability of reaching
the end goal. It can teach itself to reach very specific
outcomes.
GENERAL MODEL

• What is General Model?

• A general model of learning in AI refers to systems that learn


patterns from data, adapt to new tasks, and improve
performance through experience without explicit programming
STEPS IN GENERAL MODEL OF
LEARNING IN AI

• 1. Data Collection
• 2. Data Preprocessing
• 3. Model Training
• 4. Model Evaluation
• 5. Model Deployment
HOW FEEDBACK IS INCORPORATED
INTO THE LEARNING PROCESS

• In AI, feedback is incorporated through mechanisms like


reinforcement learning or model retraining, where the system
uses real-world results or user input to adjust its behavior.

• This iterative process helps refine the model's predictions and


performance over time, ensuring continuous improvement
LEARNING AUTOMATA

• What is Learning Automata?


• Learning automata systems are finite state adaptive systems
which interact iteratively with a general environment.
• Learning automata in AI is a type of adaptive decision-making
model that learns the optimal actions through interactions with
its environment. This learning paradigm is particularly useful
when dealing with complex, uncertain environments where
explicit programming is challenging. Automata adapt over time
to make the most effective decisions based on rewards or
penalties provided by the environment.
LEARNING PROCESS

• The automaton receives feedback from the environment in the


form of rewards or penalties, which inform whether its actions
were successful or not.

• Reinforcement signals: Responses from the environment can


be categorized as reward (success), penalty (failure), or both,
depending on how closely the selected action aligns with
optimal performance.
APPLICATIONS IN AI

• Learning automata are used in various AI applications such as


adaptive control systems, game theory, reinforcement
learning, and distributed network control. They are
particularly valuable in scenarios where the system needs to
learn optimal behavior over time.
EXAMPLE

• A two-action learning automaton could have a binary


choice (e.g., "turn left" or "turn right") and adapt based on
feedback. After a series of actions and responses, the
automaton would “learn” to choose the more favorable action.

• GAMING BOT
GENETIC ALGORITHM

• It is a search heuristic that is inspired by Charles Darwin’s


theory of natural evolution. This algorithm reflects the process
of natural selection where the fittest individuals are selected
for reproduction in order to produce offspring of the next
generation
• enetic algorithms (GAs) are optimization techniques inspired by
the principles of natural selection and genetics. They are a part
of the broader family of evolutionary algorithms and are widely
used in AI to find approximate solutions to complex problems.
Genetic algorithms are particularly useful for optimization
problems with large, multi-dimensional search spaces.
BASIC STRUCTURE OF GENETIC
ALGORITHMS

• he main steps include selection, crossover, and mutation:


• Selection: Choosing individuals based on their "fitness," or
how well they perform in the problem context.
• Crossover (Recombination): Mixing the genes of two
individuals to produce new offspring that inherit traits from
both parents.
• Mutation: Randomly altering parts of an individual’s genetic
code to introduce diversity and explore new solutions.
ALGORITHM

• 1. Randomly initialize populations p


• 2. Determine fitness of population
• 3. Until convergence repeat.
• a. Select parents from population
• b. Crossover and generate new population
• c. Perform mutation on new population
• d. Calculate fitness for new population
EXAMPLE

• 1.Google’s Deepmind
• 2. Tesla’s Self Driving Tasks
• 3. Traveling Salesperson Problem (TSP)
SOURCES

• IBM
• https://www.ibm.com/topics/reinforcement-learning
• Amazon
• https://aws.amazon.com/what-is/reinforcement-learning/
• GeeksforGeeks
• https://www.geeksforgeeks.org/what-is-reinforcement-learning/
• Google cloud
• https://cloud.google.com/learn/what-is-artificial-intelligence
SOURCES

• Javapoint
• https://
www.javatpoint.com/genetic-algorithm-in-machine-learning
• IBM. (n.d.). Genetic Algorithms for Optimization. Retrieved from
IBM Research Blog.
• Tutorials point
• Géron, A. (2019). Hands-On Machine Learning with Scikit-
Learn, Keras, and TensorFlow. O'Reilly Media.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy