0% found this document useful (0 votes)
75 views31 pages

AI Chapter 5

This document provides an overview of machine learning basics, including different types of learning. It discusses supervised learning, where a teacher provides training examples and answers to learn patterns. Unsupervised learning is used when no answers are given, to discover hidden patterns in data by clustering. Evaluation measures for supervised learning include accuracy, precision, and recall to assess performance on training and testing data. Overfitting can occur if the learned patterns only apply to the specific training examples.

Uploaded by

Ahmed Kedir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views31 pages

AI Chapter 5

This document provides an overview of machine learning basics, including different types of learning. It discusses supervised learning, where a teacher provides training examples and answers to learn patterns. Unsupervised learning is used when no answers are given, to discover hidden patterns in data by clustering. Evaluation measures for supervised learning include accuracy, precision, and recall to assess performance on training and testing data. Overfitting can occur if the learned patterns only apply to the specific training examples.

Uploaded by

Ahmed Kedir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

Chapter Five

Machine Learning Basics


Learning
• Definition: “learning is a goal-directed process of a system that
improves the knowledge or the knowledge representation of the
system by exploring experience and prior knowledge”
acquisition of new declarative knowledge
Development of motor and cognitive skills through instruction and
practice.
Organization of new knowledge into general effective representation.
discovery of new facts and theories through observation and
experimentation
Overview of Learning
• Learn what?
• Facts about the world (KB)
• Decision-making strategy
• Probabilities, costs, functions, states, …
• Learn from what?
• Training data
• Often freely available for common problems
• Real world
• Learn how?
• Need some form of feedback to the agent
Learning from Examples/Observation
• Observational Learning: To learn by watching and imitating others. For example, child
tries to learn by mimicking her parent.
Knowledge in Learning
• Supervised Learning: It involves a teacher that is scholar than the ANN itself. For
example, the teacher feeds some example data about which the teacher already knows
the answers.
For example, pattern recognizing. The ANN comes up with guesses while recognizing.
Then the teacher provides the ANN with the answers. The network then compares it
guesses with the teacher’s “correct” answers and makes adjustments according to errors.
• For supervised learning, the aim is to find a simple hypothesis approximately consistent
with training examples
Cont’d
• Unsupervised Learning: It is required when there is no example data
set with known answers. For example, searching for a hidden pattern.
In this case, clustering i.e. dividing a set of elements into groups
according to some unknown pattern is carried out based on the
existing data sets present.
• Reinforcement Learning: This strategy built on observation. The ANN
makes a decision by observing its environment. If the observation is
negative, the network adjusts its weights to be able to make a
different required decision the next time.
Supervised Learning
• Given a training corpus of data-output pairs
• x & y values
• Email & spam/not spam
• Variable values & decision
• Learn the relationship mapping the data to the output
• f(x)
• Spam features
• Decision rules
Supervised Learning Example
• 2D state space with binary
- - - - classification
-
- • Learn function to separate
- + + + - both classes
- + + -
+
-+ + + +
- + +
+ + -
- +
- - - -
- - - -
Supervised Learning Example
• Multiple variables and value to decide whether email is spam
Email UW Pics Words Attach Multi Spam?
E1 N Y 58 Y Y Yes
E2 N N 132 N Y No
E3 N Y 1049 Y Y Yes
E4 N N 18 Y Y Yes
E5 Y N 26 Y Y No
E6 N Y 32 N Y Yes
E7 Y N 44 N Y No
E8 N Y 256 N Y Yes
E9 Y Y 2789 Y N No
E10 N Y 857 Y N No
E11 N N 732 N N No
E12 N Y 541 N Y Yes
Supervised Learning Example
• Build decision tree UW

Yes No

Not Spam Multi

No
Yes
Not Spam
Attach

Yes No
Spam Pictures
Yes No
Spam Not Spam
Supervised Learning Algorithm
• Many possible learning
- - - - techniques, depending on
- the problem and the data
-
- + + + - • Start with inaccurate initial
hypothesis
- + + - • Refine to reduce error or
+ increase accuracy
-+ + + + • End with trade-off between
- + + accuracy and simplicity
+ + -
- +
- - - -
- - - -
Supervised Learning Algorithm
• Learning a decision tree follows the same general algorithm
• Start with all emails at root
• Pick attribute that will teach us the most
• Highest information gain, i.e. difference of probability of each class
• Branch using that attribute
• Repeat until trade-off between accuracy of leafs and depth limit / relevance
of attributes
Supervised Learning Evaluation
• Statistical measures of agent’s performance
• RMS(Root Mean Square) error between f(x) and y
• Making correct decision
• With as few decision rules as possible
• Shallowest tree possible
• Accuracy of a classification
• Precision and recall of a classification
Precision and Recall
• Binary classification: distinguish + (our target) from
– (everything else)
• Classifier makes mistakes
• Classifies some + as – and some – as +
• Define four categories:
Actual value
+ –
True False
+
Classified Positives Positives
as False True

Negatives Negatives
Precision and Recall
• Precision
• Proportion of selected items the classifier got right
• TP / (TP + FP)
• Recall
• Proportion of target items the classifier selected
• TP / (TP + FN)
Overfitting
• A common problem with supervised learning is over-specializing the
relation learned to the training data
• Learning from irrelevant features of the data
• Email features such as: paragraph indentation, number of typos, letter “x” in
sender address, …
• Works well on training data
• Because of poor sampling or random chance
• Fails in real-world tests
Testing Data
• Evaluate the relation learned using unseen test data
• i.e. that was not used in the training
• Therefore system not overfitted for it
• Split training data beforehand, keep part away for testing
• Only works once!
• If you reuse testing data, you are overfitting your system for that test!!
• Never do that!!!
Cross-Validation
• Shortcomings of holding out test data
• Test only works once
• Training on less data, therefore result less accurate
• n-fold cross-validation
• Split the training corpus into n parts
• Train with n-1, test with 1
• Run n tests, each time using a different test part
• Final training with all data and best features
Unsupervised Learning
• Given a training corpus of data points
• Observed value of random variables in Bayesian network
• Series of data points
• Learn underlying pattern in the data
• Existence and conditional probability of hidden variables
• Number of classes and classification rules
Unsupervised Learning Example
• 2D state space with
unclassified observations
* ** *
* * ** ** ** * • Learn number and form of
** * * * clusters
* * **
* • Problem of unsupervised
clustering
* * * • Many algorithms proposed for
* * * *
**
it
• More research still being done
for better algorithms,
different kind of data, …
Unsupervised Learning Algorithm
• Define a similarity measure,
to compare pairs of
* ** * elements
* * ** ** ** *
** * * * • Starting with no clusters
* * ** • Pick seed element
* • Group similar elements until
* threshold
* * * * *
* • Pick new seed from free
** elements and start again
Unsupervised Learning Algorithm
• Starting with one all-
encompassing cluster
* ** * • Find cluster with highest
* * ** ** ** * internal dissimilarity
** * * *
* * ** • Find most dissimilar pair of
* elements inside cluster
• Split into two clusters
* * * • Repeat until all clusters have
* * * *
** internal homogeneity
• Merge homogeneous clusters
Unsupervised Learning Evaluation
• Need to evaluate fitness of relationship learned
• Number of clusters vs. their internal properties
• Difference between clusters vs. internal homogeneity
• Number of parameters vs. number of hidden variables in Bayesian network
• No way of knowing what is the optimal solution
Reinforcement Learning
• Given a set of possible actions, the resulting state of the environment,
and rewards or punishment for each state
• Taxi driver: tips, car repair costs, tickets
• Checkers: advantage in number of pieces
• Learn to maximize the rewards and/or minimize the punishments
• Maximize tip, minimize damage to car and police tickets: drive properly
• Protect own pieces, take enemy pieces: good play strategy
Reinforcement Learning

• Learning by trial and error


• Try something, see the result
• Speeding results in tickets, going through a red light results in car damage,
quick and safe drive results in tips
• Checkers pieces in the center of the board are soon lost, pieces on the side
are kept longer, sacrifice some pieces to take a greater number of enemy
pieces
• Sacrifice known rewarding actions to explore new, potentially more
rewarding actions
• Develop strategies to maximize rewards while minimizing penalties
over the long-term
Summary of Learning
Supervised Unsupervised Reinforcement
Training Data and States,
data correct Data actions, and
output rewards
Learning Data-output Patterns in
Policy
target relationship data
Evaluation Statistics Fitness Reward value
Typical
Classifiers Clustering Controllers
application
Learning Probabilistic Models
• A deterministic mathematical model is meant to yield a single
solution describing the outcome of some "experiment" given
appropriate inputs. A probabilistic model is, instead, meant to give a
distribution of possible outcomes (i.e. it describes all outcomes and
gives some measure of how likely each is to occur)
Deep Learning
• Deep Learning is a subfield of machine learning concerned with
algorithms inspired by the structure and function of the brain
called artificial neural networks.
Neural Networks
• An Artificial Neural Network (ANN) is an information processing
paradigm that is inspired by the way biological nervous systems, such
as the brain, process information. The key element of this paradigm is
the novel structure of the information processing system.
• A neural network is composed of a number of nodes, or units,
connected by links.
• Each link has a numeric weight associated with it.
• Weights are the primary means of long-term storage in neural
networks, and learning usually takes place by updating the weights.
Convolutional Neural Network
• A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning
algorithm which can take in an input image, assign importance (learnable
weights and biases) to various aspects/objects in the image and be able to
differentiate one from the other. The pre-processing required in a ConvNet is
much lower as compared to other classification algorithms. While in primitive
methods filters are hand-engineered, with enough training, ConvNets have
the ability to learn these filters/characteristics.
• The architecture of a ConvNet is analogous to that of the connectivity pattern
of Neurons in the Human Brain and was inspired by the organization of the
Visual Cortex. Individual neurons respond to stimuli only in a restricted region
of the visual field known as the Receptive Field. A collection of such fields
overlap to cover the entire visual area.
Recurrent neural networks and LSTMs
• RNNs are a powerful and robust type of neural network, and belong
to the most promising algorithms in use because it is the only
one with an internal memory.
• Because of their internal memory, RNN’s can remember important
things about the input they received, which allows them to be very
precise in predicting what’s coming next. This is why they're the
preferred algorithm for sequential data like time series, speech, text,
financial data, audio, video, weather and much more. Recurrent
neural networks can form a much deeper understanding of a
sequence and its context compared to other algorithms.
Long Short-Term Memory (LSTM)

• Long short-term memory networks (LSTMs) are an extension for recurrent


neural networks, which basically extends the memory. Therefore it is well
suited to learn from important experiences that have very long time lags in
between.
• LSTMs enable RNNs to remember inputs over a long period of time. This is
because LSTMs contain information in a memory, much like the memory of a
computer. The LSTM can read, write and delete information from its memory.
• In an LSTM you have three gates: input, forget and output gate. These gates
determine whether or not to let new input in (input gate), delete the
information because it isn’t important (forget gate), or let it impact the
output at the current timestep (output gate).
Thank You

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy