AI Chapter 5
AI Chapter 5
Yes No
No
Yes
Not Spam
Attach
Yes No
Spam Pictures
Yes No
Spam Not Spam
Supervised Learning Algorithm
• Many possible learning
- - - - techniques, depending on
- the problem and the data
-
- + + + - • Start with inaccurate initial
hypothesis
- + + - • Refine to reduce error or
+ increase accuracy
-+ + + + • End with trade-off between
- + + accuracy and simplicity
+ + -
- +
- - - -
- - - -
Supervised Learning Algorithm
• Learning a decision tree follows the same general algorithm
• Start with all emails at root
• Pick attribute that will teach us the most
• Highest information gain, i.e. difference of probability of each class
• Branch using that attribute
• Repeat until trade-off between accuracy of leafs and depth limit / relevance
of attributes
Supervised Learning Evaluation
• Statistical measures of agent’s performance
• RMS(Root Mean Square) error between f(x) and y
• Making correct decision
• With as few decision rules as possible
• Shallowest tree possible
• Accuracy of a classification
• Precision and recall of a classification
Precision and Recall
• Binary classification: distinguish + (our target) from
– (everything else)
• Classifier makes mistakes
• Classifies some + as – and some – as +
• Define four categories:
Actual value
+ –
True False
+
Classified Positives Positives
as False True
–
Negatives Negatives
Precision and Recall
• Precision
• Proportion of selected items the classifier got right
• TP / (TP + FP)
• Recall
• Proportion of target items the classifier selected
• TP / (TP + FN)
Overfitting
• A common problem with supervised learning is over-specializing the
relation learned to the training data
• Learning from irrelevant features of the data
• Email features such as: paragraph indentation, number of typos, letter “x” in
sender address, …
• Works well on training data
• Because of poor sampling or random chance
• Fails in real-world tests
Testing Data
• Evaluate the relation learned using unseen test data
• i.e. that was not used in the training
• Therefore system not overfitted for it
• Split training data beforehand, keep part away for testing
• Only works once!
• If you reuse testing data, you are overfitting your system for that test!!
• Never do that!!!
Cross-Validation
• Shortcomings of holding out test data
• Test only works once
• Training on less data, therefore result less accurate
• n-fold cross-validation
• Split the training corpus into n parts
• Train with n-1, test with 1
• Run n tests, each time using a different test part
• Final training with all data and best features
Unsupervised Learning
• Given a training corpus of data points
• Observed value of random variables in Bayesian network
• Series of data points
• Learn underlying pattern in the data
• Existence and conditional probability of hidden variables
• Number of classes and classification rules
Unsupervised Learning Example
• 2D state space with
unclassified observations
* ** *
* * ** ** ** * • Learn number and form of
** * * * clusters
* * **
* • Problem of unsupervised
clustering
* * * • Many algorithms proposed for
* * * *
**
it
• More research still being done
for better algorithms,
different kind of data, …
Unsupervised Learning Algorithm
• Define a similarity measure,
to compare pairs of
* ** * elements
* * ** ** ** *
** * * * • Starting with no clusters
* * ** • Pick seed element
* • Group similar elements until
* threshold
* * * * *
* • Pick new seed from free
** elements and start again
Unsupervised Learning Algorithm
• Starting with one all-
encompassing cluster
* ** * • Find cluster with highest
* * ** ** ** * internal dissimilarity
** * * *
* * ** • Find most dissimilar pair of
* elements inside cluster
• Split into two clusters
* * * • Repeat until all clusters have
* * * *
** internal homogeneity
• Merge homogeneous clusters
Unsupervised Learning Evaluation
• Need to evaluate fitness of relationship learned
• Number of clusters vs. their internal properties
• Difference between clusters vs. internal homogeneity
• Number of parameters vs. number of hidden variables in Bayesian network
• No way of knowing what is the optimal solution
Reinforcement Learning
• Given a set of possible actions, the resulting state of the environment,
and rewards or punishment for each state
• Taxi driver: tips, car repair costs, tickets
• Checkers: advantage in number of pieces
• Learn to maximize the rewards and/or minimize the punishments
• Maximize tip, minimize damage to car and police tickets: drive properly
• Protect own pieces, take enemy pieces: good play strategy
Reinforcement Learning