0% found this document useful (0 votes)

62 views70 pages

Week 8

The document introduces machine learning and its three main types: supervised learning, unsupervised learning, and reinforcement learning. It discusses the key aspects of supervised learning including the process of learning from labeled training data and then testing the model on new unlabeled data.

Uploaded by

Aqil Syahmi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views70 pages

Week 8

Uploaded by

Aqil Syahmi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 70

Introduction to Supervised Machine

Learning

Nitin Indurkhya
(you can google my name and read about my background)
Acknowledgement
• This set of slides is based on input from A/Prof. Vivek Balachandran,
Prof. Yu Chien Siang and A/Prof. Guo Huaqun!
Course Logistics
• All details are in the module profile in the LMS
• Lab work will involve Python programming
• Work in same groups as in part-1 of the course
• Lets do a quick poll !!!
Quick Mentimeter Poll…
Assessment of part-2
• Quizzes each week (5%)
– Note that there is no quiz in week 12
– Quiz 11 will have 2 marks, other quizzes are worth 1 mark
– Quizzes will be from 11:30am to 11:35am
– Machine-marking
– Absolutely no makeup, approved MC’s will be extrapolated
• Group Coursework (25%)
• Final exam in week-13 (20%)
– Individual assessment
– One hour exam (MCQ and short-answers) using lockdown
browser in NYP
• Similar to what you did in the first half
Learning Outcomes (for today)
• Define the concept of Machine Learning
• Understand three types of Machine Learning
• Supervised Learning
• Unsupervised Learning
• Reinforcement Learning
• Understand concept of supervised learning algorithms and apply them into the
lab assignment and coursework
• K Nearest Neighbors
• Decision Tree
• Random Forest
Machine Learning
• What?
• Like humans: learning from the past
• The science of getting computers to learn without
being explicitly programmed
• Machine Learning is an application of AI wherein
the system gets the ability to automatically learn
and improve based on experience
• Machine Learning Applications
• Top 5 Machine Learning Application For 2022
• https://www.youtube.com/watch?v=aKhz79s-Row
Dimensions of ‘Learning’
• Has one acquired KNOWLEDGE
• Can one UNDERSTAND (meta-knowledge)
• Can one REFINE existing knowledge (background)
• What is the best REPRESENTATION (for specific knowledge)
• What are the different SOURCES (modalities)
• Is PERFORMANCE (of a specific skill) improving
• …many others…
Learning and Intelligence
• Generally accepted that ‘ability to learn’ is CENTRAL to ‘intelligence’
• Hence ‘Machine Learning’ is accepted as CENTRAL to ‘Artificial
Intelligence’ (AI)
• Over time, many so-called ‘Intelligent’ activities were reassessed:
– Game-playing (Chess is the most famous example)
– Information retrieval
• Evolving perspective of ‘Intelligence’
– Whatever CANNOT be done by a computer is ‘intelligence’
– Once an AI-task is ‘solved’ it is no longer considered part of AI
• Hence evolving view of ‘learning’!!!
Machine Learning Definition
• Arthur Samuel (1959). Machine Learning: Field of study that gives
computers the ability to learn without being explicitly programmed.

• Tom Mitchell (1998). Well-posed learning problem: A computer program

is said to learn from experience E with respect to some task T and some
performance measure P, if its performance on T, as measured by P,
improves with experience E.
Disciplines relevant to Machine Learning
• Artificial intelligence
• Bayesian methods
• Control theory
• Information theory
• Computational complexity theory
• Philosophy
• Psychology and neurobiology
• Statistics
Machine learning = Pattern Recognition

• Pattern recognition has its origins in engineering, whereas

machine learning grew out of computer science. However,
these activities can be viewed as two facets of the same
field, and together they have undergone substantial
development over the past ten years.

---- Bishop《 Pattern Recognition And Machine Learning》

Statistical learning ≈ Machine learning

• Statistical learning is an area which is highly overlapped

with machine learning.
• Most machine learning methods are from statistical
learning.
• Clustering
• The difference
• Statistical learning focus on the development and optimization of
statistical models (theory)
• Machine learning focuses on model application (practice)
Traditional Programming vs Machine
Learning Paradigm

Traditional Programming:
Data
Output Finding a Square root
Program Computer of a number

Machine Learning:
Input
Data Classification ‘Predicting’ the weather
Computer Program today
Corresponding
Output
‘Machine Learning’ as ‘ML’
• Recall the term ‘Dynamic Programming’ in Algorithms
– Just a name, other programming paradigms were not any less ‘dynamic’.
• ML is a VERY SPECIFIC view of the phrase ‘Machine Learning’
• Objective is to build a MODEL and use it for NEW CASES
• Model is built using a spreadsheet of data (mostly numeric)
• Problems posed as CLASSIFICATION or REGRESSION tasks.
• Very constrained classes of models are considered.
• Has very little to do with LEARNING and PREDICTION
Key points of the ML you will learn
• Spreadsheet of data used as input
– May need some transformations
• Predictions are NOT causal, merely correlations
• Connection of data to the real-world is not always clear-cut
• Iterative nature of modeling task
– Tweaking parameters, adjusting inputs, testing incessantly
• Focus is on a mature set of ML methods
– You won’t just be using them as black boxes but will learn HOW they work!
Supervised Learning

Supervised It is like learning Training dataset is Model is trained

learning under the like a teacher on a pre-defined
guidance of a which is used to dataset before it
teacher train the machine starts making
a method in which
decisions when
the machine learns
given new data
using labelled
data.
Unsupervised
learning
It is like learning
Unsupervised • a method in which the
without a teacher.
machine is trained on
Learning unlabelled data or
without any guidance

Model is given a
Model learns dataset and is left
through to automatically
observation & find patterns and
finds structures in relationships in that
data. dataset by creating
clusters.
Reinforcement Learning
• Reinforcement learning
• involves an agent that interacts with its
environment by producing actions &
discovers errors or rewards.
• It is like being stuck in an isolated island,
where you must explore the environment
and learn how to live and adapt to the
living conditions on your own.
• Model learns through the trial-and-error
method
• It learns on the basis of reward or penalty
given for every action it performs
Supervised Unsupervised Reinforcement
Types of Learning Learning Learning
Machine The machine is
Involve an agent
that interacts with
Learning The machine trained on
its environment by
Definition learns by using unlabeled data
producing actions
labeled data without any
& discovers errors
guidance
or rewards
Types of Classification or Association or
Reward Based
Problems Regression Classification
No pre-defined
Types of Data Labelled Data Unlabelled Data
data
External
Training No Supervision No Supervision
Supervision
Map Labeled Understand
Follow trail-and-
Approach input to known pattern and
error method
output discover output
Supervised learning process
• 2 Stage process
• Learning (training): Learn a model using the training data
• Testing: Test the model using unseen test data to assess the model
accuracy
Label Machine
learning
Feature
algorithm
extractor
features
Training input

Feature Classifier
extractor Label
model
features
Test Input
Instance-based learning
• Instance-based Learning
• Learning=storing all training instances
• Classification=assigning target function to a new
instance
• Referred to as “Lazy” learning
• Model is created at the point of classification

• Disadvantage of instance-based methods is that the

cost of classifying new instances can be high
• Nearly all computation takes place at classification
time rather than learning time
• Slower in classification
K Nearest Neighbors
• Most basic instance-based method

• Data is represented in a vector space

• Supervised learning

• https://www.youtube.com/watch?v=4HKqjENq9OU
• KNN Algorithm - How KNN Algorithm Works With Example | Data
Science For Beginners (27 min)
KNN Algorithm
• Features
• All instances correspond to points in an n-dimensional Euclidean space
• Classification is delayed till a new instance arrives
• Classification done by comparing feature vectors of the different points
• Target function may be discrete or real-valued
• For discrete-valued, the KNN returns the most common value among the k
nearest training examples.
K-Nearest Neighbor (How it works)

1 Nearest Neighbor
K-Nearest Neighbor (How it works)

3 Nearest Neighbor
KNN Algorithm
• Training algorithm
• For each training example <x,f(x)> add the example to the list
• Classification algorithm
• Given a query instance xq to be classified
• Let x1,..,xk be k instances which are nearest to xq
k
argmax
fˆ (x q )  
v  V i=1
 (v, f (x i ))

• Where (a,b)=1 if a=b, else (a,b)= 0

• V = finite set of classes/labels


What is a good value for k?
• Determined experimentally
• Start with k=1 and use a test set to validate the error rate of the
classifier
• Repeat with k=k+1
• Choose the value of k for which the error rate is minimum

• Is K = 10 or K = 11, better?
• How to test efficacy
• N-fold cross validation!
N-fold Cross Validation
• Split data into N block
• Perform the classification N times
• Each round of testing has
• N-1 blocks used for Training
• 1 block used for Testing
• Total of N results are obtained
• Average the error across all the N classification
Distance Calculation
• All instances correspond to points in an n-dimensional Euclidean
space 𝑋 = 𝑋1 , 𝑋2 , … . . 𝑋𝑛

• Distance between two instances

• Measured in Euclidean distance
• D= 𝑥1 − 𝑦1 2 + 𝑥2 − 𝑦2 2
• Distance between 𝑥1 , 𝑥2 and 𝑦1 , 𝑦2
Continuous-valued target functions
• KNN approximates to continuous-valued target functions
• Calculate the mean value of the k nearest training examples rather
than calculate their most common value

 f (x ) i
f : →d
fˆ (x q )  i=1
k

https://study.com/academy/lesson/discrete-continuous-functions-definition-examples.html


Curse of Dimensionality
• Imagine instances are described by 20 features
(attributes) but only 3 are relevant to target function
• Curse of dimensionality: nearest neighbor is easily
misled when instance space is high-dimensional
• Dominated by large number of irrelevant features
• https://deepai.org/machine-learning-glossary-and-
terms/curse-of-
dimensionality#:~:text=The%20curse%20of%20dime
nsionality%20refers,and%20%E2%80%9Ccloseness%
E2%80%9D%20of%20data.

Possible solutions
• Weight features
• Use cross-validation to automatically choose weights
z1,…,zn
The two wind turbines above seem very close to each other in two dimensions but separate when
• Feature subset selection viewed in a third dimension. This is the same effect the curse of dimensionality has on data.
KNN – When?
• When to use KNN
• Classification problems
• Data has definitive manageable feature space
• Lots of training data
Advantages:
• Training is very fast
• Learn complex target functions
• Do not lose information
Disadvantages:
• Slow at query time (or classification)
• Easily fooled by irrelevant features (attributes)
Why try Distance methods?
• Easy to apply
• No “model” to build usually
• Most packages have polished implementations
• Non-parametric in nature
• Doesn't make assumptions about population
distribution
• Quite competitive (surprisingly!)
• Theoretical result: E(1-NN) >= 2*E(Bayes)
• Will discuss Bayes rule later!!
Decision Tree Age

Old Young

Sex Healthy

Female Male
• Tree structure
Diseased
• Inverted tree starting with a root node Healthy

• An internal node is a test on an attribute

• A branch represents an outcome of the test
• A leaf node represents a class label or class label distribution
• At each node, one attribute is chosen to split training examples into
distinct classes as much as possible
• A new case is classified by following a matching path to a leaf node.
Training: make a decision tree
• Create a list of attributes that can be measured
• Decide on the target attributes that specify different classes
• Create an experience table with these attributes that we have seen in
the past
• Convert the experience table into a decision tree
• Eg: Using ID3 algorithm
Experience Table : To play or not!
Outlook Temperature Humidity Windy Play?
sunny hot high false No
• Previous weather data sunny hot high true No
• With the days James played tennis overcast hot high false Yes
rain mild high false Yes
rain cool normal false Yes
• This data is used to train the model rain cool normal true No
overcast cool normal true Yes
sunny mild high false No
• Given a new set of weather data sunny cool normal false Yes
rain mild normal false Yes
• Predict if James will play tennis
sunny mild normal true Yes
overcast mild high true Yes
overcast hot normal false Yes
rain mild high true No
Decision Tree sample
Outlook

Sunny Overcast Rain

Humidity Yes Wind

High Normal True False

No Yes No Yes
Decision Tree dissected
Outlook

Sunny Overcast Rain

Humidity Each internal node tests an attribute

Normal Each branch corresponds to an

attribute value node

Yes Leaf node -> A Class/decision

A fine new day! Outlook Temperature Humidity Windy Play?
sunny hot high false No
sunny hot high true No
• Test case overcast hot high false Yes

• Sunny, hot, normal, true rain mild high false Yes

rain cool normal false Yes
• Different from training data
rain cool normal true No
overcast cool normal true Yes
sunny mild high false No
sunny cool normal false Yes
rain mild normal false Yes
sunny mild normal true Yes
overcast mild high true Yes
overcast hot normal false Yes
rain mild high true No
Decision Tree classification
Outlook
• Test case
• Outlook =Sunny Sunny Overcast Rain
• Humidity = Normal
• Temperature = hot
• Windy = true Humidity Yes Windy
• Will he play?
• Yes!
High Normal True False

No Yes No Yes
Building Decision Tree
• Top-down tree construction
• At start, all training examples are at the root.
• Partition the examples recursively by choosing one attribute each time.

• Bottom-up tree pruning

• Remove subtrees or branches, in a bottom-up manner, to improve the
estimated accuracy on new cases.
Deciding on the splitting attribute
• At each node, available attributes are evaluated on the basis of
separating the classes of the training examples. A Goodness function
is used for this purpose.

• Typical goodness functions:

• information gain (ID3/C4.5)
• information gain ratio
• gini index

https://medium.com/datadriveninvestor/tree-algorithms-id3-c4-5-c5-0-and-cart-413387342164
Which attribute to select? Outlook

Windy
Sunny Overcast Rain

Humidity True False

Yes Yes Yes
Yes Yes Yes
No Yes Yes
High Normal No Yes Yes
Yes No
No Yes Yes
No
Yes Yes
Yes Yes Temperature No Yes
Yes Yes No Yes
Yes Yes No Yes
Hot Mild Cool
No Yes No
No Yes Yes Yes No
No Yes Yes Yes
Yes
No No No Yes
Yes
No Yes
Yes
No
No
No
Attribute selection
• Which is the best attribute?
• The one which will result in the smallest tree
• Select the attribute that produces the “purest” nodes (all yes or all no)
• Measure by the uncertainty
• Popular impurity measure: information gain
• Information gain increases with the average purity of the subsets that an
attribute produces
• Strategy: choose attribute that results in greatest information gain
Which attribute to split on?

9 Yes 9 Yes
Outlook 5 No Windy 5 No

Sunny Overcast Rain True False

2 Yes 4 Yes 3 Yes

3 No 3 Yes 6 Yes
0 No 2 No 3 No 2 No
Which attribute to split on?
9 Yes 9 Yes
5 No Windy 5 No
Outlook

Sunny Overcast Rain True False

2 Yes 4 Yes 3 Yes 3 Yes 6 Yes

3 No 0 No 2 No 3 No 2 No

• Measure the purity of the split

• More certain about Yes/No after the split
• Pure set (4 yes/0 no) : completely certain (100%)
• Impure set (3 yes/3 no) : completely uncertain (50%)
• Cannot use P(“yes”|set)
• Must be symmetric: (4 yes/0 no) is as pure as (0 yes/7 no)
Entropy
Entropy is the measure of randomness or unpredictability in the dataset
• S is a sample of training examples in a subset
• p+ is the proportion of positive examples
• p- is the proportion of negative examples
• Entropy measures the impurity of subset S
Entropy(S) = -p+ log2 p+ - p- log2 p-

• Interpretation: if item X belongs to S

• How many bits need to tell if X positive or negative
• Impure (3 yes / 3 no):
• Entropy (S) = -3/6log2(3/6) – 3/6log2(3/6) = -log2(1/2) = - (log21 – log22) = - (0 – 1) = 1 bits
• Pure ( 4 yes/ 0 no)
• Entropy (S) = -4/4log2(4/4) – 0/4log2(0/4) = - log21 – 0 = 0 bits

log2x = logx/log2
Information Gain
• Entropy tells how pure or impure one subset is
• How to combine entropy of all subsets?
• Aggregate information from several different subsets
• Average them?
• Not a simple average (Why?)
• Weight on the entropy value for each subset
• Proportional size of the subset
• Information Gain
• Entropy difference before and after the split
𝑆𝑣
𝐺𝑎𝑖𝑛 𝑆, 𝐴 = Entropy(S) - σ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝑆𝑉 )
𝑆
Find the Gain of Outlook
9 Yes
• Entropy of the set of outcomes 5 No
• Before the split happens (at the root node) Outlook
• Entropy of Sunny
• Entropy of Overcast
Sunny Overcast Rain
• Entropy of Rain
• Expected information for the attribute subsets
2 Yes 4 Yes 3 Yes
• Information gain 3 No 0 No 2 No

• Entropy(S) = -p+ log2 p+ - p- log2 p-

S – {9Yes, 5No}
𝑆𝑣
𝐺𝑎𝑖𝑛 𝑆, 𝐴 = Entropy(S) - σ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝑆𝑉 ) A – Outlook
𝑆
V – {Sunny, Overcast, Rainy}
Sv – Subset of S for each v in V
Find the Gain of Outlook
9 Yes
• Entropy of the set of outcomes (E) 5 No
• E(9/14,5/14) Outlook
• Entropy of Sunny (E1)
• E(2/5,3/5)
• Entropy of Overcast (E2) Sunny Overcast Rain
• E(4/4,0/4)
• Entropy of Rain (E3) 2 Yes 4 Yes 3 Yes
• E(3/5,2/5) 3 No 0 No 2 No
• Expected information for the attribute subsets

• Information gain:
Find the Gain of Outlook
9 Yes
• Entropy of the set of outcomes (E) 5 No
• E(9/14,5/14) (X) Outlook
• Entropy of Sunny (E1)
• E(2/5,3/5)
• Entropy of Overcast (E2) Sunny Overcast Rain
• E(4/4,0/4)
• Entropy of Rain (E3) 2 Yes 4 Yes 3 Yes
• E(3/5,2/5) 3 No 0 No 2 No
• Expected information for the attribute subsets
• (5/14)*E1 + (4/14)*E2+ (5/14)*E3 (Y)
• Information gain:
• X-Y
Find the Gain of Outlook
• Entropy of the set of outcomes
• E(9/14,5/14) = -9/14log2(9/14) – 5/14log2(5/14) 9 Yes
= -9/14log(9/14)/log2 – 5/14log(5/14)/log2 = 0.94 5 No
Outlook
• Entropy of Sunny
• E(2/5,3/5) = -2/5log2(2/5) – 3/5log2(3/5)
= -2/5log(2/5)/log2 – 3/5log(3/5)/log2 = 0.971
• Entropy of Overcast Sunny Overcast Rain
• E(4/4,0/4) = -4/4log2(4/4) – 0/4log2(0/4) = 0
• Entropy of Rain 2 Yes 4 Yes 3 Yes
• E(3/5,2/5) = -3/5log2(3/5) – 2/5log2(2/5) 3 No 0 No 2 No
= -3/5log(3/5)/log2 – 2/5log(2/5)/log2 = 0.971
• Expected information for the attribute subsets
• (5/14)*0.971 + (4/14)*0+ (5/14)*0.971 = 0.69

• Information gain:
• Gain(S, Outlook) = 0.94 – 0.69 = 0.25
Find the Gain of Temperature
• Entropy of the set of outcomes 9 Yes
• E(9/14,5/14) = -9/14log2(9/14) – 5/14log2(5/14)
5 No
= -9/14log(9/14)/log2 – 5/14log(5/14)/log2 = 0.94
Temperature
• Entropy of Hot
• E(2/4,2/4) = -2/4log2(2/4) – 2/4log2(2/4) = 1
• Entropy of Mild Hot Mild Cold
• E(4/6,2/6) = -4/6log2(4/6) – 2/6log2(2/6)
= -4/6log(4/6)/log2 – 2/6log(2/6)/log2 = 0.92
• Entropy of Cold 2 Yes 4 Yes 3 Yes
• E(3/4,1/4) = -3/4log2(3/4) – 1/4log2(1/4) 2 No 2 No 1 No
= -3/4log(3/4)/log2 – 1/4log(1/4)/log2 = 0.81
• Expected information for the attribute subsets
• (4/14)*1 + (6/14)*0.92+ (4/14)*0.81 = 0.29+0.39+0.23 = 0.91

• Information gain:
• Gain(S, Temperature) = 0.94 – 0.91 = 0.03
Find the Gain of Humidity
• Entropy of the set of outcomes Entropy(9/14, 5/14) Yes – 9
• E(9/14,5/14) = -9/14log2(9/14) – 5/14log2(5/14) Humidity No - 5
= -9/14log(9/14)/log2 – 5/14log(5/14)/log2 = 0.94
• High Humidity entropy
• E(3/7,4/7) = -3/7log2(3/7) – 4/7log2(4/7)
= -3/7log(3/7)/log2 – 4/7log(4/7)/log2 = 0.985 High Normal High - 7
Normal - 7
• Normal Humidity entropy
• E(6/7,1/7) = -6/7log2(6/7) – 1/7log2(1/7)
= -6/7log(6/7)/log2 – 1/7log(1/7)/log2 = 0.592 Yes - 3 Yes - 6
No - 4 No - 1
• Expected information for the attribute subsets
• (7/14)*0.985 + (7/14)*0.592 = 0.79
Entropy(3/7, 4/7) Entropy(6/7, 1/7)
• Information gain:
• Gain(S, Humidity) = 0.94 – 0.79 = 0.15
Find the Gain of Windy
9 Yes
• Entropy of the set of outcomes Windy 5 No
• Es
• True entropy
• E1 True False
• False entropy
• E2 6 Yes
3 Yes
• Expected information for the attribute subsets 3 No 2 No

• Information gain:
• Gain(S, Windy) = ?
Computing information gain
• Information gain for each attributes
• Gain(“Outlook’) = 0.25
• Gain(“Temperature”) = 0.03
• Gain(“Humidity”) = 0.15
• Gain(“Windy”) =
• Find the node with the maximum gain
• The root node is Outlook!
Decision Tree
Outlook

• Overcast node
already ended up Sunny Overcast Rain
having leaf node
‘Yes’
• Two subtrees of Humidity Yes Windy
Sunny and Rain
to compute
information gain: High Normal False
True
• Humidity
• Temperature
No Yes No Yes
• Windy
Overfitting
• Overfitting: A tree may overfit the training data
• Symptoms: tree too deep and too many branches, some may reflect anomalies due
to noise or outliers
• Keep splitting until each node contains 1 example
• Singleton = pure
• Good accuracy on training data but poor on test data
• Two approaches to avoid overfitting
• Pre-pruning: Halt tree construction early
• Stop splitting when not statistically significant
• Difficult to decide because we do not know what may happen subsequently if we keep
growing the tree.
• Post-pruning: Remove branches or sub-trees from a “fully grown” tree.
• This method is commonly used
• Uses a statistical method to estimates the errors at each node for pruning.
• A validation set may be used for pruning as well.
An example
Postpruning
• Postpruning waits until the full decision tree has built and then
prunes the attributes
• Two techniques:
• Subtree Replacement
• Subtree Raising
Subtree Replacement
• Entire subtree is replaced by a single leaf node

C 4 5

1 2 3
Subtree Replacement
• Node 2 replaced the subtree
• Generalizes tree a little more, but may increase accuracy

2 4 5
Subtree Raising
• Entire subtree is raised onto another node

C 4 5

1 2 3
Subtree Raising
• Entire subtree is raised onto another node

C B

1 2 3
Random Forest (RF)
• Ensemble Classifier
• Consists of many decision trees
• Created from subsets of data
• Random sampling of subsets
• Classification
• Classify using each of the trees in the random forest
• Each classifier predicts the outcome
• Final decision by voting
• The method combines Breiman's "bagging" idea and the random selection
of features
• https://www.youtube.com/watch?v=eM4uJ6XGnSM
• Random Forest Algorithm - Random Forest Explained (45 min)
An example! Age Age

Old Young Old Young

Height Healthy Sex Diseased

Short Female Male

Tall
Diseased Healthy Healthy
Healthy

• New Sample Work status

• Old, retired, male, short
Retired Working
• 2 predictions
• Diseased, Healthy Height Healthy

• Majority Rule Tall Short

• Diseased Healthy Diseased

Advantages of Random Forests
The advantages of random forest are:
• It is one of the most accurate learning algorithms available. For many
data sets, it produces a highly accurate classifier.
• It runs efficiently on large databases.
• It can handle thousands of input variables without variable deletion.
• It has an effective method for estimating missing data and maintains
accuracy when a large proportion of the data is missing.
• It is faster due to the smaller size of the trees.
Summary
• Machine Learning
• Supervised Algorithm
• Supervised learning process – Train and Test
• KNN
• Lazy learning
• N-fold cross validation
• Decision Trees
• Tree structure
• Entropy/Information gain
• Overfitting
• Random Forest
• Subsets of data
• Collection of Trees

BSC - Computer Objectives PDF
75% (4)
BSC - Computer Objectives PDF
72 pages
Asset-V1 MKAU+SEng9032+DEV 01+type@asset+block@ChapOne
No ratings yet
Asset-V1 MKAU+SEng9032+DEV 01+type@asset+block@ChapOne
29 pages
Erik Herman - Data Science For Decision Makers - Using Analytics and Case Studies (2024, Mercury Learning and Information) - Libgen - Li
No ratings yet
Erik Herman - Data Science For Decision Makers - Using Analytics and Case Studies (2024, Mercury Learning and Information) - Libgen - Li
197 pages
WEEK 01 Merged
No ratings yet
WEEK 01 Merged
606 pages
Blue Print Class IX Ai PT 2 417
100% (1)
Blue Print Class IX Ai PT 2 417
2 pages
Case Study Almarai - Supply Chain Management
100% (1)
Case Study Almarai - Supply Chain Management
16 pages
ML - 1 - Sovan - Introduction To ML
No ratings yet
ML - 1 - Sovan - Introduction To ML
83 pages
UNit 1 Introduction To ML
No ratings yet
UNit 1 Introduction To ML
225 pages
Lecture Compiled
No ratings yet
Lecture Compiled
224 pages
Ml-Unit 1
No ratings yet
Ml-Unit 1
53 pages
AI Session 3 Machine Learning Slides
No ratings yet
AI Session 3 Machine Learning Slides
35 pages
Module 1
No ratings yet
Module 1
175 pages
Machine Learning - Introduction
No ratings yet
Machine Learning - Introduction
138 pages
Utilization of ERP Systems in Manufacturing Industry For Productivity
No ratings yet
Utilization of ERP Systems in Manufacturing Industry For Productivity
8 pages
Chapter 5 Machine Learning
No ratings yet
Chapter 5 Machine Learning
96 pages
Mirdasm Report File
No ratings yet
Mirdasm Report File
101 pages
FAI Unit 6
No ratings yet
FAI Unit 6
36 pages
Week 09 Lesson 1 Intro Machine Learning 1 To 32
No ratings yet
Week 09 Lesson 1 Intro Machine Learning 1 To 32
61 pages
Machine Learning Techniques-Bcds062!01!01
No ratings yet
Machine Learning Techniques-Bcds062!01!01
66 pages
PDF Modelling and Simulation in Management Sciences: Proceedings of The International Conference On Modelling and Simulation in Management Sciences (MS-18) Joan Carles Ferrer-Comalat Download
No ratings yet
PDF Modelling and Simulation in Management Sciences: Proceedings of The International Conference On Modelling and Simulation in Management Sciences (MS-18) Joan Carles Ferrer-Comalat Download
65 pages
Unit 1 ML
No ratings yet
Unit 1 ML
96 pages
Lec 7 - 8 - Machine Learning Introduction
No ratings yet
Lec 7 - 8 - Machine Learning Introduction
55 pages
Ch3-Machine Learning
No ratings yet
Ch3-Machine Learning
124 pages
AI Notes Week 11
No ratings yet
AI Notes Week 11
68 pages
1.machine Learning Basics
No ratings yet
1.machine Learning Basics
74 pages
Machine Learning - Introduction
No ratings yet
Machine Learning - Introduction
73 pages
Lecture 1.2 Introduction To Machine Learning
No ratings yet
Lecture 1.2 Introduction To Machine Learning
31 pages
3502 Generative AI A To Z
No ratings yet
3502 Generative AI A To Z
88 pages
Overview of Machine Learning PDF
100% (1)
Overview of Machine Learning PDF
57 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
48 pages
Unit 1
No ratings yet
Unit 1
62 pages
Cs-A-501 Ai - Ocw
No ratings yet
Cs-A-501 Ai - Ocw
107 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
78 pages
CH 4
No ratings yet
CH 4
106 pages
Lecture01 Introduction To Machine Learning (Chapter1)
No ratings yet
Lecture01 Introduction To Machine Learning (Chapter1)
64 pages
CHAPTER - 1-3 (Hauwa)
No ratings yet
CHAPTER - 1-3 (Hauwa)
34 pages
Research Design
No ratings yet
Research Design
2 pages
X03 - Microsoft Threat Modeling Tool - 2023
No ratings yet
X03 - Microsoft Threat Modeling Tool - 2023
8 pages
ML Introduction
No ratings yet
ML Introduction
54 pages
Overview of Machine Learning
No ratings yet
Overview of Machine Learning
60 pages
Unit 1
No ratings yet
Unit 1
66 pages
Lec4 Data Analysis
No ratings yet
Lec4 Data Analysis
39 pages
L02-Business and Technology Threats
No ratings yet
L02-Business and Technology Threats
55 pages
Module 7
No ratings yet
Module 7
46 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Python UNIT-5
100% (1)
Python UNIT-5
67 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
20 pages
Day 2 Part 1
No ratings yet
Day 2 Part 1
52 pages
Lect1 Introduction
No ratings yet
Lect1 Introduction
38 pages
AI Trailblazing Webinar Takeaways Report
No ratings yet
AI Trailblazing Webinar Takeaways Report
24 pages
L06-Singapore Cyber Landscape 2020 - Final
No ratings yet
L06-Singapore Cyber Landscape 2020 - Final
33 pages
Introduction To ML
No ratings yet
Introduction To ML
17 pages
Machine Learning - UNIT I
No ratings yet
Machine Learning - UNIT I
70 pages
ML 1
No ratings yet
ML 1
35 pages
Machine Learning: BE Sixth Semester 20CS610
No ratings yet
Machine Learning: BE Sixth Semester 20CS610
211 pages
1st Legal Aid Policy Drafting Competition, 2025.
No ratings yet
1st Legal Aid Policy Drafting Competition, 2025.
20 pages
Unit 3
No ratings yet
Unit 3
62 pages
Assigment 2024
No ratings yet
Assigment 2024
25 pages
Key Track
No ratings yet
Key Track
11 pages
Customer Service 2024 Leadership Vision
No ratings yet
Customer Service 2024 Leadership Vision
16 pages
ML - Module 1
No ratings yet
ML - Module 1
30 pages
1 - Introduction
No ratings yet
1 - Introduction
82 pages
Paper 41-Detection of Autism Spectrum Disorder
No ratings yet
Paper 41-Detection of Autism Spectrum Disorder
15 pages
ML Chapter 1
No ratings yet
ML Chapter 1
37 pages
UNIT I-Machine Learning
No ratings yet
UNIT I-Machine Learning
68 pages
CHP 1
No ratings yet
CHP 1
47 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
04-Regression-III Machine Learning
No ratings yet
04-Regression-III Machine Learning
26 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
GCCs in India - Building Resilience For Sustainable Growth
0% (1)
GCCs in India - Building Resilience For Sustainable Growth
40 pages
AI Translation
No ratings yet
AI Translation
3 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
Chapter 1 ML
No ratings yet
Chapter 1 ML
30 pages
Artificial Intelligence in Ododontics Current Application and Future Directions
No ratings yet
Artificial Intelligence in Ododontics Current Application and Future Directions
6 pages
Joydeep Paul 23yrs LE
No ratings yet
Joydeep Paul 23yrs LE
2 pages
Introduction To ML
100% (1)
Introduction To ML
39 pages
Unit-1 Introduction To Machine Learning
No ratings yet
Unit-1 Introduction To Machine Learning
24 pages
Machine Learning
No ratings yet
Machine Learning
13 pages
Sandiway Fong: Education
No ratings yet
Sandiway Fong: Education
10 pages
INT422
No ratings yet
INT422
5 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
ML Doc1
No ratings yet
ML Doc1
14 pages
Intro - Types of Machine Learning
No ratings yet
Intro - Types of Machine Learning
24 pages
Asset-V1 ColumbiaX+CSMM.101x+1T2017+type@asset+block@AI Edx ML 5.1intro
No ratings yet
Asset-V1 ColumbiaX+CSMM.101x+1T2017+type@asset+block@AI Edx ML 5.1intro
70 pages
ML - Week 1
No ratings yet
ML - Week 1
37 pages
Machine Learning Dislocation
No ratings yet
Machine Learning Dislocation
9 pages
Application For Software Engineer - Zalando
No ratings yet
Application For Software Engineer - Zalando
1 page
Unit-4object Segmentation Regression Vs Segmentation Supervised and Unsupervised Learning Tree Building Regression Classification Overfitting Pruning and Complexity Multiple Decision Trees
No ratings yet
Unit-4object Segmentation Regression Vs Segmentation Supervised and Unsupervised Learning Tree Building Regression Classification Overfitting Pruning and Complexity Multiple Decision Trees
25 pages
Intro Machine Learning
No ratings yet
Intro Machine Learning
4 pages
Min Lin PDF
No ratings yet
Min Lin PDF
10 pages
Big Data
No ratings yet
Big Data
30 pages
Basics of Machine Learning
No ratings yet
Basics of Machine Learning
20 pages
Computer Architecture Notes For ENGG1811
No ratings yet
Computer Architecture Notes For ENGG1811
14 pages
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Week 8

Uploaded by

Week 8

Uploaded by

Introduction to Supervised Machine

• Tom Mitchell (1998). Well-posed learning problem: A computer program

• Pattern recognition has its origins in engineering, whereas

---- Bishop《 Pattern Recognition And Machine Learning》

• Statistical learning is an area which is highly overlapped

Supervised It is like learning Training dataset is Model is trained

• Disadvantage of instance-based methods is that the

• Data is represented in a vector space

• Where (a,b)=1 if a=b, else (a,b)= 0

• Distance between two instances

• An internal node is a test on an attribute

Sunny Overcast Rain

Humidity Yes Wind

High Normal True False

Sunny Overcast Rain

Humidity Each internal node tests an attribute

Normal Each branch corresponds to an

Yes Leaf node -> A Class/decision

• Sunny, hot, normal, true rain mild high false Yes

• Bottom-up tree pruning

• Typical goodness functions:

Humidity True False

Sunny Overcast Rain True False

2 Yes 4 Yes 3 Yes

Sunny Overcast Rain True False

2 Yes 4 Yes 3 Yes 3 Yes 6 Yes

• Measure the purity of the split

• Interpretation: if item X belongs to S

• Entropy(S) = -p+ log2 p+ - p- log2 p-

Old Young Old Young

Height Healthy Sex Diseased

Short Female Male

• New Sample Work status

• Majority Rule Tall Short

• Diseased Healthy Diseased

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.