0% found this document useful (0 votes)
62 views70 pages

Week 8

The document introduces machine learning and its three main types: supervised learning, unsupervised learning, and reinforcement learning. It discusses the key aspects of supervised learning including the process of learning from labeled training data and then testing the model on new unlabeled data.

Uploaded by

Aqil Syahmi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views70 pages

Week 8

The document introduces machine learning and its three main types: supervised learning, unsupervised learning, and reinforcement learning. It discusses the key aspects of supervised learning including the process of learning from labeled training data and then testing the model on new unlabeled data.

Uploaded by

Aqil Syahmi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

Introduction to Supervised Machine

Learning

Nitin Indurkhya
(you can google my name and read about my background)
Acknowledgement
• This set of slides is based on input from A/Prof. Vivek Balachandran,
Prof. Yu Chien Siang and A/Prof. Guo Huaqun!
Course Logistics
• All details are in the module profile in the LMS
• Lab work will involve Python programming
• Work in same groups as in part-1 of the course
• Lets do a quick poll !!!
Quick Mentimeter Poll…
Assessment of part-2
• Quizzes each week (5%)
– Note that there is no quiz in week 12
– Quiz 11 will have 2 marks, other quizzes are worth 1 mark
– Quizzes will be from 11:30am to 11:35am
– Machine-marking
– Absolutely no makeup, approved MC’s will be extrapolated
• Group Coursework (25%)
• Final exam in week-13 (20%)
– Individual assessment
– One hour exam (MCQ and short-answers) using lockdown
browser in NYP
• Similar to what you did in the first half
Learning Outcomes (for today)
• Define the concept of Machine Learning
• Understand three types of Machine Learning
• Supervised Learning
• Unsupervised Learning
• Reinforcement Learning
• Understand concept of supervised learning algorithms and apply them into the
lab assignment and coursework
• K Nearest Neighbors
• Decision Tree
• Random Forest
Machine Learning
• What?
• Like humans: learning from the past
• The science of getting computers to learn without
being explicitly programmed
• Machine Learning is an application of AI wherein
the system gets the ability to automatically learn
and improve based on experience
• Machine Learning Applications
• Top 5 Machine Learning Application For 2022
• https://www.youtube.com/watch?v=aKhz79s-Row
Dimensions of ‘Learning’
• Has one acquired KNOWLEDGE
• Can one UNDERSTAND (meta-knowledge)
• Can one REFINE existing knowledge (background)
• What is the best REPRESENTATION (for specific knowledge)
• What are the different SOURCES (modalities)
• Is PERFORMANCE (of a specific skill) improving
• …many others…
Learning and Intelligence
• Generally accepted that ‘ability to learn’ is CENTRAL to ‘intelligence’
• Hence ‘Machine Learning’ is accepted as CENTRAL to ‘Artificial
Intelligence’ (AI)
• Over time, many so-called ‘Intelligent’ activities were reassessed:
– Game-playing (Chess is the most famous example)
– Information retrieval
• Evolving perspective of ‘Intelligence’
– Whatever CANNOT be done by a computer is ‘intelligence’
– Once an AI-task is ‘solved’ it is no longer considered part of AI
• Hence evolving view of ‘learning’!!!
Machine Learning Definition
• Arthur Samuel (1959). Machine Learning: Field of study that gives
computers the ability to learn without being explicitly programmed.

• Tom Mitchell (1998). Well-posed learning problem: A computer program


is said to learn from experience E with respect to some task T and some
performance measure P, if its performance on T, as measured by P,
improves with experience E.
Disciplines relevant to Machine Learning
• Artificial intelligence
• Bayesian methods
• Control theory
• Information theory
• Computational complexity theory
• Philosophy
• Psychology and neurobiology
• Statistics
Machine learning = Pattern Recognition

• Pattern recognition has its origins in engineering, whereas


machine learning grew out of computer science. However,
these activities can be viewed as two facets of the same
field, and together they have undergone substantial
development over the past ten years.

---- Bishop《 Pattern Recognition And Machine Learning》


Statistical learning ≈ Machine learning

• Statistical learning is an area which is highly overlapped


with machine learning.
• Most machine learning methods are from statistical
learning.
• Clustering
• The difference
• Statistical learning focus on the development and optimization of
statistical models (theory)
• Machine learning focuses on model application (practice)
Traditional Programming vs Machine
Learning Paradigm

Traditional Programming:
Data
Output Finding a Square root
Program Computer of a number

Machine Learning:
Input
Data Classification ‘Predicting’ the weather
Computer Program today
Corresponding
Output
‘Machine Learning’ as ‘ML’
• Recall the term ‘Dynamic Programming’ in Algorithms
– Just a name, other programming paradigms were not any less ‘dynamic’.
• ML is a VERY SPECIFIC view of the phrase ‘Machine Learning’
• Objective is to build a MODEL and use it for NEW CASES
• Model is built using a spreadsheet of data (mostly numeric)
• Problems posed as CLASSIFICATION or REGRESSION tasks.
• Very constrained classes of models are considered.
• Has very little to do with LEARNING and PREDICTION
Key points of the ML you will learn
• Spreadsheet of data used as input
– May need some transformations
• Predictions are NOT causal, merely correlations
• Connection of data to the real-world is not always clear-cut
• Iterative nature of modeling task
– Tweaking parameters, adjusting inputs, testing incessantly
• Focus is on a mature set of ML methods
– You won’t just be using them as black boxes but will learn HOW they work!
Supervised Learning

Supervised It is like learning Training dataset is Model is trained


learning under the like a teacher on a pre-defined
guidance of a which is used to dataset before it
teacher train the machine starts making
a method in which
decisions when
the machine learns
given new data
using labelled
data.
Unsupervised
learning
It is like learning
Unsupervised • a method in which the
without a teacher.
machine is trained on
Learning unlabelled data or
without any guidance

Model is given a
Model learns dataset and is left
through to automatically
observation & find patterns and
finds structures in relationships in that
data. dataset by creating
clusters.
Reinforcement Learning
• Reinforcement learning
• involves an agent that interacts with its
environment by producing actions &
discovers errors or rewards.
• It is like being stuck in an isolated island,
where you must explore the environment
and learn how to live and adapt to the
living conditions on your own.
• Model learns through the trial-and-error
method
• It learns on the basis of reward or penalty
given for every action it performs
Supervised Unsupervised Reinforcement
Types of Learning Learning Learning
Machine The machine is
Involve an agent
that interacts with
Learning The machine trained on
its environment by
Definition learns by using unlabeled data
producing actions
labeled data without any
& discovers errors
guidance
or rewards
Types of Classification or Association or
Reward Based
Problems Regression Classification
No pre-defined
Types of Data Labelled Data Unlabelled Data
data
External
Training No Supervision No Supervision
Supervision
Map Labeled Understand
Follow trail-and-
Approach input to known pattern and
error method
output discover output
Supervised learning process
• 2 Stage process
• Learning (training): Learn a model using the training data
• Testing: Test the model using unseen test data to assess the model
accuracy
Label Machine
learning
Feature
algorithm
extractor
features
Training input

Feature Classifier
extractor Label
model
features
Test Input
Instance-based learning
• Instance-based Learning
• Learning=storing all training instances
• Classification=assigning target function to a new
instance
• Referred to as “Lazy” learning
• Model is created at the point of classification

• Disadvantage of instance-based methods is that the


cost of classifying new instances can be high
• Nearly all computation takes place at classification
time rather than learning time
• Slower in classification
K Nearest Neighbors
• Most basic instance-based method

• Data is represented in a vector space

• Supervised learning

• https://www.youtube.com/watch?v=4HKqjENq9OU
• KNN Algorithm - How KNN Algorithm Works With Example | Data
Science For Beginners (27 min)
KNN Algorithm
• Features
• All instances correspond to points in an n-dimensional Euclidean space
• Classification is delayed till a new instance arrives
• Classification done by comparing feature vectors of the different points
• Target function may be discrete or real-valued
• For discrete-valued, the KNN returns the most common value among the k
nearest training examples.
K-Nearest Neighbor (How it works)

1 Nearest Neighbor
K-Nearest Neighbor (How it works)

3 Nearest Neighbor
KNN Algorithm
• Training algorithm
• For each training example <x,f(x)> add the example to the list
• Classification algorithm
• Given a query instance xq to be classified
• Let x1,..,xk be k instances which are nearest to xq
k
argmax
fˆ (x q )  
v  V i=1
 (v, f (x i ))

• Where (a,b)=1 if a=b, else (a,b)= 0


• V = finite set of classes/labels


What is a good value for k?
• Determined experimentally
• Start with k=1 and use a test set to validate the error rate of the
classifier
• Repeat with k=k+1
• Choose the value of k for which the error rate is minimum

• Is K = 10 or K = 11, better?
• How to test efficacy
• N-fold cross validation!
N-fold Cross Validation
• Split data into N block
• Perform the classification N times
• Each round of testing has
• N-1 blocks used for Training
• 1 block used for Testing
• Total of N results are obtained
• Average the error across all the N classification
Distance Calculation
• All instances correspond to points in an n-dimensional Euclidean
space 𝑋 = 𝑋1 , 𝑋2 , … . . 𝑋𝑛

• Distance between two instances


• Measured in Euclidean distance
• D= 𝑥1 − 𝑦1 2 + 𝑥2 − 𝑦2 2
• Distance between 𝑥1 , 𝑥2 and 𝑦1 , 𝑦2
Continuous-valued target functions
• KNN approximates to continuous-valued target functions
• Calculate the mean value of the k nearest training examples rather
than calculate their most common value

 f (x ) i
f : →d
fˆ (x q )  i=1
k

https://study.com/academy/lesson/discrete-continuous-functions-definition-examples.html


Curse of Dimensionality
• Imagine instances are described by 20 features
(attributes) but only 3 are relevant to target function
• Curse of dimensionality: nearest neighbor is easily
misled when instance space is high-dimensional
• Dominated by large number of irrelevant features
• https://deepai.org/machine-learning-glossary-and-
terms/curse-of-
dimensionality#:~:text=The%20curse%20of%20dime
nsionality%20refers,and%20%E2%80%9Ccloseness%
E2%80%9D%20of%20data.

Possible solutions
• Weight features
• Use cross-validation to automatically choose weights
z1,…,zn
The two wind turbines above seem very close to each other in two dimensions but separate when
• Feature subset selection viewed in a third dimension. This is the same effect the curse of dimensionality has on data.
KNN – When?
• When to use KNN
• Classification problems
• Data has definitive manageable feature space
• Lots of training data
Advantages:
• Training is very fast
• Learn complex target functions
• Do not lose information
Disadvantages:
• Slow at query time (or classification)
• Easily fooled by irrelevant features (attributes)
Why try Distance methods?
• Easy to apply
• No “model” to build usually
• Most packages have polished implementations
• Non-parametric in nature
• Doesn't make assumptions about population
distribution
• Quite competitive (surprisingly!)
• Theoretical result: E(1-NN) >= 2*E(Bayes)
• Will discuss Bayes rule later!!
Decision Tree Age

Old Young

Sex Healthy

Female Male
• Tree structure
Diseased
• Inverted tree starting with a root node Healthy

• An internal node is a test on an attribute


• A branch represents an outcome of the test
• A leaf node represents a class label or class label distribution
• At each node, one attribute is chosen to split training examples into
distinct classes as much as possible
• A new case is classified by following a matching path to a leaf node.
Training: make a decision tree
• Create a list of attributes that can be measured
• Decide on the target attributes that specify different classes
• Create an experience table with these attributes that we have seen in
the past
• Convert the experience table into a decision tree
• Eg: Using ID3 algorithm
Experience Table : To play or not!
Outlook Temperature Humidity Windy Play?
sunny hot high false No
• Previous weather data sunny hot high true No
• With the days James played tennis overcast hot high false Yes
rain mild high false Yes
rain cool normal false Yes
• This data is used to train the model rain cool normal true No
overcast cool normal true Yes
sunny mild high false No
• Given a new set of weather data sunny cool normal false Yes
rain mild normal false Yes
• Predict if James will play tennis
sunny mild normal true Yes
overcast mild high true Yes
overcast hot normal false Yes
rain mild high true No
Decision Tree sample
Outlook

Sunny Overcast Rain

Humidity Yes Wind

High Normal True False

No Yes No Yes
Decision Tree dissected
Outlook

Sunny Overcast Rain

Humidity Each internal node tests an attribute

Normal Each branch corresponds to an


attribute value node

Yes Leaf node -> A Class/decision


A fine new day! Outlook Temperature Humidity Windy Play?
sunny hot high false No
sunny hot high true No
• Test case overcast hot high false Yes

• Sunny, hot, normal, true rain mild high false Yes


rain cool normal false Yes
• Different from training data
rain cool normal true No
overcast cool normal true Yes
sunny mild high false No
sunny cool normal false Yes
rain mild normal false Yes
sunny mild normal true Yes
overcast mild high true Yes
overcast hot normal false Yes
rain mild high true No
Decision Tree classification
Outlook
• Test case
• Outlook =Sunny Sunny Overcast Rain
• Humidity = Normal
• Temperature = hot
• Windy = true Humidity Yes Windy
• Will he play?
• Yes!
High Normal True False

No Yes No Yes
Building Decision Tree
• Top-down tree construction
• At start, all training examples are at the root.
• Partition the examples recursively by choosing one attribute each time.

• Bottom-up tree pruning


• Remove subtrees or branches, in a bottom-up manner, to improve the
estimated accuracy on new cases.
Deciding on the splitting attribute
• At each node, available attributes are evaluated on the basis of
separating the classes of the training examples. A Goodness function
is used for this purpose.

• Typical goodness functions:


• information gain (ID3/C4.5)
• information gain ratio
• gini index

https://medium.com/datadriveninvestor/tree-algorithms-id3-c4-5-c5-0-and-cart-413387342164
Which attribute to select? Outlook

Windy
Sunny Overcast Rain

Humidity True False


Yes Yes Yes
Yes Yes Yes
No Yes Yes
High Normal No Yes Yes
Yes No
No Yes Yes
No
Yes Yes
Yes Yes Temperature No Yes
Yes Yes No Yes
Yes Yes No Yes
Hot Mild Cool
No Yes No
No Yes Yes Yes No
No Yes Yes Yes
Yes
No No No Yes
Yes
No Yes
Yes
No
No
No
Attribute selection
• Which is the best attribute?
• The one which will result in the smallest tree
• Select the attribute that produces the “purest” nodes (all yes or all no)
• Measure by the uncertainty
• Popular impurity measure: information gain
• Information gain increases with the average purity of the subsets that an
attribute produces
• Strategy: choose attribute that results in greatest information gain
Which attribute to split on?

9 Yes 9 Yes
Outlook 5 No Windy 5 No

Sunny Overcast Rain True False

2 Yes 4 Yes 3 Yes


3 No 3 Yes 6 Yes
0 No 2 No 3 No 2 No
Which attribute to split on?
9 Yes 9 Yes
5 No Windy 5 No
Outlook

Sunny Overcast Rain True False

2 Yes 4 Yes 3 Yes 3 Yes 6 Yes


3 No 0 No 2 No 3 No 2 No

• Measure the purity of the split


• More certain about Yes/No after the split
• Pure set (4 yes/0 no) : completely certain (100%)
• Impure set (3 yes/3 no) : completely uncertain (50%)
• Cannot use P(“yes”|set)
• Must be symmetric: (4 yes/0 no) is as pure as (0 yes/7 no)
Entropy
Entropy is the measure of randomness or unpredictability in the dataset
• S is a sample of training examples in a subset
• p+ is the proportion of positive examples
• p- is the proportion of negative examples
• Entropy measures the impurity of subset S
Entropy(S) = -p+ log2 p+ - p- log2 p-

• Interpretation: if item X belongs to S


• How many bits need to tell if X positive or negative
• Impure (3 yes / 3 no):
• Entropy (S) = -3/6log2(3/6) – 3/6log2(3/6) = -log2(1/2) = - (log21 – log22) = - (0 – 1) = 1 bits
• Pure ( 4 yes/ 0 no)
• Entropy (S) = -4/4log2(4/4) – 0/4log2(0/4) = - log21 – 0 = 0 bits

log2x = logx/log2
Information Gain
• Entropy tells how pure or impure one subset is
• How to combine entropy of all subsets?
• Aggregate information from several different subsets
• Average them?
• Not a simple average (Why?)
• Weight on the entropy value for each subset
• Proportional size of the subset
• Information Gain
• Entropy difference before and after the split
𝑆𝑣
𝐺𝑎𝑖𝑛 𝑆, 𝐴 = Entropy(S) - σ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝑆𝑉 )
𝑆
Find the Gain of Outlook
9 Yes
• Entropy of the set of outcomes 5 No
• Before the split happens (at the root node) Outlook
• Entropy of Sunny
• Entropy of Overcast
Sunny Overcast Rain
• Entropy of Rain
• Expected information for the attribute subsets
2 Yes 4 Yes 3 Yes
• Information gain 3 No 0 No 2 No

• Entropy(S) = -p+ log2 p+ - p- log2 p-


S – {9Yes, 5No}
𝑆𝑣
𝐺𝑎𝑖𝑛 𝑆, 𝐴 = Entropy(S) - σ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝑆𝑉 ) A – Outlook
𝑆
V – {Sunny, Overcast, Rainy}
Sv – Subset of S for each v in V
Find the Gain of Outlook
9 Yes
• Entropy of the set of outcomes (E) 5 No
• E(9/14,5/14) Outlook
• Entropy of Sunny (E1)
• E(2/5,3/5)
• Entropy of Overcast (E2) Sunny Overcast Rain
• E(4/4,0/4)
• Entropy of Rain (E3) 2 Yes 4 Yes 3 Yes
• E(3/5,2/5) 3 No 0 No 2 No
• Expected information for the attribute subsets

• Information gain:
Find the Gain of Outlook
9 Yes
• Entropy of the set of outcomes (E) 5 No
• E(9/14,5/14) (X) Outlook
• Entropy of Sunny (E1)
• E(2/5,3/5)
• Entropy of Overcast (E2) Sunny Overcast Rain
• E(4/4,0/4)
• Entropy of Rain (E3) 2 Yes 4 Yes 3 Yes
• E(3/5,2/5) 3 No 0 No 2 No
• Expected information for the attribute subsets
• (5/14)*E1 + (4/14)*E2+ (5/14)*E3 (Y)
• Information gain:
• X-Y
Find the Gain of Outlook
• Entropy of the set of outcomes
• E(9/14,5/14) = -9/14log2(9/14) – 5/14log2(5/14) 9 Yes
= -9/14log(9/14)/log2 – 5/14log(5/14)/log2 = 0.94 5 No
Outlook
• Entropy of Sunny
• E(2/5,3/5) = -2/5log2(2/5) – 3/5log2(3/5)
= -2/5log(2/5)/log2 – 3/5log(3/5)/log2 = 0.971
• Entropy of Overcast Sunny Overcast Rain
• E(4/4,0/4) = -4/4log2(4/4) – 0/4log2(0/4) = 0
• Entropy of Rain 2 Yes 4 Yes 3 Yes
• E(3/5,2/5) = -3/5log2(3/5) – 2/5log2(2/5) 3 No 0 No 2 No
= -3/5log(3/5)/log2 – 2/5log(2/5)/log2 = 0.971
• Expected information for the attribute subsets
• (5/14)*0.971 + (4/14)*0+ (5/14)*0.971 = 0.69

• Information gain:
• Gain(S, Outlook) = 0.94 – 0.69 = 0.25
Find the Gain of Temperature
• Entropy of the set of outcomes 9 Yes
• E(9/14,5/14) = -9/14log2(9/14) – 5/14log2(5/14)
5 No
= -9/14log(9/14)/log2 – 5/14log(5/14)/log2 = 0.94
Temperature
• Entropy of Hot
• E(2/4,2/4) = -2/4log2(2/4) – 2/4log2(2/4) = 1
• Entropy of Mild Hot Mild Cold
• E(4/6,2/6) = -4/6log2(4/6) – 2/6log2(2/6)
= -4/6log(4/6)/log2 – 2/6log(2/6)/log2 = 0.92
• Entropy of Cold 2 Yes 4 Yes 3 Yes
• E(3/4,1/4) = -3/4log2(3/4) – 1/4log2(1/4) 2 No 2 No 1 No
= -3/4log(3/4)/log2 – 1/4log(1/4)/log2 = 0.81
• Expected information for the attribute subsets
• (4/14)*1 + (6/14)*0.92+ (4/14)*0.81 = 0.29+0.39+0.23 = 0.91

• Information gain:
• Gain(S, Temperature) = 0.94 – 0.91 = 0.03
Find the Gain of Humidity
• Entropy of the set of outcomes Entropy(9/14, 5/14) Yes – 9
• E(9/14,5/14) = -9/14log2(9/14) – 5/14log2(5/14) Humidity No - 5
= -9/14log(9/14)/log2 – 5/14log(5/14)/log2 = 0.94
• High Humidity entropy
• E(3/7,4/7) = -3/7log2(3/7) – 4/7log2(4/7)
= -3/7log(3/7)/log2 – 4/7log(4/7)/log2 = 0.985 High Normal High - 7
Normal - 7
• Normal Humidity entropy
• E(6/7,1/7) = -6/7log2(6/7) – 1/7log2(1/7)
= -6/7log(6/7)/log2 – 1/7log(1/7)/log2 = 0.592 Yes - 3 Yes - 6
No - 4 No - 1
• Expected information for the attribute subsets
• (7/14)*0.985 + (7/14)*0.592 = 0.79
Entropy(3/7, 4/7) Entropy(6/7, 1/7)
• Information gain:
• Gain(S, Humidity) = 0.94 – 0.79 = 0.15
Find the Gain of Windy
9 Yes
• Entropy of the set of outcomes Windy 5 No
• Es
• True entropy
• E1 True False
• False entropy
• E2 6 Yes
3 Yes
• Expected information for the attribute subsets 3 No 2 No

• Information gain:
• Gain(S, Windy) = ?
Computing information gain
• Information gain for each attributes
• Gain(“Outlook’) = 0.25
• Gain(“Temperature”) = 0.03
• Gain(“Humidity”) = 0.15
• Gain(“Windy”) =
• Find the node with the maximum gain
• The root node is Outlook!
Decision Tree
Outlook

• Overcast node
already ended up Sunny Overcast Rain
having leaf node
‘Yes’
• Two subtrees of Humidity Yes Windy
Sunny and Rain
to compute
information gain: High Normal False
True
• Humidity
• Temperature
No Yes No Yes
• Windy
Overfitting
• Overfitting: A tree may overfit the training data
• Symptoms: tree too deep and too many branches, some may reflect anomalies due
to noise or outliers
• Keep splitting until each node contains 1 example
• Singleton = pure
• Good accuracy on training data but poor on test data
• Two approaches to avoid overfitting
• Pre-pruning: Halt tree construction early
• Stop splitting when not statistically significant
• Difficult to decide because we do not know what may happen subsequently if we keep
growing the tree.
• Post-pruning: Remove branches or sub-trees from a “fully grown” tree.
• This method is commonly used
• Uses a statistical method to estimates the errors at each node for pruning.
• A validation set may be used for pruning as well.
An example
Postpruning
• Postpruning waits until the full decision tree has built and then
prunes the attributes
• Two techniques:
• Subtree Replacement
• Subtree Raising
Subtree Replacement
• Entire subtree is replaced by a single leaf node

C 4 5

1 2 3
Subtree Replacement
• Node 2 replaced the subtree
• Generalizes tree a little more, but may increase accuracy

2 4 5
Subtree Raising
• Entire subtree is raised onto another node

C 4 5

1 2 3
Subtree Raising
• Entire subtree is raised onto another node

C B

1 2 3
Random Forest (RF)
• Ensemble Classifier
• Consists of many decision trees
• Created from subsets of data
• Random sampling of subsets
• Classification
• Classify using each of the trees in the random forest
• Each classifier predicts the outcome
• Final decision by voting
• The method combines Breiman's "bagging" idea and the random selection
of features
• https://www.youtube.com/watch?v=eM4uJ6XGnSM
• Random Forest Algorithm - Random Forest Explained (45 min)
An example! Age Age

Old Young Old Young

Height Healthy Sex Diseased

Short Female Male


Tall
Diseased Healthy Healthy
Healthy

• New Sample Work status


• Old, retired, male, short
Retired Working
• 2 predictions
• Diseased, Healthy Height Healthy

• Majority Rule Tall Short

• Diseased Healthy Diseased


Advantages of Random Forests
The advantages of random forest are:
• It is one of the most accurate learning algorithms available. For many
data sets, it produces a highly accurate classifier.
• It runs efficiently on large databases.
• It can handle thousands of input variables without variable deletion.
• It has an effective method for estimating missing data and maintains
accuracy when a large proportion of the data is missing.
• It is faster due to the smaller size of the trees.
Summary
• Machine Learning
• Supervised Algorithm
• Supervised learning process – Train and Test
• KNN
• Lazy learning
• N-fold cross validation
• Decision Trees
• Tree structure
• Entropy/Information gain
• Overfitting
• Random Forest
• Subsets of data
• Collection of Trees

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy