0% found this document useful (0 votes)
26 views12 pages

5.random Forest

Random forest is an ensemble learning method that constructs multiple decision trees and outputs the class that is the mode of the classes of the individual trees. It works by constructing trees using randomly selected subsets of the training data and features for each tree. This helps reduce variance and helps avoid overfitting. Random forest can be used for both classification and regression problems and is useful for feature selection. It has applications in banking, medicine, stock markets, and e-commerce for tasks like prediction, identification, and classification.

Uploaded by

patil_555
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views12 pages

5.random Forest

Random forest is an ensemble learning method that constructs multiple decision trees and outputs the class that is the mode of the classes of the individual trees. It works by constructing trees using randomly selected subsets of the training data and features for each tree. This helps reduce variance and helps avoid overfitting. Random forest can be used for both classification and regression problems and is useful for feature selection. It has applications in banking, medicine, stock markets, and e-commerce for tasks like prediction, identification, and classification.

Uploaded by

patil_555
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Machine Learning

Sunbeam Infotech www.sunbeaminfo.com


Random Forest
-

Sunbeam Infotech www.sunbeaminfo.com


E


Wy.I model - learner
data -


rows
Fl
raise salt -

vertical
test

rill)
el
-
predictions predictions
and en

(combining results

using mean/mode
final prediction
Overview

§ Random forest classifier creates a set of decision trees from randomly selected subset of training set
-
↳ mode
§ It then aggregates the votes from different decision trees to decide the final class of the test object
-

§ This works well because a single decision tree may be prone to a noise, but aggregate of many
- -

decision trees reduce the effect of noise giving more accurate results
--

Sunbeam Infotech www.sunbeaminfo.com


How does it work?

§ Suppose training set is given as : [X1, X2, X3, X4] with corresponding labels as [L1, L2, L3, L4],
-
random forest may create three decision trees taking input of subset for example,
-

Gombine
[X1, X2, X3]
[X1, X2, X4]
[X2, X3, X4]

§ So finally, it predicts based on the majority of votes from each of the decision trees made

Sunbeam Infotech www.sunbeaminfo.com


Case Study

§ A certain country has a population of 118 million


-

§ Salary data is collected with following attributes


e n

§ Age, Gender, Education, Residence, Industry


- - - -

§ Salary bands :
-
-

§ Band 1 : Below $40,000 lowes


-

§ Band 2: $40,000 – 150,000 middle


-

§ Band 3: More than $150,000 Upper


. .

Sunbeam Infotech www.sunbeaminfo.com


Case Study

§ Following are the outputs of the 5 different CART model.


DT DT3
DT 2

I- 00 C &
-
-
- - -

-I
0
-

- -

-
·0 00

D55 A+ 4

-
⑳ -
-
-

- - -
- -

Sunbeam Infotech www.sunbeaminfo.com


Case Study

§-
Using these 5 CART models, we need to come up
-
with singe set of probability to belong to each of
the salary classes
-

§ For simplicity, we will just take a mean of probabilities in this case study. Other than simple mean, we
- - -
also consider vote method to come up with the final prediction.
-

§ To come up with the final prediction let’s locate the following profile in each CART model :
§ 1. Age : 35 years -> 60%. -I
-

§ 2. Gender : Male
--
- 70y.
-
I

§ 3. Highest Educational Qualification : Diploma holder -> 80 Y


-

§ 4. Industry : Manufacturing -> 60% I


-
-

§ 5. Residence : Metro -
-
70%. I

Sunbeam Infotech www.sunbeaminfo.com


Case Study

§ For each of these CART model, following is the distribution across salary bands :

I
35 I
-
- -
- -

marke

t
- -

0
=

I
Diploma
->
- -

D
manute in
dig
-

I
-


§ The final probability is simply the average of the probability in the same salary bands in different
CART models
§ As you can see from this analysis, that there is 70% chance of this individual falling in class 1 (less
than $40,000) and around 24% chance of the individual falling in class 2

Sunbeam Infotech www.sunbeaminfo.com


Advantages of Random Forest algorithm

§ For applications in classification problems, Random Forest algorithm will avoid the overfitting problem
-

§ For both classification and regression task, the same random forest algorithm can be used
-

§ The Random Forest algorithm can be used for identifying the most important features from the
-

training dataset, in other words, feature engineering.


-

Sunbeam Infotech www.sunbeaminfo.com


Applications of Random Forest

§ For the application in banking, Random Forest algorithm is used to find loyal customers, which means
s e e

customers who can take out plenty of loans and pay interest to the bank properly, and fraud
customers, which means customers who have bad records like failure to pay back a loan on time or
have dangerous actions.
§ For the application in medicine, Random Forest algorithm can be used to both identify the correct
-

combination of components in medicine, and to-


- -
identify diseases by analyzing the patient’s medical
records.
§ For the application in the stock market, Random Forest algorithm can be used to identify a stock’s
-

-
behavior and the expected loss or profit.
§ For the application in e-commerce, Random Forest algorithm can be used for predicting whether the
-

customer will like the recommend products, based on the experience of similar customers.
-

Sunbeam Infotech www.sunbeaminfo.com


I
Regularization
·fitting Underfitting
-

- -
-> Ensemble

-

CV

x E.
known unknown
6

training testing
accuracy
accuracy
training testing
SS
maximum
wort accuracy accuracy
hyperplane SS IS
low

:
IOW

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy