100% found this document useful (1 vote)
866 views18 pages

Medical Insurance Cost Prediction

Uploaded by

kowirix805
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
866 views18 pages

Medical Insurance Cost Prediction

Uploaded by

kowirix805
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

SUMMER INTERNSHIP PROJECT

PRESENTED BY ,
B182006 (M.Praveen)
B182096 (A.Prathyusha)
Dept of ECE.
Table of Contents
 Abstract

 work Flow

 Proposed system

 Attribute information

 Implementation

 Introduction to machine learning model

 Conclusion
Abstract

Health Insurance companies have a tough task at determining premiums for their
customers. While the health care law in the United States does have some rules for
the companies to follow to determine premiums, its really up to the companies on
what factors they want to hold more weightage to.

Using Linear Regression (Machine Learning Technique), try to determine the most
significant factors(independent variables) by an insurance company.
Work flow
PROPOSED SYSTEM
 The working of the system starts with the collection of the data and selecting the
important attributes.
 Then the required data is pre-processed into the required format.
 The data is then divided into two parts : training and testing data.
 The algorithms are applied and the model is trained using training data.
 The accuracy of the system is obtained by testing the system using the testing the data.
 This system is implemented using the following modules
1. Collection of data set
2. Selection of attributes
3. Data pre-processing
4. Balancing of data
5. Insurance cost prediction
DATA SET INFORMATION
ATTRIBUTE INFORMATION:

 Age

 Sex

 BMI

 Number of Children

 Smoker

 Region

 Charges
DATA SET INFORMATION
ATTRIBUTE INFORMATION:

 Age

 Sex

 BMI

 Number of Children

 Smoker

 Region

 Charges
Implementation

 Collecting the required data

 Preparing and pre-processing the data ,as it required for good accuracy

 Choosing the model which gives the best accuracy with minimal error , reduce
overfitting

 Training the model

 Evaluation of model with the validation data


MACHINE LEARNING

 Machine Learning is the subfield of Artificial Intelligence , which is broadly

defined as the capability of a machine to imitate intelligent human behaviour.

 Machine learning is used in :


o Internet search engines

o Email filters to sort out spam

o Banking software to detect unusual transactions

o Lots of apps on our phones such as speech recognition , photo detection ,


etc..,
Model /Algorithm used : Linear Regression

 Linear Regression is the first Machine Learning algorithm based on


“Supervised Learning”.

 Linear Regression performs the task to predict a dependent variable value


(Y) based on a given independent variable (X).

 When there is a single input variable the method is referred to Simple


Linear Regression.

 When there is a multiple input variables the method is referred to Multiple


Linear Regression.
Algorithms used :

 Multiple Linear Regression

 Decision Tree

 Random Forest
Linear Regression Equation :
Y = A+B*X

where,
X : input variable (Training data)
B : coefficient of X
A : Intercept
Y : Predicted value
Decision Tree

 A Decision tree is one of the supervised learning


techniques in machine learning algorithms.
 It is used for both classification and regression . In
this algorithm , data will be Split according to the
parameters.
 A Decision tree is a tree that will contain nodes
and leaves.
 At leaves , we will get outcomes or decision and
at the nodes ,data will be split.
Random Forest

 Random Forest is supervised learning algorithm. It is based on concept of


ensemble learning.

 It includes bagging technique to improve the performance of Decision tree.

 It is the most flexible and easy to use .

 Random Forest creates Decision trees on randomly selected data samples, get
predictions from each tree and select the best solution by means of voting.
Performance Evaluation

 Correlation between predicted and actual results

We can calculate “ R2 Score“ to check the model performance


using regression evaluation metrics to see the model behaviour and
decide which model best fits.
Conclusion

 In this project three regression models are evaluated for individual Medical
Insurance Data. The Medical Insurance data was used develop the three
regression models, and the predicted premiums from these models were
compared with actual premiums to compare the accuracies of these models. It
has been found that Random Forest regression model is the best performing
model.

 The effect of various independent variables on the premium amount was


checked.
Future Scope:

 Premium amount prediction focuses on persons own health, rather


than others companies insurance terms and conditions

 The models can be applied to the data collected in coming years to


predict the premium.

 This can help not only people but also insurance companies to work in
tandem for better and more health-centric insurance amount
THANK YOU

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy