0% found this document useful (0 votes)
35 views3 pages

Data Science R SLB

Uploaded by

rkddyuvan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views3 pages

Data Science R SLB

Uploaded by

rkddyuvan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

PROGRAM NAME : Data Science using R

DEPARTMENT : Computer Science Engineering / Information Technology

DURATION OF THE PROGRAM : 40 Hours

Description:
Data science is a "concept to unify statistics, data analysis, machine learning and their
related methods" in order to "understand and analyze actual phenomena" with data. It employs
techniques and theories drawn from many fields within the context of mathematics, statistics,
computer science, and information science. Fueled by big data and AI, demand for data science
skills is growing exponentially, according to job sites. The supply of skilled applicants, however, is
growing at a slower pace. It's a great time to be a data scientist entering the job market. ... "More
employers than ever are looking to hire data scientists."

Course Outcome:

1. Understand concepts around Business Intelligence and Business Analytics

2. Explore Recommendation Systems with functions like Association Rule Mining , user-based
collaborative filtering and Item-based collaborative filtering among others

3. Apply various supervised machine learning techniques

4. Perform Analysis of Variance (ANOVA)

5. Learn where to use algorithms - Decision Trees, Logistic Regression, Support Vector Machines,
Ensemble Techniques etc

6. Use various packages in R to create fancy plots


SESSION PLAN :

Module 1:-
Statistics Theory:
1 Introduction to Data Science
6 Probability
2 Scope of Data Science
7 Probability Distribution
3 Application of Data Science
8 Hypothesis Testing
4 Introduction to Statistics
9 Statistical Tests (Z-Test, Chi-Square,
5 Graphical and Tabular Descriptive
T-Tests, etc)
Statistics

Module 2:-
R Programming, Data Handling and Basic Statistics
1. Introduction Analytics Tool(R) 2. Data Handling in R
a. Introduction to Data Analysis a. Importing data
b. Introduction to R programming b. Sampling
c. R Environment and Basic Commands c. Data Exploration
d. Creating calculated fields
e. Sorting & removing duplicates

3. Basic Descriptive Statistics 4. Reporting and Data Validation


a. Population and Sample a. Percentiles & Quartiles
b. Measures of Central tendency b. Box plots and outlier detection
c. Measures of dispersion c. Creating Graphs and Reporting

Module 3:-
Project-1 - Data Exploration, Validation and Cleaning Project
1. Project on Data handling 5. Outliers identification
2. Data exploration 6. Data Cleaning
3. Data validation 7. Basic Descriptive statistics
4. Missing values identification

Module 4:-
Regression Analysis & Logistic Regression Model Building
1. Regression Analysis 2. Logistic Regression
a. Correlation a. Need of logistic Regression
b. Simple Regression models b. Logistic regression models
c. R-Square c. Validation of logistic regression models
d. Multiple regression d. Multi collinearity in logistic regression
e. Multi collinearity e. Individual Impact of variables
f. Individual Variable Impact f. Confusion Matrix

Decision Trees & Model Selection


1. Decision Trees 2.Model Selection and Cross validation
a. Segmentation a. How to validate a model?
b. Entropy b. What is a best model?
c. Building Decision Trees c. Types of data d. Types of errors
d. Validation of Trees e. The problem of over fitting
e. Fine tuning and Prediction using f. The problem of under fitting
Trees g. Bias Variance Tradeoff
h. Cross validation
i. Boot strapping

Module 5 :-
Project2 -Predictive Modeling Project
1. Objective 5. Variable selection
2. Model building-1 6. Model calibration
3. Model building-2 7. Out of time validation
4. Model validation

Module 6 :-
Neural Network, SVM and Random Forest
A. Neural Networks B. SVM
a. Neural network Intuition a. Introduction
b. Neural network and vocabulary b. The decision boundary with largest
c. Neural network algorithm margin
d. Math behind neural network algorithm c. SVM- The large margin classifier
e. Building the neural networks d. SVM algorithm
f. Validating the neural network model e. The kernel trick
g. Neural network applications f. Building SVM model
h. Image recognition using neural g. Conclusion
networks

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy