Data Science R SLB
Data Science R SLB
Description:
Data science is a "concept to unify statistics, data analysis, machine learning and their
related methods" in order to "understand and analyze actual phenomena" with data. It employs
techniques and theories drawn from many fields within the context of mathematics, statistics,
computer science, and information science. Fueled by big data and AI, demand for data science
skills is growing exponentially, according to job sites. The supply of skilled applicants, however, is
growing at a slower pace. It's a great time to be a data scientist entering the job market. ... "More
employers than ever are looking to hire data scientists."
Course Outcome:
2. Explore Recommendation Systems with functions like Association Rule Mining , user-based
collaborative filtering and Item-based collaborative filtering among others
5. Learn where to use algorithms - Decision Trees, Logistic Regression, Support Vector Machines,
Ensemble Techniques etc
Module 1:-
Statistics Theory:
1 Introduction to Data Science
6 Probability
2 Scope of Data Science
7 Probability Distribution
3 Application of Data Science
8 Hypothesis Testing
4 Introduction to Statistics
9 Statistical Tests (Z-Test, Chi-Square,
5 Graphical and Tabular Descriptive
T-Tests, etc)
Statistics
Module 2:-
R Programming, Data Handling and Basic Statistics
1. Introduction Analytics Tool(R) 2. Data Handling in R
a. Introduction to Data Analysis a. Importing data
b. Introduction to R programming b. Sampling
c. R Environment and Basic Commands c. Data Exploration
d. Creating calculated fields
e. Sorting & removing duplicates
Module 3:-
Project-1 - Data Exploration, Validation and Cleaning Project
1. Project on Data handling 5. Outliers identification
2. Data exploration 6. Data Cleaning
3. Data validation 7. Basic Descriptive statistics
4. Missing values identification
Module 4:-
Regression Analysis & Logistic Regression Model Building
1. Regression Analysis 2. Logistic Regression
a. Correlation a. Need of logistic Regression
b. Simple Regression models b. Logistic regression models
c. R-Square c. Validation of logistic regression models
d. Multiple regression d. Multi collinearity in logistic regression
e. Multi collinearity e. Individual Impact of variables
f. Individual Variable Impact f. Confusion Matrix
Module 5 :-
Project2 -Predictive Modeling Project
1. Objective 5. Variable selection
2. Model building-1 6. Model calibration
3. Model building-2 7. Out of time validation
4. Model validation
Module 6 :-
Neural Network, SVM and Random Forest
A. Neural Networks B. SVM
a. Neural network Intuition a. Introduction
b. Neural network and vocabulary b. The decision boundary with largest
c. Neural network algorithm margin
d. Math behind neural network algorithm c. SVM- The large margin classifier
e. Building the neural networks d. SVM algorithm
f. Validating the neural network model e. The kernel trick
g. Neural network applications f. Building SVM model
h. Image recognition using neural g. Conclusion
networks