0% found this document useful (0 votes)
51 views11 pages

DSE - Course Outline

The document provides an overview of the curriculum for an Introduction to Python course held over 4 days. Day 1 covers basic Python commands, variables, data types, and pseudo code. Day 2 focuses on lists, tuples, dictionaries, conditional statements, and loops. Day 3 has students writing programs using loops and conditionals. Day 4 introduces functions, including built-in and user-defined functions. Students complete lab exercises each day to practice the concepts.

Uploaded by

Sachin R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views11 pages

DSE - Course Outline

The document provides an overview of the curriculum for an Introduction to Python course held over 4 days. Day 1 covers basic Python commands, variables, data types, and pseudo code. Day 2 focuses on lists, tuples, dictionaries, conditional statements, and loops. Day 3 has students writing programs using loops and conditionals. Day 4 introduces functions, including built-in and user-defined functions. Students complete lab exercises each day to practice the concepts.

Uploaded by

Sachin R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

GREAT LEARNING COURSES CURRICULUM VITAE

WEEK – 1
ITP
DAY-1
 Intro to Python –
 Basic Commands - Hello World
 Variables
 Basic Arithmetic & logical operators (int, float)
 Data Types - int, float, strings
 Concat, Subset, Position, length etc.
 Appreciation of programming using Pseudo Code (Introduction) - If-else, loops (Deck)

Lab Exercises (2 hrs)

DAY-2
 List, Tuples, Dictionaries
 Indexing
 Arithmetic operators
 Logical operators
 Comparison operators

 Intro to Conditional statements (if-else, elif), Nested Conditional


 Intro to Basic For, While Loops, Break

Lab Exercises (2 hrs)

DAY-3
 psuedo codes into programs using Loops and if-else
 List Comprehension
 Use cases
 vs Loops

Lab Exercises (2 hrs)

DAY-4
 Understanding the concept of functions
 Exploring commonly used built in functions (min, max, sort etc.)
 Programming user defined functions
 Working with functions with and without arguments
 Functions with return items
 Understanding lambda functions
 Overview of map, reduce and filter functions

Lab Exercises - Mini Case Study


WEEK – 2
Numpy,Pandas,Visualization
DAY-1
 Indexing and slicing numpy arrays
 Working with 2 dimensional arrays (slicing, indexing, comparison operations)
 Arithmetic operations on 2 dimensional arrays
Iterating 2 dimensional arrays using for loops
 Operations on 2 dimensional arrays (stacking and splitting , vstack, hstack, vsplit and hsplit
operations)

Lab Exercises (2 hrs)

DAY - 2

 Explain the Data Structures with examples?


 How to create, manipulate the data frames?
 Reading data from various sources.
 Indexing, sorting, rank.

Lab Exercises (2 hours)

DAY -3

 Merge, join, concatenate.


 Reshaping,pivoting,duplicating,mapping,replacing
 Summary statistics (Mean, Median, Mode, Skewness, Kurtosis)

Lab Exercises (2 hours)

DAY-4

 What are the Visualization libraries?


 Overview on seaborn and matplotlib packages
 What are the Various plot using this visualization libraries?
 Distribution plots – Histogram, frequency polygon
 Representing data using charts – bar chart, pie chart
 Checking for data anomalies and outliers – box plots
 Association between variables – correlation heatmap, scatter plots, pairplots

Lab Exercises (2 hours)


EXPLORATORY DATA ANALYSIS
DAY - 1
 What is EDA?
 All EDA Plots?
 Getting data from multiple sources (csv, text, url etc.,)?
 Types of variables – Categorical and Continuous?
 Measures of central tendency - Mean, median, mode?
 Measures of dispersion – Quartiles, percentiles, Standard deviation, variance, coefficient of
variation?
 Coefficient of correlations, Skewness and kurtosis?
 Understand different types of data and measurements and dtype conversions?

Lab Exercises (2 hours)

DAY - 2
Univariate
 Visualising and understanding the data at hand, Summary statistics Moments, distributions
 Data transformation – z-score, normalisation,
 Label Encoding, One hot encoding,Replacing data.
 Coefficient of correlations, Skewness and kurtosis.
 Scaling Vs Normalization

Lab Exercises (2 hours)

DAY - 3
Bivariate
 Feature to feature relationships
 Correlation and Frequency tables
 Seasonality and looking at trended data
 Multi variate analysis

Lab Exercises (2 hours)

DAY - 4
Wrangling
 Various ways of treating missing value’s / Missing value Treatment?
 Various ways of outlier treatment?
 Data Imbalance treatment
 Feature engineering, Introduction to Test and Train

Lab Exercises (2 hours)


SQL
DAY-1
 Introduction to Database
 Types of Data models
 Database Operations
 Advantages of DBMS
 Introduction to Normalization.
 Need for Normalization/Denormalization.
 Normalization:
- 1 Normal Formal
ST

- 2 Normal Formal
ND

- 3 Normal Formal
RD

 DDL- Understand how to create different database objects.

 DDL to Create and Manage Tables

 DDL Expressions - CREATE, ALTER, DROP, TRUNCATE, COMMENT, RENAME.

 Types of Constraints.

 DML- Describe how to manipulate data.

 Explain how to retrieve data from tables using Filter’s


 In Class Lab Exercise: - Sql Queries Using select, where, order by Expressions.
 Take Home Lab Exercise :- Sql Queries Using select, where, order by Expressions.

DAY -2

 Functions in SQL.
 Types of Single Row Functions
 Explain how to use single-row functions in SQL
 Explain how to use group functions in SQL
 In Class Lab Exercise: - Sql Single-Row Functions, Group Functions Expressions.

 Take Home Lab Exercise: - Sql Single-Row Functions, Group Functions Expressions.
DAY-3

 Explain how to retrieve data from multiple tables using joins


 Apply set operations on tables
 Views , Advantages of Views , Creating a View , Creating a Complex View ,
Rules For Performing DML Operations on a View.
 Using the Check option clause , Indexes , Creating an Index.

 In class Lab Exercises: - SQL JOINs

 Take Home Lab Exercises: - SQL JOINs

DAY-4

 Show How to use Sub-Queries.

 Group Functions in Subqueries.

 Subqueries Within Subqueries using any clause, using ‘ALL’ clause, ‘IN’, ‘NOT IN ‘ clause ,

multiple-columns in subquery, ‘Null’ Values in Subquery, Correlated Subquery, Exists

Operator.

 Sub-queries with aggregate functions.

 Sub-queries with conditional logic.

 Joining sub-queries.

 SQL windows functions

 SQL String functions to clean data

 In class Lab Exercises : -Subqueries .

 Take Home Lab Exercises : -Subqueries .


Stats for ML + Linear Regression
DAY - 1:
 Data and measurement
 Types of data measurement scales – Discrete, Continuous and categorical data.
 Sample vs. Population
 Description of Data
 Measures of central tendency – mean, median and mode
 Measures of dispersion –
 Quartiles, Standard deviation and Variance.
 Graphically examining dispersion – Histogram and frequency
polygon.
 Measuring association between variables – Covariance and coefficient of correlation
 Summarizing data – The 5-point summary
 Intro to basics of probability
 Definition of probability - Mathematical vs statistical interpretation
 Multiple events - Addition and multiplication in probability
 Probability distribution - Enumeration of all possible outcomes
 Introduction to the Normal distribution
 Sampling distribution and population parameters
 Comparing sample and population - The concept of testing hypotheses
 Sampling error
 Defining a confidence interval

DAY - 2:
 Introduction to hypothesis testing
 Defining a null and alternate hypothesis
 Types of alternate hypothesis - One tail vs two tail test
 Type 1 and type 2 error
 Hypothesis testing applications using the z test
 Interpreting test results
 P-value vs confidence interval approach
 Testing hypothesis for sample - the t distribution
 Testing joint hypothesis - One-way ANOVA and F-test
DAY - 3:
 Examining causal relationship between variables
 Introduction to the concept of regression
 Review - Equation of a straight line
 Visualizing regression line as an average line
 Error variance and minimization
 The methodology behind OLS estimation
 Linear regression with a single independent variable
 Fitting a regression line to a dataset
 Sample vs Population regression function
 Hypothesis testing in the context of regression
 Interpreting ANOVA for a regression model
DAY - 4:
 Extending linear regression to more than one independent variable
 Assumptions in linear regression –
- Multicollinearity - effects and detection, VIF
- Autocorrelation - Why is it a problem, Durbin-Watson test
- Heteroscedasticity - What to do when error terms are unequally distributed
 Hypothesis testing in Multiple regression
 Testing individual coefficients vs joint hypothesis
 ANOVA for multiple regression - F test
 The coefficient of determination – R^2
 The adjusted R^2
 Interpreting summary results after fitting a regression model
Regression + Feature Engineering
Day 1: -
 Introduction to machine learning
 Supervised vs unsupervised learning
 Looking at regression through the perspective of machine learning
 Accuracy scores as a metric of model performance
 Measuring the importance of individual variables in a regression model
 Review - testing for individual significance vs joint significance
 Using the adjusted R^2 to compare model with different number of independent variables
 Approaches to feature selection
 Forward and backward selection
 Parameter tuning and Model evaluation

Day 2: -
 Extending linear regression
 Data transformations and normalization
 Log transformation of dependent and independent variables
 Case study: -
 Dealing with categorical independent variables
 One hot encoding vs dummy variable regression
 Case study on linear regression

Day 3: -
 Modelling probabilistic dependent variables
 The sigmoid function and odds ratio
 The concept of logit
 The failure of OLS in estimating parameters for a logistic regression
 Introduction to the concept of Maximum likelihood estimation
 Advantages of the maximum likelihood approach
 Modelling a logistic regression problem with a case study
 Making predictions and evaluating parameters

Day 4: -
 Extending the logistic model to multi-class predictions
 The one vs all approach
 Multiclass classification
 Case study on Logistic regression –
 Binary classification and Multiclass one vs all
SUPERVISED LEARNING CLASSIFICATION
DAY - 1:

 Classification Problems – Examples.


 Binary classification vs Multi class classification.
 Decision trees – Simple decision trees. Visualizing decision trees and nodes and splits.
 Working of the Decision tree algorithm.
 Importance and usage of Entropy and Gini index.
 Manually calculating entropy using gini formula and working out how to split decision nodes
 Evaluating decision tree models.
 Accuracy metrics – precision, recall and confusion matrix
 Interpretation for accuracy metric.
 Building a a robust decision tree model. k-fold cross validation - Advantages against simple
train test split.

DAY - 2:
 CART - Extending decision trees to regressing problems.
 Advantages of using CART.
 Multiple decision trees - Introduction to the Random forests algorithm.
 The KNN Algorithm and its workings
 Choosing the optimum number of neighbour’s. Iterating over various values for K .
 Measures of distance used in KNN.
 Evaluating a KNN Model - Accuracy metrics and k-fold cross validation.

DAY - 3:

 The Bayes theorem. Prior probability.


 The Gaussian NAÏVE’S BAYES Classifier.
 Assumptions of the Naive Bayes Classifier.
 Functioning of the Naïve’s bayes algorithm.
 Evaluating the model - Precision, Recall, Accuracy metrics and k-fold cross validation
 ROC Curve and AUC for binary classification for Naive Bayes.
 Extending Bayesian Classification for multiclass classification.

DAY - 4:

 Using the 2 classification case studies and working through all the classication algorithms i.e.,
CART, KNN and Naive Bayes Algorithms sequentially.
 Comparing and evaluating various classification algorithms.
 Feature selection for classification algorithms.
 Advantages and pitfalls of each algorithm. When to use which type of model?
UNSUPERVISED LEARNING
DAY - 1:

 What is Unsupervised learning?


 The two major Unsupervised Learning problems - Dimensionality reduction and clustering.
 Clustering algorithms.
 The different approaches to clustering – Heirarchical and K means clustering.
 Heirarchical clustering - The concept of agglomerative and divisive clustering.
 Agglomerative Clustering – Working of the basic algorithms.
 Distance matrix - Interpreting dendograms.
 Choosing the threshold to determine the optimum number of clusters.
 Case Study on Agglomerative clustering

DAY - 2:

 The K-means algorithm.


 Measures of distance – Euclidean, Manhattan and Minowski distance.
 The concept of within cluster sums of squares.
 Using the elbow plot to select optimum number of cluster’s.
 Case study on k-means clustering.
 Comparison of k means and agglomerative approaches to clustering.

DAY - 3:

 Noise in the data and dimensional reduction.


 Capturing Variance - The concept of a principal components.
 Assumptions in using PCA.
 The working of the PCA algorithm.
 Eigen vectors and orthogonality of principal components.
 What is complexity curve?
 Advantages of using PCA.
 Bulid a model using Principal components and comparing with normal model. What is the
difference?

DAY - 4:

 Putting it all together.


 The relationship between unsupervised and supervised learning.
 Case study on Dimensionality reduction followed by a supervised learning model.
 Case study on Clustering followed by classification model.
TABLEAU
Below Topices to be covered with 4 day-wise breakup.
 Introduction to Visualization, Rules of Visualization
 Data Types, Sources, Connections, Loading, Reshaping
 Data Aggregation
 Working with Continuous and Discrete Data
 Using Filters
 Using Calculated Fields and parameters
 Creating Tables and Charts
 Building Dash Boards and story boards
 Sharing Your Work and Publishing for wider audience

 Importing data from different sources


 Examining imported data
 Linking multiple data tabs/tables
 Understanding ‘Show Me’ features
 Understanding how data is aggregated by Tableau by default – how and when to change them
 Examine a Problem Statement
 Design a Dash Board with appropriate charts
 Build different charts
 Build Story Board
 Publish

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy