0% found this document useful (0 votes)
84 views18 pages

ZG536 L1 Introduction 140124

This document provides an introduction to a foundations of data science course. It outlines the course objectives, evaluation criteria, instructor details, and provides an overview of key data science concepts including definitions, applications, roles, skills, challenges and differences compared to other domains. Statistics, Python programming and Excel skills are listed as prerequisites.

Uploaded by

Vaishnavi Appaya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views18 pages

ZG536 L1 Introduction 140124

This document provides an introduction to a foundations of data science course. It outlines the course objectives, evaluation criteria, instructor details, and provides an overview of key data science concepts including definitions, applications, roles, skills, challenges and differences compared to other domains. Statistics, Python programming and Excel skills are listed as prerequisites.

Uploaded by

Vaishnavi Appaya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

ZG 536

Foundations of Data Science


BITS Pilani Pravin Mhaske
Pilani Campus
BITS Pilani
Pilani Campus

M1 Data Science Foundations


Lecture 1 - Introduction
Course Objectives

No Objective

Get introduced to the field of Data Science, roles, process and challenges
CO1 involved therein

Explore and experience the steps involved in the data preparations and
CO2 exploratory data analysis

Learn to select and apply proper analytics technique for various scenarios,
CO3 assess the model’s performance and interpret the results of the predictive
model

Get familiarity with the general deployment considerations of the predictive


CO4
models

Appreciate the importance of techniques like data visualization, storytelling


CO5 with data for the effective presentations of the outcomes to the stakeholders

BITS Pilani, Pilani Campus


Evaluation

Duratio Day, Date, Session,


No Name Type Weight
n Time

Experiential
Learning
Assignment 1
EC1 Take Home-Online 25% To be announced
Experiential
Learning
Assignment 2

Mid-Semester Open or 2 hours 30%


EC2 Exam Closed
Book
Comprehensive Open 2 hours 45%
EC3 Exam Book

BITS Pilani, Pilani Campus


The instructor

Pravin Mhaske

Qualification Bachelor of Engineering (Mechanical)


Master of Science (Business Analytics)
Experience 21 years (Industry)
6 years (Teaching)
Teaching interests Statistics, Data Science, Machine
Learning, Business Analytics
Pedagogy Concepts, foundation, intuition, hands-
on practice, experiential learning

BITS Pilani, Pilani Campus


What exactly is Data Science?

• An interdisciplinary field that uses algorithms, procedures, and


processes to examine large amounts of data
• Study of data to extract meaningful insights for business
• Using data to solve problems and make decisions!
• Applied Statistics!

Breaking it down:
• Data: Everything is data. Structured, unstructured.
• Scientific methods: Scientific approach, questions, data collection, analyze,
interpret, conclusion
• Statistics: Patterns, trends, insights
• Domain expertise: SME, actionable and relevant insights
• Programming: Process and manipulate data

BITS Pilani, Pilani Campus


Applications
• Every domain!
• Healthcare Better operations, early detections, preventions
• Retail Customer behavior, STP, Customer experience
• Banking and Finance Financial advice and planning, predictions, Fraud
detection
• Transportation Optimizations, better planning
• Manufacturing Fault detection, IoT, Operations and process
improvement
• Meteorology Weather, seismic, geospatial data
• Social media/TC Sentiment analysis, Demands
• Energy and utility Consumption, control
• Public services Planning, development
• Sports, Entertainment Strategy, Content creation, Demand analysis
• Politics?

BITS Pilani, Pilani Campus


Some Examples
• Recommender systems: Amazon, Netflix, youtube
• Personalization: Learning, ads, promotions and discounts
• Decision making: Google maps
• Fraud detection: transactions
• Dynamic pricing: Surge pricing
• Smart homes, voice assistants
• Social media trends
• Spam mail filters
• Traffic lights
• Online dating

BITS Pilani, Pilani Campus


Why learn Data Science?

• Career opportunities
• Rapid digital evolution
• Data is growing
• Flexibility – all industries, freelancing
• Demand-Supply gap
• Analytical, scientific approach
• Being logical and sensible
• Life skill - Solving real life problems

BITS Pilani, Pilani Campus


DS, AI, ML, DL, Analytics?
• Data Science Processing, Analyzing, Insights
• Business Analytics Solving problems, Making decisions
• Artificial Intelligence Machines simulate human behavior
• Machine Learning Computers learn themselves
• Deep Learning Artificial Neural Networks

BITS Pilani, Pilani Campus


DS/ML project flow

BITS Pilani, Pilani Campus


Popular Roles and Skills

Data Engineer Data/BI Analyst ML Engineer Business Analyst


SQL, Python, Hive, Pig, SQL, Excel, Python/R, Python, Machine Learning Excel, Visio, SQL, Tableau
Java, Hadoop, Spark, Tableau/PowerBI/QlikView, Algorithms, DL/NLP, Java, Domain understanding,
Kafka, Azkaban, Airflow, Basics of Big Data, Basics DBMS, Cloud Architecture, Requirement Gathering,
AWS, GCP, Azure of Cloud Big Data Architectures, Requirement Elicitation,
Data Warehousing, Ability Programming skills in AWS/GCP/Azure Process Excellence, User
to write, analyze, and Python/R , Solid Understanding of data Acceptance Testing,
debug SQL queries, Big understanding of database structures, data modeling Documentation Prowess,
Data platforms like management systems, and software architecture. Basic Data Analysis Skills
Hadoop, Spark, Kafka, Proficient SQL/HQL skills, Deep knowledge of math,
Flume, Pig, Hive, etc. , Good data visualization probability, statistics and
Experience in handling skills and proficient with algorithms. Ability to write
data pipeline and Tableau/PowerBI/QlikView, robust code in Python,
workflow management etc ,basic understanding of Java and R. Familiarity with
tools like Azkaban, Luigi, predictive modelling machine learning
Airflow, etc., Strong frameworks (like Keras or
Communication Skills PyTorch) and libraries (like
scikit-learn)

BITS Pilani, Pilani Campus


Data Scientist

Wears many hats!


1. Data Acquisition and Preparation: Data Sources, Cleaning,
Preprocessing, Integration, Wrangling
2. Data Analysis: EDA, insights, patterns
3. Modeling: Statistical/hypothesis testing, ML models - Building,
testing, tuning, deploying
4. Communication: Story-telling, visualization, audience
5. Collaboration: Stakeholders,
6. Solutions: Practical, relevant

BITS Pilani, Pilani Campus


Data Scientist Skills

BITS Pilani, Pilani Campus


Data Science Vs other domains

1. Interdisciplinary: Statistics, Mathematics, Computer Science,


Programming, and domain-specific expertise
2. Focus: Data
3. Problem solving: Real world challenges. Always new.
4. Evolution: Tools, techniques, algorithms
5. Lifelong learning: No crash course!
6. All industries
7. No defined scope
8. No single correct solution
9. Answer to many questions is ‘depends!’

BITS Pilani, Pilani Campus


Challenges

1. Data: Acquisition, access, quality, volume


2. Technical: Tools, algorithms
3. Explainable AI: Interpretability and explainability
4. Communication: stakeholders
5. Privacy and Security
6. Continuous learning

BITS Pilani, Pilani Campus


Prerequisites for the course

STATISTICS PYTHON EXCEL ANACONDA


PROGRAMMING INSTALLATION

BITS Pilani, Pilani Campus


Exercise

BITS Pilani, Pilani Campus

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy