0% found this document useful (0 votes)
13 views16 pages

CSL-410-L02

Uploaded by

rpschauhan2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views16 pages

CSL-410-L02

Uploaded by

rpschauhan2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Program:B.

Tech(CSE) IV Semester II Year

CSL-410: Data Science using Python


Unit No. 1
Introduction of Data Science

Lecture No. 02

Dr. Sanjay Jain


Associate Professor, CSA/SOET
Outlines
• Data Science Components

• Roles of Data Science Experts

• Data Scientist

• References
Course Outcome
• CO.1: Understanding: Basic concept of Data science, applications areas
and tools for data science.
• CO.2: Applying: Implementation of Numpy for handling numerical data
and pandas for handling data from datafiles.
• CO.3: Analyzing: Analyze the domain of data, cleaning and preparing the
data for data science.
• CO.4: Evaluating: evaluate and summarize the data using statistical &
visualization tools;
• CO.5: Create: Create dataset for machine learning models. .
Data Science Components

<CO: 1> <Reference No.: R1,R3,R4>


Data Science Components
• Data (and Its Various Types)
The raw dataset is the foundation of Data Science, and it can be of various types like
structured data (mostly in a tabular form) and unstructured data (images, videos, emails,
PDF files, etc.)
• Programming (Python and R)
Data management and analysis is done by computer programming. In Data Science, two
programming languages are most popular: Python and R.
• Statistics and Probability
Data is manipulated to extract information out of it. The mathematical foundation of
Data Science is statistics and probability. Without having a clear knowledge of statistics
and probability, there is a high possibility of misinterpreting data and reaching at
incorrect conclusions. That’s the reason why statistics and probability play a crucial role
in Data Science.

<CO: 1> <Reference No.: R1,R3,R4>


Data Science Components
• Machine Learning
As a Data Scientist, every day, you will be using Machine Learning algorithms such as
regression and classification methods. It is very important for a Data Scientist to
know Machine learning as a part of their job so that they can predict valuable insights
from available data.
• Big Data
In the current world, raw data is compared with crude oil, and the way we extract
refined oil from the crude oil, by applying Data Science, we can extract different kinds
of information from raw data. Different tools used by Data Scientists to process big data
are Java, Hadoop, R, Pig, Apache Spark, etc.

<CO: 1> <Reference No.: R1,R3,R4>


Role of Data Science Experts

<CO: 1> <Reference No.: R1,R3,R4>


Role of Data Science Experts
• Data Analyst
A Data Analyst is entrusted with the responsibility of mining huge amounts of data, looking
for patterns, relationships, trends, and so on, and coming up with compelling visualization
and reporting for analyzing the data to take business decisions.
• Data Engineer
A Data Engineer is entrusted with the responsibility of working with large amounts of data.
He/she should be available to clear data cleansing, data extraction, and data preparation for
businesses for working with large amounts of data.
• Machine Learning Expert
A Machine Learning expert is the one who is working with various Machine Learning
algorithms like regression, clustering, classification, decision tree, random forest, and so on.
• Data Scientist
A Data Scientist is the one who works with huge amounts of data to come up with
compelling business insights through the deployment of various tools, techniques,
methodologies, algorithms, and so on.

<CO: 1> <Reference No.: R1,R3,R4>


Comparison of Data Science with Data Analytics

Criteria Data Science Data Analytics

Skills Needed Data capturing, statistics, Analytical, mathematical,


and problem-solving and statistical skills

Type of Data Used All types of data Mostly structured and


numeric data

Standard Life Cycle Explore, discover, The report, predict,


investigate, and visualize prescribe, and optimize

<CO: 1> <Reference No.: R1,R3,R4>


Who is a Data Scientist?
• Data Scientists are IT professionals whose main role in an organization is to perform
data wrangling on a large volume of data—structured and unstructured—after gathering
and analyzing it. They need this voluminous data for multiple reasons, including
building hypotheses, analyzing market and customer patterns, and making inferences.
• Their role requires a combination of mathematical, statistical, and computer science
knowledge for analyzing, processing, and modeling data. This modified data is further
used for the prediction of results that can help organizations come up with efficient plans
that need to be executed for the company’s welfare.
• These experts use their skills and techniques to extract and manage data for boosting
business efficiency. They make use of their experience, contextual knowledge, current
market trends, and their assumptions made on the existing data to find solutions to the
current challenges that the organization is facing. To do so, they must use predictive
analysis, Machine Learning (ML) algorithms, and other advanced-level analytical
technologies.

<CO: 1> <Reference No.: R1,R3,R4>


What does a Data Scientist do?
• A Data Scientist must assume many roles while working in an organization, including
that of an analyst, mathematician, computer scientist, and trendspotter. To fulfill these
many roles on a daily basis, they have several responsibilities in the organization. Let’s
take a look at some of the most common and significant ones:
– Collect large volumes of quantitative and qualitative data and transform it into a
readable and usable format
– Use data-driven methods to resolve business issues
– Work with Python, SAS, R, and other programming languages
– Apply several distribution methods and statistical tests
– Make use of Deep Learning, ML, and analytical techniques
– Analyze patterns and trends in data to help in building business efficiency

<CO: 1> <Reference No.: R1,R3,R4>


What does a Data Scientist do?
• The overall life cycle of these professionals is mentioned below:
– Step 1: Discover data
– Step 2: Perform ETL (extract, transform, and load) for data preparation
– Step 3: Use visualization tools to apply Exploratory Data Analytics (EDA) for
planning the model
– Step 4: Use necessary tools to build the model
– Step 5: Deliver the results using the data visualization tools

<CO: 1> <Reference No.: R1,R3,R4>


How to become a Data Scientist?
• You have read about Data Scientists in terms of who they are and what they do. Now,
you may wonder what steps need to be followed to become one. What are the skills and
qualifications required? And so on. Following are the steps that will lead you to become
a Data Scientist:
– Get a bachelor’s degree in statistics, mathematics, or computer science
– Acquire the necessary skills expected from a Data Scientist
– Gain practical experience as a Data Scientist
• These are the skills that companies look for in a Data Scientist:
– Mathematical and statistical skills
– Programming skills in R, Python, etc.
– Knowledge of PostgreSQL, MySQL, or any other database
– Experience with data visualization tools and reporting techniques

<CO: 1> <Reference No.: R1,R3,R4>


References
• Data Science with Python by by Aaron England, Mohamed Noordeen Alaudeen, and
Rohan Chopra. Packt Publishing; July 2019
• https://intellipaat.com/blog/what-is-data-science/
• https://onlinecourses.nptel.ac.in/noc20_cs36/
Learning Outcomes

The students have learn and understand the followings:

•Data Science component

•Role of Data Science Experts

•Data scientist
Thank You

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy