0% found this document useful (0 votes)
39 views

Data Science

This document provides an overview of data science. It defines data science as using data to solve real-world problems. It discusses the evolution from data mining to data science and provides examples of data science applications at companies like Netflix, Flipkart, and Facebook. It also outlines the data science process, different types of data, and popular tools used in data science like Python, R, Tableau, and Spark. Finally, it focuses on R programming, describing its features, installation, and use of packages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

Data Science

This document provides an overview of data science. It defines data science as using data to solve real-world problems. It discusses the evolution from data mining to data science and provides examples of data science applications at companies like Netflix, Flipkart, and Facebook. It also outlines the data science process, different types of data, and popular tools used in data science like Python, R, Tableau, and Spark. Finally, it focuses on R programming, describing its features, installation, and use of packages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

DATA SCIENCE

PRESENTED BY
DR. HARSHA PATIL
ASHOKA CENTER FOR BUSINESS AND COMPUTER STUDIES,
NASHIK
Chapter 1

 What is Data Science


 Scenarios on Data Science
 How Data Science helps for Organization?
 Explain different types of data
 Structured, Unstructured data and Machine
generated data
 Understanding on Data Science Process
 Explain on Research Goal
 Data Processing on Data Science
 Explain on Data Science Projects
What is Data Science?
What is Data Science cont..

 The Science which helps to solve real time


problems using data !

 Companies or Organizations are using data


(Their own historical data, other companies
data, openly available data or survey data for
improve their products)
What is Data Science cont..

History of Data Science: Data Mining to Data Science

 In 1996 article was published with title “


From Data mining to Knowledge discovery in
Data bases”

 Data Mining: The Practice of Examining large


databases in order to generate new
information.
What is Data Science cont..

• Raw facts
Data
• information
SQL
• New Information
Data Mining
• Knowledge
Data Science
Data Science
In 2001..

 Computer Science + Data Mining=Data Science

 “Data science, also known as data-driven science, is an


interdisciplinary field about scientific methods, processes,
and systems to extract knowledge or insights from data in
various forms, either structured or unstructured, similar to
data mining.” (Wikipedia)
What is Data Science cont..

Empowerment of statistical Analysis techniques using


computing technology.

Evolution of Web 2.0


2004
3V’s
Types of Data

Structured

Types of Semi
Data structured
Human
generated
unstructured
Machine
generated
The four levels of data

 The nominal level


 The ordinal level
 The interval level
 The ratio level
How Data Science helps for Organization?

Let us first see some


Case Study…..
Case Study: Data Science at Netflix
Case Study: Data Science at Flipkart
Ranking system at Flipkart
Case Study: Data Science at Facebook
Understanding on Data Science
Process
Explain on Research Goal Data
Processing on Data Science
Data Science Projects

 Movie Recommendation System Project


 Customer Segmentation using Machine
Learning
 Sentiment Analysis Model in R
 Uber Data Analysis Project
 Credit Card Fraud Detection Project in R
Applications of DS

 Image recognition and Speech Recognition


 Gaming World
 Internet Search
 Transport
 Healthcare
 Recommendation Systems
 Risk Detection
DS Tool Box

 Python Programming
 R programming
 SAS
 Tableau Public
 Microsoft Excel
 RapidMiner
 Apache Spark
 Knime
for Data Science

 R programming language is one that allows


statistical computing that is used widely by the
data miners and statisticians for data analysis.

 Developed in 1995 by Ross Ihaka and Robert


Gentleman, where the name ‘R’ was derived
from the first letters of their names ners and
statisticians for data analysis.
Features of R

• Open source
• Complete language

• Analytical support
• Facilitates interaction with databases

• Supports extensions
• Simple and easy to understand
Installation R

 Getting Start With R Overview of R Why R


for Data Science Download and Installing R
Installing R Studio Project Workspace
Setup Understanding on R Packages
Installing Packages Load Libraries and
Installed Packages

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy