0% found this document useful (0 votes)
8 views3 pages

Digital Data Part 4

This chapter introduces data science, its life cycle, and essential tools such as Python, R, and Google Colaboratory. It outlines the stages of data science projects, including data acquisition, exploration, analysis, and visualization. The chapter also emphasizes the application of data science across various fields like healthcare, business, and education.

Uploaded by

Manivannan B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views3 pages

Digital Data Part 4

This chapter introduces data science, its life cycle, and essential tools such as Python, R, and Google Colaboratory. It outlines the stages of data science projects, including data acquisition, exploration, analysis, and visualization. The chapter also emphasizes the application of data science across various fields like healthcare, business, and education.

Uploaded by

Manivannan B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

DATA SCIENCE PART 4

Chapter Outline
15.1 Introduction to data science
15.2 NumPy
15.3 Pandas
15.4 Exploratory data analysis
15.5 Data visualization
15.6 Summary
Data Science provides the ability to derive insights and make informed decisions
from data. Data science helps people make decisions in disciplines as diverse as
healthcare, business, education, politics, environmental science, and social
sciences.

This chapter aims to introduce the field of data science and the data science life
cycle. The resources provided in this chapter are meant to guide readers using
Python to further explore data science.
Learning objectives
By the end of this section you should be able to

Describe data science.


Identify different stages of the data science life cycle.
Name data science tools and software.
Use Google Colaboratory to run code.
Data science life cycle
Data science is a multidisciplinary field that combines collecting, processing, and
analyzing large volumes of data to extract insights and drive informed decision-
making. The data science life cycle is the framework followed by data scientists
to complete a data science project. The data science life cycle is an iterative
process that starts with data acquisition, followed by data exploration. The data
acquisition stage may involve obtaining data from a source or collecting data
through surveys and other means of data collection that are domain-specific.
During the data exploration stage, data scientists will ensure that the data are in
the right format for the data analysis stage through data cleanup and they may
also visualize the data for further inspection. Once the data are cleaned, data
scientists can perform data analysis, which is shared with stakeholders using
reports and presentations. The data analysis stage involves using data to
generate insights or make a predictive model. Data science is increasingly being
adopted in many different fields, such as healthcare, economics, education, and
social sciences, to name a few. The animation below demonstrates different
stages of the data science life cycle.

1|Page
DATA SCIENCE PART 4

Concepts in Practice
What is data science?
What is the first stage of any data science life cycle?
a. data visualization
b. data cleanup
c. data acquisition
How many stages does the data science life cycle have?
a. 3
b. 4
c. 5
What does a data scientist do in the data exploration stage?
a. document insights and visualization
b. analyze data
c. data cleaning and visualization
Data science tools
Several tools and software are commonly used in data science. Here are some
examples.

Python programming language: Python is widely used in data science. It has a


large system of libraries designed for data analysis, machine learning, and
visualization. Some popular Python libraries for data science include NumPy,
Pandas, Matplotlib, Seaborn, and scikit-learn. In this chapter, you will explore
some of these libraries.
R programming language: R is commonly used in statistical computing and data
analysis, and it offers a wide range of packages and libraries tailored for data
manipulation, statistical modeling, and visualization.
Jupyter Notebook/JupyterLab: Jupyter Notebook and JupyterLab are web-based
interactive computing environments that support multiple programming
languages, including Python and R. They allow a programmer to create
documents that contain code, visualizations, and text, making them suitable for
data exploration, analysis, and reporting.
Google Colaboratory: Google Colaboratory is a cloud-based Jupyter Notebook
environment that allows a programmer to write, run, and share Python code
online. In this chapter, you will use Google Colaboratory to practice data science
concepts.

2|Page
DATA SCIENCE PART 4

Kaggle Kernels: Kaggle Kernels is an online data science platform that provides a
collaborative environment for building and running code. Kaggle Kernels support
Python and R and offers access to datasets, pre-installed libraries, and
computational resources. Kaggle also hosts data science competitions and
provides a platform for sharing and discovering data science projects.
Excel/Sheets: Microsoft Excel and Google Sheets are widely used spreadsheet
applications that offer basic data analysis and visualization capabilities. They can
help beginners get started with data manipulation, basic statistical calculations,
and simple visualizations.
Checkpoint
Concepts in Practice
Data science tools and software
Between Python, R, and Java, which is the most popular language in data
science?
a. Python
b. R
c. Java

Which of the following is a data science-related library in Python?


a. list
b. NumPy
c. array

Google Colaboratory can be used for reporting and sharing insights.


a. true
b. false
Programming practice with Google
Open the Google Colaboratory document below. To open the Colaboratory
document, you need to login to a Google account, if you have one, or create a
Google account. Run all cells. You may also attempt creating new cells or
modifying existing cells. To save a copy of your edits, go to "File > Save a Copy
in Drive", and the edited file will be stored in your own Google Drive.

3|Page

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy