0% found this document useful (0 votes)

7 views29 pages

CSL 410 L15

Uploaded by

rpschauhan2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views29 pages

CSL 410 L15

Uploaded by

rpschauhan2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Program:B.

Tech(CSE) IV Semester II Year

CSL-410: Data Science using Python

Unit No. 2
Pandas: Dataframes

Lecture No. 15

Dr. Sanjay Jain

Associate Professor, CSA/SOET
Outlines
• Introduction
• Create an Empty DataFrame
• Create a DataFrame from Lists
• Create a DataFrame from dict of ndarrays/Lists
• Create a DataFrame from List of Dicts
• Create a DataFrame from Dict of Series
• column selection, addition, and deletion
• Row Selection, Addition, and Deletion
• Examples
• References
Student Effective Learning Outcomes(SELO)
01: Ability to understand subject related concepts clearly along with
contemporary issues.
02: Ability to use updated tools, techniques and skills for effective domain
specific practices.
03: Understanding available tools and products and ability to use it
effectively.
DataFrame: Introduction
• A Data frame is a two-dimensional data structure, i.e., data is aligned in a
tabular fashion in rows and columns.
• Features of DataFrame
– Potentially columns are of different types
– Size – Mutable
– Labeled axes (rows and columns)
– Can Perform Arithmetic operations on rows and columns

<SELO: 1> <Reference No.: R1,R4>

DataFrame: Introduction
• Structure: Let us assume that we are creating a data frame with student’s
data.

<SELO: 1> <Reference No.: R1,R4>

pandas.DataFrame()
• A pandas Dataframe can be created using the following constructor:
pandas.dataframe (data, index, columns,dtype, copy)

S.No. Parameter & Description

data data takes various forms like ndarray, series, map, lists, dict, constants and alsoanother
1 DataFrame.

index
2 For the row labels, the Index to be used for the resulting frame is Optional Default np.arrange(n) if no
index is passed.
columns For column labels, the optional default syntax is -np.arrange(n). This is only trueif no index
3 is passed.

dtype Data type of each column.

5 copy This command (or whatever it is) is used for copying of data, if the default is False.

<SELO: 1> <Reference No.: R1,R4>

Create DataFrame

• A pandas DataFrame can be created using various inputs like:

– Lists
– dict
– Series
– Numpy ndarrays
– Another DataFrame

<SELO: 1> <Reference No.: R1,R4>

DataFrame : Create an Empty DataFrame
• A basic DataFrame, which can be created is an Empty DataFrame.
• Example:
#import the pandas library and aliasing as pd
import pandas as pd
df = pd. DataFrame()
print (df)
• Outcome:
Empty DataFrame
Columns: []
Index: []

<SELO: 1> <Reference No.: R1,R4>

DataFrame : Create a DataFrame from Lists
• The DataFrame can be created using a single list or a list of lists.
• Example:
import pandas as pd
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print(df)
• Outcome:
0
0 1
1 2
2 3
3 4
4 5

<SELO: 1> <Reference No.: R1,R4>

DataFrame : Create a DataFrame from Lists
• The DataFrame can be created using a single list or a list of lists.
• Example:
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print(df)
• Outcome:
Name Age
0 Alex 10
1 Bob 12
2 Clarke 13

<SELO: 1> <Reference No.: R1,R4>

DataFrame : Create a DataFrame from Lists
• The DataFrame can be created using a single list or a list of lists.
• Example:
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'],dtype=float)
print(df)
• Outcome:
Name Age
0 Alex 10.0
1 Bob 12 .0
2 Clarke 13.0

<SELO: 1> <Reference No.: R1,R4>

DataFrame : Create a DataFrame from Dict of ndarrays / Lists

• All the ndarrays must be of same length. If index is passed, then the length
of the index should equal to the length of the arrays. If no index is passed, then
by default, index will be range(n), where n is the array length.
• Example:
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df= pd.DataFrame(data)
print(df)
• Outcome:
Name Age
0 Tom 28
1 Jack 34
2 Steve 29
3 Ricky 42

<SELO: 1> <Reference No.: R1,R4>

DataFrame : Create a DataFrame from List of Dicts

• List of Dictionaries can be passed as input data to create a DataFrame. The

dictionary keys are by default taken as column names.
• Example:
import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data)
print(df)
• Outcome:
a b c
0 1 2 NaN
1 5 10 20.0

<SELO: 1> <Reference No.: R1,R4>

DataFrame : Create a DataFrame from List of Dicts

• The following example shows how to create a DataFrame with a list of

dictionaries, row indices, and column indices.
• Example:
import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
#With two column indices, values same as dictionary keys
df1 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b'])
#With two column indices with one index with other name
df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1'])
print(df1)
print(df2)

<SELO: 1> <Reference No.: R1,R4>

DataFrame : Create a DataFrame from List of Dicts

• Outcome:
Note: Observe, df2 DataFrame is created with a column index other than
the dictionary key; thus, appended the NaN’s in place. Whereas, df1 is
created with column indices same as dictionary keys, so NaN’s appended.

<SELO: 1> <Reference No.: R1,R4>

DataFrame : Create a DataFrame from Dict of Series

• Dictionary of Series can be passed to form a DataFrame. The resultant

index is the union of all the series indexes passed.
• Example:
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print(df)
• Outcome:
one two
a 1.0 1
b 2.0 2
c 3.0 3
d NaN 4
<SELO: 1> <Reference No.: R1,R4>
DataFrame : Create a DataFrame from Dict of Series

• Observe, for the series one, there is no label ‘d’ passed, but in the result,
for the d label, NaN is appended with NaN.
• Column Selection:
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print(df ['one'])
• Outcome:
one
a 1.0
b 2.0
c 3.0
d NaN
Name: one, dtype: float64
<SELO: 1> <Reference No.: R1,R4>
DataFrame : Create a DataFrame from Dict of Series

Column Addition:
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
# Adding a new column to an existing DataFrame object with column labe
l by passing new series
print ("Adding a new column by passing as Series:")
df['three']=pd.Series([10,20,30],index=['a','b','c'])
print(df)
print ("Adding a new column using the existing columns in DataFrame:")
df['four']=df['one']+df['three']
print(df)
<SELO: 1> <Reference No.: R1,R4>
DataFrame : Create a DataFrame from Dict of Series

Column Addition:

<SELO: 1> <Reference No.: R1,R4>

DataFrame : Create a DataFrame from Dict of Series

Column Deletion:
# Using the previous DataFrame, we will delete a column
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd']),
'three' : pd.Series([10,20,30], index=['a','b','c'])}
df = pd.DataFrame(d)
print ("Our dataframe is:")
print (df)
# using del function
print ("Deleting the first column using DEL function:")
del df['one']
print (df)
# using pop function
print ("Deleting another column using POP function:")
df.pop('two')
print (df)

<SELO: 1> <Reference No.: R1,R4>

DataFrame : Create a DataFrame from Dict of Series

Column Deletion:

<SELO: 1> <Reference No.: R1,R4>

DataFrame : Create a DataFrame from Dict of Series

Row Selection, Addition, and Deletion :

• Selection by Label: Rows can be selected by passing row label to a loc
function.
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print (df.loc['b'])
Output:
one 2.0
two 2.0
Name: b, dtype: float64
• Note: The result is a series with labels as column names of the DataFrame.
And, the Name of the series is the label with which it is retrieved.

<SELO: 1> <Reference No.: R1,R4>

DataFrame : Create a DataFrame from Dict of Series

Row Selection, Addition, and Deletion :

• Selection by integer location: Rows can be selected by passing integer
location to an iloc function.
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print (df.iloc[2])
Output:
one 3.0
two 3.0
Name: c, dtype: float64

<SELO: 1> <Reference No.: R1,R4>

DataFrame : Create a DataFrame from Dict of Series

Row Selection, Addition, and Deletion :

• Slice Rows: Multiple rows can be selected using ‘ : ’ operator.
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print (df[2:4])
Output:
one two
c 3.0 3
d NaN 4

<SELO: 1> <Reference No.: R1,R4>

DataFrame : Create a DataFrame from Dict of Series

Row Selection, Addition, and Deletion :

• Addition of Rows: Add new rows to a DataFrame using the append
function. This function will append the rows at the end.
import pandas as pd
df = pd.DataFrame([[1, 2], [3, 4]], columns=['a','b'])
df2 = pd.DataFrame([[5, 6], [7, 8]], columns=['a','b'])
df = df.append(df2)
print (df)
Output:
a b
0 1 2
1 3 4
0 5 6
1 7 8
<SELO: 1> <Reference No.: R1,R4>
DataFrame : Create a DataFrame from Dict of Series

Row Selection, Addition, and Deletion :

• Deletion of Rows: Use index label to delete or drop rows from DataFrame. If label
is duplicated, then multiple rows will be dropped.
• If you observe, in the above example, the labels are duplicate. Let us drop a label
and will see how many rows will get dropped.
import pandas as pd
df = pd.DataFrame([[1, 2], [3, 4]], columns=['a','b'])
df2 = pd.DataFrame([[5, 6], [7, 8]], columns=['a','b'])
df = df.append(df2)
# Drop rows with label 0
df = df.drop(0)
print(df)
Output:
a b
1 3 4
1 7 8
Note: Two rows were dropped because those two contain the same label 0.
<SELO: 1> <Reference No.: R1,R4>
Learning Outcomes

The students have learn and understand the followings:

•Introduction
•Create an Empty DataFrame
•Create a DataFrame from Lists
•Create a DataFrame from dict of ndarrays/Lists
•Create a DataFrame from List of Dicts
•Create a DataFrame from Dict of Series
•column selection, addition, and deletion
•Row Selection, Addition, and Deletion
References

1. Data Science with Python by by Aaron England, Mohamed Noordeen

Alaudeen, and Rohan Chopra. Packt Publishing; July 2019
2. https://intellipaat.com/blog/what-is-data-science/
3. https://onlinecourses.nptel.ac.in/noc20_cs36/
Thank you

Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
38 pages
Chapter 2 Data Handling Using Pandas - I (DATA FRAME)
No ratings yet
Chapter 2 Data Handling Using Pandas - I (DATA FRAME)
15 pages
Pandas DataFrame1
No ratings yet
Pandas DataFrame1
22 pages
Dataframe Notes
No ratings yet
Dataframe Notes
39 pages
L1 DataFrames I
No ratings yet
L1 DataFrames I
24 pages
IP Slybuss
No ratings yet
IP Slybuss
21 pages
Dataframe Notes
No ratings yet
Dataframe Notes
26 pages
Chapter 1 Python Pandas - I
No ratings yet
Chapter 1 Python Pandas - I
35 pages
Handout Pandas
No ratings yet
Handout Pandas
33 pages
Dataframe PDF
No ratings yet
Dataframe PDF
14 pages
SBLC 1
No ratings yet
SBLC 1
23 pages
Chapter 1 - Part 2 - DataFrame
No ratings yet
Chapter 1 - Part 2 - DataFrame
48 pages
Data Frame CREATION
No ratings yet
Data Frame CREATION
7 pages
DataFrame Notes
No ratings yet
DataFrame Notes
12 pages
Data Handling and CSV 2024 - 2025
No ratings yet
Data Handling and CSV 2024 - 2025
12 pages
P.no 35 To 52
No ratings yet
P.no 35 To 52
18 pages
Lecture 9 Pandas
No ratings yet
Lecture 9 Pandas
176 pages
Pandas DataFrame
No ratings yet
Pandas DataFrame
70 pages
DF 1
No ratings yet
DF 1
17 pages
Pandas
No ratings yet
Pandas
82 pages
DataFrame Notes1
No ratings yet
DataFrame Notes1
32 pages
Lab 9
No ratings yet
Lab 9
9 pages
Dataframe Ip
No ratings yet
Dataframe Ip
75 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
Python 3rd Unit Question and Answer
No ratings yet
Python 3rd Unit Question and Answer
25 pages
Python Pandas Module - Introduction-07-11-2023
No ratings yet
Python Pandas Module - Introduction-07-11-2023
84 pages
DataFrame Ac Win Final
No ratings yet
DataFrame Ac Win Final
30 pages
P Unit-4 NP
No ratings yet
P Unit-4 NP
30 pages
Python Pandas Dataframe: Parameter & Description
No ratings yet
Python Pandas Dataframe: Parameter & Description
12 pages
Pandas Class 12 Ncertttt
No ratings yet
Pandas Class 12 Ncertttt
48 pages
Panda
No ratings yet
Panda
46 pages
09 - Pandas Slides
No ratings yet
09 - Pandas Slides
33 pages
Delta Lesson Plan
100% (3)
Delta Lesson Plan
8 pages
Python Pandas Interview Questions
100% (1)
Python Pandas Interview Questions
17 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
Pandas & Numpy
No ratings yet
Pandas & Numpy
32 pages
Ip Study
No ratings yet
Ip Study
18 pages
1 Data Handlinng Using Pandas-I
No ratings yet
1 Data Handlinng Using Pandas-I
46 pages
Python UnitIV
No ratings yet
Python UnitIV
20 pages
Data Handlinng Using Pandas
No ratings yet
Data Handlinng Using Pandas
46 pages
IP Practical File - Reference
No ratings yet
IP Practical File - Reference
98 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
Pandas Python
No ratings yet
Pandas Python
11 pages
14 Pandas
No ratings yet
14 Pandas
25 pages
UNIT 3 (Chapter 2) Pandas
No ratings yet
UNIT 3 (Chapter 2) Pandas
43 pages
Data Handling Using Pandas-I-ORG
No ratings yet
Data Handling Using Pandas-I-ORG
44 pages
Data Science - Unit-3-Part-2
No ratings yet
Data Science - Unit-3-Part-2
32 pages
Case Folder - Special Crime
No ratings yet
Case Folder - Special Crime
45 pages
On Data Handling Using Pandas-I
100% (2)
On Data Handling Using Pandas-I
64 pages
Unit 4
No ratings yet
Unit 4
36 pages
Pandas-Creating Series & Dataframes (DR V Gowri, Srmist)
No ratings yet
Pandas-Creating Series & Dataframes (DR V Gowri, Srmist)
47 pages
Block 1-Data Handling Using Pandas DataFrame
No ratings yet
Block 1-Data Handling Using Pandas DataFrame
17 pages
James and The Giant Peach Lessonplan and Worksheet
No ratings yet
James and The Giant Peach Lessonplan and Worksheet
57 pages
Design Calc S Manual
No ratings yet
Design Calc S Manual
285 pages
Pandas Dataframe
No ratings yet
Pandas Dataframe
48 pages
Pandas
No ratings yet
Pandas
16 pages
Seismic Analysis Lumped Mass Procedure
No ratings yet
Seismic Analysis Lumped Mass Procedure
20 pages
UNIT - 3 Pandas
No ratings yet
UNIT - 3 Pandas
21 pages
Class 12 Panda Project
No ratings yet
Class 12 Panda Project
13 pages
Pandas Dataframe Export The CSV File
No ratings yet
Pandas Dataframe Export The CSV File
9 pages
Millis M. Warp Drives & Wormholes
100% (1)
Millis M. Warp Drives & Wormholes
8 pages
Python Data Frame New
No ratings yet
Python Data Frame New
32 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
Talent Management
100% (3)
Talent Management
33 pages
Class XII Data Handlinng Using PandasI
No ratings yet
Class XII Data Handlinng Using PandasI
46 pages
High Voltage Engineering Theory and Practice by M. Khalifa
No ratings yet
High Voltage Engineering Theory and Practice by M. Khalifa
554 pages
Wardhaugh CH 7
No ratings yet
Wardhaugh CH 7
22 pages
Dynamic Lookup PDF
No ratings yet
Dynamic Lookup PDF
5 pages
Laser Beam Shaping
100% (1)
Laser Beam Shaping
6 pages
Rationale
No ratings yet
Rationale
2 pages
Rhetorical Analysis
No ratings yet
Rhetorical Analysis
7 pages
Amplitude Modulation On MATLAB Simulink
No ratings yet
Amplitude Modulation On MATLAB Simulink
4 pages
Market Identification Guide
0% (1)
Market Identification Guide
14 pages
7AN01TE0423 Part-1
100% (1)
7AN01TE0423 Part-1
14 pages
Compare Maslow and Herzberg Theory of Motivation
100% (1)
Compare Maslow and Herzberg Theory of Motivation
3 pages
10-Year Project TOKIO
No ratings yet
10-Year Project TOKIO
16 pages
Project Report On PN Diode Characterization (Using LTSpice)
No ratings yet
Project Report On PN Diode Characterization (Using LTSpice)
11 pages
Berthold Schwarz: Ralph Oesper
No ratings yet
Berthold Schwarz: Ralph Oesper
4 pages
Deleted Chapter 2024 @somyajeet
No ratings yet
Deleted Chapter 2024 @somyajeet
5 pages
Peirano 1998 When Anthropology Is at Home PDF
No ratings yet
Peirano 1998 When Anthropology Is at Home PDF
25 pages
Maximizing Versus Satisficing: Happiness Is A Matter of Choice
No ratings yet
Maximizing Versus Satisficing: Happiness Is A Matter of Choice
2 pages
TMGT Tech Marketing Influencer Briefing Book
No ratings yet
TMGT Tech Marketing Influencer Briefing Book
3 pages
Laminar Mass Transfer From Porous Tubes and Flat Plates With Wall Resistance
No ratings yet
Laminar Mass Transfer From Porous Tubes and Flat Plates With Wall Resistance
17 pages
Homework 1 AMATH 301 UW
No ratings yet
Homework 1 AMATH 301 UW
2 pages
Constructivism On International Relations
No ratings yet
Constructivism On International Relations
2 pages
Ambati Ramesh Reddy: Objective
No ratings yet
Ambati Ramesh Reddy: Objective
2 pages
PCC-CS494 Rakesh Manna
No ratings yet
PCC-CS494 Rakesh Manna
1 page
Student Feedback Analysis
No ratings yet
Student Feedback Analysis
3 pages
Getting Started with SAS Programming: Using SAS Studio in the Cloud
From Everand
Getting Started with SAS Programming: Using SAS Studio in the Cloud
Ron Cody
No ratings yet
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CSL 410 L15

Uploaded by

CSL 410 L15

Uploaded by

Program:B.

Tech(CSE) IV Semester II Year

CSL-410: Data Science using Python

Dr. Sanjay Jain

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

S.No. Parameter & Description

dtype Data type of each column.

<SELO: 1> <Reference No.: R1,R4>

• A pandas DataFrame can be created using various inputs like:

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

• List of Dictionaries can be passed as input data to create a DataFrame. The

<SELO: 1> <Reference No.: R1,R4>

• The following example shows how to create a DataFrame with a list of

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

• Dictionary of Series can be passed to form a DataFrame. The resultant

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

Row Selection, Addition, and Deletion :

<SELO: 1> <Reference No.: R1,R4>

Row Selection, Addition, and Deletion :

<SELO: 1> <Reference No.: R1,R4>

Row Selection, Addition, and Deletion :

<SELO: 1> <Reference No.: R1,R4>

Row Selection, Addition, and Deletion :

Row Selection, Addition, and Deletion :

The students have learn and understand the followings:

1. Data Science with Python by by Aaron England, Mohamed Noordeen

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.