0% found this document useful (0 votes)

4 views8 pages

Pandas.ipynb - Colab (1)

The document provides an overview of Pandas objects, specifically the Series and DataFrame structures, highlighting their creation, indexing, and manipulation. It explains how to construct these objects from various data types, including lists, dictionaries, and NumPy arrays, and discusses data indexing techniques using loc and iloc. Additionally, it covers operations on data, handling missing values, and basic statistical functions available in Pandas.

Uploaded by

Asra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views8 pages

Pandas.ipynb - Colab (1)

Uploaded by

Asra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

keyboard_arrow_down Introducing Pandas Objects

import numpy as np
import pandas as pd

keyboard_arrow_down The Pandas Series Object

A Pandas Series is a one-dimensional array of indexed data. It can be created from a list or array as follows:

data = pd.Series([5,14,99,888])
data

0 5

1 14

2 99

3 888

dtype: int64

data[3]

888

data.values

array([ 5, 14, 99, 888])

data.index

RangeIndex(start=0, stop=4, step=1)

Like with a NumPy array, data can be accessed by the associated index via the familiar Python square-bracket notation:

data[1]

14
data[1:3]

1 14

2 99

dtype: int64

keyboard_arrow_down Series as generalized NumPy array

Index is the difference between numpy and pandas

data = pd.Series([0.25, 0.5, 0.75, 1.0],

index=['a', 'b', 'c','d'])
data

a 0.25
b 0.50
c 0.75
d 1.00
dtype: float64

And the item access works as expected:

data['b']

0.5

keyboard_arrow_down Series as specialized dictionary

The Series -as-dictionary analogy can be made even more clear by constructing a Series object directly from a Python dictionary:

student_dict = {'Ram': 123,

'Shyam': 124,
'Arun': 125}
students = pd.Series(student_dict)
students

Ram 123

Shyam 124

Arun 125

dtype: int64

By default, a Series will be created where the index is drawn from the sorted keys. From here, typical dictionary-style item access can be
performed:

#To access rollno of Ram

students['Ram']

123

keyboard_arrow_down The Pandas DataFrame Object

The next fundamental structure in Pandas is the DataFrame . Like the Series object discussed in the previous section, the DataFrame can be
thought of either as a generalization of a NumPy array, or as a specialization of a Python dictionary. We'll now take a look at each of these
perspectives.
# initialize list of lists
data = [['tom', 10], ['nick', 15], ['juli', 14]]

# Create the pandas DataFrame

df = pd.DataFrame(data, columns=['Name', 'Age'])

# print dataframe.
df

Name Age

0 tom 10

1 nick 15

2 juli 14

df.index

RangeIndex(start=0, stop=3, step=1)

Additionally, the DataFrame has a columns attribute, which is an Index object holding the column labels:

df.columns

Index(['Name', 'Age'], dtype='object')

Thus the DataFrame can be thought of as a generalization of a two-dimensional NumPy array, where both the rows and columns have a
generalized index for accessing the data.

df['Name']

0 tom
1 nick
2 juli
Name: Name, dtype: object

keyboard_arrow_down Constructing DataFrame objects

A Pandas DataFrame can be constructed in a variety of ways. Here we'll give several examples.

keyboard_arrow_down From a single Series object

A DataFrame is a collection of Series objects, and a single-column DataFrame can be constructed from a single Series :

pd.DataFrame(students, columns=['rollno'])

rollno

Ram 123

Shyam 124

Arun 125

keyboard_arrow_down From a list of dicts

Any list of dictionaries can be made into a DataFrame .

Even if some keys in the dictionary are missing, Pandas will fill them in with NaN (i.e., "not a number") values:

pd.DataFrame([{'a': 1, 'b': 2}, {'b': 3, 'c': 4}])

a b c

0 1.0 2 NaN

1 NaN 3 4.0

keyboard_arrow_down From a two-dimensional NumPy array

Given a two-dimensional array of data, we can create a DataFrame with any specified column and index names. If omitted, an integer index will
be used for each:

np.random.rand(3, 2)

array([[0.48925761, 0.81202557],
[0.37526746, 0.9834642 ],
[0.10226165, 0.37402615]])

pd.DataFrame(np.random.rand(3, 2),
columns=['foo', 'bar'],
index=['a', 'b', 'c'])

foo bar

a 0.965321 0.512423

b 0.969355 0.437354

c 0.196705 0.719428

keyboard_arrow_down From a NumPy structured array

We covered structured arrays in Structured Data: NumPy's Structured Arrays. A Pandas DataFrame operates much like a structured array, and
can be created directly from one:

A = np.zeros(3, dtype=[('A', 'i8'), ('B', 'f8')])

array([(0, 0.), (0, 0.), (0, 0.)], dtype=[('A', '<i8'), ('B', '<f8')])

pd.DataFrame(A)

A B

0 0 0.0

1 0 0.0

2 0 0.0

keyboard_arrow_down Index as ordered set

Pandas objects are designed to facilitate operations such as joins across datasets, which depend on many aspects of set arithmetic. The
Index object follows many of the conventions used by Python's built-in set data structure, so that unions, intersections, differences, and other
combinations can be computed in a familiar way:

indA = pd.Index([1, 3, 5, 7, 9])

indB = pd.Index([2, 3, 5, 7, 11])

indA & indB # intersection- common elements

<ipython-input-71-b0dd807d5915>:1: FutureWarning: Index.__and__ operating as a set operation is deprecated, in the future this will be a
indA & indB # intersection- common elements
Int64Index([3, 5, 7], dtype='int64')

indA | indB # union - all elements

Index([3, 3, 5, 7, 11], dtype='int64')

indA ^ indB # symmetric difference

Index([3, 0, 0, 0, 2], dtype='int64')

DATA INDEXING AND SELECTION

import pandas as pd
data = pd.Series([0.25, 0.5, 0.75, 1.0],
index=['a', 'b', 'c', 'd'])
data

a 0.25

b 0.50

c 0.75

d 1.00

dtype: float64

# masking
data[(data ==0.5) ]

b 0.5

dtype: float64

# fancy indexing
data[['a', 'd']]

a 0.25
d 1.00
dtype: float64

Indexers: loc, iloc, and ix

data = pd.Series(['a', 'b', 'c'], index=[1, 3, 5])

data

1 a

3 b

5 c

dtype: object

# explicit index when indexing - user defined index

data[1]

'a'

# implicit index when slicing

data[1:3]

3 b

5 c

dtype: object
Because of this potential confusion in the case of integer indexes, Pandas provides some special indexer attributes that explicitly expose
certain indexing schemes.

First, the loc attribute allows indexing and slicing that always references the explicit index:

data.loc[1] #Local means explicit

'a'

data.loc[1:3]

1 a
3 b
dtype: object

The iloc attribute allows indexing and slicing that always references the implicit Python-style index:

data.iloc[1:3] #Implicit

3 b
5 c
dtype: object

student= [['Ram', 123,80,85], ['Shyam', 124,70,75],

['Arun', 125,35,60], ['Gopal', 235,95,70]]
data = pd.DataFrame(student,columns=['Name','Rollno',"FDS_Mark","DS_Mark"])

data

Name Rollno FDS_Mark DS_Mark

0 Ram 123 80 85

1 Shyam 124 70 75

2 Arun 125 35 60

3 Gopal 235 95 70

#Select all roll numbers

data['Rollno']

0 123
1 124
2 125
3 235
Name: Rollno, dtype: int64

data['name_dept'] = data['Name'] + "_CSE A"

data

Name Rollno FDS_Mark DS_Mark name_dept

0 Ram 123 80 85 Ram_CSE A

1 Shyam 124 70 75 Shyam_CSE A

2 Arun 125 35 60 Arun_CSE A

3 Gopal 235 95 70 Gopal_CSE A

#Select first two rows

data[:2]

Name Rollno FDS_Mark DS_Mark name_dept

0 Ram 123 80 85 Ram_CSE A

1 Shyam 124 70 75 Shyam_CSE A

#Operating on Pandas Data
#Dividing mark column by 100
data['FDS_Mark']/100

FDS_Mark

0 0.80

1 0.70

2 0.35

3 0.95

dtype: float64

data['DS_Mark']-15

DS_Mark

0 70

1 60

2 45

3 55

dtype: int64

data['Total_mark']=data['FDS_Mark']+data['DS_Mark']

data

Name Rollno FDS_Mark DS_Mark name_dept Total_mark

0 Ram 123 80 85 Ram_CSE A 165

1 Shyam 124 70 75 Shyam_CSE A 145

2 Arun 125 35 60 Arun_CSE A 95

3 Gopal 235 95 70 Gopal_CSE A 165

data['Total_mark'].mean()

142.5

data['Total_mark'].median()

155.0

data['Total_mark'].mode()

0 165
dtype: int64

#Handling Missing Data

isnull(): Generate a boolean mask indicating missing values

notnull(): Opposite of isnull()

dropna(): Return a filtered version of the data

fillna(): Return a copy of the data with missing values filled or imputed

import numpy as np
data = pd.Series([1, np.nan, 'hello', None])
data
0

0 1

1 NaN

2 hello

3 None

dtype: object

data.isnull()

0 False
1 True
2 False
3 True
dtype: bool

data.dropna() # Inplace changes original copy

0 1
2 hello
dtype: object

data

0 1
1 NaN
2 hello
3 None
dtype: object

#Filling Null Values

data.fillna(0)

0 1
1 0
2 hello
3 0
dtype: object

# forward-fill
data.fillna(method='ffill')

0 1
1 1
2 hello
3 hello
dtype: object

Start coding or generate with AI.

IP DataFrames (Introduction)
No ratings yet
IP DataFrames (Introduction)
18 pages
Pandas 1
No ratings yet
Pandas 1
89 pages
09_Pandas slides
No ratings yet
09_Pandas slides
33 pages
Copy of Copy of Black Doodle Group Project Presentation - 20230903 - 211147 - 0000
No ratings yet
Copy of Copy of Black Doodle Group Project Presentation - 20230903 - 211147 - 0000
32 pages
Data Manipulation With Pandas (1)
No ratings yet
Data Manipulation With Pandas (1)
138 pages
Communication skills class X 2025-26
No ratings yet
Communication skills class X 2025-26
58 pages
Pandas
No ratings yet
Pandas
82 pages
Python Pandas - DataFrame
No ratings yet
Python Pandas - DataFrame
12 pages
Pandas
No ratings yet
Pandas
49 pages
Data Science - Unit II
100% (2)
Data Science - Unit II
173 pages
Pandas
No ratings yet
Pandas
44 pages
Tutorial Data Visualization Pandas Matplotlib Seaborn
No ratings yet
Tutorial Data Visualization Pandas Matplotlib Seaborn
32 pages
Customer satisfaction in Public transportation within Kathmandu valley
No ratings yet
Customer satisfaction in Public transportation within Kathmandu valley
35 pages
Pandas Cheat Sheet........
No ratings yet
Pandas Cheat Sheet........
11 pages
Unit III - Pandas - Data Manipulation Using Python
No ratings yet
Unit III - Pandas - Data Manipulation Using Python
15 pages
lecture-9-pandas
No ratings yet
lecture-9-pandas
176 pages
Pandas
No ratings yet
Pandas
8 pages
IP Practical File - Reference
No ratings yet
IP Practical File - Reference
98 pages
Class 12 Panda Project
No ratings yet
Class 12 Panda Project
13 pages
Pandas
No ratings yet
Pandas
11 pages
IBPS-PO-English-Practice-Questions
No ratings yet
IBPS-PO-English-Practice-Questions
23 pages
2.1 Pandas Objects
No ratings yet
2.1 Pandas Objects
10 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
IP Slybuss
No ratings yet
IP Slybuss
21 pages
Pandas & Numpy
No ratings yet
Pandas & Numpy
32 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
P Unit-4 NP
No ratings yet
P Unit-4 NP
30 pages
14_Pandas
No ratings yet
14_Pandas
25 pages
UNIT - 3 Pandas
No ratings yet
UNIT - 3 Pandas
21 pages
Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
38 pages
Unit_III_part_2_1725700061785
No ratings yet
Unit_III_part_2_1725700061785
85 pages
Pandas DataFrame1
No ratings yet
Pandas DataFrame1
22 pages
Pandas Shan Ver2
No ratings yet
Pandas Shan Ver2
25 pages
Unit 2
No ratings yet
Unit 2
81 pages
Mizoram Synod Handbooks
No ratings yet
Mizoram Synod Handbooks
16 pages
Functions of a Complex Variable 1st Edition Hemant Kumar Pathak - Quickly download the ebook to read anytime, anywhere
No ratings yet
Functions of a Complex Variable 1st Edition Hemant Kumar Pathak - Quickly download the ebook to read anytime, anywhere
46 pages
Unit 4
No ratings yet
Unit 4
36 pages
P03 Introduction To Pandas Ans
No ratings yet
P03 Introduction To Pandas Ans
45 pages
Donner DDP-100 Digital Piano Manuel
67% (3)
Donner DDP-100 Digital Piano Manuel
1 page
Pandas
No ratings yet
Pandas
163 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
12 Pandas
No ratings yet
12 Pandas
9 pages
Pandas,Numpy,Matplotlib
No ratings yet
Pandas,Numpy,Matplotlib
11 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
ML UNIT-2 NOTES
No ratings yet
ML UNIT-2 NOTES
17 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
6 pages
Pandas DataFrame Notes
67% (3)
Pandas DataFrame Notes
13 pages
Pandas
No ratings yet
Pandas
41 pages
ip study
No ratings yet
ip study
18 pages
Cheat Sheet: The Pandas Dataframe Object: Column Index (DF - Columns)
No ratings yet
Cheat Sheet: The Pandas Dataframe Object: Column Index (DF - Columns)
6 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
Vanderbilt University Press Spring/Summer 2021 Catalog
No ratings yet
Vanderbilt University Press Spring/Summer 2021 Catalog
19 pages
Manual BMW E 36 Workshop PDF
No ratings yet
Manual BMW E 36 Workshop PDF
16 pages
Tribal Customary Law in Jordan
No ratings yet
Tribal Customary Law in Jordan
19 pages
Pandas python
No ratings yet
Pandas python
11 pages
BN 100C - 40W Batten
No ratings yet
BN 100C - 40W Batten
10 pages
A Crazy Methodology - On The Limits of Macro-Quantitative Social Science Research
No ratings yet
A Crazy Methodology - On The Limits of Macro-Quantitative Social Science Research
32 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
0104 Position Description Grade 2 Physiotherapist
No ratings yet
0104 Position Description Grade 2 Physiotherapist
4 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
10 pages
Block 1-Data Handling Using Pandas DataFrame
No ratings yet
Block 1-Data Handling Using Pandas DataFrame
17 pages
Siddhi Sanjay Punpaliya
No ratings yet
Siddhi Sanjay Punpaliya
3 pages
Fashion Design and Visual Merchandising Attributes in E-Commerce
No ratings yet
Fashion Design and Visual Merchandising Attributes in E-Commerce
14 pages
Ja 021307 W
No ratings yet
Ja 021307 W
14 pages
Heat Transport Basic Equations and Applications
No ratings yet
Heat Transport Basic Equations and Applications
13 pages
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
No ratings yet
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
4 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
Chapter 3 - Technical Assessment
No ratings yet
Chapter 3 - Technical Assessment
2 pages
(Oobe) - Lucid Dream, Obe - Best Method
No ratings yet
(Oobe) - Lucid Dream, Obe - Best Method
1 page
Skripsi Riska Rahman PDF
No ratings yet
Skripsi Riska Rahman PDF
130 pages
Kusti Proceedure Basic Padyav Kusti Zn
No ratings yet
Kusti Proceedure Basic Padyav Kusti Zn
8 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
African Music in Christian Liturgy, The Igbo Experiment
100% (1)
African Music in Christian Liturgy, The Igbo Experiment
22 pages
Our Lady of Fatima University Department of Surgery: Choose The Best Answer
100% (1)
Our Lady of Fatima University Department of Surgery: Choose The Best Answer
6 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
Memorising Chords For Bass Guitar
100% (2)
Memorising Chords For Bass Guitar
3 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Mobile Crane Pre-Use Inspection Form
No ratings yet
Mobile Crane Pre-Use Inspection Form
1 page
L'oreal: Shampoo Wars
No ratings yet
L'oreal: Shampoo Wars
3 pages
Taylor: Connectors For Transmitter Cable
No ratings yet
Taylor: Connectors For Transmitter Cable
2 pages
HCA Course 1 HCA Course 1 With Verified Correct Answers - Complete Solution 2024
No ratings yet
HCA Course 1 HCA Course 1 With Verified Correct Answers - Complete Solution 2024
20 pages
Motorized System On Lifting
No ratings yet
Motorized System On Lifting
4 pages
Oral Communication in Context: Quarter 2 - Module 11: Principles of Speech Delivery
No ratings yet
Oral Communication in Context: Quarter 2 - Module 11: Principles of Speech Delivery
18 pages
Sept 2020 Email Data
No ratings yet
Sept 2020 Email Data
20 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Pandas.ipynb - Colab (1)

Uploaded by

Pandas.ipynb - Colab (1)

Uploaded by

keyboard_arrow_down Introducing Pandas Objects

keyboard_arrow_down The Pandas Series Object

array([ 5, 14, 99, 888])

RangeIndex(start=0, stop=4, step=1)

keyboard_arrow_down Series as generalized NumPy array

Index is the difference between numpy and pandas

data = pd.Series([0.25, 0.5, 0.75, 1.0],

And the item access works as expected:

keyboard_arrow_down Series as specialized dictionary

student_dict = {'Ram': 123,

#To access rollno of Ram

keyboard_arrow_down The Pandas DataFrame Object

# Create the pandas DataFrame

RangeIndex(start=0, stop=3, step=1)

Index(['Name', 'Age'], dtype='object')

keyboard_arrow_down Constructing DataFrame objects

keyboard_arrow_down From a single Series object

keyboard_arrow_down From a list of dicts

Any list of dictionaries can be made into a DataFrame .

pd.DataFrame([{'a': 1, 'b': 2}, {'b': 3, 'c': 4}])

keyboard_arrow_down From a two-dimensional NumPy array

keyboard_arrow_down From a NumPy structured array

A = np.zeros(3, dtype=[('A', 'i8'), ('B', 'f8')])

keyboard_arrow_down Index as ordered set

indA = pd.Index([1, 3, 5, 7, 9])

indA & indB # intersection- common elements

indA | indB # union - all elements

indA ^ indB # symmetric difference

Index([3, 0, 0, 0, 2], dtype='int64')

DATA INDEXING AND SELECTION

Indexers: loc, iloc, and ix

data = pd.Series(['a', 'b', 'c'], index=[1, 3, 5])

# explicit index when indexing - user defined index

# implicit index when slicing

data.loc[1] #Local means explicit

student= [['Ram', 123,80,85], ['Shyam', 124,70,75],

Name Rollno FDS_Mark DS_Mark

#Select all roll numbers

data['name_dept'] = data['Name'] + "_CSE A"

Name Rollno FDS_Mark DS_Mark name_dept

0 Ram 123 80 85 Ram_CSE A

1 Shyam 124 70 75 Shyam_CSE A

2 Arun 125 35 60 Arun_CSE A

3 Gopal 235 95 70 Gopal_CSE A

#Select first two rows

Name Rollno FDS_Mark DS_Mark name_dept

0 Ram 123 80 85 Ram_CSE A

1 Shyam 124 70 75 Shyam_CSE A

Name Rollno FDS_Mark DS_Mark name_dept Total_mark

0 Ram 123 80 85 Ram_CSE A 165

1 Shyam 124 70 75 Shyam_CSE A 145

2 Arun 125 35 60 Arun_CSE A 95

3 Gopal 235 95 70 Gopal_CSE A 165

#Handling Missing Data

isnull(): Generate a boolean mask indicating missing values

notnull(): Opposite of isnull()

dropna(): Return a filtered version of the data

data.dropna() # Inplace changes original copy

#Filling Null Values

Start coding or generate with AI.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.