0% found this document useful (0 votes)

2 views7 pages

Pandas

The document provides an overview of the Pandas library for data manipulation in Python, covering installation, key data structures (Series, DataFrame, Index), and various operations such as sorting, statistical functions, and indexing. It includes code examples for creating Series and DataFrames, reindexing, and performing statistical calculations. The document serves as a guide for beginners to understand and utilize Pandas for data analysis.

Uploaded by

shinchanshinchan941

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views7 pages

Pandas

Uploaded by

shinchanshinchan941

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

1.

Data Manipulation Using Pandas Library

2. Learning Objectives
 Introduction to Pandas
 Installation of Pandas
 Pandas Objects
 Pandas Sort
 Working with Text Data
 Statistical Function
 Indexing and Selecting Data
3. Introduction to Pandas
 Pandas is an open-source Python library that uses powerful data structures to provide
high-performance data manipulation and analysis.
 It provides a variety of data structures and operations for manipulating numerical data
and time series.
 This library is based on the NumPy library.
4. Installation of Pandas
 The first step in using pandas is to check whether it is installed in the Python folder.
 If not, we must install it on our system using the pip (Pip Installs Packages)
command.

pip install pandas

Defaulting to user installation because normal site-packages is not writeable

Requirement already satisfied: pandas in c:\programdata\anaconda3\lib\site-packages (1.4.2)
Requirement already satisfied: pytz>=2020.1 in c:\programdata\anaconda3\lib\site-packages (
from pandas) (2021.3)
Requirement already satisfied: python-dateutil>=2.8.1 in c:\programdata\anaconda3\lib\site-p
ackages (from pandas) (2.8.2)
Requirement already satisfied: numpy>=1.18.5 in c:\programdata\anaconda3\lib\site-package
s (from pandas) (1.21.5)
Requirement already satisfied: six>=1.5 in c:\programdata\anaconda3\lib\site-packages (from
python-dateutil>=2.8.1->pandas) (1.16.0)
Note: you may need to restart the kernel to use updated packages.

 After installing pandas on your system, you'll need to import the library.
 This module is typically imported as follows:
5. Introducing Pandas Objects

 Pandas objects can be thought of as enhanced versions of NumPy structured arrays in

which the rows and columns are identified with labels rather than simple integer
indices
 There are three fundamental Pandas data structures:
 Series
 DataFrame
 Index.

6. What is a Series?

 Pandas Series is a labelled one-dimensional array that can hold any type of data
(integer, string, float, Python objects, and so on).
 Pandas Series is simply a column in an Excel spreadsheet.
 Using the Series() method, we can easily convert a list, tuple, or dictionary into a
Series.

6.1. Creating a Series

import pandas as pd
import numpy as np
# Creating empty series.
ser = pd.Series()
print(ser)
# simple array
data = np.array(['T', 'A', 'S', 'K'])
ser = pd.Series(data)
print(ser)
6.2. Creating a series from Lists:
7. Pandas Index
 Pandas Index is an efficient tool for extracting particular rows and columns of data
from a DataFrame.
 Its job is to organise data and make it easily accessible.
 We can also define an index, similar to an address, through which we can access any
data in the Series or DataFrame.

7.1. Creating index

First, we have to take a csv file that consist some data used for indexing.
# importing pandas package
import pandas as pd
data = pd.read_csv("airlines.csv")
data
8. Pandas DataFrame
Panda has A two-dimensional data structure with corresponding labels is known as a
dataframe. Spreadsheets used in Excel or Calc or SQL tables are similar to DataFrames.
Pandas DataFrame consists of three main components: the data, the index, and the columns.

8.1. Creating a Pandas DataFrame

Creating a dataframe using List: DataFrame can be created using a single list or a list of lists.
#import pandas as pd import pandas as pd
# list of strings
lst = ['Geeks', 'For', 'Geeks', 'is', 'portal', 'for', 'Geeks']
# Calling DataFrame constructor on list
df = pd.DataFrame(lst)
print(df)
Creating DataFrame from dict of ndarray/lists : To generate a DataFrame from a dict of
narrays/lists, each narray must be the same length.
# Python code demonstrate creating
# DataFrame from dict narray / lists #By default addresses.
import pandas as pd
# intialise data of lists.
data = { 'Name': ['Tom', 'nick', 'krish', 'jack'],
'Age': [20, 21, 19, 18]}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
print(df)
9. Reindexing
 Reindexing modifies the row and column labels of a DataFrame.
 It denotes verifying that the data corresponds to a specific set of labels along an
established axis.Indexing enables us to carry out a variety of operations, including:-
 Insert missing value (NaN) markers in label locations where there was
previously no data for the label.
 To reorder existing data to correspond to a new set of labels.
 To reindex the dataframe, use the reindex() function.
 Values in the new index that do not have matching records in the dataframe are by
default given the value NaN.

import pandas as pd
# Create dataframe
info = pd.DataFrame({"P":[4, 7, 1, 8, 9],
"Q":[6, 8, 10, 15, 11],
"R":[17, 13, 12, 16, 14],
"S":[15, 19, 7, 21, 9]},
index =["Parker", "William", "Smith", "Terry", "Phill"])
#Print dataframe
Info
Now, we can use the dataframe.reindex() function to reindex the dataframe.
10. Pandas Sort
 There are two kinds of sorting available in Pandas. They are –
 By label
 By Actual Value
 By Label - When using the sort_index() method, DataFrame can be sorted by passing
the axis arguments and the sorting order. Row labels are sorted by default in
ascending order.
11. Working with Text Data
 Working with string data is made simple by a set of string functions that are part of
Pandas.
 Most importantly, these functions ignore (or exclude) missing/NaN values.
 Watch each operation now to see how it does

12. Statistical Functions

 Using pandas, it is simple to simplify numerous complex statistical operations in
Python to a single line of code.
 Some of the most popular and practical statistical operations will be covered.

Pandas sum() method

import pandas as pd
# Dataset
data = {
'Maths' :[90, 85, 98, 80, 55, 78],
'Science': [92, 87, 59, 64, 87, 96], 'English': [95, 94, 84, 75, 67, 65]
}
# DataFrame
df = pd.DataFrame(data)
# Display the DataFrame
print("DataFrame = \n",df)
# Display the Sum of Marks in each column
print("\nSum = \n",df.sum())
print("\nCount of non-empty values = \n", df.count())
print("\nMaximum Marks = \n", df.max())
print("\nMinimum Marks = \n", df.min())
print("\nMedian = \n",df.median())
//
import pandas as pd
# Dataset
data = {
'Maths': [90, 85, 98, None, 55, 78],
'Science': [92, 87, 59, None, None, 96],
'English': [95, None, 84, 75, 67, None]
}
# DataFrame
df = pd.DataFrame(data)
# Display the DataFrame
print("DataFrame = \n", df)
# Display the Count of non-empty values in each column
print("\nCount of non-empty values = \n", df.count())
13. Indexing and Selecting Data
 In Pandas, selecting specific rows and columns of data from a DataFrame constitutes
indexing.
 Selecting all the rows and some of the columns, some of the rows and all the columns,
or a portion of each row and each column is what is referred to as indexing.
 Another term for indexing is subset selection.
 Pandas now supports three types of Multi-axes indexing

13.1. Indexing a Data frame using indexing operator [] :

This indexer had the ability to select both by integer location and label. Although it was
adaptable, its lack of explicitness led to a lot of confusion. Integers can occasionally serve as
labels for rows and columns as well. As a result, there were times when it was unclear. In
most cases, ix is label-based and performs exactly as the.loc indexer. However,.ix also
supports choosing an integer type (like.iloc) when an integer is passed. This only functions
when the DataFrame's index is not integer-based.Any.loc and.iloc input is acceptable for ix.
# importing pandas package
import pandas as pd
# making data frame from csv file
data = pd.read_csv("nba.csv", index_col ="Name")
# retrieving columns by indexing operator
first = data["Age"]
print(first)
13.2. Indexing a DataFrame using .loc[ ] :
# importing pandas package
import pandas as pd
# making data frame from csv file
data = pd.read_csv("nba.csv", index_col ="Name")
# retrieving row by loc method
first = data.loc["Avery Bradley"]
second = data.loc["R.J. Hunter"]

print(first, "\n\n\n", second)

13.3. Indexing a DataFrame using .iloc[ ]

import pandas as pd
# making data frame from csv file
data = pd.read_csv("nba.csv", index_col ="Name")
# retrieving rows by iloc method
row2 = data.iloc[3]
print(row2)

Chapter 10 Database
No ratings yet
Chapter 10 Database
76 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
Python Data Frame New
No ratings yet
Python Data Frame New
32 pages
Phan1 Pandas Numpy Matplotlib
No ratings yet
Phan1 Pandas Numpy Matplotlib
158 pages
Pandas Notes
No ratings yet
Pandas Notes
44 pages
Unit 3 Data Analysis Using Pandas
No ratings yet
Unit 3 Data Analysis Using Pandas
49 pages
Unit 4
No ratings yet
Unit 4
36 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
Pandas
No ratings yet
Pandas
25 pages
Pandas: Import
100% (1)
Pandas: Import
13 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
Pandas
No ratings yet
Pandas
29 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
Eda Unit 2
No ratings yet
Eda Unit 2
65 pages
Pandas Dataframe Export The CSV File
No ratings yet
Pandas Dataframe Export The CSV File
9 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
Unit 2
No ratings yet
Unit 2
81 pages
Unit-4Introduction To Pandas
No ratings yet
Unit-4Introduction To Pandas
44 pages
Pandas
No ratings yet
Pandas
13 pages
Exp 25 - 26
No ratings yet
Exp 25 - 26
17 pages
Python Pandas Presentation
No ratings yet
Python Pandas Presentation
32 pages
UNIT - 3 Pandas
No ratings yet
UNIT - 3 Pandas
21 pages
14 Pandas
No ratings yet
14 Pandas
25 pages
Unit 3
No ratings yet
Unit 3
10 pages
Pandas Class 12 Ncertttt
No ratings yet
Pandas Class 12 Ncertttt
48 pages
Loki Temp PPT Pandas 2
No ratings yet
Loki Temp PPT Pandas 2
31 pages
Python Pandas Tutorial For Beginners
No ratings yet
Python Pandas Tutorial For Beginners
203 pages
Data Handling Using Pandas-I-ORG
No ratings yet
Data Handling Using Pandas-I-ORG
44 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
60 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
Module 6
No ratings yet
Module 6
48 pages
Python 2.1.2
No ratings yet
Python 2.1.2
7 pages
On Data Handling Using Pandas-I
100% (2)
On Data Handling Using Pandas-I
64 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
Lec 02 - DS100 Fa23 - Pandas 1
No ratings yet
Lec 02 - DS100 Fa23 - Pandas 1
61 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
Exp1 - Manipulating Datasets Using Pandas
No ratings yet
Exp1 - Manipulating Datasets Using Pandas
15 pages
Class Xii Information Practices PPT On Data Handling Using Pandas-I
No ratings yet
Class Xii Information Practices PPT On Data Handling Using Pandas-I
64 pages
05getting Started With Pandas
No ratings yet
05getting Started With Pandas
44 pages
Python Pandas Module - Introduction-07-11-2023
No ratings yet
Python Pandas Module - Introduction-07-11-2023
84 pages
Ln. 1 - Data Handling Using Pandas - Series & Dataframe
No ratings yet
Ln. 1 - Data Handling Using Pandas - Series & Dataframe
14 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
25 pages
Pandas
No ratings yet
Pandas
5 pages
Pandas
No ratings yet
Pandas
63 pages
Data Handlinng Using Pandas
No ratings yet
Data Handlinng Using Pandas
46 pages
Lecture 5
No ratings yet
Lecture 5
36 pages
Pandas Python
No ratings yet
Pandas Python
11 pages
Panda
No ratings yet
Panda
46 pages
Python Pandas
No ratings yet
Python Pandas
21 pages
Class XII Data Handlinng Using PandasI
No ratings yet
Class XII Data Handlinng Using PandasI
46 pages
04-Data Manipulation With Pandas
No ratings yet
04-Data Manipulation With Pandas
28 pages
Pandas
No ratings yet
Pandas
4 pages
Class XII IP Key Points (Python Pandas)
No ratings yet
Class XII IP Key Points (Python Pandas)
5 pages
DSL Pandas
No ratings yet
DSL Pandas
87 pages
DataFrame Ac Win Final
No ratings yet
DataFrame Ac Win Final
30 pages
Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
75 pages
Pandas
No ratings yet
Pandas
13 pages
On Data Handling Using Pandas-I
100% (2)
On Data Handling Using Pandas-I
63 pages
1 Data Handlinng Using Pandas-I
No ratings yet
1 Data Handlinng Using Pandas-I
46 pages
DDB Notes
No ratings yet
DDB Notes
19 pages
Synopsis: Project Title: Gym Management System
No ratings yet
Synopsis: Project Title: Gym Management System
2 pages
DBMS Convert ER Into Table - Unit..2
No ratings yet
DBMS Convert ER Into Table - Unit..2
3 pages
Machine Learning For Tabular Data XGBoost, Deep Learning, and AI (Mark Ryan, Luca Massaron) (Z-Library)
100% (1)
Machine Learning For Tabular Data XGBoost, Deep Learning, and AI (Mark Ryan, Luca Massaron) (Z-Library)
504 pages
DBMS Lab 9
No ratings yet
DBMS Lab 9
4 pages
Azure Data Engineer Guide
No ratings yet
Azure Data Engineer Guide
87 pages
UNIT 1 INTRODUCTION TO BIGDATA by MIT
No ratings yet
UNIT 1 INTRODUCTION TO BIGDATA by MIT
12 pages
Gui GRP 20 MP
No ratings yet
Gui GRP 20 MP
18 pages
MongoDB As A Service
No ratings yet
MongoDB As A Service
20 pages
CATIA - Prismatic Machining 2 (PMG)
No ratings yet
CATIA - Prismatic Machining 2 (PMG)
4 pages
01 JTW115 3 Dec 2022
No ratings yet
01 JTW115 3 Dec 2022
8 pages
ICT File
No ratings yet
ICT File
3 pages
Micro Project (Database Management System)
No ratings yet
Micro Project (Database Management System)
17 pages
Database Design Management Lab Manual
100% (1)
Database Design Management Lab Manual
96 pages
Information Technology Audit of Statutory Corporation
No ratings yet
Information Technology Audit of Statutory Corporation
18 pages
Math Assignment Unit 5
No ratings yet
Math Assignment Unit 5
3 pages
Applied Statistics Chapter 5 Statistical System in India
No ratings yet
Applied Statistics Chapter 5 Statistical System in India
21 pages
SAP HANA Security Checklists and Recommendations
No ratings yet
SAP HANA Security Checklists and Recommendations
36 pages
BCA Data Mining
No ratings yet
BCA Data Mining
116 pages
Java-Springboot Test Paper
No ratings yet
Java-Springboot Test Paper
4 pages
Modul 9 - Document, Content, and Metadata Management - DMBOK2 PDF
No ratings yet
Modul 9 - Document, Content, and Metadata Management - DMBOK2 PDF
66 pages
DBMS MCQ Exam With Answers
No ratings yet
DBMS MCQ Exam With Answers
4 pages
Flutter, Clean Architecture
No ratings yet
Flutter, Clean Architecture
5 pages
DatabaseDesignDocumentV1 1
No ratings yet
DatabaseDesignDocumentV1 1
15 pages
Enterprise User Guide V8i
No ratings yet
Enterprise User Guide V8i
33 pages
2 Cookbook VBFA Adoption 20180406
No ratings yet
2 Cookbook VBFA Adoption 20180406
16 pages
MS SQL Server 2005-2019: A Few Things To Consider
No ratings yet
MS SQL Server 2005-2019: A Few Things To Consider
5 pages
Sim-Acc 214 Ulo 1
No ratings yet
Sim-Acc 214 Ulo 1
7 pages
Microsoft: AZ-303 Exam
No ratings yet
Microsoft: AZ-303 Exam
140 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Pandas

Uploaded by

Pandas

Uploaded by

1.

Data Manipulation Using Pandas Library

pip install pandas

Defaulting to user installation because normal site-packages is not writeable

 Pandas objects can be thought of as enhanced versions of NumPy structured arrays in

6.1. Creating a Series

7.1. Creating index

8.1. Creating a Pandas DataFrame

12. Statistical Functions

Pandas sum() method

13.1. Indexing a Data frame using indexing operator [] :

print(first, "\n\n\n", second)

13.3. Indexing a DataFrame using .iloc[ ]

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.