0% found this document useful (0 votes)
52 views7 pages

Pandas - Basics - Practice: Consider The Following Python Dictionary Data and Python List Labels

The document provides instructions and code snippets to practice working with pandas DataFrames. It defines sample data as a Python dictionary and list, and creates a DataFrame from this, setting the list as the index. It then demonstrates various methods to explore, select, filter and summarize the DataFrame, such as describing the data, selecting subsets of rows and columns, grouping and aggregating, and sorting.

Uploaded by

ABC
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views7 pages

Pandas - Basics - Practice: Consider The Following Python Dictionary Data and Python List Labels

The document provides instructions and code snippets to practice working with pandas DataFrames. It defines sample data as a Python dictionary and list, and creates a DataFrame from this, setting the list as the index. It then demonstrates various methods to explore, select, filter and summarize the DataFrame, such as describing the data, selecting subsets of rows and columns, grouping and aggregating, and sorting.

Uploaded by

ABC
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

pandas_basics_practice

January 4, 2021

Consider the following Python dictionary data and Python list labels:
data = {‘birds’: [‘Cranes’, ‘Cranes’, ‘plovers’, ‘spoonbills’, ‘spoonbills’, ‘Cranes’, ‘plovers’, ‘Cranes’,
‘spoonbills’, ‘spoonbills’], ‘age’: [3.5, 4, 1.5, np.nan, 6, 3, 5.5, np.nan, 8, 4], ‘visits’: [2, 4, 3, 4, 3, 4,
2, 2, 3, 2], ‘priority’: [‘yes’, ‘yes’, ‘no’, ‘yes’, ‘no’, ‘no’, ‘no’, ‘yes’, ‘no’, ‘no’]}
labels = [‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, ‘g’, ‘h’, ‘i’, ‘j’]
1. Create a DataFrame birds from this dictionary data which has the index labels.

[1]: import pandas as pd


import numpy as np
record={'Birds': ['Cranes', 'Cranes', 'plovers', 'spoonbills', 'spoonbills',␣
,→'Cranes', 'plovers', 'Cranes', 'spoonbills', 'spoonbills'],

'Age': [3.5, 4, 1.5, np.nan, 6, 3, 5.5, np.nan, 8, 4],


'Visits': [2, 4, 3, 4, 3, 4, 2, 2, 3, 2],
'Priority': ['yes', 'yes', 'no', 'yes', 'no', 'no', 'no', 'yes', 'no',␣
,→'no']}

recordIndex=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
birdsDataFrame=pd.
,→DataFrame(record,columns=['Birds','Age','Visits','Priority'],index=recordIndex)

birdsDataFrame

[1]: Birds Age Visits Priority


a Cranes 3.5 2 yes
b Cranes 4.0 4 yes
c plovers 1.5 3 no
d spoonbills NaN 4 yes
e spoonbills 6.0 3 no
f Cranes 3.0 4 no
g plovers 5.5 2 no
h Cranes NaN 2 yes
i spoonbills 8.0 3 no
j spoonbills 4.0 2 no

2. Display a summary of the basic information about birds DataFrame and its data.

[2]: print('====================================================')
#Describing basic information about birds dataframe
print(birdsDataFrame.describe)

1
print('====================================================')
#Displaying the data types about birds dataframe
print(birdsDataFrame.dtypes)
print('====================================================')
#Displaying the statitics about birds dataframe
print(birdsDataFrame.describe())
print('====================================================')

====================================================
<bound method NDFrame.describe of Birds Age Visits Priority
a Cranes 3.5 2 yes
b Cranes 4.0 4 yes
c plovers 1.5 3 no
d spoonbills NaN 4 yes
e spoonbills 6.0 3 no
f Cranes 3.0 4 no
g plovers 5.5 2 no
h Cranes NaN 2 yes
i spoonbills 8.0 3 no
j spoonbills 4.0 2 no>
====================================================
Birds object
Age float64
Visits int64
Priority object
dtype: object
====================================================
Age Visits
count 8.000000 10.000000
mean 4.437500 2.900000
std 2.007797 0.875595
min 1.500000 2.000000
25% 3.375000 2.000000
50% 4.000000 3.000000
75% 5.625000 3.750000
max 8.000000 4.000000
====================================================
3. Print the first 2 rows of the birds dataframe

[3]: birdsDataFrame[0:2]

[3]: Birds Age Visits Priority


a Cranes 3.5 2 yes
b Cranes 4.0 4 yes

4. Print all the rows with only ‘birds’ and ‘age’ columns from the dataframe

2
[4]: print('Option 1:')
print(birdsDataFrame[['Birds','Age']])
print('====================================================')
print('Option 2:')
print(birdsDataFrame.loc[:,'Birds':'Age'])

Option 1:
Birds Age
a Cranes 3.5
b Cranes 4.0
c plovers 1.5
d spoonbills NaN
e spoonbills 6.0
f Cranes 3.0
g plovers 5.5
h Cranes NaN
i spoonbills 8.0
j spoonbills 4.0
====================================================
Option 2:
Birds Age
a Cranes 3.5
b Cranes 4.0
c plovers 1.5
d spoonbills NaN
e spoonbills 6.0
f Cranes 3.0
g plovers 5.5
h Cranes NaN
i spoonbills 8.0
j spoonbills 4.0
5. select [2, 3, 7] rows and in columns [‘birds’, ‘age’, ‘visits’]

[5]: resultDataFrame=birdsDataFrame.iloc[[2,3,7]]
resultDataFrame[['Birds','Age','Visits']]

[5]: Birds Age Visits


c plovers 1.5 3
d spoonbills NaN 4
h Cranes NaN 2

6. select the rows where the number of visits is less than 4

[6]: birdsDataFrame.loc[birdsDataFrame['Visits']<4]

[6]: Birds Age Visits Priority


a Cranes 3.5 2 yes
c plovers 1.5 3 no

3
e spoonbills 6.0 3 no
g plovers 5.5 2 no
h Cranes NaN 2 yes
i spoonbills 8.0 3 no
j spoonbills 4.0 2 no

7. select the rows with columns [‘birds’, ‘visits’] where the age is missing i.e NaN

[7]: birdsDataFrame.loc[birdsDataFrame['Age'].isnull(),('Birds','Visits')]

[7]: Birds Visits


d spoonbills 4
h Cranes 2

8. Select the rows where the birds is a Cranes and the age is less than 4

[8]: birdsDataFrame.loc[(birdsDataFrame['Birds']=='Cranes') &␣


,→(birdsDataFrame['Age']<4)]

[8]: Birds Age Visits Priority


a Cranes 3.5 2 yes
f Cranes 3.0 4 no

9. Select the rows the age is between 2 and 4(inclusive)

[9]: birdsDataFrame.loc[(birdsDataFrame['Age']>=2) & (birdsDataFrame['Age']<=4)]

[9]: Birds Age Visits Priority


a Cranes 3.5 2 yes
b Cranes 4.0 4 yes
f Cranes 3.0 4 no
j spoonbills 4.0 2 no

10. Find the total number of visits of the bird Cranes

[10]: cranesDataFrame=birdsDataFrame.loc[(birdsDataFrame['Birds']=='Cranes'),'Visits']
sumVisits=cranesDataFrame.sum()
print('The number visits for cranes bird is: ',sumVisits)

The number visits for cranes bird is: 12


11. Calculate the mean age for each different birds in dataframe.

[11]: resultDataFrame=birdsDataFrame.groupby('Birds')
resultDataFrame=resultDataFrame['Age'].mean()
print(resultDataFrame)

Birds
Cranes 3.5
plovers 3.5

4
spoonbills 6.0
Name: Age, dtype: float64
12. Append a new row ‘k’ to dataframe with your choice of values for each column.
Then delete that row to return the original DataFrame.

[12]: birdsDataFrame.loc['k']=['ABC','3','2','yes']
print("\n------------ ADDING THE ROW ----------------\n")
print(birdsDataFrame)
birdsDataFrame.drop(index='k',inplace=True)
print("\n------------ AFTER DROPPING THE ROW ----------------\n")
print(birdsDataFrame)

------------ ADDING THE ROW ----------------

Birds Age Visits Priority


a Cranes 3.5 2 yes
b Cranes 4 4 yes
c plovers 1.5 3 no
d spoonbills NaN 4 yes
e spoonbills 6 3 no
f Cranes 3 4 no
g plovers 5.5 2 no
h Cranes NaN 2 yes
i spoonbills 8 3 no
j spoonbills 4 2 no
k ABC 3 2 yes

------------ AFTER DROPPING THE ROW ----------------

Birds Age Visits Priority


a Cranes 3.5 2 yes
b Cranes 4 4 yes
c plovers 1.5 3 no
d spoonbills NaN 4 yes
e spoonbills 6 3 no
f Cranes 3 4 no
g plovers 5.5 2 no
h Cranes NaN 2 yes
i spoonbills 8 3 no
j spoonbills 4 2 no
13. Find the number of each type of birds in dataframe (Counts)

[13]: birdsDataFrame.groupby('Birds')['Birds'].count()

[13]: Birds
Cranes 4

5
plovers 2
spoonbills 4
Name: Birds, dtype: int64

14. Sort dataframe (birds) first by the values in the ‘age’ in decending order, then by
the value in the ‘visits’ column in ascending order.

[14]: print('Sort by Age decending: -')


print(birdsDataFrame.sort_values(by='Age',ascending=False))
print('===========================================================')
print('Sort by Visits ascending: -')
print(birdsDataFrame.sort_values(by='Visits',ascending=True))

Sort by Age decending: -


Birds Age Visits Priority
i spoonbills 8 3 no
e spoonbills 6 3 no
g plovers 5.5 2 no
b Cranes 4 4 yes
j spoonbills 4 2 no
a Cranes 3.5 2 yes
f Cranes 3 4 no
c plovers 1.5 3 no
d spoonbills NaN 4 yes
h Cranes NaN 2 yes
===========================================================
Sort by Visits ascending: -
Birds Age Visits Priority
a Cranes 3.5 2 yes
g plovers 5.5 2 no
h Cranes NaN 2 yes
j spoonbills 4 2 no
c plovers 1.5 3 no
e spoonbills 6 3 no
i spoonbills 8 3 no
b Cranes 4 4 yes
d spoonbills NaN 4 yes
f Cranes 3 4 no
15. Replace the priority column values with’yes’ should be 1 and ‘no’ should be 0

[15]: birdsDataFrame.replace({'yes':1,'no':0})

[15]: Birds Age Visits Priority


a Cranes 3.5 2 1
b Cranes 4.0 4 1
c plovers 1.5 3 0
d spoonbills NaN 4 1
e spoonbills 6.0 3 0

6
f Cranes 3.0 4 0
g plovers 5.5 2 0
h Cranes NaN 2 1
i spoonbills 8.0 3 0
j spoonbills 4.0 2 0

16. In the ‘birds’ column, change the ‘Cranes’ entries to ‘trumpeters’.

[16]: birdsDataFrame.replace({'Cranes':'trumpeters'})

[16]: Birds Age Visits Priority


a trumpeters 3.5 2 yes
b trumpeters 4.0 4 yes
c plovers 1.5 3 no
d spoonbills NaN 4 yes
e spoonbills 6.0 3 no
f trumpeters 3.0 4 no
g plovers 5.5 2 no
h trumpeters NaN 2 yes
i spoonbills 8.0 3 no
j spoonbills 4.0 2 no

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy