0% found this document useful (0 votes)
106 views33 pages

Intermediate Python ch3 Slides PDF

The document discusses filtering Pandas DataFrames. It shows a brics DataFrame containing country data for Brazil, Russia, India, China and South Africa. It then states the goal is to filter the DataFrame to only include rows for Brazil, Russia and India based on their country codes.

Uploaded by

sujuma brahma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views33 pages

Intermediate Python ch3 Slides PDF

The document discusses filtering Pandas DataFrames. It shows a brics DataFrame containing country data for Brazil, Russia, India, China and South Africa. It then states the goal is to filter the DataFrame to only include rows for Brazil, Russia and India based on their country codes.

Uploaded by

sujuma brahma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

INTERMEDIATE PYTHON FOR DATA SCIENCE

Comparison Operators
Intermediate Python for Data Science

Numpy Recap
In [1]: import numpy as np
In [2]: np_height = np.array([1.73, 1.68, 1.71, 1.89, 1.79])
In [3]: np_weight = np.array([65.4, 59.2, 63.6, 88.4, 68.7])
In [4]: bmi = np_weight / np_height ** 2

In [5]: bmi
Out[5]: array([ 21.852, 20.975, 21.75 , 24.747, 21.441])

In [6]: bmi > 23


Out[6]: array([False, False, False, True, False], dtype=bool)

In [7]: bmi[bmi > 23]


Out[7]: array([ 24.747]) Code from Intro to Python for Data Science, Chapter 4

Comparison operators: how Python values relate


Intermediate Python for Data Science

Numeric Comparisons
In [8]: 2 < 3
Out[8]: True

In [9]: 2 == 3
Out[9]: False

In [10]: 2 <= 3
Out[10]: True

In [11]: 3 <= 3
Out[11]: True

In [12]: x = 2

In [13]: y = 3

In [14]: x < y
Out[14]: True
Intermediate Python for Data Science

Other Comparisons
In [15]: "carl" < "chris"
Out[15]: True

In [16]: 3 < "chris"


TypeError: unorderable types: int() < str()

In [17]: 3 < 4.1


Out[17]: True

In [18]: bmi
Out[18]: array([ 21.852, 20.975, 21.75 , 24.747, 21.441])

In [19]: bmi > 23


Out[19]: array([False, False, False, True, False], dtype=bool)
Intermediate Python for Data Science

Comparators
< strictly less than

<= less than or equal

> strictly greater than

>= greater than or equal

== equal

!= not equal
INTERMEDIATE PYTHON FOR DATA SCIENCE

Let’s practice!
INTERMEDIATE PYTHON FOR DATA SCIENCE

Boolean Operators
Intermediate Python for Data Science

Boolean Operators
● and
● or
● not
Intermediate Python for Data Science

and
In [1]: True and True
Out[1]: True

In [2]: False and True


Out[2]: False

In [3]: True and False


Out[3]: False

In [4]: False and False


Out[4]: False

In [5]: x = 12
True True
In [6]: x > 5 and x < 15
Out[6]: True
Intermediate Python for Data Science

or
In [7]: True or True
Out[7]: True

In [8]: False or True


Out[8]: True

In [9]: True or False


Out[9]: True

In [10]: False or False


Out[10]: False

In [11]: y = 5
True False
In [12]: y < 7 or y > 13
Out[12]: True
Intermediate Python for Data Science

not
In [13]: not True
Out[13]: False

In [14]: not False


Out[14]: True
Intermediate Python for Data Science

Numpy
In [19]: bmi # calculation of bmi left out
Out[19]: array([ 21.852, 20.975, 21.75 , 24.747, 21.441])

In [20]: bmi > 21


Out[20]: array([ True, False, True, True, True], dtype=bool)

In [21]: bmi < 22


Out[22]: array([ True, True, True, False, True], dtype=bool)

In [23]: bmi > 21 and bmi < 22


ValueError: The truth value of an array with more than one element
is ambiguous. Use a.any() or a.all()
Intermediate Python for Data Science

Numpy logical_and()
logical_or()
logical_not()
In [19]: bmi # calculation of bmi left out
Out[19]: array([ 21.852, 20.975, 21.75 , 24.747, 21.441])

In [20]: bmi > 21


Out[20]: array([ True, False, True, True, True], dtype=bool)

In [21]: bmi < 22


Out[22]: array([ True, True, True, False, True], dtype=bool)

In [23]: bmi > 21 and bmi < 22


ValueError: The truth value of an array with more than one element
is ambiguous. Use a.any() or a.all()

In [24]: np.logical_and(bmi > 21, bmi < 22)


Out[24]: array([ True, False, True, False, True], dtype=bool)

In [25]: bmi[np.logical_and(bmi > 21, bmi < 22)]


Out[25]: array([ 21.852, 21.75, 21.441])
INTERMEDIATE PYTHON FOR DATA SCIENCE

Let’s practice!
INTERMEDIATE PYTHON FOR DATA SCIENCE

if, elif, else


Intermediate Python for Data Science

Overview
● Comparison Operators
● <, >, >=, <=, ==, !=
● Boolean Operators
● and, or, not
● Conditional Statements
● if, else, elif
Intermediate Python for Data Science

if if condition :
expression
"

! control.py

z = 4 True
if z % 2 == 0 :
print("z is even")

Output:
z is even
Intermediate Python for Data Science

if if condition :
expression

"

expression # not part of if


! control.py

z = 4 True
if z % 2 == 0 :
print("z is even")

Output:
z is even
Intermediate Python for Data Science

if if condition :
expression
"

! control.py

z = 4
if z % 2 == 0 :
print("checking " + str(z))
print("z is even")

Output:
checking 4

z is even
Intermediate Python for Data Science

if if condition :
expression
"

! control.py

z = 5 False
if z % 2 == 0 :
print("checking " + str(z))
print("z is even") Not executed

Output:
Intermediate Python for Data Science

else if condition :
expression
"

else :
! control.py expression

z = 5 False
if z % 2 == 0 :
print("z is even")
else :
print("z is odd")

Output:
z is odd
Intermediate Python for Data Science

elif if condition :
expression
"

elif condition :
! control.py expression
else :
z = 3
expression
if z % 2 == 0 : False
print("z is divisible by 2")
elif z % 3 == 0 : True
print("z is divisible by 3")
else :
print("z is neither divisible by 2 nor by 3")

Output:
z is divisible by 3
Intermediate Python for Data Science

elif if condition :
expression
"

elif condition :
! control.py expression
else :
z = 6
expression
if z % 2 == 0 : True
print("z is divisible by 2")
elif z % 3 == 0 : Never reached
print("z is divisible by 3")
else :
print("z is neither divisible by 2 nor by 3")

Output:
z is divisible by 2
INTERMEDIATE PYTHON FOR DATA SCIENCE

Let’s practice!
INTERMEDIATE PYTHON FOR DATA SCIENCE

Filtering Pandas DataFrame


Intermediate Python for Data Science

brics
In [1]: import pandas as pd

In [2]: brics = pd.read_csv("path/to/brics.csv", index_col = 0)

In [3]: brics
Out[3]:
country capital area population
BR Brazil Brasilia 8.516 200.40
RU Russia Moscow 17.100 143.50
IN India New Delhi 3.286 1252.00
CH China Beijing 9.597 1357.00
SA South Africa Pretoria 1.221 52.98
Intermediate Python for Data Science

Goal BR
RU
IN
country
Brazil
Russia
India
capital
Brasilia
Moscow
New Delhi
area
8.516
17.100
3.286
population
200.40
143.50
1252.00

Select countries with 



CH China Beijing 9.597 1357.00
● SA South Africa Pretoria 1.221 52.98

area over 8 million km 2

● 3 steps
● Select the area column
● Do comparison on area column
● Use result to select countries
Intermediate Python for Data Science

Step 1: Get column BR


RU
IN
country
Brazil
Russia
India
capital
Brasilia
Moscow
New Delhi
area
8.516
17.100
3.286
population
200.40
143.50
1252.00
CH China Beijing 9.597 1357.00
In [4]: brics["area"] SA South Africa Pretoria 1.221 52.98

Out[4]: Alternatives:
BR 8.516
RU 17.100 brics.loc[:,"area"]
IN 3.286
CH 9.597 brics.iloc[:,2]
SA 1.221
Name: area, dtype: float64
Need Pandas Series
Intermediate Python for Data Science

Step 2: Compare BR
RU
IN
country
Brazil
Russia
India
capital
Brasilia
Moscow
New Delhi
area
8.516
17.100
3.286
population
200.40
143.50
1252.00
CH China Beijing 9.597 1357.00
In [4]: brics["area"] SA South Africa Pretoria 1.221 52.98

Out[4]:
BR 8.516
RU 17.100
IN 3.286
CH 9.597
SA 1.221
Name: area, dtype: float64

In [5]: brics["area"] > 8


Out[5]:
BR True
RU True
IN False
CH True
SA False
Name: area, dtype: bool

In [6]: is_huge = brics["area"] > 8


Intermediate Python for Data Science

Step 3: Subset DF BR
RU
IN
country
Brazil
Russia
India
capital
Brasilia
Moscow
New Delhi
area
8.516
17.100
3.286
population
200.40
143.50
1252.00
CH China Beijing 9.597 1357.00
In [7]: is_huge SA South Africa Pretoria 1.221 52.98

Out[7]:
BR True
RU True
IN False
CH True
SA False
Name: area, dtype: bool

In [8]: brics[is_huge]
Out[8]:
country capital area population
BR Brazil Brasilia 8.516 200.4
RU Russia Moscow 17.100 143.5
CH China Beijing 9.597 1357.0
Intermediate Python for Data Science

Summary BR
RU
IN
country
Brazil
Russia
India
capital
Brasilia
Moscow
New Delhi
area
8.516
17.100
3.286
population
200.40
143.50
1252.00
CH China Beijing 9.597 1357.00
In [9]: is_huge = brics["area"] > 8 SA South Africa Pretoria 1.221 52.98

In [10]: brics[is_huge]
Out[10]:
country capital area population
BR Brazil Brasilia 8.516 200.4
RU Russia Moscow 17.100 143.5
CH China Beijing 9.597 1357.0

In [11]: brics[brics["area"] > 8]


Out[11]:
country capital area population
BR Brazil Brasilia 8.516 200.4
RU Russia Moscow 17.100 143.5
CH China Beijing 9.597 1357.0
Intermediate Python for Data Science

Boolean operators BR
RU
IN
country
Brazil
Russia
India
capital
Brasilia
Moscow
New Delhi
area
8.516
17.100
3.286
population
200.40
143.50
1252.00
CH China Beijing 9.597 1357.00
In [12]: import numpy as np SA South Africa Pretoria 1.221 52.98

In [13]: np.logical_and(brics["area"] > 8, brics["area"] < 10)


Out[13]:
BR True
RU False
IN False
CH True
SA False
Name: area, dtype: bool

In [14]: brics[np.logical_and(brics["area"] > 8, brics["area"] < 10)]


Out[14]:
country capital area population
BR Brazil Brasilia 8.516 200.4
CH China Beijing 9.597 1357.0
INTERMEDIATE PYTHON FOR DATA SCIENCE

Let’s practice!

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy