0% found this document useful (0 votes)
16 views8 pages

Act 7.2

Uploaded by

huelvamicah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views8 pages

Act 7.2

Uploaded by

huelvamicah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Micah T.

Huelva
BSIT 3-1
Activity #7 PanDaS
Importing panda
import pandas as pd

Series
- similar to the use of np.array() in NumPy
series1 = pd.Series([9,3,1,7,8,5])
series1

0 9
1 3
2 1
3 7
4 8
5 5
dtype: int64

# getting the array representation


series1.values

array([9, 3, 1, 7, 8, 5], dtype=int64)

# index object of the Series via its values and index attributes,
respectively.

series1.index

RangeIndex(start=0, stop=6, step=1)

Changing the index


series2 = pd.Series([9,3,1,7,8,5], index=['a','b','c','d','e','f']) series2

series2.values
series2.index

Index(['a', 'b', 'c', 'd', 'e', 'f'], dtype='object')


Series Operations
# provide Series where values in data_2 are multiplied by 3
series2 * 3

a 0
b -12
c -3
d -9
e -9
f -9
dtype: int64

Passing a Series as argument to numpy functions.


import numpy as np
np.square(series2)

a 0
b 16
c 1
d 9
e 9
f 9
dtype: int64

Series Indexing and Splicing


# access value having index 'e'
series2["e"]

-3

# access part of data_2 with values at index 'a','d','f'


series2[["a","d","f"]]

a 0
d -3
f -3
dtype: int64

# access part of data_2 where all values are less than 4


series2[series2 < 4]

a 0
b -4
c -1
d -3
e -3
f -3
dtype: int64
Splicing
# assign value 0 to index 'a'
series2["c":"f"]

c -1
d -3
e -3
f -3
dtype: int64

Series Assignments
# assign value 0 to index 'a'
series2["a"] = 0
series2

a 0
b -4
c -1
d -3
e -3
f -3
dtype: int64

series2["d":"f"] = -3
series2

a 0
b -4
c -1
d -3
e -3
f -3
dtype: int64

# assign value 0 to index 'a'


series2[["c","b"]] = [-1,-4]
series2

a 0
b -4
c -1
d -3
e -3
f -3
dtype: int64
Using dictionaries to create Series
age_dict={"Ellen":27, "Charlie":18, "Ana":20, "Ben":24, "Dina":29}
age = pd.Series(age_dict)
age

Ellen 27
Charlie 18
Ana 20
Ben 24
Dina 29
dtype: int64

# access the value having index "Ellen"


age["Ellen"]

27

# access those having age less than 25


age[age < 25]

Charlie 18
Ana 20
Ben 24
dtype: int64

# access those values from index "Charlie" to "Ben"


age["Charlie":"Ben"]

Charlie 18
Ana 20
Ben 24
dtype: int64

The name attribute


age.name = "Age"
age.index.name = "First Names"
age

First Names
Ellen 27
Charlie 18
Ana 20
Ben 24
Dina 29
Name: Age, dtype: int64
DataFrame
some_dict = {'a':[0,1,2], 'b':[3,4,5]}
series3 = pd.DataFrame(some_dict)
series3

a b
0 0 3
1 1 4
2 2 5

Creating a Dataframe from multiple Series.


age

First Names
Ellen 27
Charlie 18
Ana 20
Ben 24
Dina 29
Name: Age, dtype: int64

province_dict = {"Ellen":"Tarlac", "Charlie":"Cebu", "Ana":"Pampanga",


"Ben": "Davao", "Dina": "Cebu"}
province = pd.Series(province_dict)
province

Ellen Tarlac
Charlie Cebu
Ana Pampanga
Ben Davao
Dina Cebu
dtype: object

people = pd.DataFrame({'age':age, 'province':province})


people

age province
Ellen 27 Tarlac
Charlie 18 Cebu
Ana 20 Pampanga
Ben 24 Davao
Dina 29 Cebu

Selecting a column/row of a DataFrame


people['age']

Ellen 27
Charlie 18
Ana 20
Ben 24
Dina 29
Name: age, dtype: int64

people.province

Ellen Tarlac
Charlie Cebu
Ana Pampanga
Ben Davao
Dina Cebu
Name: province, dtype: object

Retrieving rows using special loc attribute


people.loc['Ellen']

age 27
province Tarlac
Name: Ellen, dtype: object

Adding a New Column


# create a new column 'debt' and set value to 0
people['debt'] = 0
people

age province debt


Ellen 27 Tarlac 0
Charlie 18 Cebu 0
Ana 20 Pampanga 0
Ben 24 Davao 0
Dina 29 Cebu 0

Array of values
people['is_married'] = [True, False, False, True, True]
people

age province debt is_married


Ellen 27 Tarlac 0 True
Charlie 18 Cebu 0 False
Ana 20 Pampanga 0 False
Ben 24 Davao 0 True
Dina 29 Cebu 0 True
Adding a New Row
values = {'age':22, "is_married":False, "occupation":"cashier"}
people.loc['Regine']=values
people

age province debt is_married


Ellen 27 Tarlac 0.0 True
Charlie 18 Cebu 0.0 False
Ana 20 Pampanga 0.0 False
Ben 24 Davao 0.0 True
Dina 29 Cebu 0.0 True
Regine 22 NaN NaN False

Deleting a column/row
people.drop(columns="debt")

age province is_married


Ellen 27 Tarlac True
Charlie 18 Cebu False
Ana 20 Pampanga False
Ben 24 Davao True
Dina 29 Cebu True
Regine 22 NaN False

people.drop(index=["Regine","Ellen"], inplace=True)
people

age province debt is_married


Charlie 18 Cebu 0.0 False
Ana 20 Pampanga 0.0 False
Ben 24 Davao 0.0 True
Dina 29 Cebu 0.0 True

Sorting in a DataFrame
people.sort_values(by='province')

age province debt is_married


Charlie 18 Cebu 0.0 False
Dina 29 Cebu 0.0 True
Ben 24 Davao 0.0 True
Ana 20 Pampanga 0.0 False

# Sort the value by index


people.sort_index(axis=0)

age province debt is_married


Ana 20 Pampanga 0.0 False
Ben 24 Davao 0.0 True
Charlie 18 Cebu 0.0 False
Dina 29 Cebu 0.0 True

people.sort_index(axis=1, ascending=False)

province is_married debt age


Charlie Cebu False 0.0 18
Ana Pampanga False 0.0 20
Ben Davao True 0.0 24
Dina Cebu True 0.0 29

Quick Descriptive Statistics


# provides quick statistical description
print(people.describe())

age debt
count 4.000000 4.0
mean 22.750000 0.0
std 4.856267 0.0
min 18.000000 0.0
25% 19.500000 0.0
50% 22.000000 0.0
75% 25.250000 0.0
max 29.000000 0.0

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy