0% found this document useful (0 votes)

64 views4 pages

Lecture - Ex - 03 - Jupyter Notebook

This document discusses combining features in a pandas dataframe. It demonstrates how to generate data, combine feature names into a new string, add two columns together, and multiply two features to create an interaction term. Multiplying is preferable to adding because it separates the feature space into a grid, making it easier to distinguish the impact of varying one feature on the interaction term in the data. Creating interaction terms will be practiced in this week's graded assignment.

Uploaded by

Timothy Suraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views4 pages

Lecture - Ex - 03 - Jupyter Notebook

Uploaded by

Timothy Suraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Course 2 week 1 lecture notebook Exercise 03

Combine features
In this exercise, you will practice how to combine features in a pandas dataframe. This will help you in the
graded assignment at the end of the week.

In addition, you will explore why it makes more sense to multiply two features rather than add them in order to
create interaction terms.

First, you will generate some data to work with.

In [ ]:

# Import pandas
import pandas as pd

# Import a pre-defined function that generates data

from utils import load_data

In [ ]:

# Generate features and labels

X, y = load_data(100)

In [ ]:

X.head()

In [ ]:

feature_names = X.columns
feature_names

Combine strings
Even though you can visually see feature names and type the name of the combined feature, you can
programmatically create interaction features so that you can apply this to any dataframe.

Use f-strings to combine two strings. There are other ways to do this, but Python's f-strings are quite useful.

In [ ]:

name1 = feature_names[0]
name2 = feature_names[1]

print(f"name1: {name1}")
print(f"name2: {name2}")
In [ ]:

# Combine the names of two features into a single string, separated by '_&_' for clarity
combined_names = f"{name1}_&_{name2}"
combined_names

Add two columns

Add the values from two columns and put them into a new column.
You'll do something similar in this week's assignment.

In [ ]:

X[combined_names] = X['Age'] + X['Systolic_BP']

X.head(2)

Why we multiply two features instead of adding

Why do you think it makes more sense to multiply two features together rather than adding them together?

Please take a look at two features, and compare what you get when you add them, versus when you multiply
them together.

In [ ]:

# Generate a small dataset with two features

df = pd.DataFrame({'v1': [1,1,1,2,2,2,3,3,3],
'v2': [100,200,300,100,200,300,100,200,300]
})

# add the two features together

df['v1 + v2'] = df['v1'] + df['v2']

# multiply the two features together

df['v1 x v2'] = df['v1'] * df['v2']
df

It may not be immediately apparent how adding or multiplying makes a difference; either way you get unique
values for each of these operations.

To view the data in a more helpful way, rearrange the data (pivot it) so that:

feature 1 is the row index

feature 2 is the column name.
Then set the sum of the two features as the value.

Display the resulting data in a heatmap.

In [ ]:

# Import seaborn in order to use a heatmap plot

import seaborn as sns
In [ ]:

# Pivot the data so that v1 + v2 is the value

df_add = df.pivot(index='v1',
columns='v2',
values='v1 + v2'
)
print("v1 + v2\n")
display(df_add)
print()
sns.heatmap(df_add);

Notice that it doesn't seem like you can easily distinguish clearly when you vary feature 1 (which ranges from 1
to 3), since feature 2 is so much larger in magnitude (100 to 300). This is because you added the two features
together.

View the 'multiply' interaction

Now pivot the data so that:

feature 1 is the row index

feature 2 is the column name.
The values are 'v1 x v2'

Use a heatmap to visualize the table.

In [ ]:

df_mult = df.pivot(index='v1',
columns='v2',
values='v1 x v2'
)
print('v1 x v2')
display(df_mult)
print()
sns.heatmap(df_mult);

Notice how when you multiply the features, the heatmap looks more like a 'grid' shape instead of three vertical
bars.

This means that you are more clearly able to make a distinction as feature 1 varies from 1 to 2 to 3.

Discussion
When you find the interaction between two features, you ideally hope to see how varying one feature makes an
impact on the interaction term. This is better achieved by multiplying the two features together rather than
adding them together.

Another way to think of this is that you want to separate the feature space into a "grid", which you can do by
multiplying the features together.

In this week's assignment, you will create interaction terms!

This is the end of this practice section.
Please continue on with the lecture videos!

Pandas+With+Python+ +DATAhill+Solutions
No ratings yet
Pandas+With+Python+ +DATAhill+Solutions
24 pages
Data Preprocessing 2
No ratings yet
Data Preprocessing 2
5 pages
EDA_Module_3-1
No ratings yet
EDA_Module_3-1
40 pages
Unit_5_2
No ratings yet
Unit_5_2
6 pages
Binary Operations
No ratings yet
Binary Operations
7 pages
aide memoire preparation des données
No ratings yet
aide memoire preparation des données
2 pages
cours data
No ratings yet
cours data
51 pages
download
No ratings yet
download
3 pages
Week 3 GGG
No ratings yet
Week 3 GGG
17 pages
Pandas Notes
No ratings yet
Pandas Notes
4 pages
3 Creating Features - Kaggle
No ratings yet
3 Creating Features - Kaggle
14 pages
(Feature Engineering) (Extended-Cheatsheet)
No ratings yet
(Feature Engineering) (Extended-Cheatsheet)
9 pages
Pandas Plots
No ratings yet
Pandas Plots
14 pages
01-Numpy & Pandas
No ratings yet
01-Numpy & Pandas
69 pages
DMT Function
No ratings yet
DMT Function
10 pages
Pandas_Data_Analytics
No ratings yet
Pandas_Data_Analytics
61 pages
Exp2 - Data Visualization and Cleaning and Feature Selection
No ratings yet
Exp2 - Data Visualization and Cleaning and Feature Selection
13 pages
Python CSBS Bhavya Lab Manual
No ratings yet
Python CSBS Bhavya Lab Manual
14 pages
Data Mining_Week - 4
No ratings yet
Data Mining_Week - 4
8 pages
Cleaning Data in Python
No ratings yet
Cleaning Data in Python
8 pages
AD3301 - Data - Transformation - Ipynb - Colaboratory
No ratings yet
AD3301 - Data - Transformation - Ipynb - Colaboratory
27 pages
Pandas Tricks To Create A DataFrame From An Existing One
No ratings yet
Pandas Tricks To Create A DataFrame From An Existing One
14 pages
Lecture 14
No ratings yet
Lecture 14
33 pages
Ap Python
No ratings yet
Ap Python
12 pages
Pandas
No ratings yet
Pandas
44 pages
EDA+Cheatsheet+ +Class+Note
No ratings yet
EDA+Cheatsheet+ +Class+Note
29 pages
Chapter 2 Python Pandas - II
No ratings yet
Chapter 2 Python Pandas - II
19 pages
Data Wrangling and Analysis
100% (1)
Data Wrangling and Analysis
36 pages
Lab File
No ratings yet
Lab File
96 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
EXP-3
No ratings yet
EXP-3
10 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
4 PythonPandas
No ratings yet
4 PythonPandas
8 pages
EDA+Cheatsheet+ +Class+Note
No ratings yet
EDA+Cheatsheet+ +Class+Note
29 pages
Hrithik Saini Class 12th c1, Roll No 1033
No ratings yet
Hrithik Saini Class 12th c1, Roll No 1033
25 pages
EDA LAB ASSIGNMENT2
No ratings yet
EDA LAB ASSIGNMENT2
10 pages
EDA_CODE_SNIPPETS
No ratings yet
EDA_CODE_SNIPPETS
17 pages
Hint_sheet
No ratings yet
Hint_sheet
13 pages
Unit2 Modified
No ratings yet
Unit2 Modified
42 pages
Machine Learning
No ratings yet
Machine Learning
30 pages
python interviews
No ratings yet
python interviews
154 pages
ML-merged
No ratings yet
ML-merged
28 pages
EDA - Exploratory Data Analysis
No ratings yet
EDA - Exploratory Data Analysis
16 pages
EDS - Python Cheat Sheet
0% (1)
EDS - Python Cheat Sheet
3 pages
pandas_merged
No ratings yet
pandas_merged
2 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
Watchlist 1
No ratings yet
Watchlist 1
1 page
Relinfra Adanipower Yesbank Idfc Always Sbi Coal India
No ratings yet
Relinfra Adanipower Yesbank Idfc Always Sbi Coal India
1 page
Rapids Cheatsheet
100% (1)
Rapids Cheatsheet
2 pages
Feature Creation in Data Mining
No ratings yet
Feature Creation in Data Mining
5 pages
Pandas Cheat Sheet
100% (4)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet CN
No ratings yet
Pandas Cheat Sheet CN
4 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Conti Playbook Translated
No ratings yet
Conti Playbook Translated
32 pages
Chapter-2 Python Pandas
100% (2)
Chapter-2 Python Pandas
33 pages
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
10 Lessons in Front-end
From Everand
10 Lessons in Front-end
Krasimir Tsonev
2/5 (1)
Pandas Cheat Sheet
100% (2)
Pandas Cheat Sheet
6 pages
HellrAiser System Utility V4 Reversing
50% (2)
HellrAiser System Utility V4 Reversing
34 pages
CMU-CS 462 - Software Meassurement and Analysis - 2024S - Lecture Slides - 3-2
No ratings yet
CMU-CS 462 - Software Meassurement and Analysis - 2024S - Lecture Slides - 3-2
30 pages
Simplex Noise demystified
No ratings yet
Simplex Noise demystified
18 pages
Ebenezer's FYP - Edited
No ratings yet
Ebenezer's FYP - Edited
48 pages
Hospital Planning and Design-Unit 1
0% (1)
Hospital Planning and Design-Unit 1
25 pages
User Guide - SoEasy - Rev 6
100% (1)
User Guide - SoEasy - Rev 6
88 pages
Back To Basics
No ratings yet
Back To Basics
24 pages
Pandas Cheat Sheet
85% (13)
Pandas Cheat Sheet
2 pages
Math Class
No ratings yet
Math Class
28 pages
Lecture 1
No ratings yet
Lecture 1
27 pages
Introduction To Big Data BS (CS) 6 Lecture # 4: Dr. Syed Attique Shah (PH.D.)
No ratings yet
Introduction To Big Data BS (CS) 6 Lecture # 4: Dr. Syed Attique Shah (PH.D.)
19 pages
Getting Started Guide
No ratings yet
Getting Started Guide
9 pages
Non-Clinical Careers For Physicians: Featuring Mentoring, Recruiters, and Employers
No ratings yet
Non-Clinical Careers For Physicians: Featuring Mentoring, Recruiters, and Employers
18 pages
Assignment 2
100% (1)
Assignment 2
13 pages
Amazon.com iPad Holding Stand
No ratings yet
Amazon.com iPad Holding Stand
1 page
How To Flash (Or Unbrick) Samsung Galaxy S Plus (I9001) Using O
No ratings yet
How To Flash (Or Unbrick) Samsung Galaxy S Plus (I9001) Using O
13 pages
Major Models Used in SDLC
No ratings yet
Major Models Used in SDLC
14 pages
3d Video Wizard User Manual 101711
No ratings yet
3d Video Wizard User Manual 101711
30 pages
Shortest Path Algorithms: 8.1.1 Problem
No ratings yet
Shortest Path Algorithms: 8.1.1 Problem
14 pages
Mohit Mahendra Singhvi
No ratings yet
Mohit Mahendra Singhvi
1 page
Grow To Greatness: Smart Growth For Private Businesses, Part I
No ratings yet
Grow To Greatness: Smart Growth For Private Businesses, Part I
16 pages
About the SCAN
No ratings yet
About the SCAN
2 pages
19 Resources To Make You A Pro Broker
100% (1)
19 Resources To Make You A Pro Broker
5 pages
BCA-06 C Programming 2022
No ratings yet
BCA-06 C Programming 2022
3 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Hands-On Hardware Hacking and Reverse Engineering Techniques
No ratings yet
Hands-On Hardware Hacking and Reverse Engineering Techniques
28 pages
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
100% (3)
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
9 pages
02-CH02-CompSec2e-ver02 Cryptographic Tools PDF
No ratings yet
02-CH02-CompSec2e-ver02 Cryptographic Tools PDF
36 pages
Position Paper: Applying Engineering Principles To System Security Design & Implementation
No ratings yet
Position Paper: Applying Engineering Principles To System Security Design & Implementation
6 pages
LSMW De1 Mi02 Delete
No ratings yet
LSMW De1 Mi02 Delete
4 pages
C2 - W1 - Lecture - Ex - 04 - Jupyter Notebook PDF
No ratings yet
C2 - W1 - Lecture - Ex - 04 - Jupyter Notebook PDF
3 pages
ANSYS Parametric Design Language
No ratings yet
ANSYS Parametric Design Language
3 pages
Graphics Function
No ratings yet
Graphics Function
10 pages
Python Libraries Cheat Sheets
No ratings yet
Python Libraries Cheat Sheets
6 pages
Formula: Our Rewards in Life Will Always Be in Exact Proportion To Our Contribution or Service
No ratings yet
Formula: Our Rewards in Life Will Always Be in Exact Proportion To Our Contribution or Service
1 page
Part Two - The Copywriter's How To'
No ratings yet
Part Two - The Copywriter's How To'
22 pages
File Allocation Table
No ratings yet
File Allocation Table
22 pages
Distance BCA Syllabus Karnataka State Open University Bachelor of Computer Applications
No ratings yet
Distance BCA Syllabus Karnataka State Open University Bachelor of Computer Applications
29 pages
Training Report Final
No ratings yet
Training Report Final
38 pages
LInkedin Test Plan
No ratings yet
LInkedin Test Plan
38 pages
Wind Loading
No ratings yet
Wind Loading
11 pages
Freelance Copywriting
No ratings yet
Freelance Copywriting
13 pages
Bdsapplicationform
No ratings yet
Bdsapplicationform
3 pages
Bdsapplicationform
No ratings yet
Bdsapplicationform
3 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
MAN MMDS Interfacing PDF
100% (1)
MAN MMDS Interfacing PDF
3 pages
440393
No ratings yet
440393
5 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lecture - Ex - 03 - Jupyter Notebook

Uploaded by

Lecture - Ex - 03 - Jupyter Notebook

Uploaded by

Course 2 week 1 lecture notebook Exercise 03

First, you will generate some data to work with.

# Import a pre-defined function that generates data

# Generate features and labels

Add two columns

X[combined_names] = X['Age'] + X['Systolic_BP']

Why we multiply two features instead of adding

# Generate a small dataset with two features

# add the two features together

# multiply the two features together

feature 1 is the row index

Display the resulting data in a heatmap.

# Import seaborn in order to use a heatmap plot

# Pivot the data so that v1 + v2 is the value

View the 'multiply' interaction

Now pivot the data so that:

feature 1 is the row index

Use a heatmap to visualize the table.

In this week's assignment, you will create interaction terms!

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.