0% found this document useful (0 votes)

18 views6 pages

SQL To Pandas - Group Aggregations

SQL to pandas

Uploaded by

sagarvshinde

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views6 pages

SQL To Pandas - Group Aggregations

SQL to pandas

Uploaded by

sagarvshinde

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

SQL to Pandas:

Group Aggregations

SQL Pandas

SELECT col
umn1
, MEAN(column2) df.groupby('column1')
FROM table ['column2'].mean()
GROUP BY column1;

#SQL_to_Pandas with Shane Butler

SQL to Pandas:
Group Aggregations

Explanation
Grouping data is a powerful method to aggregate and analyze data based on
categories or groups. It helps in summarizing data and deriving insights by grouping
similar items together. In SQL, you can use the GROUP BY clause to group data
based on one or more columns and then apply aggregation functions on the grouped
data. In Pandas, the .groupby() method allows you to group data by one or more
columns and apply aggregation functions to these groups.

Imagine you are analyzing a dataset of companies. You might want to know the
average number of employees in each industry. Grouping the data by industry and
then calculating the average employee count for each group provides this insight.
SQL to Pandas:
Group Aggregations

Key Conepts
.groupby() Method: In Pandas, the .groupby() method groups data by specified
columns and allows you to apply aggregation functions to these groups.
SQL to Pandas:
Group Aggregations

SQL Data

-- Create the companies table

CREATE TABLE companies (
company_id INT,
company_name VARCHAR(50),
industry VARCHAR(50),
location VARCHAR(50),
employee_count INT,
revenue INT,
year_founded INT
);

-- Insert data into the companies table

INSERT INTO companies (company_id, company_name, industry, location, employee_count, revenue,
year_founded) VALUES
(1, 'Apple', 'Technology', 'Cupertino', 147000, 274515000, 1976),
(2, 'Google', 'Technology', 'Mountain View', 135301, 182527000, 1998),
(3, 'Microsoft', 'Technology', 'Redmond', 163000, 143015000, 1975),
(4, 'Amazon', 'E-commerce', 'Seattle', 1335000, 386064000, 1994),
(5, 'Meta', 'Technology', 'Menlo Park', 58604, 85965000, 2004),
(6, 'Tesla', 'Automotive', 'Palo Alto', 70757, 31536000, 2003),
(7, 'Uber', 'Technology', 'San Francisco', 22600, 11427000, 2009),
(8, 'Airbnb', 'Technology', 'San Francisco', 6132, 3376000, 2008),
(9, 'Netflix', 'Entertainment', 'Los Gatos', 9400, 25099999, 1997),
(10, 'Spotify', 'Entertainment', 'Stockholm', 6617, 7800000, 2006);
SQL to Pandas:
Group Aggregations

Pandas Data
import pandas as pd

#Create a DataFrame (like a table in SQL)

data = {
'company_id': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'company_name': ['Apple', 'Google', 'Microsoft', 'Amazon', 'Meta', 'Tesla', 'Uber', 'Airbnb',
'Netflix', 'Spotify'],
'industry': ['Technology', 'Technology', 'Technology', 'E-commerce', 'Technology', 'Automotive',
'Technology', 'Technology', 'Entertainment', 'Entertainment'],
'location': ['Cupertino', 'Mountain View', 'Redmond', 'Seattle', 'Menlo Park', 'Palo Alto', 'San
Francisco', 'San Francisco', 'Los Gatos', 'Stockholm'],
'employee_count': [147000, 135301, 163000, 1335000, 58604, 70757, 22600, 6132, 9400, 6617],
'revenue': [274515000, 182527000, 143015000, 386064000, 85965000, 31536000, 11427000, 3376000,
25099999, 7800000],
'year_founded': [1976, 1998, 1975, 1994, 2004, 2003, 2009, 2008, 1997, 2006]
}

df = pd.DataFrame(data)
SQL to Pandas:
Group Aggregations
SQL Input

SELECT industry
, AVG(employee_count) AS avg_employee_count
FROM companies
GROUP BY industry;

Pandas Input

# Calculate the avg employee count by industry (like GROUP BY in SQL)

grouped_df = df.groupby('industry')['employee_count'].mean()
print(grouped_df)

Pandas Output

industry
Automotive 7.075700e+04
E-commerce 1.335000e+06
Entertainment 8.008500e+03
Technology 8.877283e+04
Name: employee_count, dtype: float64

Fall 2023 - CS302P - 1
No ratings yet
Fall 2023 - CS302P - 1
2 pages
How To Hack Wifi in Windows 7 - 8 - 8.1 - 10 Without Any Software - Using With CMD
No ratings yet
How To Hack Wifi in Windows 7 - 8 - 8.1 - 10 Without Any Software - Using With CMD
10 pages
Indonesia (Suite) Wiring Diagram
No ratings yet
Indonesia (Suite) Wiring Diagram
1 page
Datos Tecnicos RLN
No ratings yet
Datos Tecnicos RLN
7 pages
Log
No ratings yet
Log
119 pages
Lecture 14
No ratings yet
Lecture 14
33 pages
Module - 3 New
No ratings yet
Module - 3 New
38 pages
EDA Module 3-1
No ratings yet
EDA Module 3-1
40 pages
SQL Server Source Control Basics Ebook
No ratings yet
SQL Server Source Control Basics Ebook
296 pages
Rajni Ip File Final
No ratings yet
Rajni Ip File Final
42 pages
SQL Guide To Null
No ratings yet
SQL Guide To Null
11 pages
Pandas Cheat Sheet CN
No ratings yet
Pandas Cheat Sheet CN
4 pages
Data Aggregation Using Python
No ratings yet
Data Aggregation Using Python
33 pages
Using Groupby and Pivot
No ratings yet
Using Groupby and Pivot
7 pages
Low Power Clock Tree Optimization by Clock Buffer/Inverter Reduction
100% (1)
Low Power Clock Tree Optimization by Clock Buffer/Inverter Reduction
2 pages
EDAV Exp10
No ratings yet
EDAV Exp10
4 pages
Geo SCADA Expert Performance Guidelines
No ratings yet
Geo SCADA Expert Performance Guidelines
12 pages
Sample DB SQL
No ratings yet
Sample DB SQL
4 pages
IP Imp Notes
No ratings yet
IP Imp Notes
5 pages
Some Introductory Concepts On Fiberr Optic System
No ratings yet
Some Introductory Concepts On Fiberr Optic System
36 pages
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
100% (3)
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
9 pages
Groupby RST
No ratings yet
Groupby RST
32 pages
Pandas Plots
No ratings yet
Pandas Plots
14 pages
Wendland, Aristeae Ad Philocratem Epistula
No ratings yet
Wendland, Aristeae Ad Philocratem Epistula
275 pages
Unit 5 2
No ratings yet
Unit 5 2
6 pages
Cs403 Assignment Solution 1 Fall 2023
No ratings yet
Cs403 Assignment Solution 1 Fall 2023
7 pages
AirCheck Detail Report - PK8AP02
No ratings yet
AirCheck Detail Report - PK8AP02
100 pages
PostGres SQL
No ratings yet
PostGres SQL
7 pages
Python CheatSheet
No ratings yet
Python CheatSheet
2 pages
Sheeting Accessories
No ratings yet
Sheeting Accessories
6 pages
PANDAS Python
No ratings yet
PANDAS Python
2 pages
Pandas PDF
No ratings yet
Pandas PDF
25 pages
Descriptive Statistics With Pandas: Data Handling Using Pandas - II
100% (1)
Descriptive Statistics With Pandas: Data Handling Using Pandas - II
37 pages
Pandas Notes
No ratings yet
Pandas Notes
4 pages
Comprehensive Guide To Grouping and Aggregating With Pandas - Practical Business Python
No ratings yet
Comprehensive Guide To Grouping and Aggregating With Pandas - Practical Business Python
23 pages
Data Aggregation and Group Operations
No ratings yet
Data Aggregation and Group Operations
34 pages
Python & MySQL For Data Analysis
No ratings yet
Python & MySQL For Data Analysis
45 pages
CQF Brochure
No ratings yet
CQF Brochure
24 pages
Project Time Management PDF
No ratings yet
Project Time Management PDF
95 pages
Chapter-2 Python Pandas
100% (2)
Chapter-2 Python Pandas
33 pages
Pandas Cheat Sheet
85% (13)
Pandas Cheat Sheet
2 pages
More On Pandas
No ratings yet
More On Pandas
51 pages
4 PythonPandas
No ratings yet
4 PythonPandas
8 pages
Data Handling Module
No ratings yet
Data Handling Module
10 pages
Python 2.1.3
No ratings yet
Python 2.1.3
6 pages
Practical
No ratings yet
Practical
12 pages
Pandas Tricks To Create A DataFrame From An Existing One
No ratings yet
Pandas Tricks To Create A DataFrame From An Existing One
14 pages
Unit 1 Python Pandas
No ratings yet
Unit 1 Python Pandas
20 pages
Ar514 Project Manuscript Format
No ratings yet
Ar514 Project Manuscript Format
2 pages
MCT Enrollment and Renewal Guide Feb 2021 - General MCT Trainers
No ratings yet
MCT Enrollment and Renewal Guide Feb 2021 - General MCT Trainers
22 pages
Unit - 4 - Part 2
No ratings yet
Unit - 4 - Part 2
36 pages
Unit 4 Fod
100% (1)
Unit 4 Fod
21 pages
Pandas Questions
No ratings yet
Pandas Questions
11 pages
Starting Out With Pandas - Ext
No ratings yet
Starting Out With Pandas - Ext
18 pages
Pandas
No ratings yet
Pandas
26 pages
Delft3D-WAVE User Manual PDF
No ratings yet
Delft3D-WAVE User Manual PDF
226 pages
Vlsi Module-3
No ratings yet
Vlsi Module-3
129 pages
Usage of NumPy For Numerical Data in Detail
No ratings yet
Usage of NumPy For Numerical Data in Detail
52 pages
Clips Report-CAM - 6-2023-10-13-1407
No ratings yet
Clips Report-CAM - 6-2023-10-13-1407
2 pages
Numpy-Guide-1 11 0
No ratings yet
Numpy-Guide-1 11 0
135 pages
Data Engineering 101 - Day 24 - SQL Vs PySpark
No ratings yet
Data Engineering 101 - Day 24 - SQL Vs PySpark
82 pages
HTML Code
No ratings yet
HTML Code
3 pages
Lab Record IP
No ratings yet
Lab Record IP
13 pages
NumPy and Pandas Tutorial
No ratings yet
NumPy and Pandas Tutorial
8 pages
Extra Practice Problems For SQL
No ratings yet
Extra Practice Problems For SQL
5 pages
Understanding Pandas Groupby For Data Aggregation
No ratings yet
Understanding Pandas Groupby For Data Aggregation
49 pages
TL-WR844N (EU) 1.0 Datasheet
100% (1)
TL-WR844N (EU) 1.0 Datasheet
5 pages
K Agitation
No ratings yet
K Agitation
6 pages
Fundamental - Python
No ratings yet
Fundamental - Python
3 pages
Python For Statistics
No ratings yet
Python For Statistics
40 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
12 pages
Bumping Hard Inquiries
100% (5)
Bumping Hard Inquiries
7 pages
Pandas
No ratings yet
Pandas
13 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
Unit 3 (FODS)
No ratings yet
Unit 3 (FODS)
34 pages
Python-for-Data-Analysis (Pandas
No ratings yet
Python-for-Data-Analysis (Pandas
31 pages
Chapter 2
100% (1)
Chapter 2
40 pages
Pandas Cheat Sheet
100% (2)
Pandas Cheat Sheet
6 pages
Data Science Tools Study Guides For MIT's 15.003
No ratings yet
Data Science Tools Study Guides For MIT's 15.003
23 pages
RCC11 Element Design
No ratings yet
RCC11 Element Design
6 pages
Pandas Cheat Sheet
100% (4)
Pandas Cheat Sheet
2 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
DevOps Session 3 Pandas
No ratings yet
DevOps Session 3 Pandas
33 pages
Data Wrangling With Python and Pandas
No ratings yet
Data Wrangling With Python and Pandas
7 pages
Ashwani Kumar Yadav Chief Mechanic
No ratings yet
Ashwani Kumar Yadav Chief Mechanic
5 pages
Southern Province Grade 10 Information and Communication Technology Ict 2020 1 Term Test Paper 61e9422335b6f
No ratings yet
Southern Province Grade 10 Information and Communication Technology Ict 2020 1 Term Test Paper 61e9422335b6f
13 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
Is Unit 4
No ratings yet
Is Unit 4
97 pages
Sop Epfl PDF
No ratings yet
Sop Epfl PDF
4 pages
Wonderware - InTouch Access Anywhere Secure Gateway 2013
No ratings yet
Wonderware - InTouch Access Anywhere Secure Gateway 2013
43 pages
Pandas
No ratings yet
Pandas
9 pages
Pandas Cheat Sheet - Python For Data Science
No ratings yet
Pandas Cheat Sheet - Python For Data Science
5 pages
Super Study Guide: Data Science Tools: Afshine Amidi and Shervine Amidi August 21, 2020
No ratings yet
Super Study Guide: Data Science Tools: Afshine Amidi and Shervine Amidi August 21, 2020
23 pages
LoA - Basic - Book 1
No ratings yet
LoA - Basic - Book 1
21 pages
Choosing Beliefs NOTES
No ratings yet
Choosing Beliefs NOTES
9 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

SQL To Pandas - Group Aggregations

Uploaded by

SQL To Pandas - Group Aggregations

Uploaded by

SQL to Pandas:

#SQL_to_Pandas with Shane Butler

-- Create the companies table

-- Insert data into the companies table

#Create a DataFrame (like a table in SQL)

# Calculate the avg employee count by industry (like GROUP BY in SQL)

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.