0% found this document useful (0 votes)

4 views

8 SQL Techniques Data Analysis Analytics Data Science

The document outlines eight essential SQL techniques for data analysis, crucial for professionals in analytics and data science. It emphasizes the importance of SQL in managing and analyzing data, providing practical examples such as counting rows, using aggregation functions, and filtering data. The article serves as a guide for those looking to enhance their SQL skills for effective data analysis.

Uploaded by

bgurram5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

8 SQL Techniques Data Analysis Analytics Data Science

Uploaded by

bgurram5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

8 SQL Techniques to Perform Data Analysis for Analytics and

Data Science
D AT A S C I E NC E I NT E RM E D I AT E RE T A I L SQL S T RUC T URE D D AT A

Overview

SQL is a must-know language for anyone in analytics or data science

Here are 8 nifty SQL techniques for data analysis that ever analytics and data science professional will
love working with

Introduction

SQL is a key cog in a data science professional’s armory. I’m speaking from experience – you simply cannot
expect to carve out a successful career in either analytics or data science if you haven’t yet picked up SQL.

And why is SQL so important?

As we move into a new decade, the rate at which we are producing and consuming data is skyrocketing by
the day. To make smart decisions based on data, organizations around the world are hiring data
professionals like business analysts and data scientists to mine and unearth insights from the vast
treasure trove of data.

And one of the most important tools required for this is – you guessed it – SQL!
Structured Query Language (SQL) has been around for decades. It is a programming language used for
managing the data held in relational databases. SQL is used all around the world by a majority of big
companies. A data analyst can use SQL to access, read, manipulate, and analyze the data stored in a
database and generate useful insights to drive an informed decision-making process.

In this article, I will be discussing 8 SQL techniques/queries that will make you ready for any advanced
data analysis problems. Do keep in mind that this article assumes a very basic knowledge of SQL.

I would suggest checking out the below courses if you’re new to SQL and/or business analytics:

Certified Business Analytics Program

Structured Query Language (SQL) for Data Science

Table of Contents

1. Let’s First Understand the Dataset

2. SQL Technique #1: Counting Rows and Items
3. SQL Technique #2: Aggregation Functions
4. SQL Technique #3: Extreme Value Identification
5. SQL Technique #4: Slicing Data
6. SQL Technique #5: Limiting Data
7. SQL Technique #6: Sorting Data
8. SQL Technique #7: Filtering Patterns
9. SQL Technique #8: Groupings, Rolling up Data and Filtering in Groups

Let’s First Understand the Dataset

What is the best way to learn data analysis? By performing it side by side on a dataset! For this purpose, I
have created a dummy dataset of a retail store. The customer data table is represented by
ConsumerDetails.

Our dataset consists of the following columns:

Name – The name of the consumer

Locality – The locality of the customer
Total_amt_spend – The total amount of money spent by the consumer in the store
Industry – It signifies the industry from which the consumer belongs to

Note:- I will be using MySQL 5.7 for going forward in the article. You can download it from here – My SQL 5.7
Downloads.
SQL Technique #1 – Counting Rows and Items

Count Function

We will begin our analysis with the simplest query, i.e, counting the number of rows in our table. We will do
this by using the function – COUNT().

Great! Now we know the number of rows in our table which is 10. It may seem to be funny using this
function on a small test dataset but it can help a lot when your rows run into the millions!

Distinct Function

A lot of times, our data table is filled with duplicate values. To attain the unique value, we use the
DISTINCT function.

In our dataset, how can we find the unique industries that customers belong to?

You guessed it right. We can do this by using the DISTINCT function.

You can even count the number of unique rows by using the count along with distinct. You can refer to the
below query:

SQL Technique #2 – Aggregation Functions

Aggregation functions are the base of any kind of data analysis. They provide us with an overview of the
dataset. Some of the functions we will be discussing are – SUM(), AVG(), and STDDEV().

Calculate sum

We use the SUM() function to calculate the sum of the numerical column in a table.

Let’s find out the sum of the amount spent by each of the customers:

In the above example, sum_all is the variable in which the value of the sum is stored. The sum of the
amount of money spent by consumers is Rs. 12,560.

Calculate the average

To calculate the average of the numeric columns, we use the AVG() function. Let’s find the average
expenditure by the consumers for our retail store:

The average amount spent by customers in the retail store is Rs. 1256.

Calculate standard deviation

If you have looked at the dataset and then the average value of expenditure by the consumers, you’ll have
noticed there’s something missing. The average does not quite provide the complete picture so let’s find
another important metric – Standard Deviation. The function is STDDEV().

The standard deviation comes out to be 829.7 which means there is a high disparity between the
expenditures of consumers!

SQL Technique #3 – Extreme Value Identification

The next type of analysis is to identify the extreme values which will help you understand the data better.

Max

The maximum numeric value can be identified by using the MAX() function. Let’s see how to apply it:
The maximum amount of money spent by the consumer in the retail store is Rs. 3000.

Min

Similar to the max function, we have the MIN() function to identify the minimum numeric value in a given
column:

The minimum amount of money spent by the retail store consumer is Rs. 350.

SQL Technique #4 – Slicing Data

Now, let us focus on one of the most important parts of the data analysis – slicing the data. This section
of the analysis is going to form the basis for advanced queries and help you retrieve data based on some
kind of condition.

Let’s say that the retail store wants to find the customers coming from a locality, specifically Shakti
Nagar and Shanti Vihar. What will be the query for this?

Great, we have 3 customers! We have used the WHERE clause to filter out the data based on the condition
that consumers should be living in the locality – Shakti Nagar and Shanti Vihar. I didn’t use the OR
condition here. Instead, I have used the IN operator which allows us to specify multiple values in the
WHERE clause.

We need to find the customers who live in specific localities (Shakti Nagar and Shanti Vihar) and spend
an amount greater than Rs. 2000.

In our dataset, only Shantanu and Natasha fulfill these conditions. As both conditions need to be fulfilled,
the AND condition is better suited here. Let’s check out another example to slice our data.

This time the retail store wants to retrieve all the consumers who are spending between Rs. 1000 and
Rs. 2000 so as to push out special marketing offers. What will be the query for this?

Another way to write the same statement would be:

Only Rohan is clearing this criteria!

Great! We have reached halfway in our journey. Let us build more on the knowledge that we have gained so
far.

SQL Technique #5 – Limiting Data

Limit

Let’s say we want to view the data table consisting of millions of records. We can’t use the SELECT
statement directly as this would dump the complete table onto our screen which is cumbersome and
computationally intensive. Instead, we can use the LIMIT clause:

The above SQL command helps us show the first 5 rows of the table.

OFFSET

What will you do if you just want to select only the fourth and fifth rows? We will make use of the OFFSET
clause. The OFFSET clause will skip the specified number of rows. Let’s see how it works:

SQL Technique #6 – Sorting Data

Sorting data helps us put our data into perspective. We can perform the sorting process by using the
keyword – ORDER BY.

ORDER BY
The keyword can be used to sort the data into ascending or descending order. The ORDER BY keyword
sorts the data in ascending order by default.

Let us see an example where we sort the data according to the column Total_amt_spend in ascending
order:

Awesome! To order the dataset into descending order, we can follow the below command:

SQL Technique #7 – Filtering Patterns

In the earlier sections, we learned how to filter the data based on one or multiple conditions. Here, we will
learn to filter the columns that match a specified pattern. To move forward with this, we will first
understand the LIKE operator and wildcard characters.

LIKE operator
The LIKE operator is used in a WHERE clause to search for a specified pattern in a column.

Wildcard Characters

The Wildcard Character is used to substitute one or more characters in a string. These are used along with
the LIKE operator. The two most common wildcard characters are:

% – It represents 0 or more number of characters

_ – It represents a single character

In our dummy retail dataset, let’s say we want all the localities that end with “Nagar”. Take a moment to
understand the problem statement and think about how we can solve this.

Let’s try to break down the problem. We require all the localities that end with “Nagar” and can have any
number of characters before this particular string. Therefore, we can make use of the “%” wildcard before
“Nagar”:

Awesome, we have 6 localities ending with this name. Notice that we are using the LIKE operator to
perform pattern matching.

Next, we will try to solve another pattern-based problem. We want the names of the consumers whose
second character has “a” in their respective names. Again, I would suggest you to take a moment to
understand the problem and think of a logic to solve it.

Let’s breakdown the problem. Here, the second character needs to be “a”. The first character can be
anything so we substitute this letter with the wildcard “_”. After the second character, there can be any
number of characters so we substitute those characters with the wildcard “%”. The final pattern matching
will look like this:
We have 6 people satisfying this bizarre condition!

SQL Technique #8 – Groupings, Rolling up Data and Filtering in Groups

We have finally arrived at one of the most powerful analysis tools in SQL – Grouping of data which is
performed using the GROUP BY statement. The most useful application of this statement is to find the
distribution of categorical variables. This is done by using the GROUP BY statement along with aggregation
functions like – COUNT, SUM, AVG, etc.

Let’s try to understand this better by taking up a problem statement. The retail store wants to find the
Number of Customers corresponding to the industries they belong to:

We notice that the count of customers belonging to the various industries is more or less the same. So, let
us move forward and find the sum of spendings by customers grouped by the industry they belong to:
We can observe that the maximum amount of money spent is by the customers belonging to the
Manufacturing industry. This seems a bit easy, right? Let us take a step ahead and make it more
complicated.

Now, the retailer wants to find the industries whose total_sum is greater than 2500. To solve this problem,
we will again group by the data according to the industry and then use the HAVING clause.

HAVING

The HAVING clause is just like the WHERE clause but only for filtering the grouped by data. Remember, it
will always come after the GROUP BY statement.

We have only 3 categories that satisfy the conditions – Aviation, Defense, and Manufacturing. But to make
it more clearer, I will also add the ORDER BY keyword to make it more intuitive:

End Notes

I am really glad you made it so far. These are the building blocks for all data analysis queries in SQL. You
can also take up advanced queries by using these fundamentals. In this article, I made use of MySQL 5.7 to
establish the examples.

I really hope that these SQL queries will help you in your day to day life when you are analyzing complex
data. Do you have any of your tips and tricks for analyzing data in SQL? Let me know in the comments!

Article Url - https://www.analyticsvidhya.com/blog/2020/07/8-sql-techniques-data-analysis-analytics-

data-science/
Ram Dewani
Product Growth Analyst at Analytics Vidhya. I’m always curious to deep dive into data, process it,
polish it so as to create value. My interest lies in the field of marketing analytics.

Metro 2 Third Party Stacking Process To Report Tradelines PDF
100% (1)
Metro 2 Third Party Stacking Process To Report Tradelines PDF
3 pages
CAF Form PDF
No ratings yet
CAF Form PDF
2 pages
Problem 2.4.1-Ecommerce Enrichment Plan of Action-Please Answer All of The Following Before You (And Your Partner) Begin Your
No ratings yet
Problem 2.4.1-Ecommerce Enrichment Plan of Action-Please Answer All of The Following Before You (And Your Partner) Begin Your
4 pages
SQL For Everyone (Definitive Guide)
No ratings yet
SQL For Everyone (Definitive Guide)
10 pages
Advanced SQL Concepts
No ratings yet
Advanced SQL Concepts
38 pages
SQL for Data Science
No ratings yet
SQL for Data Science
8 pages
SQL For Everyone
No ratings yet
SQL For Everyone
11 pages
SQL For Everyone
No ratings yet
SQL For Everyone
11 pages
SQL-Data Analytcs
No ratings yet
SQL-Data Analytcs
13 pages
7 SQL Tricks in Data Analysis
No ratings yet
7 SQL Tricks in Data Analysis
9 pages
Interview_7 - IMP
No ratings yet
Interview_7 - IMP
26 pages
LearningTask4Document1 (2)
No ratings yet
LearningTask4Document1 (2)
20 pages
SQL Unit 2
No ratings yet
SQL Unit 2
30 pages
DBMS
No ratings yet
DBMS
66 pages
SQL Theory With Query
No ratings yet
SQL Theory With Query
11 pages
Data Science: Part 2 - SQL
100% (1)
Data Science: Part 2 - SQL
13 pages
SQL_2024
No ratings yet
SQL_2024
3 pages
Top 60 SQL Queries
100% (1)
Top 60 SQL Queries
34 pages
SQL Functions For Data Analysis Tasks PDF
No ratings yet
SQL Functions For Data Analysis Tasks PDF
16 pages
TECH MAHINDRA DATA ANALYST INTERVIEW QUESTIONS
No ratings yet
TECH MAHINDRA DATA ANALYST INTERVIEW QUESTIONS
11 pages
EDA_SQL_Document
No ratings yet
EDA_SQL_Document
3 pages
SQL for Data Analysis.pdf
100% (1)
SQL for Data Analysis.pdf
10 pages
Advanced_SQL_Query_Techniques
No ratings yet
Advanced_SQL_Query_Techniques
2 pages
SQL_Interview_Questions
No ratings yet
SQL_Interview_Questions
4 pages
The Ultimate Guide of SQL
No ratings yet
The Ultimate Guide of SQL
28 pages
corporate query
No ratings yet
corporate query
6 pages
Advanced Concepts in SQL
No ratings yet
Advanced Concepts in SQL
5 pages
SQL Essentials PDF
No ratings yet
SQL Essentials PDF
36 pages
Unit III Fis
No ratings yet
Unit III Fis
9 pages
SQL - Eda Process
No ratings yet
SQL - Eda Process
7 pages
Sql notes
No ratings yet
Sql notes
5 pages
S07 Slides
No ratings yet
S07 Slides
17 pages
SQL Made Easy A Beginners Guide To Easily Learn SQL b096w2gtdf
No ratings yet
SQL Made Easy A Beginners Guide To Easily Learn SQL b096w2gtdf
214 pages
Module 2 Introduction to SQL
No ratings yet
Module 2 Introduction to SQL
22 pages
Crack Your Data Engineering SQL Round
No ratings yet
Crack Your Data Engineering SQL Round
112 pages
Step-by-Step Guide To Learn SQL
No ratings yet
Step-by-Step Guide To Learn SQL
11 pages
Simple SQL Queries
No ratings yet
Simple SQL Queries
4 pages
13 SQL Statements For 90 - of Your Data Analysis Tasks. by Abhishek Saud Mar, 2023 Medium
No ratings yet
13 SQL Statements For 90 - of Your Data Analysis Tasks. by Abhishek Saud Mar, 2023 Medium
18 pages
CSC421 - Database Management II
No ratings yet
CSC421 - Database Management II
48 pages
SQL Quants
No ratings yet
SQL Quants
30 pages
IS 4420 Database Fundamentals Introduction To SQL Leon Chen
No ratings yet
IS 4420 Database Fundamentals Introduction To SQL Leon Chen
42 pages
SQL Cheat Sheet
No ratings yet
SQL Cheat Sheet
13 pages
SQL Notes-2
No ratings yet
SQL Notes-2
7 pages
SQL Server: Tips and Tricks - 2
From Everand
SQL Server: Tips and Tricks - 2
Priyanka Agarwal
4.5/5 (3)
Benja's Notes
No ratings yet
Benja's Notes
40 pages
Detailed_SQL_Interview_Questions
No ratings yet
Detailed_SQL_Interview_Questions
4 pages
SQL functions useful for Data analysis _ Towards Dev
No ratings yet
SQL functions useful for Data analysis _ Towards Dev
17 pages
SQL Theory for Data Science
No ratings yet
SQL Theory for Data Science
8 pages
UNIT-5
No ratings yet
UNIT-5
5 pages
Sq l Notes for Professionals
No ratings yet
Sq l Notes for Professionals
166 pages
Tech Mahindra SQL Interview Questions for Data Engineer
No ratings yet
Tech Mahindra SQL Interview Questions for Data Engineer
6 pages
SQL Interview Questions & Answers
No ratings yet
SQL Interview Questions & Answers
6 pages
Chapter-6 Add From Handout
No ratings yet
Chapter-6 Add From Handout
72 pages
SQL Basics Advanced Examples v2
No ratings yet
SQL Basics Advanced Examples v2
4 pages
ADDON - SQL CHEAT SHEET
No ratings yet
ADDON - SQL CHEAT SHEET
18 pages
SQL Tutorial for Beginners
No ratings yet
SQL Tutorial for Beginners
10 pages
dbms
No ratings yet
dbms
9 pages
What Are The Benefits of Using Cloud Services? How Does The DISTINCT Keyword Work in SQL? What Are Common Aggregate Functions in SQL?
No ratings yet
What Are The Benefits of Using Cloud Services? How Does The DISTINCT Keyword Work in SQL? What Are Common Aggregate Functions in SQL?
3 pages
4. SQL - 1688813695672
No ratings yet
4. SQL - 1688813695672
33 pages
SQL_short_Notes_Top_10_Questions_1748266007
No ratings yet
SQL_short_Notes_Top_10_Questions_1748266007
8 pages
Exclusive SQL Tutorial On Data Analysis in R
No ratings yet
Exclusive SQL Tutorial On Data Analysis in R
9 pages
SQLNotesForProfessionals.pdf
No ratings yet
SQLNotesForProfessionals.pdf
145 pages
SQL Mastery: The Masterclass Guide to Become an SQL ExpertMaster The SQL Programming Language In This Ultimate Guide Today!
From Everand
SQL Mastery: The Masterclass Guide to Become an SQL ExpertMaster The SQL Programming Language In This Ultimate Guide Today!
Jonathan S. Walker
No ratings yet
Chapter 3
No ratings yet
Chapter 3
6 pages
Kartikey Web Developer 2024
No ratings yet
Kartikey Web Developer 2024
1 page
Mat CV
No ratings yet
Mat CV
1 page
GIS and Morphostructural Mapping: A Contribution To The Morphotectonic Study of The Baturité Massif, Northeastern Brazil
No ratings yet
GIS and Morphostructural Mapping: A Contribution To The Morphotectonic Study of The Baturité Massif, Northeastern Brazil
1 page
Manual English Ultratrend
No ratings yet
Manual English Ultratrend
74 pages
Online Notice Board
75% (4)
Online Notice Board
67 pages
Introduction To RAD
No ratings yet
Introduction To RAD
5 pages
NIEM IEPD XML Code Generation in Java
No ratings yet
NIEM IEPD XML Code Generation in Java
8 pages
Business Processes
No ratings yet
Business Processes
1 page
Lec 5
No ratings yet
Lec 5
35 pages
Git-Iqa
No ratings yet
Git-Iqa
11 pages
Oracle Data Integrator 11g: Presented By: Arun K. Chaturvedi
No ratings yet
Oracle Data Integrator 11g: Presented By: Arun K. Chaturvedi
30 pages
Aayush Sharma 20bec020 CV
No ratings yet
Aayush Sharma 20bec020 CV
1 page
2. Analisis Forense de celular
No ratings yet
2. Analisis Forense de celular
2 pages
GPFS Advance Administration Guide
No ratings yet
GPFS Advance Administration Guide
138 pages
Chapter 1 - Overview of Computer Networks
No ratings yet
Chapter 1 - Overview of Computer Networks
49 pages
MVC With Angular Development
No ratings yet
MVC With Angular Development
12 pages
An Empirical Study of Sources Affecting E-Business Value Creation in Jordanian Banking Services Sector
No ratings yet
An Empirical Study of Sources Affecting E-Business Value Creation in Jordanian Banking Services Sector
8 pages
Deep-boot
No ratings yet
Deep-boot
8 pages
101 VNA Questions Ebook
No ratings yet
101 VNA Questions Ebook
34 pages
ITU-T Security Standard Activities PDF
No ratings yet
ITU-T Security Standard Activities PDF
95 pages
Big Data Glossary Pete Warden download
No ratings yet
Big Data Glossary Pete Warden download
45 pages
cloud computing file
No ratings yet
cloud computing file
13 pages
Jwts Not Safe e Book
No ratings yet
Jwts Not Safe e Book
34 pages
Hadoop and Big Data Unit 2
No ratings yet
Hadoop and Big Data Unit 2
11 pages
AL Case Study Lindstrom
No ratings yet
AL Case Study Lindstrom
2 pages
ISD Development Approach:: Data Synchronization Tasks
No ratings yet
ISD Development Approach:: Data Synchronization Tasks
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

8 SQL Techniques Data Analysis Analytics Data Science

Uploaded by

8 SQL Techniques Data Analysis Analytics Data Science

Uploaded by

8 SQL Techniques to Perform Data Analysis for Analytics and

SQL is a must-know language for anyone in analytics or data science

And why is SQL so important?

Certified Business Analytics Program

1. Let’s First Understand the Dataset

Let’s First Understand the Dataset

Our dataset consists of the following columns:

Name – The name of the consumer

You guessed it right. We can do this by using the DISTINCT function.

SQL Technique #2 – Aggregation Functions

Calculate the average

Calculate standard deviation

SQL Technique #3 – Extreme Value Identification

SQL Technique #4 – Slicing Data

Another way to write the same statement would be:

Only Rohan is clearing this criteria!

SQL Technique #5 – Limiting Data

SQL Technique #6 – Sorting Data

SQL Technique #7 – Filtering Patterns

% – It represents 0 or more number of characters

SQL Technique #8 – Groupings, Rolling up Data and Filtering in Groups

Article Url - https://www.analyticsvidhya.com/blog/2020/07/8-sql-techniques-data-analysis-analytics-

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.