0% found this document useful (0 votes)
30 views14 pages

Day 10 1729086189

Python
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views14 pages

Day 10 1729086189

Python
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

100 𝐝𝐚𝐲𝐬 𝐝𝐚𝐭𝐚 𝐞𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐬

𝐢𝐧𝐭𝐞𝐫𝐯𝐢𝐞𝐰 𝐬𝐞𝐫𝐢𝐞𝐬
Day 10: Advanced Advanced SQL scenarios like window
functions, CTEs, and handling performance issues.

Created By Devikrishna R
Window Functions
1. OVER() Clause
2. ROW_NUMBER()
3. RANK()
4. DENSE_RANK()
5. PARTITION BY
1.OVER() Clause
The OVER() clause is fundamental in SQL window functions.
It defines a window (or subset of rows) for each row in the result set.
This enables the calculation of aggregate values like totals or rankings without collapsing the rows like
traditional GROUP BY does.
•Syntax:

SELECT column_name, AGGREGATE_FUNCTION(column_name) OVER ([PARTITION BY column_name]


ORDER BY column_name) AS alias FROM table_name;
•Example (Running Total):
sql
SELECT employee_id, salary, SUM(salary) OVER (ORDER BY employee_id) AS running_total FROM employees;
This will create a running total of the salary for each employee, without collapsing the rows.
•Key Points:
•The OVER() clause can work with aggregate functions like SUM(), COUNT(), AVG(), etc.
•The ORDER BY within OVER() defines how the window is sorted.
•If PARTITION BY is not used, the function operates over the entire result set.
2.ROW_NUMBER()
•Assigns a unique number to each row in the result set, starting at 1.
•If two rows have the same value in the ordering column, they still get different row numbers.
Example:
sql
SELECT employee_id, salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num FROM employees;
This assigns a unique number to each employee, ordered by their salary.
3.RANK()
•Provides a ranking but skips ranks if there are ties.
Example:
sql
SELECT employee_id, salary, RANK() OVER (ORDER BY salary DESC) AS rank FROM employees;
If two employees have the same salary, they will get the same rank, but the next rank will be skipped.
E.g., if two employees are ranked 1st, the next rank will be 3rd.
4.DENSE_RANK()
•Similar to RANK(), but it does not skip ranks after ties.
Example:
sql
SELECT employee_id, salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank FROM employees;
If two employees are tied at rank 1, the next rank will be 2 instead of 3, like in RANK()
PARTITION BY
The PARTITION BY clause allows you to break the result set into partitions (groups),
and the window function is applied separately to each partition.
•Example:
sql
SELECT department_id, employee_id, salary, SUM(salary) OVER (PARTITION BY
department_id ORDER BY employee_id) AS department_total FROM employees;
This query calculates a running total of salaries, but it resets for each department
because of the PARTITION BY department_id.
•Key Points:
•PARTITION BY is like a GROUP BY, but it keeps the original rows while applying aggregate functions within partitions.
•The ORDER BY clause inside OVER() defines how the function is applied within each partition.
Usage Comparison:
•ROW_NUMBER() is best when you need a strict, unique ordering of rows, even if there are ties.
•RANK() is useful when ties should receive the same rank but you want to skip subsequent ranks.
•DENSE_RANK() is ideal if you want to rank tied rows equally without skipping subsequent ranks.
SQL Query Optimization Tips I Regret Not Knowing Earlier

1. Indexing
Explanation: Indexes act like a roadmap that helps the database find data faster. Without an index, the database must scan the entire
table to locate the relevant rows.

2. Query Refactoring
Explanation: Complex queries can often be split into simpler, more manageable parts. This makes them easier to optimize and debug.

3. Avoid SELECT *
Explanation: Selecting all columns (SELECT *) can retrieve more data than necessary, slowing down the query. Specifying only the
required columns reduces the workload on the database.

4. Efficient Joins
Explanation: The way tables are joined can significantly impact performance, especially with large datasets. The order of joins and the
type of join used matter.
5. Use WHERE Instead of HAVING
Explanation: The “WHERE” clause filters rows before grouping them, while “HAVING” filters rows after grouping. Filtering early with
“WHERE” is more efficient.

6. Limit Data Retrieval


Explanation: Fetching fewer rows or processing only a subset of the data can greatly improve query performance.

7. Use EXISTS Instead of IN


Explanation: “EXISTS” can be more efficient than “IN” when checking for the existence of rows in a subquery, especially when the
subquery returns a large result set.

8. Optimize Aggregations
Explanation: Aggregating data (SUM, COUNT, etc.) can be slow, especially on large tables. Indexing the columns used in aggregations
can speed up these operations.
9. Consider Query Execution Plans
Explanation: Execution plans show how the database intends to execute your query. Understanding this can help identify bottlenecks
like full table scans.

10. Avoid Using Functions on Indexed Columns


Explanation: Applying a function to an indexed column in the “WHERE” clause can prevent the index from being used, leading to slower
queries.

11. Caching
Explanation: Query caching can store the results of expensive queries, so they don’t have to be recalculated each time.

12. Use Temporary Tables


Explanation: Storing intermediate results in temporary tables can make complex queries more efficient, especially when those results
are reused multiple times.

13. Parallel Execution


Explanation: Some databases support parallel execution, which splits the work across multiple CPU cores to process queries faster.

14. Optimize Data Types


Explanation: Using the most appropriate data types for your columns can save space and speed up queries.

15. Batch Processing


Explanation: When performing updates or deletes, processing them in batches instead of one at a time reduces transaction overhead.
2. Common Table Expressions (CTEs)
A Common Table Expression (CTE) is a temporary result set defined within a WITH clause that can be
referenced later in a SELECT, INSERT, UPDATE, or DELETE query. CTEs are ideal for breaking down complex queries into smaller
readable chunks or for recursive queries like traversing hierarchical data.
Recursive CTEs
Recursive CTEs are particularly useful for querying hierarchical or self-referencing data, such as organizational structures,
family trees, or any data where an item references another item in the same table.
How Recursive CTEs Work
A recursive CTE is structured in two parts:
1.Anchor Member: The base case of the recursion that starts the sequence.
2.Recursive Member: The part that refers back to the CTE to continue the recursion until a stopping condition is met.
WITH RECURSIVE cte_name AS (
-- Anchor member
SELECT column_list
FROM table
WHERE condition

UNION ALL

-- Recursive member
SELECT column_list
FROM table
JOIN cte_name ON table.column = cte_name.column
)
SELECT * FROM cte_name;

------------------------------
Example: Organizational Hierarchy
Imagine a table employees where each employee has a manager_id referring to their manager
(or is NULL if they have no manager).
sql
WITH RECURSIVE EmployeeHierarchy AS ( -- Anchor member: Get top-level employees (those with no manager)
SELECT employee_id, employee_name, manager_id FROM employees
WHERE manager_id IS NULL UNION ALL -- Recursive member: Get employees reporting to those already found

SELECT e.employee_id, e.employee_name, e.manager_id FROM employees e INNER JOIN


EmployeeHierarchy eh ON e.manager_id = eh.employee_id ) SELECT * FROM EmployeeHierarchy;
•Explanation:
• The first part (anchor member) gets all employees with no manager.
• The recursive part then gets employees who report to those employees, continuing until all levels of
the hierarchy are retrieved.
Key Points:
•Recursive CTEs are effective for traversing tree-like or graph-like structures.
•The recursion continues until there are no more rows to return (the base condition is not met anymore).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy