0% found this document useful (0 votes)
2 views

Chapter 3- Query Processing and Optimization

The document provides an overview of query processing and optimization in advanced database systems, covering essential concepts such as query languages, relational algebra, and file organization. It explains the steps involved in translating high-level SQL queries into low-level instructions for efficient data retrieval, including parsing, optimization, and execution. Additionally, it discusses the importance of indexing and file organization in enhancing database performance.

Uploaded by

natnaelabera96
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Chapter 3- Query Processing and Optimization

The document provides an overview of query processing and optimization in advanced database systems, covering essential concepts such as query languages, relational algebra, and file organization. It explains the steps involved in translating high-level SQL queries into low-level instructions for efficient data retrieval, including parsing, optimization, and execution. Additionally, it discusses the importance of indexing and file organization in enhancing database performance.

Uploaded by

natnaelabera96
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

HARAMAYA UNIVERSITY

COLLEGE OF COMPUTING AND INFORMATICS


DEPARTMENT OF SOFTWARE ENGINEERING

ADVANCED DATABASE SYSTEMS (SENG 3072)

CHAPTER THREE: QUERY PROCESSING AND OPTIMIZATION


2 CONTENTS

PART I: OVERVIEWS OF PREREQUISITE CONCEPTS

 Query and Query Language

 Relational Algebra and Calculus

 File Organization and Indexing

PART II: QUERY PROCESSING AND OPTIMIZATION

 Introduction

 Query Processing and Optimization

 Best Practice for SQL Optimization


3
QUERY

 A database query is a way to retrieve a specific subset of data from a database.

 Databases often comprise many tables, or collections of related data.

 Sometimes multiple tables house the various pieces of data you want to access.

 In these instances, queries can help you retrieve and compile information from the assorted

tables.

 By querying databases, businesses can analyze data to form helpful conclusions.

 For example, a data analyst might perform a query to find the average ages of a company's

customers.

 This information can help the company learn more about its customers and make informed

business decisions
4
QUERY

Querying can help you to perform:

 Filtering data according to specific criteria  Combining data from various tables

 Summarizing data  Deleting specific data from tables

 Executing calculations  Adjusting data

 Automating data management tasks  Updating databases

 Answering data-related questions  Inserting new data into the database

A database query can be either a select query or an action query. A select query is a query for

retrieving data, while an action query requests additional actions to be performed on the data,

like deletion, insertion, and updating.


5
QUERY LANGUAGE

 A language which is used to store and retrieve data from database is known as query language.

 There are two types of query languages associated to relational model.

Procedural Query Languages (How + What) Non-Procedural Query Languages (What only)

• Relational Algebra (theoretical foundation) • Relational Calculus (theoretical foundation)


• Specifies both what data to retrieve and how to • SQL (practical implementation)
retrieve it • Declares what data is needed without specifying
Example: retrieval steps
σ_salary>10000(π_empname(EMPLOYEE))
Example:
SELECT empname FROM EMPLOYEE WHERE
salary > 10000
6
QUERY LANGUAGE

Aspect SQL (Non-Procedural) Relational Algebra (Procedural)

User specifies What data is needed How to get the data

Level High-level, declarative Low-level, operational

Optimization responsibility Handled by DBMS Partially expressed in the query

SELECT empname FROM


π_empname (σ_salary > 10000
Example EMPLOYEE WHERE salary >
(EMPLOYEE))
10000;
7
RELATIONAL ALGEBRA

 Relational algebra is a procedural query language that works on relational model.

 The purpose of a query language is to retrieve data from database or perform various

operations such as insert, update, delete on the data.

 Relational algebra is more operational, is very useful for representing execution plan.

 Procedural query language means that it tells what data to be retrieved and how to be retrieved.

 Relational algebra and Calculus mainly provides mathematical theory foundation for

relational databases and SQL.


8
RELATIONAL ALGEBRA
9
RELATIONAL ALGEBRA
10
RELATIONAL ALGEBRA
11
RELATIONAL ALGEBRA
12
RELATIONAL ALGEBRA
13
RELATIONAL ALGEBRA

Selection (σ)

Description The SELECT operation is used for selecting a subset of the tuples according to a given selection condition.
Sigma(σ)Symbol denotes it. It is used as an expression to choose tuples which meet the selection
condition.

Syntax
σ Condition (Table)
Example
σ DeptId=1 (Student)
Output UID (PK) FullName StudID Batch DeptId (FK)

1 Dignity IsPeace UG/0001/14 V 1

4 Self Respect UG/0004/14 IV 1


14
RELATIONAL ALGEBRA

Project (∏)

Description Project operator is denoted by ∏ symbol and it is used to select desired columns (or attributes) from a
table (or relation). Project operator in relational algebra is similar to the Select statement in SQL.

Syntax ∏ column1, column2, ...., columnN (Table)

Example ∏ DeptName, DeptOffice(Department)

Output DeptName DeptOffice


SENG R-15
CS R-13
IT R-45
ISY R-50
15
RELATIONAL ALGEBRA

Union (∪)

Description Union operator is denoted by ∪ symbol and it is used to select all the rows (tuples) from two tables (relations). It also eliminates
duplicate tuples; For a union operation to be valid, the following conditions must hold: Number of columns and order of columns of all

queries must be same; the data types of the columns on involving table in each query must be compatible. Duplicate tuples should be
automatically removed.

Syntax Table1 ∪ Table2

Example ∏ DeptId (Student) ∪ DeptId (Department)

Output DeptId
1
2
3
4
16
RELATIONAL ALGEBRA

Cartesian Product (X)

Description Cartesian Product is denoted by X symbol. Lets say we have two relations R1 and R2 then the Cartesian product
of these two relations (R1 X R2) would combine each tuple of first relation R1 with the each tuple of second
relation R2.

Syntax Table1 X Table2

Example Course X Instructor

Output CourseCode CourseTitle Name Department


SENG100 Programming Mo SENG
SENG100 Programming Jo CS
CS200 Algorithm Mo SENG
CS200 Algorithm Jo CS
17
RELATIONAL CALCULUS

 Relational calculus is a non-procedural query language.

 In the non-procedural query language, the user is concerned with the details of how to obtain the end

results.

 The relational calculus tells WHAT to do but never explains HOW to do.

 Many of the calculus expressions involves the use of Quantifiers.

 There are two types of quantifiers:

 Universal Quantifiers: The universal quantifier denoted by ∀ is read as for all which means that in

a given set of tuples exactly all tuples satisfy a given condition.

 Existential Quantifiers: The existential quantifier denoted by ∃ is read as there exists which means

that in a given set of tuples there is at least one occurrences whose value satisfy a given condition.
18
RELATIONAL CALCULUS

Mathematical Concepts of Quantifiers

 Quantifiers are words, expressions or phrases that indicate the number of elements that a statement refer to.

 There are two quantifiers in mathematics: Existential and Universal

 Existential Quantifier [∃]: indicates that at least one element exists that satisfy a certain condition.

 Universal Quantifier [∀]: indicates that all of the elements of a given set satisfy the condition.
19
RELATIONAL CALCULUS

{ t | Employee(t) ∧ t.salary > 50000 }

• Return all tuples t from the Employee relation where the salary is over 50,000.

∃ d (Department(d) ∧ d.name = 'HR')

• There exists a tuple d in the Department relation such that d.name = 'HR‘

∀ e (Employee(e) → e.salary > 0)

• For every tuple e in the Employee relation, e.salary > 0 must be true
20
FILE ORGANIZATION

 As we've already seen, a database consists of various elements such as tables, views, indexes,

procedures, and functions.

 To a user, data appears in the form of tables or views but these are just logical representations.

 In reality, the underlying data is stored in the physical memory as files.

What is a File?

 A file is a named collection of related data that is stored on secondary storage devices such as

magnetic disks, magnetic tapes, or optical drives.

 While users interact with structured data using SQL queries, behind the scenes, this data is

converted into binary format and stored across physical memory blocks.

 These blocks have fixed capacities and are used to map and store actual data.
21
FILE ORGANIZATION

 In the physical memory devices, these data cannot be stored as it is.

 They are converted to binary format.

 Each memory devices will have many data blocks, each of which will be capable of storing

certain amount of data.

 The data and these blocks will be mapped to store the data in the memory.

 Any user who wants to view these data or modify these data, simply fires SQL query and gets the

result on the screen.

 Any of these queries should give results as fast as possible.

 But how these data are fetched from the physical memory?
22
FILE ORGANIZATION

 Do you think simply storing the data in memory devices give us the better results when we fire

queries? No

 How is it stored in the memory, accessing method, query type etc. makes great affect on getting the results.

 Hence organizing the data in the database and hence in the memory is one of important topic to think about.

 In a database we have lots of data. Each data is grouped into related groups called tables.

 Each table will have lots of related records.

 Any user will see these records in the form of tables in the screen.

 But these records are stored as files in the memory.

 Usually one file will contain all the records of a table.


23
FILE ORGANIZATION

 As we saw above, in order to access the contents of the files – records in the physical memory, it is not

that easy.

 They are not stored as tables there and our SQL queries will not work.

 We need some accessing methods.

 To access these files, we need to store them in certain order so that it will be easy to fetch the records.

 It is same as indexes in the books, or catalogues in the library, which helps us to find required topics or

books respectively.
24
FILE ORGANIZATION

 Storing the files in certain order is called file organization.

 The File Organization can be defined as the logical relationships between different

records in the file.

 It is mainly related to finding and accessing any particular record.

 If we talk simply, file organization means storing files in a particular sequence for

logical control
25
FILE ORGANIZATION

The main objective of file organization is

 Optimal selection of records i.e.; records should be accessed as fast as possible.

 Any insert, update or delete transaction on records should be easy, quick and should

not harm other records.

 No duplicate records should be induced as a result of insert, update or delete

 Records should be stored efficiently so that cost of storage is minimal.


26
INDEXING

 Indexing is a way to optimize the performance of a database by minimizing the number of disk accesses

required when a query is processed. It is a technique used in databases to speed up data retrieval.

 It is a data structure technique which is used to quickly locate and access the data in a database.

 Indexes are created using a few database columns.

 The first column is the Search key that contains a copy of the primary key or candidate key of the

table. These values are stored in sorted order so that the corresponding data can be accessed

quickly.

 The second column is the Data Reference or Pointer which contains a set of pointers holding the

address of the disk block where that particular key value can be found.
27
INDEXING
28

PART II: QUERY PROCESSING AND OPTIMIZATION

 Introduction

 Query Processing and Optimization

 Best Practice for SQL Optimization


29
INTRODUCTION … (1)

 SQL simplifies interactions with databases, turning complex data operations into intuitive, structured commands.

 Example: A simple SELECT query can retrieve years of sales data in milliseconds.

 Behind the scenes, databases process millions of requests daily—each needing to be fast, accurate, and resource-

efficient.

 Not all queries are created equal: Some crawl; others fly.

 Modern DBMSs (like RDBMS) don’t just execute queries they optimize them.

 Like a GPS finding the fastest route, the system evaluates multiple paths to deliver results at lightning speed.
30
INTRODUCTION … (2)

 One of the types of DBMS is RDBMS where data is stored in the form of rows and columns (in other

words, stored in tables) which have intuitive associations with each other.

 The users have the freedom to select, insert, update and delete these rows and columns without

violating the constraints provided at the time of defining these relational tables.

 Let’s say you want the list of all the employees who have a salary of more than 100,000.

SELECT EMPNAME FROM EMPLOYEE WHERE SALARY > 10000;


31
INTRODUCTION … (3)

 The SQL query” SELECT EMPNAME FROM EMPLOYEE WHERE SALARY > 10000”; is a high-level command used to

retrieve data. However, DBMS cannot directly understand such high-level statements.

 To bridge this gap, SQL is used, as it allows users to query data in a way that's more intuitive and closer to human

language, while still being processed by the DBMS.

 Despite its user-friendly syntax, the DBMS does not natively understand SQL.

 Instead, SQL queries are passed through a processing unit that translates them into a lower-level language using

Relational Algebra.

 This transformation is necessary because relational algebra, although more complex than SQL, provides the

foundation for DBMS to execute the query.

 As a result, users are only required to write SQL queries, which the system then processes, optimizing and

evaluating them through this underlying transformation.


32
QUERY PROCESSING AND OPTIMIZATION … (1)

 Query Processing refers to the series of steps involved in translating high-level queries into low-level

expressions that a database system can execute.

 It encompasses a range of activities, including parsing, optimization, and actual execution of the

query at the physical level of the file system.

 This process is essential for efficiently retrieving data from the database.

 Query processing relies on fundamental concepts such as relational algebra and file structures.

 It begins with the translation of high-level database language (like SQL) into intermediate

representations and eventually into low-level instructions that interact with the storage system.

 By studying query processing, we gain insight into how queries are interpreted, optimized for

performance, and executed to produce the desired results.


33
QUERY PROCESSING AND OPTIMIZATION … (2)

 Query processing involves two main phases: compile-time and runtime.

 In the compile-time phase, the query compiler translates the high-level query specification into an executable

form.

 This process, known as query compilation, includes several steps: lexical analysis, syntactic and semantic

analysis as well as query optimization, and code generation.

 The resulting code typically consists of a sequence of physical operators designed for the database engine.

 These operators handle core operations such as data access, joins, selections, projections, grouping, and

aggregation.

 In the runtime phase, the database engine interprets and executes the compiled program to retrieve and

return the final query result.

 This separation of compilation and execution allows for efficient processing and optimization of queries
34
QUERY PROCESSING AND OPTIMIZATION … (3)

 Query processing refers to the range of activities involved in extracting data from a database.

 The activities include translation of queries in high-level database languages into

expressions that can be used at the physical level of the file system, a variety of query-

optimizing transformations, and actual evaluation of queries.

 Query processing involves three basic steps:

 Parsing and Translation (Query Compilation)

 Query Optimization

 Evaluation/Execution (Code Generation)


35
QUERY PROCESSING AND OPTIMIZATION … (4)
36
PARSING AND TRANSLATION … (1)

 The first step in query processing is Parsing and Translation.

 When a query is submitted, it undergoes lexical, syntactic, and semantic analysis:

 Lexical Analysis: The query is broken down into tokens such as keywords, identifiers, and symbols. During this

process, white spaces and comments are removed, simplifying the input for further analysis.

 Syntactic Analysis: The query is checked for correct SQL syntax. The parser verifies whether the structure of the

query conforms to the grammar rules of SQL, ensuring that the command is well-formed.

 Semantic Analysis: Beyond syntax, this step validates the meaning of the query. It checks for logical correctness,

such as whether referenced tables and columns exist, whether data types match, and if operations are valid.

 Once these checks are successfully completed, the query is translated into intermediate representations such as

relational algebra expressions, expression trees, or query graphs.

 These representations help the database engine optimize and evaluate the query more efficiently in later stages.
37
PARSING AND TRANSLATION … (2)

Lexical The query is broken into tokens:


Analysis
• SELECT → Keyword

• EMPNAME → Identifier (column name)

• FROM → Keyword

• EMPLOYEE → Identifier (table name)

• WHERE → Keyword

• SALARY → Identifier (column name)

• > → Operator

• 10000 → Literal (numeric value)


38
PARSING AND TRANSLATION … (3)

Syntactic The query is checked against the SQL grammar rules:


Analysis
• Does the SELECT clause correctly specify the columns?

• Is the FROM clause correctly followed by a table name (EMPLOYEE)?

• Is the WHERE clause valid with a condition (SALARY > 10000)?


39
PARSING AND TRANSLATION … (4)

Semantic The query is checked for correctness:


Analysis
• Does the EMPLOYEE table exist in the database?

• Do the EMPNAME column exist within the EMPLOYEE table?

• Are there any logical issues with the query (e.g., column names and

table relationships)?

• Does salary and the value 10000 have the same data type?
40
PARSING AND TRANSLATION … (5)

 If the checks pass, the query is then translated into relational algebra (like the π operator for projection

and σ for selection), which is easier for the system to process.

 So, the next step is to translate the generated set of tokens into a relational algebra query.

 These are easy to handle for the optimizer in further processes


41
QUERY OPTIMIZATION … (1)

 Query optimization is the process of enhancing the efficiency of a query by selecting the most cost-

effective execution plan.

 The primary goal is to reduce the overall cost of query execution, which can include minimizing CPU

and memory usage as well as improving disk I/O performance (e.g., reducing the number of disk

reads).

 In relational algebra, optimization involves restructuring the query execution plan to make it faster and

more resource-efficient.

 This is achieved through various techniques like reordering operations, selection and projection

pushdown, and leveraging indexes.


42
QUERY OPTIMIZATION … (2)

 Typically, users do not need to write their queries in a way that makes them optimally efficient.

 Instead, the database system is expected to handle the task of generating an optimized query execution

plan.

 This plan should minimize the cost of executing the query, ensuring efficient resource utilization.

 Once a query is submitted to the database server, it is first parsed by the query parser for parsing

(syntactic and semantic checks).

 After this, the query is passed to the query optimizer, which analyzes the possible execution strategies and

selects the most efficient one.

 This optimization process happens automatically in the background, and users do not have direct control

over it.
43
QUERY OPTIMIZATION … (3)

 A query is a request for information from a database.

 It can range from a simple request like, "Find the address of the person with Social Security Number 123," to a more

complex one such as, "Find the average salary of all employed married men in California between the ages of 30 and

39 who earn less than their spouses.“

 The result of a query is generated by processing the rows in the database to yield the requested information.

 Given the complexity of most database structures, especially for more complex queries, the data needed to answer

the query may be retrieved in different ways, using various data structures and in different orders.

 Each method of accessing the data typically requires different processing times.

 Processing times for the same query can vary widely—from fractions of a second to several hours—depending on the

approach chosen.

 The purpose of query optimization, an automated process, is to identify the most efficient way to process a given query

in the shortest time possible.


44
QUERY OPTIMIZATION … (4)

 One aspect of query optimization occurs at the relational algebra level, where the system aims to find

an equivalent expression that is more efficient to execute than the original one.

 Another aspect involves selecting the optimal strategy for processing the query, such as choosing the

appropriate algorithm for executing operations, determining which indices to use, and other such

decisions.

 The difference in cost (in terms of execution time) between an efficient strategy and an inefficient one

can be substantial, often varying by several orders of magnitude.

 This makes it worthwhile for the system to invest considerable time in selecting the best strategy for

processing a query, even if the query is only executed once


45
QUERY OPTIMIZATION … (5)

 The ultimate goal of query optimization is to minimize the system resources required to fulfill a query,

thereby providing faster results to the user. There are several benefits to this:

 Improved User Experience: By delivering faster results, the application appears more responsive

to the user.

 Increased Throughput: With optimized queries, the system can handle more queries in the same

amount of time, improving overall throughput.

 Resource Efficiency: Query optimization reduces the strain on hardware resources, such as disk

drives, and contributes to lower power consumption, reduced memory usage, and overall system

efficiency.
46
QUERY OPTIMIZATION … (6)

 In earlier example, we translate the query in to two relational algebra expression and submit to optimizer.

• Expression1: σ_SALARY > 10000(π_EMPNAME(EMPLOYEE))

• Expression 2: π_EMPNAME(σ_SALARY > 10000(EMPLOYEE))

Expression 1 Expression 2

σ_SALARY > 10000(π_EMPNAME(EMPLOYEE)) π_EMPNAME(σ_SALARY > 10000(EMPLOYEE))

• Projection (π_EMPNAME) is applied first, which • Selection (σ_SALARY > 10000) is applied first to
selects only the EMPNAME column from the the EMPLOYEE table, which filters out the rows
EMPLOYEE table. where SALARY > 10000.
• After that, Selection (σ_SALARY > 10000) is • Then, Projection (π_EMPNAME) is applied to the
applied to the result, meaning it filters out rows filtered result, keeping only the EMPNAME
where SALARY > 10000. column.
47
QUERY OPTIMIZATION … (7)

 Expression 2 is more efficient because it filters out unnecessary rows first (with the

Selection) and then projects the results, minimizing the amount of data that needs to be

handled in subsequent operations.

 Expression 1 does the projection first, reducing the number of columns but still keeping all

the rows from the original table, which could be wasteful in terms of resources since many

rows will be discarded later.


48
QUERY OPTIMIZATION … (8)

EXECUTION PLAN

 An execution plan is a set of instructions that outlines the steps performed by the database engine during query

execution. This plan is often referred to as the SQL Server execution plan (or simply query plan).

 The query optimizer is responsible for generating the execution plan, with the primary goal of creating an optimal and

cost-effective plan.

 The optimizer evaluates various possible execution strategies and selects the one that is expected to provide the best

performance.

 During query execution, the query processing engine generates multiple execution plans.

 From these options, the plan with the highest performance (i.e., the least resource-intensive) is chosen.

 A plan cache is a memory location where execution plans are stored for future reuse.

 By caching execution plans, the database can avoid regenerating the plan for identical or similar queries, improving

overall performance and reducing processing time.


49
EXECUTION ENGINE … (1)

 The execution engine is responsible for performing the operations defined in the query execution plan

by fetching data from the database and carrying out the necessary computations. Once a query

evaluation plan is generated by the query optimizer, the execution engine interprets and executes that

plan, ultimately returning the results of the query to the user.

 In essence, the query execution engine is the component that carries out the work specified by the

query, which includes accessing the relevant data, performing operations (such as joins, aggregations,

and filtering), and processing the results. It interprets the SQL commands in the query, determines the

most efficient way to execute the steps outlined in the execution plan, and retrieves the necessary data

from the database.


50
EXECUTION ENGINE … (2)

 The engine manages the interaction with the database storage, retrieving rows, columns, or index

entries, and may use various techniques (like sequential scans, index scans, hash joins, or nested loops)

to access and manipulate the data efficiently. After processing the query, it returns the computed results

back to the user or application that submitted the query.

 Additionally, the execution engine plays a critical role in query performance by determining how

resources like memory, CPU, and disk I/O are utilized. Through careful management of these resources,

the execution engine aims to execute queries in the most efficient way possible, ensuring fast and

accurate results.
BEST PRACTICE FOR QUERY OPTIMIZATION … Developer Side (1)
51

 There are a number of best practices that developers can adopt to enhance query performance and help

the query optimizer operate more efficiently.

 By writing optimized queries, following sound database design principles, and being mindful of how data

is accessed and manipulated, developers can significantly reduce the workload of the query optimizer.

 This not only ensures that the database engine executes queries with optimal performance, but also helps

avoid unnecessary complexity in the query execution plans.

 By applying these best practices, developers can ensure that the database system performs at its best,

delivering faster response times and reducing resource consumption, all while minimizing the reliance on

the optimizer to correct inefficient query plans


BEST PRACTICE FOR QUERY OPTIMIZATION … Developer Side (2)
52

 Avoid Unnecessary Complexity: Try to keep queries simple and focused. Overly complex queries with

unnecessary joins, subqueries, or nested operations can confuse the optimizer and result in suboptimal plans.

 Limit the Use of Subqueries: If possible, avoid using subqueries, especially in the WHERE clause or FROM clause.

In many cases, subqueries can be rewritten as JOINs or common table expressions (CTEs), which might improve

performance.

Write SELECT E.* FROM EMPLOYEES E JOIN SALARY S ON E.EMPID = S.EMPID WHERE
S.AMOUNT > 10000;

Instead of SELECT * FROM EMPLOYEES WHERE EMPID IN (SELECT EMPID FROM SALARY WHERE
AMOUNT > 10000);
BEST PRACTICE FOR QUERY OPTIMIZATION … Developer Side (3)
53

• Use SELECT Carefully: Avoid SELECT * and only select the columns you need.

• This reduces the amount of data transferred and processed.

Write SELECT StudName, StudId FROM Student;

Instead of SELECT * FROM Student;

• Use WHERE Clauses to Filter Data: Always apply filters early with WHERE clauses to reduce the

number of rows the query has to process.

• Limit the Result Set: If you're only interested in the top N rows, use LIMIT or TOP clauses to restrict the

result set. This avoids unnecessarily large result sets.

SELECT TOP 10 EMPNAME, SALARY FROM EMPLOYEES ORDER BY SALARY DESC;


BEST PRACTICE FOR QUERY OPTIMIZATION … Developer Side (4)
54

 Avoid Aggregations on Large Result Sets: Applying aggregations (SUM(), AVG(), COUNT(), etc.) to a

large number of rows can be expensive. Try to reduce the result set with a WHERE clause before

applying aggregation.

SELECT AVG(SALARY) FROM EMPLOYEES WHERE SALARY > 10000;

 Use Grouping Efficiently: Use GROUP BY clauses only when necessary. Only apply GROUP BY when

you need to group rows by a specific attribute and then perform an aggregation.

SELECT DEPARTMENT, AVG(SALARY)


FROM EMPLOYEES
GROUP BY DEPARTMENT;
BEST PRACTICE FOR QUERY OPTIMIZATION … Developer Side (5)
55

 Use WHERE to Filter Before Grouping: If you're filtering rows before applying the aggregation,

make sure to use the WHERE clause instead of the HAVING clause, as WHERE filters rows before the

grouping happens, which is more efficient.

Write SELECT DEPARTMENT, AVG(SALARY)


FROM EMPLOYEES
WHERE SALARY > 50000
GROUP BY DEPARTMENT;

Instead of SELECT DEPARTMENT, AVG(SALARY)


FROM EMPLOYEES
GROUP BY DEPARTMENT;
HAVING SALARY > 50000;
BEST PRACTICE FOR QUERY OPTIMIZATION … Developer Side (6)
56

 Batch Updates and Deletes: If your queries involve UPDATE or DELETE operations that affect many

rows, try to batch them into smaller chunks to avoid long transactions and heavy locking.

UPDATE EMPLOYEES SET STATUS = 'ACTIVE' WHERE EMPID BETWEEN 1 AND 1000;

 Avoid Leading Wildcards:

Write SELECT * FROM CUSTOMERS WHERE NAME LIKE 'john%';

Instead of SELECT * FROM CUSTOMERS WHERE NAME LIKE '%john';

The second query enforces the database to scan every row (full table scan) to check the pattern.
BEST PRACTICE FOR QUERY OPTIMIZATION … Developer Side (7)
57

 INNER JOIN Instead of WHERE

Write SELECT e.EMPNAME, d.DEPTNAME

FROM EMPLOYEES e

INNER JOIN DEPARTMENTS d ON e.DEPTID = d.DEPTID;

Instead of SELECT e.EMPNAME, d.DEPTNAME

FROM EMPLOYEES e, DEPARTMENTS d

WHERE e.DEPTID = d.DEPTID;


TEACHING YOU IS GOOD LUCK

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy