0% found this document useful (0 votes)

104 views

Practical Mysql Indexing Guidelines

The document provides guidelines for efficient indexing in MySQL databases, noting that indexes can improve query performance but also slow down write operations. It recommends focusing indexes on columns used in WHERE clauses to filter data and including all columns needed to sort results. Composite indexes on multiple columns are generally better than separate indexes when columns are used together in queries.

Uploaded by

Azizul Huq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

104 views

Practical Mysql Indexing Guidelines

Uploaded by

Azizul Huq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Practical MySQL indexing guidelines

Percona Live
October 24th-25th, 2011 Stéphane Combaudon
London, UK stephane.combaudon@dailymotion.com
Agenda

 Introduction
 Bad indexes & performance drops
 Guidelines for efficient indexing
 Tools and methods to improve index usage

2
Introduction

3
Goals

 Having fun with indexes!!!

 Getting rid of trial-and-error approach

 Knowing performance penalty of bad indexes

 Being productive
 Knowing simple rules to design indexes
 Knowing tools that can help 4
Indexing basics

 Index: data structure to speed up SELECTs

 Think of an index in a book
 In MySQL, key = index
 We'll consider that indexes are trees

 InnoDB's clustered index

 Data is stored with the PK: PK lookups are fast
 Secondary keys hold the PK values
 Designing InnoDB's PKs with care is critical for perf.
5
Strengths

 An index can filter and/or sort values

 An index can contain all the fields needed for a

query
 No need to access data anymore

 A leftmost prefix can be used

 Indexes on several columns are useful
 Order of columns in composite keys is important
6
Limitations

 MySQL only uses 1 index per table per query

 Ok, that's not 100% true (OR clauses...)
 Think of composite indexes when you can!!

 Can't index full TEXT fields

 You must use a prefix
 Same for BLOBS and long VARCHARs

 Maintaining an index has a cost

 Read speed vs write speed 7
Sample table

CREATE TABLE t (
   id INT NOT NULL AUTO_INCREMENT,
   a INT NOT NULL DEFAULT 0,
   b INT NOT NULL DEFAULT 0,
   [more columns here]
   PRIMARY KEY(id)
)ENGINE=InnoDB;

 Populated with ”many” rows

 Means that queries against table are ”slow”

 Replace ”many” and ”slow” with your own values

8
Bad indexes &
performance drops

9
Adding an index

 3 main consequences:
 Can speed up queries (good)
 Increases the size of your dataset (bad)
 Slows down writes (bad)

 How big is the write slow-down?

 Let's have simple tests

10
Write slow-downs, pictured
In-memory test

300 Baseline is 100 for 1 key

250 for both graphs
Time to load data

200
2 idx
3 idx
150
4 idx
100
For in-memory
workloads, adding 2 keys
50
makes perf. 2x worse
0
Number of indexes

On-disk test

12000

10000

For on-disk workloads,

Time to load data

8000
2 idx

6000
3 idx adding 2 keys make perf.
4 idx
4000
40x worse!!
2000

0
11
Number of indexes
So what?

 Removing bad indexes is crucial for perf.

 Especially for write-intensive workloads
 Tools will help us

 What if your workload is read-intensive?

 A few hot tables may handle most of the writes
 These tables will be write-intensive

12
Identifying bad indexes

 Before removing bad indexes, identify them!

 What is a bad index?

 Duplicate indexes: always bad
 Redondant indexes: generally bad
 Low-cardinality indexes: depends
 Unused indexes: always bad

13
Guidelines for efficient indexes

14
Before we start...

 Indexing is not an exact science

 But guessing is not the best way to design indexes

 A few simple rules will help 90% of the time

 Always check your assumptions

 EXPLAIN does not tell you everything
 Time your queries with different index combinations
 SHOW PROFILES is often valuable

 Slow query log is a good place to start! 15

Rule #1: Filter

Q1: SELECT * FROM t WHERE a = 10 AND b = 20
 Without an index, always a full table scan
1. mysql> EXPLAIN SELECT * FROM t WHERE a = 10 AND b = 20\G
2. *********** 1. row ***********
3.            id: 1
4.   select_type: SIMPLE
5.         table: t
ALL means
6.          type: ALL
table scan
7. possible_keys: NULL
8.           key: NULL
9.       key_len: NULL
10.          ref: NULL
Estimated #
12.         rows: 1000545
of rows to read
12.        Extra: Using where

Post-filtering needed
to discard the non-matching rows 16
Rule #1: Filter

 Idea: filter as much data as possible by

focusing on the WHERE clause

 Candidates for Q1:

 key(a), key(b), key(a,b), key(b,a)

 Condition is on both a and b with an AND

 A composite index should be better
 Let's test!
17
Rule #1: Filter
1. mysql> EXPLAIN SELECT * ... 1. mysql> EXPLAIN SELECT * ...
2. ********** 1. row ********** 2. ********** 1. row **********
3.            [...] 3.            [...]
4.           key: a 4.           key: b
5.       key_len: 4 5.       key_len: 4
6.            [...] 6.            [...]
7.          rows: 20 7.          rows: 67368
Exec time: 0.00s Exec time: 0.20s

1. mysql> EXPLAIN SELECT * ... 1. mysql> EXPLAIN SELECT * ...
2. ********** 1. row ********** 2. ********** 1. row **********
3.            [...] 3.            [...]
4.           key: ab 4.           key: ba
5.       key_len: 8 5.       key_len: 8
6.            [...] 6.            [...]
7.          rows: 10 7.          rows: 10
Exec time: 0.00s Exec time: 0.00s

Same perf. for this query

Other queries will guide us 18
to choose between them
Rule #2: Sort

Q2: SELECT * FROM t WHERE a = 10 ORDER BY b
 Remember: indexed values are sorted

 An index can avoid costly filesorts

 Think of filesorts performed on on-disk temp tables
 ORDER BY clause must be a leftmost prefix of the
index

 Caveat: an index scan is fast in itself, but

retrieving the rows in index order may be slow
 Seq. scan on index but random access on table 19
Rule #2: Sort

 Let's try key(b) for Q2 vs full table scan

1. mysql> EXPLAIN SELECT * ... 1. mysql> EXPLAIN SELECT * ...
2. ********** 1. row ********** 2. ********** 1. row **********
3.            […] 3.            […]
4.          type: index 4.          type: ALL
5.           key: b 4.           key: NULL
6.       key_len: 4 5.       key_len: NULL
7.            [...] 6.            [...]
8.          rows: 1000638 7.          rows: 1000638
9.         Extra: Using where 8.         Extra: Using where;
                  Using filesort

Exec time: 1.52s Exec time: 0.37s

EXPLAIN suggest
key(b) is better,
20
but it's wrong!
Rule #2: Sort

 An index is not always the best for sorting

 If possible, try to sort and filter

 Exception to the leftmost prefix rule:

 Leading columns appearing in the WHERE clause
as constants can fill the holes in the index
 WHERE a = 10 ORDER BY b: key(a,b) can
filter and sort
 Not true with WHERE a > 10 ORDER BY b
21
Rule #2: Sort

 With key(a,b)
1. mysql> EXPLAIN SELECT * FROM t 1. mysql> EXPLAIN SELECT * FROM t
WHERE a = 10 ORDER BY b\G WHERE a > 10 ORDER BY b\G
2. ********** 1. row ********** 2. ********** 1. row **********
3.            [...] 3.            [...]
4.          type: ref 4.          type: ALL
5.           key: ab 5.           key: NULL
6.       key_len: 8 6.       key_len: NULL
7.            [...] 7.            [...]
8.          rows: 20 8.          rows: 1000638
9.         Extra: 9.         Extra: Using where;
                  Using filesort

Could have been a range scan

Depends on the distribution
of the values
22
Rule #3: Cover

Q3: SELECT a,b FROM t WHERE a > 100;

 With key(a), you filter efficiently

 But with key(a,b)

 You filter
 The index holds all the columns you need
 Means you don't need to access data

 key(a,b) is a covering index

23
Rule #3: Cover

 Back to InnoDB's clustered index

 It is always covering
 SELECT by PK is the fastest access with InnoDB
 Take care of your PKs!!

 Remember full table scan + filesort vs index?

 If the index used for sorting is also covering, it will
outperform the table scan

24
Rating an index

 An index can give you 3 benefits: filtering,

sorting, covering

 1-star index: 1 property

 2-star index: 2 properties
 3-star index: 3 properties

 This is my own rating, other systems exist

25
Range queries and ORDER BY

Q4: SELECT * FROM t WHERE a > 10 and b = 20 ORDER BY a
mysql> EXPLAIN SELECT * ...\G
********** 1. row **********
           [...] Key filters and sorts, but
         type: range filtering is not efficient.
          key: a Getting data is very slow
         rows: 500319 (random access + I/O-bound)
        Extra: Using where
Exec time: 35.9s

mysql> EXPLAIN SELECT * ...\G
********** 1. row **********
           [...]
         type: ref Key filters but doesn't sort.
possible_keys: a,b,ab,ba Filtering is efficient so getting data,
          key: ba post-filtering and post-sorting
         rows: 64814 is not too slow
        Extra: Using where;
26
               Using filesort
Exec time: 0.2s
Joins and ORDER BY

 All columns in the ORDER BY clause must

refer to the 1st table

 Forcing the join order with SELECT

STRAIGHT_JOIN is sometimes useful

 Sometimes you can't fulfill this condition

 This can be a reason to denormalize

27
Tools and methods
to improve index usage

28
Userstats v2

 You need Percona Server or MariaDB 5.2+

mysql> SELECT s.table_name,s.index_name,rows_read
       FROM information_schema.statistics s
       LEFT JOIN information_schema.index_statistics i
       ON (i.table_schema=s.table_schema Table added by
           AND i.table_name=s.table_name this feature
           AND i.index_name=s.index_name)
       WHERE s.table_name='comment'
             AND s.table_schema='mydb'
             AND seq_in_index=1; Deals with
composite indexes

 Very easy to use

 Turn on the variable and forget

 Easy to write queries to discover unused

indexes automatically

31
Cons

 Large sample period needed for accurate stats

 Not always obvious to say if index is useful
 Look at created_language_idx in previous slide
 Has some CPU overhead

32
pt-duplicate-key-checker

 Anything wrong with the keys?

CREATE TABLE comment (
  comment_id int(10) ... AUTO_INCREMENT,
  video_id int(10) ...,
  user_id int(10) ...,
  language char(2) ...,
  [...]
  PRIMARY KEY (comment_id), Tool is aware
  KEY user_id (user_id), of InnoDB's
  KEY video_comment_idx (video_id,language,comment_id) clustered index!
) ENGINE=InnoDB;

$ ptduplicatekeychecker u=root,h=localhost
[...]
# Key video_comment_idx ends with a prefix of the clustered index
# Key definitions:
# KEY video_comment_idx (video_id,language,comment_id)
Query to remove
# PRIMARY KEY (comment_id),
[...]
the index
# To shorten this duplicate clustered index, execute:
ALTER TABLE mydb.comment DROP INDEX video_comment_idx, ADD INDEX 33
video_comment_idx (video_id,language)
pt-index-usage

 Helps answer questions not solved by userstats

 Are there any queries with a changing exec plan?
 Is an index necessary for a query?

 Read a slow log file/general log file

 Can give you invaluable information on your

index usage
 See the man page for more
34
 Thanks for your attention!

 Any questions?

DBT Question Paper
No ratings yet
DBT Question Paper
3 pages
Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
1 Intro To Simulation
No ratings yet
1 Intro To Simulation
8 pages
Chapter 3 - SQL Notes
No ratings yet
Chapter 3 - SQL Notes
25 pages
Redis Cheatsheet
100% (1)
Redis Cheatsheet
4 pages
WPS-Tube To Tube Sheet (SS-SS)
0% (2)
WPS-Tube To Tube Sheet (SS-SS)
2 pages
Mysql Explain Explained
No ratings yet
Mysql Explain Explained
23 pages
Query Optimization
No ratings yet
Query Optimization
9 pages
Mysql Exercises
No ratings yet
Mysql Exercises
4 pages
DB Campus Drive Preparation Materials Geeks4Geeks
No ratings yet
DB Campus Drive Preparation Materials Geeks4Geeks
14 pages
How Does Database Indexing Work
No ratings yet
How Does Database Indexing Work
4 pages
Using Hibernate in A Java Swing Application
No ratings yet
Using Hibernate in A Java Swing Application
22 pages
SQL Assingement PDF
No ratings yet
SQL Assingement PDF
4 pages
MongoDB Notes
No ratings yet
MongoDB Notes
16 pages
Dbms Lab # 4: SQL Wildcards & Operators
No ratings yet
Dbms Lab # 4: SQL Wildcards & Operators
10 pages
SQLServer Assignments
0% (1)
SQLServer Assignments
5 pages
Java Means Durga Soft: DURGA SOFTWARE SOLUTIONS, 202 HUDA Maitrivanam, Ameerpet, Hyd. PH: 040-64512786
No ratings yet
Java Means Durga Soft: DURGA SOFTWARE SOLUTIONS, 202 HUDA Maitrivanam, Ameerpet, Hyd. PH: 040-64512786
19 pages
Difference Between Clustered and Non-Clustered Index
No ratings yet
Difference Between Clustered and Non-Clustered Index
7 pages
Understanding The Top 5 Redis Performance Metrics
No ratings yet
Understanding The Top 5 Redis Performance Metrics
22 pages
Indexing
No ratings yet
Indexing
8 pages
An Introduction To MySQL Performance Optimization
No ratings yet
An Introduction To MySQL Performance Optimization
20 pages
What Are The Different Type of SQL's Statements
No ratings yet
What Are The Different Type of SQL's Statements
10 pages
Indexes
No ratings yet
Indexes
4 pages
Postgre SQL
No ratings yet
Postgre SQL
35 pages
1 Introduction Module 1
No ratings yet
1 Introduction Module 1
64 pages
10 Frequently Asked SQL Query Interview Questions - Java67
No ratings yet
10 Frequently Asked SQL Query Interview Questions - Java67
26 pages
MySQL 8 For Developers
No ratings yet
MySQL 8 For Developers
113 pages
BCA 428 Oracle
No ratings yet
BCA 428 Oracle
142 pages
SQL Server Replication
No ratings yet
SQL Server Replication
8 pages
SQL Server 2000 Faqs
No ratings yet
SQL Server 2000 Faqs
62 pages
2 and 3 Tier Architecture. A tier
No ratings yet
2 and 3 Tier Architecture. A tier
5 pages
Lecture 07 - Key-Value Databases
No ratings yet
Lecture 07 - Key-Value Databases
75 pages
DBMS Lab
No ratings yet
DBMS Lab
59 pages
Linux Admin 4
No ratings yet
Linux Admin 4
44 pages
02 IntroLinux
No ratings yet
02 IntroLinux
30 pages
SOAP Web Security
No ratings yet
SOAP Web Security
0 pages
Oracle Interview Q
No ratings yet
Oracle Interview Q
105 pages
Top 50 Mysql Interview Questions & Answers
No ratings yet
Top 50 Mysql Interview Questions & Answers
1 page
Neo4j PDF
No ratings yet
Neo4j PDF
30 pages
Spring Framework Note by Biswajit Saha
No ratings yet
Spring Framework Note by Biswajit Saha
49 pages
Chapter 9 MySQL
No ratings yet
Chapter 9 MySQL
29 pages
Web Services
No ratings yet
Web Services
10 pages
SQL Detailed Notes For Professionals 1672765219
No ratings yet
SQL Detailed Notes For Professionals 1672765219
166 pages
Ruby On Rails: Database Indexing Techniques
No ratings yet
Ruby On Rails: Database Indexing Techniques
19 pages
ADBMS Parallel and Distributed Databases
No ratings yet
ADBMS Parallel and Distributed Databases
98 pages
UNIX For Testers
100% (1)
UNIX For Testers
141 pages
Mysql Command
No ratings yet
Mysql Command
5 pages
My) SQL Cheat Sheet: Mysql Command-Line What How Example (S)
No ratings yet
My) SQL Cheat Sheet: Mysql Command-Line What How Example (S)
3 pages
B: O: D: / M: A: + S: - : Precedence: BODMAS
No ratings yet
B: O: D: / M: A: + S: - : Precedence: BODMAS
17 pages
JSP Complete Notes
No ratings yet
JSP Complete Notes
153 pages
Ebook Learn To Use PostgreSQL For Real
No ratings yet
Ebook Learn To Use PostgreSQL For Real
35 pages
DataBase Administration
No ratings yet
DataBase Administration
50 pages
Atlassian Git Cheatsheet PDF
No ratings yet
Atlassian Git Cheatsheet PDF
2 pages
Introduction To String Handling
No ratings yet
Introduction To String Handling
15 pages
PostgreSQL and NoSQL
100% (7)
PostgreSQL and NoSQL
36 pages
Adv Sub Queries
No ratings yet
Adv Sub Queries
10 pages
Informatica Basic Dac Obia7964
0% (1)
Informatica Basic Dac Obia7964
96 pages
IBM InfoSphere Replication Server and Data Event Publisher
From Everand
IBM InfoSphere Replication Server and Data Event Publisher
Pav Kumar-Chatterjee
No ratings yet
About Kubernetes and Security Practices - Short Edition: First Edition, #1
From Everand
About Kubernetes and Security Practices - Short Edition: First Edition, #1
Ami Adi
No ratings yet
Professional Hadoop Solutions
From Everand
Professional Hadoop Solutions
Boris Lublinsky
4/5 (2)
Oracle Data Guard A Clear and Concise Reference
From Everand
Oracle Data Guard A Clear and Concise Reference
Gerardus Blokdyk
No ratings yet
Report 20191012
No ratings yet
Report 20191012
7 pages
Blueprism: Important Elements of Blue Prism
No ratings yet
Blueprism: Important Elements of Blue Prism
5 pages
Report 20190929
No ratings yet
Report 20190929
3 pages
CV Structure
No ratings yet
CV Structure
1 page
Introduction To Github
No ratings yet
Introduction To Github
3 pages
Chapter 6 Architectural Design 1
No ratings yet
Chapter 6 Architectural Design 1
50 pages
Software Design Patterns
No ratings yet
Software Design Patterns
12 pages
Enable Github Pages
No ratings yet
Enable Github Pages
2 pages
Ebook - Object Oriented Programming With C# - 4th Edition
No ratings yet
Ebook - Object Oriented Programming With C# - 4th Edition
127 pages
RAWDATA Assignment 2 - Querying IMDB With SQL: Use Your Account On Wt-220.ruc - DK
No ratings yet
RAWDATA Assignment 2 - Querying IMDB With SQL: Use Your Account On Wt-220.ruc - DK
4 pages
Ebook - Object Oriented Programming With C# - 4th Edition
No ratings yet
Ebook - Object Oriented Programming With C# - 4th Edition
127 pages
4 Channel DJ Audio Mixer Circuit Part 2 1
No ratings yet
4 Channel DJ Audio Mixer Circuit Part 2 1
12 pages
Maths Activity Class12 (2022-23)
No ratings yet
Maths Activity Class12 (2022-23)
6 pages
Ds Gigavue Ta Series Traffic Aggregation
No ratings yet
Ds Gigavue Ta Series Traffic Aggregation
16 pages
Pre Formulation
0% (1)
Pre Formulation
53 pages
DM LCD24064 468 Datasheet
No ratings yet
DM LCD24064 468 Datasheet
10 pages
Electronic Procurement and Performance
No ratings yet
Electronic Procurement and Performance
10 pages
Previewpdf
No ratings yet
Previewpdf
37 pages
P1 Coordinate Geometry D
No ratings yet
P1 Coordinate Geometry D
19 pages
INTP
No ratings yet
INTP
11 pages
Chapter 4 Wcu
100% (3)
Chapter 4 Wcu
29 pages
Amjad Ali SNA4 Lab Manual Cpp4
No ratings yet
Amjad Ali SNA4 Lab Manual Cpp4
188 pages
Plate Versus Tension-Band Wire Fixation For Olecranon Fractures
No ratings yet
Plate Versus Tension-Band Wire Fixation For Olecranon Fractures
13 pages
Gesture Control Car
No ratings yet
Gesture Control Car
22 pages
Robot Trajectory Design Using Genetic Algorithm in Matlab: I.Sekaj, A.Husár
No ratings yet
Robot Trajectory Design Using Genetic Algorithm in Matlab: I.Sekaj, A.Husár
6 pages
Lesson 2 The Concept of Logic Circuits
No ratings yet
Lesson 2 The Concept of Logic Circuits
23 pages
Mathematics: First Quarter WEEK 2 - Module 2
100% (1)
Mathematics: First Quarter WEEK 2 - Module 2
19 pages
Econ. 201 Major Assignment 2024 Or2025
No ratings yet
Econ. 201 Major Assignment 2024 Or2025
3 pages
Us 20060145019
100% (4)
Us 20060145019
11 pages
Jeanrose
No ratings yet
Jeanrose
3 pages
Flexible Rigid Pavements
No ratings yet
Flexible Rigid Pavements
66 pages
Dev and Lap Splice Lengths - ETN-D-1-15 PDF
No ratings yet
Dev and Lap Splice Lengths - ETN-D-1-15 PDF
9 pages
University of Engineering and Technology: Lab Report
No ratings yet
University of Engineering and Technology: Lab Report
34 pages
Biofarmasetika BCS
No ratings yet
Biofarmasetika BCS
19 pages
Simulation Based Performance Analysis of Heat Exchangers: A Review
No ratings yet
Simulation Based Performance Analysis of Heat Exchangers: A Review
7 pages
MATATAG DLL WEEK 7 SCIENCE G4 q4
No ratings yet
MATATAG DLL WEEK 7 SCIENCE G4 q4
11 pages
Nyquist Plot: Plot of in The Complex Plane As Is Varied On
No ratings yet
Nyquist Plot: Plot of in The Complex Plane As Is Varied On
19 pages
Problem Set 01 Trigo
No ratings yet
Problem Set 01 Trigo
2 pages
SUPD &RUPD logsheet
No ratings yet
SUPD &RUPD logsheet
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Practical Mysql Indexing Guidelines

Uploaded by

Practical Mysql Indexing Guidelines

Uploaded by

Practical MySQL indexing guidelines

 Having fun with indexes!!!

 Getting rid of trial-and-error approach

 Knowing performance penalty of bad indexes

 Index: data structure to speed up SELECTs

 InnoDB's clustered index

 An index can filter and/or sort values

 An index can contain all the fields needed for a

 A leftmost prefix can be used

 MySQL only uses 1 index per table per query

 Can't index full TEXT fields

 Maintaining an index has a cost

 Populated with ”many” rows

 Replace ”many” and ”slow” with your own values

 How big is the write slow-down?

300 Baseline is 100 for 1 key

For on-disk workloads,

 Removing bad indexes is crucial for perf.

 What if your workload is read-intensive?

 Before removing bad indexes, identify them!

 What is a bad index?

 Indexing is not an exact science

 A few simple rules will help 90% of the time

 Always check your assumptions

 Slow query log is a good place to start! 15

 Idea: filter as much data as possible by

 Candidates for Q1:

 Condition is on both a and b with an AND

Same perf. for this query

 An index can avoid costly filesorts

 Caveat: an index scan is fast in itself, but

 Let's try key(b) for Q2 vs full table scan

 An index is not always the best for sorting

 If possible, try to sort and filter

 Exception to the leftmost prefix rule:

Could have been a range scan

 With key(a), you filter efficiently

 But with key(a,b)

 key(a,b) is a covering index

 Back to InnoDB's clustered index

 Remember full table scan + filesort vs index?

 An index can give you 3 benefits: filtering,

 1-star index: 1 property

 This is my own rating, other systems exist

 All columns in the ORDER BY clause must

 Forcing the join order with SELECT

 Sometimes you can't fulfill this condition

 You need Percona Server or MariaDB 5.2+

 Very easy to use

 Easy to write queries to discover unused

 Large sample period needed for accurate stats

 Anything wrong with the keys?

 Helps answer questions not solved by userstats

 Read a slow log file/general log file

 Can give you invaluable information on your

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.