Skip to content

Commit ee2863e

Browse files
committed
Merge branch 'main' into 'main'
Days 74-76 See merge request postgres-ai/postgresql-consulting/postgres-howtos!28
2 parents e732277 + 1bc0f7a commit ee2863e

File tree

4 files changed

+641
-0
lines changed

4 files changed

+641
-0
lines changed

0074_how_to_flush_caches.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
Originally from: [tweet](https://twitter.com/samokhvalov/status/1733652860435640705), [LinkedIn post]().
2+
3+
---
4+
5+
# How to flush caches (OS page cache and Postgres buffer pool)
6+
7+
> I post a new PostgreSQL "howto" article every day. Join me in this
8+
> journey – [subscribe](https://twitter.com/samokhvalov/), provide feedback, share!
9+
10+
For experiments, it is important to take into account the state of caches – Postgres buffer pool (size of which is
11+
controlled by `shared_buffers`) and OS page cache. If we decide to start each experiment run with cold caches, we need
12+
to flush them.
13+
14+
## Flushing Postgres buffer pool
15+
16+
To flush Postgres buffer pool, restart Postgres.
17+
18+
To analyze the current state of the buffer pool,
19+
use [pg_buffercache](https://postgresql.org/docs/current/pgbuffercache.html).
20+
21+
## Flushing OS page cache
22+
23+
To flush Linux page cache:
24+
25+
```bash
26+
sync
27+
echo 3 > /proc/sys/vm/drop_caches
28+
```
29+
30+
To see the current state of RAM consumption (in MiB) in Linux:
31+
32+
```bash
33+
free -m
34+
```
35+
36+
On macOS, to flush the page cache:
37+
38+
```bash
39+
sync
40+
sudo purge
41+
```
42+
43+
To see the current state of RAM on macOS:
44+
45+
```bash
46+
vm_stat
47+
```

0075_how_to_find_unused_indexes.md

Lines changed: 309 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,309 @@
1+
Originally from: [tweet](https://twitter.com/samokhvalov/status/1734044926755967163), [LinkedIn post]().
2+
3+
---
4+
5+
# How to find unused indexes
6+
7+
> I post a new PostgreSQL "howto" article every day. Join me in this
8+
> journey – [subscribe](https://twitter.com/samokhvalov/), provide feedback, share!
9+
10+
## Why to clean up unused indexes
11+
12+
Having unused indexes is bad because:
13+
14+
1. They occupy extra space on disk.
15+
16+
2. They slow down many write operations (`INSERT`s and non-HOT `UPDATE`s).
17+
18+
3. They "spam" caches (OS page cache, Postgres buffer pool) because index blocks are loaded there during write
19+
operations mentioned above.
20+
21+
4. Due to the same reasons, they "spam" WAL (thus, replication and backup systems need to handle more data).
22+
23+
5. Every index slows down the query planning
24+
(see
25+
[benchmarks](https://gitlab.com/postgres-ai/postgresql-consulting/tests-and-benchmarks/-/issues/41#note_1602598974)).
26+
27+
6. Finally, if there is no plan caching (prepared statements are not used), each extra index increases chances to
28+
reach `FP_LOCK_SLOTS_PER_BACKEND=16` relations (both tables and indexes) involved in query processing, which, under
29+
high QPS, increases chances of having `LWLock:LockManager` contention (see
30+
[benchmarks](https://gitlab.com/postgres-ai/postgresql-consulting/tests-and-benchmarks/-/issues/41)).
31+
32+
Thus, having a routine procedure for periodic analysis and cleanup of unused indexes is highly recommended.
33+
34+
## General algorithm of unused indexes cleanup
35+
36+
1. Using one of the queries provided below (all of them deal with `pg_stat_user_indexes`, where index usage stats can be
37+
found), identify unused indexes.
38+
39+
2. Make sure that the stats are old enough, inspecting timestamp `stats_reset` in `pg_stat_database` for your database.
40+
For example, they represent usage data for at least 1 month (depending on the application, this threshold can be more
41+
or less). If this condition is not met, postpone the analysis.
42+
43+
3. If replicas receive read-only traffic, analyze all of them as well, paying attention to the stats reset timestamps
44+
too.
45+
46+
4. If you have more than one production instance of the database (e.g., you develop a system that is installed in
47+
various places and all of them have their own databases), analyze as many database instances as possible, using steps
48+
1-3 above.
49+
50+
5. As a result, build a list of indexes that can be reliably named as "unused" – we know that we didn't use them during
51+
significant time, neither on the primary nor replicas, on all production systems we can observe.
52+
53+
6. For each index in the list drop it using `DROP INDEX CONCURRENTLY`.
54+
55+
## Queries to analyze unused indexes
56+
57+
A basic query is simple:
58+
59+
```sql
60+
select *
61+
from pg_stat_user_indexes
62+
where idx_scan = 0;
63+
```
64+
65+
How to understand how long ago the stats were last reset:
66+
67+
```sql
68+
select
69+
stats_reset,
70+
age(now(), stats_reset)
71+
from pg_stat_database
72+
where datname = current_database();
73+
```
74+
75+
For holistic analysis, you can use this query from
76+
[postgres-checkup](https://gitlab.com/postgres-ai/postgres-checkup/-/blob/7f5c45b18b04c665d11f744486db047a1dbc680e/resources/checks/H002_unused_indexes.sh),
77+
which also includes the analysis of rarely used indexes, DDL commands for index cleanup, and presents the result as JSON
78+
for easier integration in observability and automation systems:
79+
80+
```sql
81+
with const as (
82+
select 0 as min_relpages -- on very large DBs, increase it to, say, 100
83+
), fk_indexes as (
84+
select
85+
n.nspname as schema_name,
86+
ci.relname as index_name,
87+
cr.relname as table_name,
88+
(confrelid::regclass)::text as fk_table_ref,
89+
array_to_string(indclass, ', ') as opclasses
90+
from pg_index i
91+
join pg_class ci on ci.oid = i.indexrelid and ci.relkind = 'i'
92+
join pg_class cr on cr.oid = i.indrelid and cr.relkind = 'r'
93+
join pg_namespace n on n.oid = ci.relnamespace
94+
join pg_constraint cn on cn.conrelid = cr.oid
95+
left join pg_stat_user_indexes si on si.indexrelid = i.indexrelid
96+
where
97+
contype = 'f'
98+
and i.indisunique is false
99+
and conkey is not null
100+
and ci.relpages > (select min_relpages from const)
101+
and si.idx_scan < 10
102+
), table_scans as (
103+
select
104+
relid,
105+
tables.idx_scan + tables.seq_scan as all_scans,
106+
(tables.n_tup_ins + tables.n_tup_upd + tables.n_tup_del) as writes,
107+
pg_relation_size(relid) as table_size
108+
from pg_stat_user_tables as tables
109+
join pg_class c on c.oid = relid
110+
where c.relpages > (select min_relpages from const)
111+
), all_writes as (
112+
select sum(writes) as total_writes
113+
from table_scans
114+
), indexes as (
115+
select
116+
i.indrelid,
117+
i.indexrelid,
118+
n.nspname as schema_name,
119+
cr.relname as table_name,
120+
ci.relname as index_name,
121+
quote_ident(n.nspname) as formated_schema_name,
122+
coalesce(nullif(quote_ident(n.nspname), 'public') || '.', '') || quote_ident(ci.relname) as formated_index_name,
123+
quote_ident(cr.relname) as formated_table_name,
124+
coalesce(nullif(quote_ident(n.nspname), 'public') || '.', '') || quote_ident(cr.relname) as formated_relation_name,
125+
si.idx_scan,
126+
pg_relation_size(i.indexrelid) as index_bytes,
127+
ci.relpages,
128+
(case when a.amname = 'btree' then true else false end) as idx_is_btree,
129+
pg_get_indexdef(i.indexrelid) as index_def,
130+
array_to_string(i.indclass, ', ') as opclasses
131+
from pg_index i
132+
join pg_class ci on ci.oid = i.indexrelid and ci.relkind = 'i'
133+
join pg_class cr on cr.oid = i.indrelid and cr.relkind = 'r'
134+
join pg_namespace n on n.oid = ci.relnamespace
135+
join pg_am a ON ci.relam = a.oid
136+
left join pg_stat_user_indexes si on si.indexrelid = i.indexrelid
137+
where
138+
i.indisunique = false
139+
and i.indisvalid = true
140+
and ci.relpages > (select min_relpages from const)
141+
), index_ratios as (
142+
select
143+
i.indexrelid as index_id,
144+
i.schema_name,
145+
i.table_name,
146+
i.index_name,
147+
idx_scan,
148+
all_scans,
149+
round(
150+
case
151+
when all_scans = 0 then 0.0::numeric
152+
else idx_scan::numeric / all_scans * 100
153+
end,
154+
2
155+
) as index_scan_pct,
156+
writes,
157+
round(
158+
case
159+
when writes = 0 then idx_scan::numeric
160+
else idx_scan::numeric / writes
161+
end,
162+
2
163+
) as scans_per_write,
164+
index_bytes as index_size_bytes,
165+
table_size as table_size_bytes,
166+
i.relpages,
167+
idx_is_btree,
168+
index_def,
169+
formated_index_name,
170+
formated_schema_name,
171+
formated_table_name,
172+
formated_relation_name,
173+
i.opclasses,
174+
(
175+
select count(1)
176+
from fk_indexes fi
177+
where
178+
fi.fk_table_ref = i.table_name
179+
and fi.schema_name = i.schema_name
180+
and fi.opclasses like (i.opclasses || '%')
181+
) > 0 as supports_fk
182+
from indexes i
183+
join table_scans ts on ts.relid = i.indrelid
184+
), never_used_indexes as ( -- Never used indexes
185+
select
186+
'Never Used Indexes' as reason,
187+
ir.*
188+
from index_ratios ir
189+
where
190+
idx_scan = 0
191+
and idx_is_btree
192+
order by index_size_bytes desc
193+
), never_used_indexes_num as (
194+
select
195+
row_number() over () num,
196+
nui.*
197+
from never_used_indexes nui
198+
), never_used_indexes_total as (
199+
select
200+
sum(index_size_bytes) as index_size_bytes_sum,
201+
sum(table_size_bytes) as table_size_bytes_sum
202+
from never_used_indexes
203+
), never_used_indexes_json as (
204+
select
205+
json_object_agg(coalesce(nuin.schema_name, 'public') || '.' || nuin.index_name, nuin) as json
206+
from never_used_indexes_num nuin
207+
), rarely_used_indexes as ( -- Rarely used indexes
208+
select
209+
'Low Scans, High Writes' as reason,
210+
*,
211+
1 as grp
212+
from index_ratios
213+
where
214+
scans_per_write <= 1
215+
and index_scan_pct < 10
216+
and idx_scan > 0
217+
and writes > 100
218+
and idx_is_btree
219+
union all
220+
select
221+
'Seldom Used Large Indexes' as reason,
222+
*,
223+
2 as grp
224+
from index_ratios
225+
where
226+
index_scan_pct < 5
227+
and scans_per_write > 1
228+
and idx_scan > 0
229+
and idx_is_btree
230+
and index_size_bytes > 100000000
231+
union all
232+
select
233+
'High-Write Large Non-Btree' as reason,
234+
index_ratios.*,
235+
3 as grp
236+
from index_ratios, all_writes
237+
where
238+
(writes::numeric / ( total_writes + 1 )) > 0.02
239+
and not idx_is_btree
240+
and index_size_bytes > 100000000
241+
order by grp, index_size_bytes desc
242+
), rarely_used_indexes_num as (
243+
select row_number() over () num, rui.*
244+
from rarely_used_indexes rui
245+
), rarely_used_indexes_total as (
246+
select
247+
sum(index_size_bytes) as index_size_bytes_sum,
248+
sum(table_size_bytes) as table_size_bytes_sum
249+
from rarely_used_indexes
250+
), rarely_used_indexes_json as (
251+
select
252+
json_object_agg(coalesce(ruin.schema_name, 'public') || '.' || ruin.index_name, ruin) as json
253+
from rarely_used_indexes_num ruin
254+
), do_lines as (
255+
select
256+
format(
257+
'DROP INDEX CONCURRENTLY %s; -- %s, %s, table %s',
258+
formated_index_name,
259+
pg_size_pretty(index_size_bytes)::text,
260+
reason,
261+
formated_relation_name
262+
) as line
263+
from never_used_indexes nui
264+
order by table_name, index_name
265+
), undo_lines as (
266+
select
267+
replace(
268+
format('%s; -- table %s', index_def, formated_relation_name),
269+
'CREATE INDEX',
270+
'CREATE INDEX CONCURRENTLY'
271+
) as line
272+
from never_used_indexes nui
273+
order by table_name, index_name
274+
), database_stat as (
275+
select
276+
row_to_json(dbstat)
277+
from (
278+
select
279+
sd.stats_reset::timestamptz(0),
280+
age(
281+
date_trunc('minute', now()),
282+
date_trunc('minute', sd.stats_reset)
283+
) as stats_age,
284+
((extract(epoch from now()) - extract(epoch from sd.stats_reset)) / 86400)::int as days,
285+
(select pg_database_size(current_database())) as database_size_bytes
286+
from pg_stat_database sd
287+
where datname = current_database()
288+
) dbstat
289+
)
290+
select
291+
jsonb_pretty(jsonb_build_object(
292+
'never_used_indexes',
293+
(select * from never_used_indexes_json),
294+
'never_used_indexes_total',
295+
(select row_to_json(nuit) from never_used_indexes_total as nuit),
296+
'rarely_used_indexes',
297+
(select * from rarely_used_indexes_json),
298+
'rarely_used_indexes_total',
299+
(select row_to_json(ruit) from rarely_used_indexes_total as ruit),
300+
'do',
301+
(select json_agg(dl.line) from do_lines as dl),
302+
'undo',
303+
(select json_agg(ul.line) from undo_lines as ul),
304+
'database_stat',
305+
(select * from database_stat),
306+
'min_index_size_bytes',
307+
(select min_relpages * current_setting('block_size')::numeric from const)
308+
));
309+
```

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy