0% found this document useful (0 votes)
101 views26 pages

The Best of Bruce's Postgres Slides: Ruce Omjian

The document provides details about Postgres system architecture including how shared memory is created between the postmaster and backend processes, how shared buffers and WAL work, and an explanation of query processing and the backend flowchart. It also includes examples of EXPLAIN output and a discussion of deadlocks.

Uploaded by

PhotoHawk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views26 pages

The Best of Bruce's Postgres Slides: Ruce Omjian

The document provides details about Postgres system architecture including how shared memory is created between the postmaster and backend processes, how shared buffers and WAL work, and an explanation of query processing and the backend flowchart. It also includes examples of EXPLAIN output and a discussion of deadlocks.

Uploaded by

PhotoHawk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

The Best of Bruce’s Postgres Slides

BRUCE MOMJIAN

This talk has the best slides from my 25+ Postgres presentations.
Creative Commons Attribution License http://momjian.us/presentations
Last updated: May, 2017

1 / 26
Postgres System Architecture

Main
Libpq
Postmaster

Postgres Postgres

Parse Statement

utility Utility
Traffic Cop
Command
Query e.g. CREATE TABLE, COPY
SELECT, INSERT, UPDATE, DELETE

Rewrite Query

Generate Paths
Optimal Path

Generate Plan
Plan

Execute Plan

Utilities Catalog Storage Managers

Access Methods Nodes / Lists

Mastering PostgreSQL Administration


2 / 26
Shared Memory Creation

()
rk
postmaster fo postgres postgres

Program (Text) Program (Text) Program (Text)

Data Data Data

Shared Memory Shared Memory Shared Memory

Stack Stack Stack

Inside PostgreSQL Shared Memory


3 / 26
Shared Buffers and WAL

../administration/buffer_stack.eps

PostgreSQL Performance Tuning


4 / 26
Backend Flowchart - Magnified

Parse Statement

utility Utility
Traffic Cop Command
Query e.g. CREATE TABLE, COPY
SELECT, INSERT, UPDATE, DELETE

Rewrite Query

Generate Paths
Optimal Path

Generate Plan
Plan

Execute Plan

PostgreSQL Internals Through Pictures


5 / 26
Query Processing

FindExec: found "/var/local/postgres/./bin/postmaster" using argv[0]


./bin/postmaster: BackendStartup: pid 3320 user postgres db test socket 5
./bin/postmaster child[3320]: starting with (postgres −d99 −F −d99 −v131072 −p test )
FindExec: found "/var/local/postgres/./bin/postgres" using argv[0]
DEBUG: connection: host=[local] user=postgres database=test
DEBUG: InitPostgres
DEBUG: StartTransactionCommand
DEBUG: query: SELECT firstname
FROM friend
WHERE age = 33;
DEBUG: parse tree: { QUERY :command 1 :utility <> :resultRelation 0 :into <> :isPortal false :isBinary false :isTemp false :hasAgg
s false :hasSubLinks false :rtable ({ RTE :relname friend :relid 26912 :subquery <> :alias <> :eref { ATTR :relname friend :attrs (
"firstname" "lastname" "city" "state" "age" )} :inh true :inFromCl true :checkForRead true :checkForWrite false :checkAsUse
r 0}) :jointree { FROMEXPR :fromlist ({ RANGETBLREF 1 }) :quals { EXPR :typeOid 16 :opType op :oper { OPER :opno 96 :opid 0 :opresu
lttype 16 } :args ({ VAR :varno 1 :varattno 5 :vartype 23 :vartypmod −1 :varlevelsup 0 :varnoold 1 :varoattno 5} { CONST :consttype
23 :constlen 4 :constbyval true :constisnull false :constvalue 4 [ 33 0 0 0 ] })}} :rowMarks () :targetList ({ TARGETENTRY :resdom
{ RESDOM :resno 1 :restype 1042 :restypmod 19 :resname firstname :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr {
VAR :varno 1 :varattno 1 :vartype 1042 :vartypmod 19 :varlevelsup 0 :varnoold 1 :varoattno 1}}) :groupClause <> :havingQual <> :dis
tinctClause <> :sortClause <> :limitOffset <> :limitCount <> :setOperations <> :resultRelations ()}
DEBUG: rewritten parse tree:
DEBUG: { QUERY :command 1 :utility <> :resultRelation 0 :into <> :isPortal false :isBinary false :isTemp false :hasAggs false :has
SubLinks false :rtable ({ RTE :relname friend :relid 26912 :subquery <> :alias <> :eref { ATTR :relname friend :attrs ( "firstname"
"lastname" "city" "state" "age" )} :inh true :inFromCl true :checkForRead true :checkForWrite false :checkAsUser 0}) :joint
ree { FROMEXPR :fromlist ({ RANGETBLREF 1 }) :quals { EXPR :typeOid 16 :opType op :oper { OPER :opno 96 :opid 0 :opresulttype 16 }
:args ({ VAR :varno 1 :varattno 5 :vartype 23 :vartypmod −1 :varlevelsup 0 :varnoold 1 :varoattno 5} { CONST :consttype 23 :constle
n 4 :constbyval true :constisnull false :constvalue 4 [ 33 0 0 0 ] })}} :rowMarks () :targetList ({ TARGETENTRY :resdom { RESDOM :r
esno 1 :restype 1042 :restypmod 19 :resname firstname :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1
:varattno 1 :vartype 1042 :vartypmod 19 :varlevelsup 0 :varnoold 1 :varoattno 1}}) :groupClause <> :havingQual <> :distinctClause
<> :sortClause <> :limitOffset <> :limitCount <> :setOperations <> :resultRelations ()}
DEBUG: plan: { SEQSCAN :startup_cost 0.00 :total_cost 22.50 :rows 10 :width 12 :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno
1 :restype 1042 :restypmod 19 :resname firstname :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :va
rattno 1 :vartype 1042 :vartypmod 19 :varlevelsup 0 :varnoold 1 :varoattno 1}}) :qpqual ({ EXPR :typeOid 16 :opType op :oper { OPE
R :opno 96 :opid 65 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 5 :vartype 23 :vartypmod −1 :varlevelsup 0 :varnoold 1 :varo
attno 5} { CONST :consttype 23 :constlen 4 :constbyval true :constisnull false :constvalue 4 [ 33 0 0 0 ] })}) :lefttree <> :rightt
ree <> :extprm () :locprm () :initplan <> :nprm 0 :scanrelid 1 }
DEBUG: ProcessQuery
DEBUG: CommitTransactionCommand
DEBUG: proc_exit(0)
DEBUG: shmem_exit(0)
DEBUG: exit(0)
./bin/postmaster: reaping dead processes...
./bin/postmaster: CleanupProc: pid 3320 exited with status 0

PostgreSQL Internals Through Pictures

6 / 26
EXPLAIN with Constants of Various Frequencies

l | count | lookup_letter
---+-------+-----------------------------------------------------------------------
p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2)
s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2)
c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2)
r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2)
t | 5 | Bitmap Heap Scan on sample (cost=4.29..12.76 rows=5 width=2)
f | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)
v | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)
d | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)
a | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2)
_ | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2)
u | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2)
e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)
i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)
k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)
(14 rows)

Explaining the Postgres Query Optimizer

7 / 26
Deadlocks

SELECT pg_sleep(0.500); SELECT * FROM lockview1;


pid | vxid | lock_type | lock_mode | granted | xid_lock | relname
-------+-------+---------------+------------------+---------+----------+------------
11306 | 2/61 | transactionid | ExclusiveLock | t | 710 |
11306 | 2/61 | relation | RowExclusiveLock | t | | i_lockdemo
11306 | 2/61 | relation | RowExclusiveLock | t | | lockdemo
11306 | 2/61 | tuple | ExclusiveLock | t | | lockdemo
11306 | 2/61 | transactionid | ShareLock | f | 711 |
11642 | 3/116 | transactionid | ExclusiveLock | t | 711 |
11642 | 3/116 | relation | RowExclusiveLock | t | | i_lockdemo
11642 | 3/116 | relation | RowExclusiveLock | t | | lockdemo
11642 | 3/116 | tuple | ExclusiveLock | t | | lockdemo
11642 | 3/116 | transactionid | ShareLock | f | 710 |

(10 rows)

Unlocking the Postgres Lock Manager

8 / 26
MVCC Behavior

Cre 40
Exp INSERT

Cre 40
Exp 47 DELETE

Cre 64 old (delete)


Exp 78
UPDATE
Cre 78 new (insert)
Exp
UPDATE is effectively a DELETE and an INSERT.
MVCC Unmasked
9 / 26
MVCC Examples

Create−Only

Cre 30 Sequential Scan


Exp Visible

Cre 50 Snapshot
Exp Invisible

Cre 110 The highest−numbered


Exp Invisible committed transaction: 100

Open Transactions: 25, 50, 75


Create & Expire

Cre 30 For simplicity, assume all other


Exp 80 Invisible transactions are committed.
Cre 30
Exp 75 Visible

Cre 30
Exp 110 Visible

Internally, the creation xid is stored in the system column ’xmin’, and expire in ’xmax’.

MVCC Unmasked
10 / 26
Heap Page Structure

Page Header Item Item Item

8K

Tuple

Tuple Tuple Special

PostgreSQL Internals Through Pictures

11 / 26
Pg_upgrade: Restore Schema In New Cluster
pg_dumpall − −schema

Old Cluster New Cluster

System Tables and Indexes System Tables and Indexes


1 4 7 1 4 7

2 5 8 2 5 8

3 6 9 3 6 9

pg_class pg_class

User Tables and Indexes User Tables and Indexes

10 16 22 10 16 22

11 17 23 11 17 23

12 18 24 12 18 24

13 19 25 13 19 25

14 20 26 14 20 26

15 21 27 15 21 27

clog clog

Rapid Upgrades With Pg_Upgrade


12 / 26
Pg_upgrade: Copy User Heap/Index Files

Old Cluster New Cluster

System Tables and Indexes System Tables and Indexes


1 4 7 1 4 7

2 5 8 2 5 8

3 6 9 3 6 9

pg_class pg_class

User Tables and Indexes User Tables and Indexes

10 16 22 10 16 22

11 17 23 11 17 23

12 18 24 12 18 24

13 19 25 13 19 25

14 20 26 14 20 26

15 21 27 15 21 27

clog clog

Rapid Upgrades With Pg_Upgrade


13 / 26
Continuous Archiving

0
:0

:0

:0

:0
02

09

11

13
WAL AL AL
W W

File System− Continuous


Level Backup Archive (WAL)

The Magic of Hot Streaming Replication

14 / 26
Point-in-Time Recovery

5
:0

:3

:4

:5
17

17

17

17
WAL
AL AL
W W

File System− Continuous


Level Backup Archive (WAL)

The Magic of Hot Streaming Replication

15 / 26
Streaming Replication Setup

0
:0

:0

:0

:0
02

09

11

13
WAL AL AL
W W

File System− Standby


Level Backup Server

The Magic of Hot Streaming Replication

16 / 26
Streaming Replication in Operation

Primary Standby

Network
/pg_xlog /pg_xlog

archive WAL restore


command Archive command
Directory

The Magic of Hot Streaming Replication


17 / 26
Read Scaling Using Pgpool & Streaming Replication

pgpool
INSERT, UPDATE, SELECT
DELETE to master to any host
host

111111111111
000000000000 1111111111111
0000000000000
0000000000000
1111111111111
000000000000
111111111111 0000000000000
1111111111111
streaming 000000000000
111111111111
000000000000
111111111111 0000000000000
1111111111111
000000000000
111111111111
replication 0000000000000
1111111111111
000000000000
111111111111 0000000000000
1111111111111
0000000000000
1111111111111
000000000000
111111111111 0000000000000
1111111111111
000000000000
111111111111
Master Slave Slave
replication

A full copy of the data exists on every node.


PostgreSQL Replication Solutions 18 / 26
Write Scaling Using FDW-Based Sharding

SQL Queries

PG FDW

SQL Queries
with joins, sorts, aggregates

111111111111111111111111
000000000000000000000000 11111111111111111111111
00000000000000000000000 11111111111111111111111
00000000000000000000000
000000000000000000000000
111111111111111111111111
000000000000000000000000
111111111111111111111111 00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111 00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
000000000000000000000000
111111111111111111111111
000000000000000000000000
111111111111111111111111 00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111 00000000000000000000000
11111111111111111111111
00000000000000000000000
11111111111111111111111
000000000000000000000000
111111111111111111111111
Foreign Server
00000000000000000000000
11111111111111111111111
Foreign Server
00000000000000000000000
11111111111111111111111
Foreign Server
000000000000000000000000
111111111111111111111111 00000000000000000000000
11111111111111111111111 00000000000000000000000
11111111111111111111111
The Future of Postgres Sharding
19 / 26
Database Server Hardware Priorities

CPU

Memory

I/O

Database Hardware Selection Guidelines

20 / 26
Postgres’s Central Role

Oracle ISN

PostGIS
MongoDB
Foreign Data Extensions
Wrappers PL/R
Twitter

Postgres

Window Functions JSON


Data NoSQL
Warehouse
Data Paritioning Easy DDL

Bitmap Scans Sharding

Making Postgres Central in Your Data Center


21 / 26
Use of the Contains Operator @>

\do @>
List of operators
Schema | Name | Left arg type | Right arg type | Result type | Description
------------+------+---------------+----------------+-------------+-------------
pg_catalog | @> | aclitem[] | aclitem | boolean | contains
pg_catalog | @> | anyarray | anyarray | boolean | contains
pg_catalog | @> | anyrange | anyelement | boolean | contains
pg_catalog | @> | anyrange | anyrange | boolean | contains
pg_catalog | @> | box | box | boolean | contains
pg_catalog | @> | box | point | boolean | contains
pg_catalog | @> | circle | circle | boolean | contains
pg_catalog | @> | circle | point | boolean | contains
pg_catalog | @> | jsonb | jsonb | boolean | contains
pg_catalog | @> | path | point | boolean | contains
pg_catalog | @> | polygon | point | boolean | contains
pg_catalog | @> | polygon | polygon | boolean | contains
pg_catalog | @> | tsquery | tsquery | boolean | contains

Non-Relational Postgres

22 / 26
Postgres System Tables
pg_database pg_trigger pg_aggregate pg_amproc
datlastsysoid tgrelid aggfnoid amopclaid
pg_conversion tgfoid aggtransfn amproc
conproc aggfinalfn
pg_language aggtranstype

pg_cast pg_proc pg_constraint pg_am


pg_rewrite castsource prolang contypid amgettuple
ev_class casttarget prorettype aminsert
castfunc pg_opclass ambeginscan
opcdeftype amrescan
amendscan
pg_index pg_class pg_type pg_operator ammarkpos
indexrelid reltype typrelid oprleft amrestrpos
indrelid relam typelem oprright ambuild
relfilenode typinput oprresult ambulkdelete
reltoastrelid typoutput oprcom amcostestimate
reltoastidxid typbasetype oprnegate
oprlsortop
oprrsortop
oprcode
pg_inherits pg_attribute pg_attrdef oprrest pg_amop
inhrelid attrelid adrelid oprjoin amopclaid
inhparent attnum adnum amopopr
atttypid
pg_statistic
starelid
staattnum
pg_depend pg_namespace staop pg_shadow pg_group pg_description

PostgreSQL Internals Through Pictures


http://www.postgresql.org/docs/current/static/catalogs.html
23 / 26
CTEs: Mixing Modification Commands

CREATE TEMPORARY TABLE old_orders (order_id INTEGER);

WITH source (order_id) AS (


DELETE FROM orders WHERE name = ’my order’ RETURNING order_id
), source2 AS (
DELETE FROM items USING source WHERE source.order_id = items.order_id
)
INSERT INTO old_orders SELECT order_id FROM source;

Programming the SQL Way with Common Table Expressions

24 / 26
SSL ’VERIFY-CA’ Is Secure
From Spoofing

SSL verify-ca Fake PostgreSQL PostgreSQL

Database Invalid certificate Database Database


X
Client (no CA signature)
root.crt
Server Server

server.crt

Securing PostgreSQL From External Attack

25 / 26
Conclusion: Release Dates and Sizes After 2000

version | reldate | months | relnotes | lines | change | % change


----------+------------+--------+----------+---------+--------+----------
7.0 | 2000-05-08 | 11 | | 383270 | 51992 | 15
7.1 | 2001-04-13 | 11 | | 410500 | 27230 | 7
7.2 | 2002-02-04 | 10 | 250 | 394274 | -16226 | -3
7.3 | 2002-11-27 | 10 | 305 | 453282 | 59008 | 14
7.4 | 2003-11-17 | 12 | 263 | 508523 | 55241 | 12
8.0 | 2005-01-19 | 14 | 230 | 654437 | 145914 | 28
8.1 | 2005-11-08 | 10 | 174 | 630422 | -24015 | -3
8.2 | 2006-12-05 | 13 | 215 | 684646 | 54224 | 8
8.3 | 2008-02-04 | 14 | 223 | 762697 | 78051 | 11
8.4 | 2009-07-01 | 17 | 314 | 939098 | 176401 | 23
9.0 | 2010-09-20 | 15 | 237 | 999862 | 60764 | 6
9.1 | 2011-09-12 | 12 | 203 | 1069547 | 69685 | 6
9.2 | 2012-09-10 | 12 | 238 | 1148192 | 78645 | 7
9.3 | 2013-09-09 | 12 | 177 | 1195627 | 47435 | 4
9.4 | 2014-12-18 | 15 | 211 | 1261024 | 65397 | 5
9.5 | 2016-01-07 | 13 | 193 | 1340005 | 78981 | 6
9.6 | 2016-09-29 | 8 | 214 | 1380458 | 40453 | 3

PostgreSQL: Past, Present, and Future

26 / 26

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy