Skip to content

Commit 9e257a1

Browse files
committed
Add parallel pg_dump option.
New infrastructure is added which creates a set number of workers (threads on Windows, forked processes on Unix). Jobs are then handed out to these workers by the master process as needed. pg_restore is adjusted to use this new infrastructure in place of the old setup which created a new worker for each step on the fly. Parallel dumps acquire a snapshot clone in order to stay consistent, if available. The parallel option is selected by the -j / --jobs command line parameter of pg_dump. Joachim Wieland, lightly editorialized by Andrew Dunstan.
1 parent 3b91fe1 commit 9e257a1

22 files changed

+2765
-819
lines changed

doc/src/sgml/backup.sgml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -310,6 +310,24 @@ pg_restore -d <replaceable class="parameter">dbname</replaceable> <replaceable c
310310
with one of the other two approaches.
311311
</para>
312312

313+
<formalpara>
314+
<title>Use <application>pg_dump</>'s parallel dump feature.</title>
315+
<para>
316+
To speed up the dump of a large database, you can use
317+
<application>pg_dump</application>'s parallel mode. This will dump
318+
multiple tables at the same time. You can control the degree of
319+
parallelism with the <command>-j</command> parameter. Parallel dumps
320+
are only supported for the "directory" archive format.
321+
322+
<programlisting>
323+
pg_dump -j <replaceable class="parameter">num</replaceable> -F d -f <replaceable class="parameter">out.dir</replaceable> <replaceable class="parameter">dbname</replaceable>
324+
</programlisting>
325+
326+
You can use <command>pg_restore -j</command> to restore a dump in parallel.
327+
This will work for any archive of either the "custom" or the "directory"
328+
archive mode, whether or not it has been created with <command>pg_dump -j</command>.
329+
</para>
330+
</formalpara>
313331
</sect2>
314332
</sect1>
315333

doc/src/sgml/perform.sgml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1433,6 +1433,15 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
14331433
base backup.
14341434
</para>
14351435
</listitem>
1436+
<listitem>
1437+
<para>
1438+
Experiment with the parallel dump and restore modes of both
1439+
<application>pg_dump</> and <application>pg_restore</> and find the
1440+
optimal number of concurrent jobs to use. Dumping and restoring in
1441+
parallel by means of the <option>-j</> option should give you a
1442+
significantly higher performance over the serial mode.
1443+
</para>
1444+
</listitem>
14361445
<listitem>
14371446
<para>
14381447
Consider whether the whole dump should be restored as a single

doc/src/sgml/ref/pg_dump.sgml

Lines changed: 84 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -73,10 +73,12 @@ PostgreSQL documentation
7373
transfer mechanism. <application>pg_dump</application> can be used to
7474
backup an entire database, then <application>pg_restore</application>
7575
can be used to examine the archive and/or select which parts of the
76-
database are to be restored. The most flexible output file format is
77-
the <quote>custom</quote> format (<option>-Fc</option>). It allows
78-
for selection and reordering of all archived items, and is compressed
79-
by default.
76+
database are to be restored. The most flexible output file formats are
77+
the <quote>custom</quote> format (<option>-Fc</option>) and the
78+
<quote>directory</quote> format(<option>-Fd</option>). They allow
79+
for selection and reordering of all archived items, support parallel
80+
restoration, and are compressed by default. The <quote>directory</quote>
81+
format is the only format that supports parallel dumps.
8082
</para>
8183

8284
<para>
@@ -251,7 +253,8 @@ PostgreSQL documentation
251253
can read. A directory format archive can be manipulated with
252254
standard Unix tools; for example, files in an uncompressed archive
253255
can be compressed with the <application>gzip</application> tool.
254-
This format is compressed by default.
256+
This format is compressed by default and also supports parallel
257+
dumps.
255258
</para>
256259
</listitem>
257260
</varlistentry>
@@ -285,6 +288,62 @@ PostgreSQL documentation
285288
</listitem>
286289
</varlistentry>
287290

291+
<varlistentry>
292+
<term><option>-j <replaceable class="parameter">njobs</replaceable></></term>
293+
<term><option>--jobs=<replaceable class="parameter">njobs</replaceable></></term>
294+
<listitem>
295+
<para>
296+
Run the dump in parallel by dumping <replaceable class="parameter">njobs</replaceable>
297+
tables simultaneously. This option reduces the time of the dump but it also
298+
increases the load on the database server. You can only use this option with the
299+
directory output format because this is the only output format where multiple processes
300+
can write their data at the same time.
301+
</para>
302+
<para>
303+
<application>pg_dump</> will open <replaceable class="parameter">njobs</replaceable>
304+
+ 1 connections to the database, so make sure your <xref linkend="guc-max-connections">
305+
setting is high enough to accommodate all connections.
306+
</para>
307+
<para>
308+
Requesting exclusive locks on database objects while running a parallel dump could
309+
cause the dump to fail. The reason is that the <application>pg_dump</> master process
310+
requests shared locks on the objects that the worker processes are going to dump later
311+
in order to
312+
make sure that nobody deletes them and makes them go away while the dump is running.
313+
If another client then requests an exclusive lock on a table, that lock will not be
314+
granted but will be queued waiting for the shared lock of the master process to be
315+
released.. Consequently any other access to the table will not be granted either and
316+
will queue after the exclusive lock request. This includes the worker process trying
317+
to dump the table. Without any precautions this would be a classic deadlock situation.
318+
To detect this conflict, the <application>pg_dump</> worker process requests another
319+
shared lock using the <literal>NOWAIT</> option. If the worker process is not granted
320+
this shared lock, somebody else must have requested an exclusive lock in the meantime
321+
and there is no way to continue with the dump, so <application>pg_dump</> has no choice
322+
but to abort the dump.
323+
</para>
324+
<para>
325+
For a consistent backup, the database server needs to support synchronized snapshots,
326+
a feature that was introduced in <productname>PostgreSQL</productname> 9.2. With this
327+
feature, database clients can ensure they see the same dataset even though they use
328+
different connections. <command>pg_dump -j</command> uses multiple database
329+
connections; it connects to the database once with the master process and
330+
once again for each worker job. Without the sychronized snapshot feature, the
331+
different worker jobs wouldn't be guaranteed to see the same data in each connection,
332+
which could lead to an inconsistent backup.
333+
</para>
334+
<para>
335+
If you want to run a parallel dump of a pre-9.2 server, you need to make sure that the
336+
database content doesn't change from between the time the master connects to the
337+
database until the last worker job has connected to the database. The easiest way to
338+
do this is to halt any data modifying processes (DDL and DML) accessing the database
339+
before starting the backup. You also need to specify the
340+
<option>--no-synchronized-snapshots</option> parameter when running
341+
<command>pg_dump -j</command> against a pre-9.2 <productname>PostgreSQL</productname>
342+
server.
343+
</para>
344+
</listitem>
345+
</varlistentry>
346+
288347
<varlistentry>
289348
<term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
290349
<term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
@@ -690,6 +749,17 @@ PostgreSQL documentation
690749
</listitem>
691750
</varlistentry>
692751

752+
<varlistentry>
753+
<term><option>--no-synchronized-snapshots</></term>
754+
<listitem>
755+
<para>
756+
This option allows running <command>pg_dump -j</> against a pre-9.2
757+
server, see the documentation of the <option>-j</option> parameter
758+
for more details.
759+
</para>
760+
</listitem>
761+
</varlistentry>
762+
693763
<varlistentry>
694764
<term><option>--no-tablespaces</option></term>
695765
<listitem>
@@ -1082,6 +1152,15 @@ CREATE DATABASE foo WITH TEMPLATE template0;
10821152
</screen>
10831153
</para>
10841154

1155+
<para>
1156+
To dump a database into a directory-format archive in parallel with
1157+
5 worker jobs:
1158+
1159+
<screen>
1160+
<prompt>$</prompt> <userinput>pg_dump -Fd mydb -j 5 -f dumpdir</userinput>
1161+
</screen>
1162+
</para>
1163+
10851164
<para>
10861165
To reload an archive file into a (freshly created) database named
10871166
<literal>newdb</>:

src/bin/pg_dump/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
1919
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
2020

2121
OBJS= pg_backup_archiver.o pg_backup_db.o pg_backup_custom.o \
22-
pg_backup_null.o pg_backup_tar.o \
22+
pg_backup_null.o pg_backup_tar.o parallel.o \
2323
pg_backup_directory.o dumputils.o compress_io.o $(WIN32RES)
2424

2525
KEYWRDOBJS = keywords.o kwlookup.o

src/bin/pg_dump/compress_io.c

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@
5454

5555
#include "compress_io.h"
5656
#include "dumputils.h"
57+
#include "parallel.h"
5758

5859
/*----------------------
5960
* Compressor API
@@ -182,6 +183,9 @@ size_t
182183
WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
183184
const void *data, size_t dLen)
184185
{
186+
/* Are we aborting? */
187+
checkAborting(AH);
188+
185189
switch (cs->comprAlg)
186190
{
187191
case COMPR_ALG_LIBZ:
@@ -351,6 +355,9 @@ ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
351355
/* no minimal chunk size for zlib */
352356
while ((cnt = readF(AH, &buf, &buflen)))
353357
{
358+
/* Are we aborting? */
359+
checkAborting(AH);
360+
354361
zp->next_in = (void *) buf;
355362
zp->avail_in = cnt;
356363

@@ -411,6 +418,9 @@ ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
411418

412419
while ((cnt = readF(AH, &buf, &buflen)))
413420
{
421+
/* Are we aborting? */
422+
checkAborting(AH);
423+
414424
ahwrite(buf, 1, cnt, AH);
415425
}
416426

src/bin/pg_dump/dumputils.c

Lines changed: 73 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ static struct
3838
} on_exit_nicely_list[MAX_ON_EXIT_NICELY];
3939

4040
static int on_exit_nicely_index;
41+
void (*on_exit_msg_func) (const char *modulename, const char *fmt, va_list ap) = vwrite_msg;
4142

4243
#define supports_grant_options(version) ((version) >= 70400)
4344

@@ -48,11 +49,21 @@ static bool parseAclItem(const char *item, const char *type,
4849
static char *copyAclUserName(PQExpBuffer output, char *input);
4950
static void AddAcl(PQExpBuffer aclbuf, const char *keyword,
5051
const char *subname);
52+
static PQExpBuffer getThreadLocalPQExpBuffer(void);
5153

5254
#ifdef WIN32
55+
static void shutdown_parallel_dump_utils(int code, void *unused);
5356
static bool parallel_init_done = false;
5457
static DWORD tls_index;
5558
static DWORD mainThreadId;
59+
60+
static void
61+
shutdown_parallel_dump_utils(int code, void *unused)
62+
{
63+
/* Call the cleanup function only from the main thread */
64+
if (mainThreadId == GetCurrentThreadId())
65+
WSACleanup();
66+
}
5667
#endif
5768

5869
void
@@ -61,23 +72,29 @@ init_parallel_dump_utils(void)
6172
#ifdef WIN32
6273
if (!parallel_init_done)
6374
{
75+
WSADATA wsaData;
76+
int err;
77+
6478
tls_index = TlsAlloc();
65-
parallel_init_done = true;
6679
mainThreadId = GetCurrentThreadId();
80+
err = WSAStartup(MAKEWORD(2, 2), &wsaData);
81+
if (err != 0)
82+
{
83+
fprintf(stderr, _("WSAStartup failed: %d\n"), err);
84+
exit_nicely(1);
85+
}
86+
on_exit_nicely(shutdown_parallel_dump_utils, NULL);
87+
parallel_init_done = true;
6788
}
6889
#endif
6990
}
7091

7192
/*
72-
* Quotes input string if it's not a legitimate SQL identifier as-is.
73-
*
74-
* Note that the returned string must be used before calling fmtId again,
75-
* since we re-use the same return buffer each time. Non-reentrant but
76-
* reduces memory leakage. (On Windows the memory leakage will be one buffer
77-
* per thread, which is at least better than one per call).
93+
* Non-reentrant but reduces memory leakage. (On Windows the memory leakage
94+
* will be one buffer per thread, which is at least better than one per call).
7895
*/
79-
const char *
80-
fmtId(const char *rawid)
96+
static PQExpBuffer
97+
getThreadLocalPQExpBuffer(void)
8198
{
8299
/*
83100
* The Tls code goes awry if we use a static var, so we provide for both
@@ -86,9 +103,6 @@ fmtId(const char *rawid)
86103
static PQExpBuffer s_id_return = NULL;
87104
PQExpBuffer id_return;
88105

89-
const char *cp;
90-
bool need_quotes = false;
91-
92106
#ifdef WIN32
93107
if (parallel_init_done)
94108
id_return = (PQExpBuffer) TlsGetValue(tls_index); /* 0 when not set */
@@ -118,6 +132,23 @@ fmtId(const char *rawid)
118132

119133
}
120134

135+
return id_return;
136+
}
137+
138+
/*
139+
* Quotes input string if it's not a legitimate SQL identifier as-is.
140+
*
141+
* Note that the returned string must be used before calling fmtId again,
142+
* since we re-use the same return buffer each time.
143+
*/
144+
const char *
145+
fmtId(const char *rawid)
146+
{
147+
PQExpBuffer id_return = getThreadLocalPQExpBuffer();
148+
149+
const char *cp;
150+
bool need_quotes = false;
151+
121152
/*
122153
* These checks need to match the identifier production in scan.l. Don't
123154
* use islower() etc.
@@ -185,6 +216,35 @@ fmtId(const char *rawid)
185216
return id_return->data;
186217
}
187218

219+
/*
220+
* fmtQualifiedId - convert a qualified name to the proper format for
221+
* the source database.
222+
*
223+
* Like fmtId, use the result before calling again.
224+
*
225+
* Since we call fmtId and it also uses getThreadLocalPQExpBuffer() we cannot
226+
* use it until we're finished with calling fmtId().
227+
*/
228+
const char *
229+
fmtQualifiedId(int remoteVersion, const char *schema, const char *id)
230+
{
231+
PQExpBuffer id_return;
232+
PQExpBuffer lcl_pqexp = createPQExpBuffer();
233+
234+
/* Suppress schema name if fetching from pre-7.3 DB */
235+
if (remoteVersion >= 70300 && schema && *schema)
236+
{
237+
appendPQExpBuffer(lcl_pqexp, "%s.", fmtId(schema));
238+
}
239+
appendPQExpBuffer(lcl_pqexp, "%s", fmtId(id));
240+
241+
id_return = getThreadLocalPQExpBuffer();
242+
243+
appendPQExpBuffer(id_return, "%s", lcl_pqexp->data);
244+
destroyPQExpBuffer(lcl_pqexp);
245+
246+
return id_return->data;
247+
}
188248

189249
/*
190250
* Convert a string value to an SQL string literal and append it to
@@ -1315,7 +1375,7 @@ exit_horribly(const char *modulename, const char *fmt,...)
13151375
va_list ap;
13161376

13171377
va_start(ap, fmt);
1318-
vwrite_msg(modulename, fmt, ap);
1378+
on_exit_msg_func(modulename, fmt, ap);
13191379
va_end(ap);
13201380

13211381
exit_nicely(1);

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy