Skip to content

Commit 40e2e5e

Browse files
Introduce framework for parallelizing various pg_upgrade tasks.
A number of pg_upgrade steps require connecting to every database in the cluster and running the same query in each one. When there are many databases, these steps are particularly time-consuming, especially since they are performed sequentially, i.e., we connect to a database, run the query, and process the results before moving on to the next database. This commit introduces a new framework that makes it easy to parallelize most of these once-in-each-database tasks by processing multiple databases concurrently. This framework manages a set of slots that follow a simple state machine, and it uses libpq's asynchronous APIs to establish the connections and run the queries. The --jobs option is used to determine the number of slots to use. To use this new task framework, callers simply need to provide the query and a callback function to process its results, and the framework takes care of the rest. A more complete description is provided at the top of the new task.c file. None of the eligible once-in-each-database tasks are converted to use this new framework in this commit. That will be done via several follow-up commits. Reviewed-by: Jeff Davis, Robert Haas, Daniel Gustafsson, Ilya Gladyshev, Corey Huinker Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
1 parent d891c49 commit 40e2e5e

File tree

6 files changed

+474
-3
lines changed

6 files changed

+474
-3
lines changed

doc/src/sgml/ref/pgupgrade.sgml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,7 @@ PostgreSQL documentation
118118
<varlistentry>
119119
<term><option>-j <replaceable class="parameter">njobs</replaceable></option></term>
120120
<term><option>--jobs=<replaceable class="parameter">njobs</replaceable></option></term>
121-
<listitem><para>number of simultaneous processes or threads to use
121+
<listitem><para>number of simultaneous connections and processes/threads to use
122122
</para></listitem>
123123
</varlistentry>
124124

@@ -587,8 +587,8 @@ NET STOP postgresql-&majorversion;
587587

588588
<para>
589589
The <option>--jobs</option> option allows multiple CPU cores to be used
590-
for copying/linking of files and to dump and restore database schemas
591-
in parallel; a good place to start is the maximum of the number of
590+
for copying/linking of files, dumping and restoring database schemas
591+
in parallel, etc.; a good place to start is the maximum of the number of
592592
CPU cores and tablespaces. This option can dramatically reduce the
593593
time to upgrade a multi-database server running on a multiprocessor
594594
machine.

src/bin/pg_upgrade/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ OBJS = \
2525
relfilenumber.o \
2626
server.o \
2727
tablespace.o \
28+
task.o \
2829
util.o \
2930
version.o
3031

src/bin/pg_upgrade/meson.build

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ pg_upgrade_sources = files(
1414
'relfilenumber.c',
1515
'server.c',
1616
'tablespace.c',
17+
'task.c',
1718
'util.c',
1819
'version.c',
1920
)

src/bin/pg_upgrade/pg_upgrade.h

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -494,3 +494,24 @@ void parallel_transfer_all_new_dbs(DbInfoArr *old_db_arr, DbInfoArr *new_db_arr
494494
char *old_pgdata, char *new_pgdata,
495495
char *old_tablespace);
496496
bool reap_child(bool wait_for_child);
497+
498+
/* task.c */
499+
500+
typedef void (*UpgradeTaskProcessCB) (DbInfo *dbinfo, PGresult *res, void *arg);
501+
502+
/* struct definition is private to task.c */
503+
typedef struct UpgradeTask UpgradeTask;
504+
505+
UpgradeTask *upgrade_task_create(void);
506+
void upgrade_task_add_step(UpgradeTask *task, const char *query,
507+
UpgradeTaskProcessCB process_cb, bool free_result,
508+
void *arg);
509+
void upgrade_task_run(const UpgradeTask *task, const ClusterInfo *cluster);
510+
void upgrade_task_free(UpgradeTask *task);
511+
512+
/* convenient type for common private data needed by several tasks */
513+
typedef struct
514+
{
515+
FILE *file;
516+
char path[MAXPGPATH];
517+
} UpgradeTaskReport;

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy