Skip to content

Commit ed308d7

Browse files
committed
Add options to enable and disable checksums in pg_checksums
An offline cluster can now work with more modes in pg_checksums: - --enable enables checksums in a cluster, updating all blocks with a correct checksum, and updating the control file at the end. - --disable disables checksums in a cluster, updating only the control file. - --check is an extra option able to verify checksums for a cluster, and the default used if no mode is specified. When running --enable or --disable, the data folder gets fsync'd for durability, and then it is followed by a control file update and flush to keep the operation consistent should the tool be interrupted, killed or the host unplugged. If no mode is specified in the options, then --check is used for compatibility with older versions of pg_checksums (named pg_verify_checksums in v11 where it was introduced). Author: Michael Banck, Michael Paquier Reviewed-by: Fabien Coelho, Magnus Hagander, Sergei Kornilov Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de
1 parent 87914e7 commit ed308d7

File tree

4 files changed

+285
-46
lines changed

4 files changed

+285
-46
lines changed

doc/src/sgml/ref/pg_checksums.sgml

Lines changed: 74 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ PostgreSQL documentation
1616

1717
<refnamediv>
1818
<refname>pg_checksums</refname>
19-
<refpurpose>verify data checksums in a <productname>PostgreSQL</productname> database cluster</refpurpose>
19+
<refpurpose>enable, disable or check data checksums in a <productname>PostgreSQL</productname> database cluster</refpurpose>
2020
</refnamediv>
2121

2222
<refsynopsisdiv>
@@ -36,10 +36,19 @@ PostgreSQL documentation
3636
<refsect1 id="r1-app-pg_checksums-1">
3737
<title>Description</title>
3838
<para>
39-
<application>pg_checksums</application> verifies data checksums in a
40-
<productname>PostgreSQL</productname> cluster. The server must be shut
41-
down cleanly before running <application>pg_checksums</application>.
42-
The exit status is zero if there are no checksum errors, otherwise nonzero.
39+
<application>pg_checksums</application> checks, enables or disables data
40+
checksums in a <productname>PostgreSQL</productname> cluster. The server
41+
must be shut down cleanly before running
42+
<application>pg_checksums</application>. The exit status is zero if there
43+
are no checksum errors when checking them, and nonzero if at least one
44+
checksum failure is detected. If enabling or disabling checksums, the
45+
exit status is nonzero if the operation failed.
46+
</para>
47+
48+
<para>
49+
While checking or enabling checksums needs to scan or write every file in
50+
the cluster, disabling checksums will only update the file
51+
<filename>pg_control</filename>.
4352
</para>
4453
</refsect1>
4554

@@ -60,6 +69,37 @@ PostgreSQL documentation
6069
</listitem>
6170
</varlistentry>
6271

72+
<varlistentry>
73+
<term><option>-c</option></term>
74+
<term><option>--check</option></term>
75+
<listitem>
76+
<para>
77+
Checks checksums. This is the default mode if nothing else is
78+
specified.
79+
</para>
80+
</listitem>
81+
</varlistentry>
82+
83+
<varlistentry>
84+
<term><option>-d</option></term>
85+
<term><option>--disable</option></term>
86+
<listitem>
87+
<para>
88+
Disables checksums.
89+
</para>
90+
</listitem>
91+
</varlistentry>
92+
93+
<varlistentry>
94+
<term><option>-e</option></term>
95+
<term><option>--enable</option></term>
96+
<listitem>
97+
<para>
98+
Enables checksums.
99+
</para>
100+
</listitem>
101+
</varlistentry>
102+
63103
<varlistentry>
64104
<term><option>-v</option></term>
65105
<term><option>--verbose</option></term>
@@ -119,4 +159,33 @@ PostgreSQL documentation
119159
</varlistentry>
120160
</variablelist>
121161
</refsect1>
162+
163+
<refsect1>
164+
<title>Notes</title>
165+
<para>
166+
When disabling or enabling checksums in a replication setup of multiple
167+
clusters, it is recommended to stop all the clusters before doing
168+
the switch to all the clusters consistently. When using a replication
169+
setup with tools which perform direct copies of relation file blocks
170+
(for example <xref linkend="app-pgrewind"/>), enabling or disabling
171+
checksums can lead to page corruptions in the shape of incorrect
172+
checksums if the operation is not done consistently across all nodes.
173+
Destroying all the standbys in the setup first, enabling or disabling
174+
checksums on the primary and finally recreating the standbys from
175+
scratch is also safe.
176+
</para>
177+
<para>
178+
If <application>pg_checksums</application> is aborted or killed in
179+
its operation while enabling or disabling checksums, the cluster
180+
will have the same state with respect of checksums as before the
181+
operation and <application>pg_checksums</application> needs to be
182+
restarted.
183+
</para>
184+
<para>
185+
When enabling checksums in a cluster, the operation can potentially
186+
take a long time if the data directory is large. During this operation,
187+
the cluster or other programs that write to the data directory must not
188+
be started or else data loss may occur.
189+
</para>
190+
</refsect1>
122191
</refentry>

src/bin/pg_checksums/pg_checksums.c

Lines changed: 149 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
/*-------------------------------------------------------------------------
22
*
33
* pg_checksums.c
4-
* Verifies page level checksums in an offline cluster.
4+
* Checks, enables or disables page level checksums for an offline
5+
* cluster
56
*
67
* Copyright (c) 2010-2019, PostgreSQL Global Development Group
78
*
@@ -17,14 +18,15 @@
1718
#include <sys/stat.h>
1819
#include <unistd.h>
1920

20-
#include "catalog/pg_control.h"
21+
#include "access/xlog_internal.h"
2122
#include "common/controldata_utils.h"
23+
#include "common/file_perm.h"
24+
#include "common/file_utils.h"
2225
#include "getopt_long.h"
2326
#include "pg_getopt.h"
2427
#include "storage/bufpage.h"
2528
#include "storage/checksum.h"
2629
#include "storage/checksum_impl.h"
27-
#include "storage/fd.h"
2830

2931

3032
static int64 files = 0;
@@ -35,16 +37,38 @@ static ControlFileData *ControlFile;
3537
static char *only_relfilenode = NULL;
3638
static bool verbose = false;
3739

40+
typedef enum
41+
{
42+
PG_MODE_CHECK,
43+
PG_MODE_DISABLE,
44+
PG_MODE_ENABLE
45+
} PgChecksumMode;
46+
47+
/*
48+
* Filename components.
49+
*
50+
* XXX: fd.h is not declared here as frontend side code is not able to
51+
* interact with the backend-side definitions for the various fsync
52+
* wrappers.
53+
*/
54+
#define PG_TEMP_FILES_DIR "pgsql_tmp"
55+
#define PG_TEMP_FILE_PREFIX "pgsql_tmp"
56+
57+
static PgChecksumMode mode = PG_MODE_CHECK;
58+
3859
static const char *progname;
3960

4061
static void
4162
usage(void)
4263
{
43-
printf(_("%s verifies data checksums in a PostgreSQL database cluster.\n\n"), progname);
64+
printf(_("%s enables, disables or verifies data checksums in a PostgreSQL database cluster.\n\n"), progname);
4465
printf(_("Usage:\n"));
4566
printf(_(" %s [OPTION]... [DATADIR]\n"), progname);
4667
printf(_("\nOptions:\n"));
4768
printf(_(" [-D, --pgdata=]DATADIR data directory\n"));
69+
printf(_(" -c, --check check data checksums (default)\n"));
70+
printf(_(" -d, --disable disable data checksums\n"));
71+
printf(_(" -e, --enable enable data checksums\n"));
4872
printf(_(" -v, --verbose output verbose messages\n"));
4973
printf(_(" -r RELFILENODE check only relation with specified relfilenode\n"));
5074
printf(_(" -V, --version output version information, then exit\n"));
@@ -90,8 +114,14 @@ scan_file(const char *fn, BlockNumber segmentno)
90114
PageHeader header = (PageHeader) buf.data;
91115
int f;
92116
BlockNumber blockno;
117+
int flags;
118+
119+
Assert(mode == PG_MODE_ENABLE ||
120+
mode == PG_MODE_CHECK);
121+
122+
flags = (mode == PG_MODE_ENABLE) ? O_RDWR : O_RDONLY;
123+
f = open(fn, PG_BINARY | flags, 0);
93124

94-
f = open(fn, O_RDONLY | PG_BINARY, 0);
95125
if (f < 0)
96126
{
97127
fprintf(stderr, _("%s: could not open file \"%s\": %s\n"),
@@ -121,18 +151,47 @@ scan_file(const char *fn, BlockNumber segmentno)
121151
continue;
122152

123153
csum = pg_checksum_page(buf.data, blockno + segmentno * RELSEG_SIZE);
124-
if (csum != header->pd_checksum)
154+
if (mode == PG_MODE_CHECK)
125155
{
126-
if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
127-
fprintf(stderr, _("%s: checksum verification failed in file \"%s\", block %u: calculated checksum %X but block contains %X\n"),
128-
progname, fn, blockno, csum, header->pd_checksum);
129-
badblocks++;
156+
if (csum != header->pd_checksum)
157+
{
158+
if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
159+
fprintf(stderr, _("%s: checksum verification failed in file \"%s\", block %u: calculated checksum %X but block contains %X\n"),
160+
progname, fn, blockno, csum, header->pd_checksum);
161+
badblocks++;
162+
}
163+
}
164+
else if (mode == PG_MODE_ENABLE)
165+
{
166+
/* Set checksum in page header */
167+
header->pd_checksum = csum;
168+
169+
/* Seek back to beginning of block */
170+
if (lseek(f, -BLCKSZ, SEEK_CUR) < 0)
171+
{
172+
fprintf(stderr, _("%s: seek failed for block %d in file \"%s\": %s\n"), progname, blockno, fn, strerror(errno));
173+
exit(1);
174+
}
175+
176+
/* Write block with checksum */
177+
if (write(f, buf.data, BLCKSZ) != BLCKSZ)
178+
{
179+
fprintf(stderr, _("%s: could not update checksum of block %d in file \"%s\": %s\n"),
180+
progname, blockno, fn, strerror(errno));
181+
exit(1);
182+
}
130183
}
131184
}
132185

133186
if (verbose)
134-
fprintf(stderr,
135-
_("%s: checksums verified in file \"%s\"\n"), progname, fn);
187+
{
188+
if (mode == PG_MODE_CHECK)
189+
fprintf(stderr,
190+
_("%s: checksums verified in file \"%s\"\n"), progname, fn);
191+
if (mode == PG_MODE_ENABLE)
192+
fprintf(stderr,
193+
_("%s: checksums enabled in file \"%s\"\n"), progname, fn);
194+
}
136195

137196
close(f);
138197
}
@@ -234,7 +293,10 @@ int
234293
main(int argc, char *argv[])
235294
{
236295
static struct option long_options[] = {
296+
{"check", no_argument, NULL, 'c'},
237297
{"pgdata", required_argument, NULL, 'D'},
298+
{"disable", no_argument, NULL, 'd'},
299+
{"enable", no_argument, NULL, 'e'},
238300
{"verbose", no_argument, NULL, 'v'},
239301
{NULL, 0, NULL, 0}
240302
};
@@ -262,10 +324,19 @@ main(int argc, char *argv[])
262324
}
263325
}
264326

265-
while ((c = getopt_long(argc, argv, "D:r:v", long_options, &option_index)) != -1)
327+
while ((c = getopt_long(argc, argv, "cD:der:v", long_options, &option_index)) != -1)
266328
{
267329
switch (c)
268330
{
331+
case 'c':
332+
mode = PG_MODE_CHECK;
333+
break;
334+
case 'd':
335+
mode = PG_MODE_DISABLE;
336+
break;
337+
case 'e':
338+
mode = PG_MODE_ENABLE;
339+
break;
269340
case 'v':
270341
verbose = true;
271342
break;
@@ -312,6 +383,15 @@ main(int argc, char *argv[])
312383
exit(1);
313384
}
314385

386+
/* Relfilenode checking only works in --check mode */
387+
if (mode != PG_MODE_CHECK && only_relfilenode)
388+
{
389+
fprintf(stderr, _("%s: relfilenode option only possible with --check\n"), progname);
390+
fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
391+
progname);
392+
exit(1);
393+
}
394+
315395
/* Check if cluster is running */
316396
ControlFile = get_controlfile(DataDir, progname, &crc_ok);
317397
if (!crc_ok)
@@ -339,29 +419,72 @@ main(int argc, char *argv[])
339419
if (ControlFile->state != DB_SHUTDOWNED &&
340420
ControlFile->state != DB_SHUTDOWNED_IN_RECOVERY)
341421
{
342-
fprintf(stderr, _("%s: cluster must be shut down to verify checksums\n"), progname);
422+
fprintf(stderr, _("%s: cluster must be shut down\n"), progname);
343423
exit(1);
344424
}
345425

346-
if (ControlFile->data_checksum_version == 0)
426+
if (ControlFile->data_checksum_version == 0 &&
427+
mode == PG_MODE_CHECK)
347428
{
348429
fprintf(stderr, _("%s: data checksums are not enabled in cluster\n"), progname);
349430
exit(1);
350431
}
432+
if (ControlFile->data_checksum_version == 0 &&
433+
mode == PG_MODE_DISABLE)
434+
{
435+
fprintf(stderr, _("%s: data checksums are already disabled in cluster.\n"), progname);
436+
exit(1);
437+
}
438+
if (ControlFile->data_checksum_version > 0 &&
439+
mode == PG_MODE_ENABLE)
440+
{
441+
fprintf(stderr, _("%s: data checksums are already enabled in cluster.\n"), progname);
442+
exit(1);
443+
}
444+
445+
/* Operate on all files if checking or enabling checksums */
446+
if (mode == PG_MODE_CHECK || mode == PG_MODE_ENABLE)
447+
{
448+
scan_directory(DataDir, "global");
449+
scan_directory(DataDir, "base");
450+
scan_directory(DataDir, "pg_tblspc");
451+
452+
printf(_("Checksum operation completed\n"));
453+
printf(_("Files scanned: %s\n"), psprintf(INT64_FORMAT, files));
454+
printf(_("Blocks scanned: %s\n"), psprintf(INT64_FORMAT, blocks));
455+
if (mode == PG_MODE_CHECK)
456+
{
457+
printf(_("Bad checksums: %s\n"), psprintf(INT64_FORMAT, badblocks));
458+
printf(_("Data checksum version: %d\n"), ControlFile->data_checksum_version);
459+
460+
if (badblocks > 0)
461+
exit(1);
462+
}
463+
}
464+
465+
/*
466+
* Finally make the data durable on disk if enabling or disabling
467+
* checksums. Flush first the data directory for safety, and then update
468+
* the control file to keep the switch consistent.
469+
*/
470+
if (mode == PG_MODE_ENABLE || mode == PG_MODE_DISABLE)
471+
{
472+
ControlFile->data_checksum_version =
473+
(mode == PG_MODE_ENABLE) ? PG_DATA_CHECKSUM_VERSION : 0;
351474

352-
/* Scan all files */
353-
scan_directory(DataDir, "global");
354-
scan_directory(DataDir, "base");
355-
scan_directory(DataDir, "pg_tblspc");
475+
printf(_("Syncing data directory\n"));
476+
fsync_pgdata(DataDir, progname, PG_VERSION_NUM);
356477

357-
printf(_("Checksum scan completed\n"));
358-
printf(_("Data checksum version: %d\n"), ControlFile->data_checksum_version);
359-
printf(_("Files scanned: %s\n"), psprintf(INT64_FORMAT, files));
360-
printf(_("Blocks scanned: %s\n"), psprintf(INT64_FORMAT, blocks));
361-
printf(_("Bad checksums: %s\n"), psprintf(INT64_FORMAT, badblocks));
478+
printf(_("Updating control file\n"));
479+
update_controlfile(DataDir, progname, ControlFile, true);
362480

363-
if (badblocks > 0)
364-
return 1;
481+
if (verbose)
482+
printf(_("Data checksum version: %d\n"), ControlFile->data_checksum_version);
483+
if (mode == PG_MODE_ENABLE)
484+
printf(_("Checksums enabled in cluster\n"));
485+
else
486+
printf(_("Checksums disabled in cluster\n"));
487+
}
365488

366489
return 0;
367490
}

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy