Skip to content

Commit 92c1f91

Browse files
committed
Protect against SnapshotNow race conditions in pg_tablespace scans.
Use of SnapshotNow is known to expose us to race conditions if the tuple(s) being sought could be updated by concurrently-committing transactions. CREATE DATABASE and DROP DATABASE are particularly exposed because they do heavyweight filesystem operations during their scans of pg_tablespace, so that the scans run for a very long time compared to most. Furthermore, the potential consequences of a missed or twice-visited row are nastier than average: * createdb() could fail with a bogus "file already exists" error, or silently fail to copy one or more tablespace's worth of files into the new database. * remove_dbtablespaces() could miss one or more tablespaces, thus failing to free filesystem space for the dropped database. * check_db_file_conflict() could likewise miss a tablespace, leading to an OID conflict that could result in data loss either immediately or in future operations. (This seems of very low probability, though, since a duplicate database OID would be unlikely to start with.) Hence, it seems worth fixing these three places to use MVCC snapshots, even though this will someday be superseded by a generic solution to SnapshotNow race conditions. Back-patch to all active branches. Stephen Frost and Tom Lane
1 parent 801babf commit 92c1f91

File tree

1 file changed

+54
-3
lines changed

1 file changed

+54
-3
lines changed

src/backend/commands/dbcommands.c

Lines changed: 54 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,7 @@ createdb(const CreatedbStmt *stmt)
132132
int notherbackends;
133133
int npreparedxacts;
134134
createdb_failure_params fparms;
135+
Snapshot snapshot;
135136

136137
/* Extract options from the statement node tree */
137138
foreach(option, stmt->options)
@@ -587,6 +588,29 @@ createdb(const CreatedbStmt *stmt)
587588
*/
588589
RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT);
589590

591+
/*
592+
* Take an MVCC snapshot to use while scanning through pg_tablespace. For
593+
* safety, register the snapshot (this prevents it from changing if
594+
* something else were to request a snapshot during the loop).
595+
*
596+
* Traversing pg_tablespace with an MVCC snapshot is necessary to provide
597+
* us with a consistent view of the tablespaces that exist. Using
598+
* SnapshotNow here would risk seeing the same tablespace multiple times,
599+
* or worse not seeing a tablespace at all, if its tuple is moved around
600+
* by a concurrent update (eg an ACL change).
601+
*
602+
* Inconsistency of this sort is inherent to all SnapshotNow scans, unless
603+
* some lock is held to prevent concurrent updates of the rows being
604+
* sought. There should be a generic fix for that, but in the meantime
605+
* it's worth fixing this case in particular because we are doing very
606+
* heavyweight operations within the scan, so that the elapsed time for
607+
* the scan is vastly longer than for most other catalog scans. That
608+
* means there's a much wider window for concurrent updates to cause
609+
* trouble here than anywhere else. XXX this code should be changed
610+
* whenever a generic fix is implemented.
611+
*/
612+
snapshot = RegisterSnapshot(GetLatestSnapshot());
613+
590614
/*
591615
* Once we start copying subdirectories, we need to be able to clean 'em
592616
* up if we fail. Use an ENSURE block to make sure this happens. (This
@@ -604,7 +628,7 @@ createdb(const CreatedbStmt *stmt)
604628
* each one to the new database.
605629
*/
606630
rel = heap_open(TableSpaceRelationId, AccessShareLock);
607-
scan = heap_beginscan(rel, SnapshotNow, 0, NULL);
631+
scan = heap_beginscan(rel, snapshot, 0, NULL);
608632
while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
609633
{
610634
Oid srctablespace = HeapTupleGetOid(tuple);
@@ -708,6 +732,9 @@ createdb(const CreatedbStmt *stmt)
708732
}
709733
PG_END_ENSURE_ERROR_CLEANUP(createdb_failure_callback,
710734
PointerGetDatum(&fparms));
735+
736+
/* Free our snapshot */
737+
UnregisterSnapshot(snapshot);
711738
}
712739

713740
/* Error cleanup callback for createdb */
@@ -1692,9 +1719,20 @@ remove_dbtablespaces(Oid db_id)
16921719
Relation rel;
16931720
HeapScanDesc scan;
16941721
HeapTuple tuple;
1722+
Snapshot snapshot;
1723+
1724+
/*
1725+
* As in createdb(), we'd better use an MVCC snapshot here, since this
1726+
* scan can run for a long time. Duplicate visits to tablespaces would be
1727+
* harmless, but missing a tablespace could result in permanently leaked
1728+
* files.
1729+
*
1730+
* XXX change this when a generic fix for SnapshotNow races is implemented
1731+
*/
1732+
snapshot = RegisterSnapshot(GetLatestSnapshot());
16951733

16961734
rel = heap_open(TableSpaceRelationId, AccessShareLock);
1697-
scan = heap_beginscan(rel, SnapshotNow, 0, NULL);
1735+
scan = heap_beginscan(rel, snapshot, 0, NULL);
16981736
while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
16991737
{
17001738
Oid dsttablespace = HeapTupleGetOid(tuple);
@@ -1740,6 +1778,7 @@ remove_dbtablespaces(Oid db_id)
17401778

17411779
heap_endscan(scan);
17421780
heap_close(rel, AccessShareLock);
1781+
UnregisterSnapshot(snapshot);
17431782
}
17441783

17451784
/*
@@ -1761,9 +1800,19 @@ check_db_file_conflict(Oid db_id)
17611800
Relation rel;
17621801
HeapScanDesc scan;
17631802
HeapTuple tuple;
1803+
Snapshot snapshot;
1804+
1805+
/*
1806+
* As in createdb(), we'd better use an MVCC snapshot here; missing a
1807+
* tablespace could result in falsely reporting the OID is unique, with
1808+
* disastrous future consequences per the comment above.
1809+
*
1810+
* XXX change this when a generic fix for SnapshotNow races is implemented
1811+
*/
1812+
snapshot = RegisterSnapshot(GetLatestSnapshot());
17641813

17651814
rel = heap_open(TableSpaceRelationId, AccessShareLock);
1766-
scan = heap_beginscan(rel, SnapshotNow, 0, NULL);
1815+
scan = heap_beginscan(rel, snapshot, 0, NULL);
17671816
while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
17681817
{
17691818
Oid dsttablespace = HeapTupleGetOid(tuple);
@@ -1789,6 +1838,8 @@ check_db_file_conflict(Oid db_id)
17891838

17901839
heap_endscan(scan);
17911840
heap_close(rel, AccessShareLock);
1841+
UnregisterSnapshot(snapshot);
1842+
17921843
return result;
17931844
}
17941845

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy