Skip to content

Commit cb0cca1

Browse files
committed
Fix recovery of 2PC transaction during crash recovery
A crash in the middle of a checkpoint with some two-phase state data already flushed to disk by this checkpoint could cause a follow-up crash recovery to recover twice the same transaction, once from what has been found in pg_twophase/ at the beginning of recovery and a second time when replaying its corresponding record. This would lead to FATAL failures in the startup process during recovery, where the same transaction would have a state recovered twice instead of once: LOG: recovering prepared transaction 731 from shared memory LOG: recovering prepared transaction 731 from shared memory FATAL: lock ExclusiveLock on object 731/0/0 is already held This issue is fixed by skipping the addition of any 2PC state coming from a record whose equivalent 2PC state file has already been loaded in TwoPhaseState at the beginning of recovery by restoreTwoPhaseData(), which is OK as long as the system has not reached a consistent state. The timing to get a messed up recovery processing is very racy, and would very unlikely happen. The thread that has reported the issue has demonstrated the bug using injection points to force a PANIC in the middle of a checkpoint. Issue introduced in 728bd99, so backpatch all the way down. Reported-by: "suyu.cmj" <mengjuan.cmj@alibaba-inc.com> Author: "suyu.cmj" <mengjuan.cmj@alibaba-inc.com> Author: Michael Paquier Discussion: https://postgr.es/m/109e6994-b971-48cb-84f6-829646f18b4c.mengjuan.cmj@alibaba-inc.com Backpatch-through: 11
1 parent 8fab4b3 commit cb0cca1

File tree

1 file changed

+33
-0
lines changed

1 file changed

+33
-0
lines changed

src/backend/access/transam/twophase.c

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,7 @@
8686
#include "access/xlog.h"
8787
#include "access/xloginsert.h"
8888
#include "access/xlogreader.h"
89+
#include "access/xlogrecovery.h"
8990
#include "access/xlogutils.h"
9091
#include "catalog/pg_type.h"
9192
#include "catalog/storage.h"
@@ -2477,6 +2478,38 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn,
24772478
* that it got added in the redo phase
24782479
*/
24792480

2481+
/*
2482+
* In the event of a crash while a checkpoint was running, it may be
2483+
* possible that some two-phase data found its way to disk while its
2484+
* corresponding record needs to be replayed in the follow-up recovery.
2485+
* As the 2PC data was on disk, it has already been restored at the
2486+
* beginning of recovery with restoreTwoPhaseData(), so skip this record
2487+
* to avoid duplicates in TwoPhaseState. If a consistent state has been
2488+
* reached, the record is added to TwoPhaseState and it should have no
2489+
* corresponding file in pg_twophase.
2490+
*/
2491+
if (!XLogRecPtrIsInvalid(start_lsn))
2492+
{
2493+
char path[MAXPGPATH];
2494+
2495+
TwoPhaseFilePath(path, hdr->xid);
2496+
2497+
if (access(path, F_OK) == 0)
2498+
{
2499+
ereport(reachedConsistency ? ERROR : WARNING,
2500+
(errmsg("could not recover two-phase state file for transaction %u",
2501+
hdr->xid),
2502+
errdetail("Two-phase state file has been found in WAL record %X/%X, but this transaction has already been restored from disk.",
2503+
LSN_FORMAT_ARGS(start_lsn))));
2504+
return;
2505+
}
2506+
2507+
if (errno != ENOENT)
2508+
ereport(ERROR,
2509+
(errcode_for_file_access(),
2510+
errmsg("could not access file \"%s\": %m", path)));
2511+
}
2512+
24802513
/* Get a free gxact from the freelist */
24812514
if (TwoPhaseState->freeGXacts == NULL)
24822515
ereport(ERROR,

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy