Skip to content

Commit 463a2eb

Browse files
committed
postmaster: Commonalize FatalError paths
This includes some behavioral changes: - Previously PM_WAIT_XLOG_ARCHIVAL wasn't handled in HandleFatalError(), that doesn't seem quite right. - Previously a fatal error in PM_WAIT_XLOG_SHUTDOWN lead to jumping back to PM_WAIT_BACKENDS, no we go to PM_WAIT_DEAD_END. Jumping backwards doesn't seem quite right and we didn't do so when checkpointer failed to fork during a shutdown. - Previously a checkpointer fork failure didn't call SetQuitSignalReason(), which would lead to quickdie() reporting "terminating connection because of unexpected SIGQUIT signal" which seems even worse than the PMQUIT_FOR_CRASH message. If I saw that in the log I'd suspect somebody outside of postgres sent SIGQUITs Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
1 parent 8edd8c7 commit 463a2eb

File tree

1 file changed

+58
-16
lines changed

1 file changed

+58
-16
lines changed

src/backend/postmaster/postmaster.c

Lines changed: 58 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -2706,13 +2706,50 @@ HandleFatalError(QuitSignalReason reason, bool consider_sigabrt)
27062706

27072707
FatalError = true;
27082708

2709-
/* We now transit into a state of waiting for children to die */
2710-
if (pmState == PM_RECOVERY ||
2711-
pmState == PM_HOT_STANDBY ||
2712-
pmState == PM_RUN ||
2713-
pmState == PM_STOP_BACKENDS ||
2714-
pmState == PM_WAIT_XLOG_SHUTDOWN)
2715-
UpdatePMState(PM_WAIT_BACKENDS);
2709+
/*
2710+
* Choose the appropriate new state to react to the fatal error. Unless we
2711+
* were already in the process of shutting down, we go through
2712+
* PM_WAIT_BACKEND. For errors during the shutdown sequence, we directly
2713+
* switch to PM_WAIT_DEAD_END.
2714+
*/
2715+
switch (pmState)
2716+
{
2717+
case PM_INIT:
2718+
/* shouldn't have any children */
2719+
Assert(false);
2720+
break;
2721+
case PM_STARTUP:
2722+
/* should have been handled in process_pm_child_exit */
2723+
Assert(false);
2724+
break;
2725+
2726+
/* wait for children to die */
2727+
case PM_RECOVERY:
2728+
case PM_HOT_STANDBY:
2729+
case PM_RUN:
2730+
case PM_STOP_BACKENDS:
2731+
UpdatePMState(PM_WAIT_BACKENDS);
2732+
break;
2733+
2734+
case PM_WAIT_BACKENDS:
2735+
/* there might be more backends to wait for */
2736+
break;
2737+
2738+
case PM_WAIT_XLOG_SHUTDOWN:
2739+
case PM_WAIT_XLOG_ARCHIVAL:
2740+
2741+
/*
2742+
* NB: Similar code exists in PostmasterStateMachine()'s handling
2743+
* of FatalError in PM_STOP_BACKENDS/PM_WAIT_BACKENDS states.
2744+
*/
2745+
ConfigurePostmasterWaitSet(false);
2746+
UpdatePMState(PM_WAIT_DEAD_END);
2747+
break;
2748+
2749+
case PM_WAIT_DEAD_END:
2750+
case PM_NO_CHILDREN:
2751+
break;
2752+
}
27162753

27172754
/*
27182755
* .. and if this doesn't happen quickly enough, now the clock is ticking
@@ -2942,15 +2979,18 @@ PostmasterStateMachine(void)
29422979
{
29432980
/*
29442981
* Stop any dead-end children and stop creating new ones.
2982+
*
2983+
* NB: Similar code exists in HandleFatalErrors(), when the
2984+
* error happens in pmState > PM_WAIT_BACKENDS.
29452985
*/
29462986
UpdatePMState(PM_WAIT_DEAD_END);
29472987
ConfigurePostmasterWaitSet(false);
29482988
SignalChildren(SIGQUIT, btmask(B_DEAD_END_BACKEND));
29492989

29502990
/*
2951-
* We already SIGQUIT'd walsenders and the archiver, if any,
2952-
* when we started immediate shutdown or entered FatalError
2953-
* state.
2991+
* We already SIGQUIT'd auxiliary processes (other than
2992+
* logger), if any, when we started immediate shutdown or
2993+
* entered FatalError state.
29542994
*/
29552995
}
29562996
else
@@ -2981,13 +3021,15 @@ PostmasterStateMachine(void)
29813021
* We don't consult send_abort_for_crash here, as it's
29823022
* unlikely that dumping cores would illuminate the reason
29833023
* for checkpointer fork failure.
3024+
*
3025+
* XXX: It may be worth to introduce a different PMQUIT
3026+
* value that signals that the cluster is in a bad state,
3027+
* without a process having crashed. But right now this
3028+
* path is very unlikely to be reached, so it isn't
3029+
* obviously worthwhile adding a distinct error message in
3030+
* quickdie().
29843031
*/
2985-
FatalError = true;
2986-
UpdatePMState(PM_WAIT_DEAD_END);
2987-
ConfigurePostmasterWaitSet(false);
2988-
2989-
/* Kill the walsenders and archiver too */
2990-
SignalChildren(SIGQUIT, btmask_all_except(B_LOGGER));
3032+
HandleFatalError(PMQUIT_FOR_CRASH, false);
29913033
}
29923034
}
29933035
}

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy