-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
The postgres in docker hosted on K8S infrastructure with periodic health and rediness checks experiences regular restarts of the database.
Logs show 'postgres "server process <PID>exited with exit code 2"
' followed by restart and recovery of the database.
happening with a variety of stable postgres versions (12, 14, 16) on the same system
deployed using kubernetes, we monitor health with exec probe
livenessProbe:
exec:
command:
- pg_isready
- -U
- dmp-admin
- -d
- dmp-entity
timeoutSeconds: 1
as well as readiness probe
readinessProbe:
exec:
command:
- /bin/bash
- -c
- pg_isready -U dmp-admin -d dmp-entity && [ ! -f /var/lib/postgresql/backup/pgdump_backup.velero.sql ]
timeoutSeconds: 1
Investigation shows that the postgres db is fine - no exit 2 occurences.
Instead a regular health monitoring process that includes a /bin/bash -c pg_isready ..
causes the problem.
Root cause is described here: https://www.cybertec-postgresql.com/en/docker-sudden-death-for-postgresql/ (thanks to laurentz albe)
The postgres in docker setup runs the default process (postgres) as root process of the container with PID 1.
The main postgres container is a health manager/monitor for all other spawned worker processes.
Sudden exits of worker processes lead to DB restart - remediating possible shared memory corruption.
However, orphaned other processes will get the root process as parent process (PID 1) being our postgres main entrypoint.
These processes get orphaned due to to timeout of the monitoring environment.
pg_isready is known to return exit code 2 when not able to connect.
Suggested remediation is in the referenced article: start the container using dum-init.
Example patch we deploy to remediate consists of installing the dumb-init package (using apt) and extending the entrypoint to use dumb-init as the main process (PID 1)
FROM postgres:12
RUN apt update && apt install -y dumb-init && apt clean
ENTRYPOINT ["/usr/bin/dumb-init", "docker-entrypoint.sh"]
CMD ["postgres"]