Docker Compose restart policies and healthchecks

If you’ve been digging into the rabbit hole of Docker Compose healthchecks and restart policies, or maybe got a bit confused by my previous posts, you’ve probably wondered: “Can I use restart policies to restart containers when they become unhealthy?”

tldr: By default, nope, not out of the box. Yet, you can make it work by implementing some application logic to it.

restart policies

Currently there are 4 restart policies:

no - default one. Don’t do anything.
on-failure[:max-retries] - Restart only on failure. Won’t restart any container if daemon restarts.
always - Always means always however if you manually stop it, it stays stopped until you restart the daemon or manually start it again.
unless-stopped - Like always, but more stubborn - if you manually stop it, it stays stopped forever (even through daemon restarts).

healthchecks

Right off the bat if you add the following line in your Dockerfile:

HEALTHCHECK --interval=30s --timeout=10s --retries=3  --start-period=30s \
  CMD ["/bin/sh", "-c", "curl -f http://localhost:3001/health || exit 1"]

it won’t work. Don’t get confused. exit 1 only makes the HEALTHCHECK (curl) command fail not containers process 1. So, now you got few options.

solutions

option 1: Application-Level Exit Strategy

Make your health endpoint actually do something about being unhealthy:

app.get('/health', (req, res) => {
  if (!isHealthy()) {
    process.exit(1); // This would actually stop the container
  }
  res.status(200).send('OK');
});

option 2: External Tools

Use autoheal - check previous posts

option 3: Container Orchestration

Docker Swarm mode (the OG Docker Swarm is dead - long story short Mirantis acquisition happened few years back and now it has LTS for another 5 years) and of course, Kubernetes :)

bottom line

Restart policies and healthchecks are nice and all, but they’re only half the battle. You still need proper container replacement logic in place.

Shameless plug alert: Still monitoring your stuff manually? Check out justanotheruptime.com and yeah that’s all.

restart policies#

healthchecks#

solutions#

option 1: Application-Level Exit Strategy#

option 2: External Tools#

option 3: Container Orchestration#

bottom line#