Postgres Replica Lags

Problem

After a node failure or any network related issue, in rare cases, the health-check output might look like the following:

In this example, rbpostgres-db is lagging behind. Sometimes, atom-db can lag behind as well.

Cause

The Postgres replica has not caught up with the leader instance.

Solution

Run the repair command as displayed in the health-check output.

Verify the Postgres replica status using the health-check command. If the Postgres replica still lags behind, perform the following steps:

Use any one of the rbpostgres-db pods and run patronictl list command to verify which instance has problem.

Here, as also confirmed by the health-check output, rbpostgres-db-2 is lagging.

Log in to the leader instance.

Reinitialize the lagging pod and enter Y when prompted to proceed.

Wait for a few minutes and run the patronictl list command. Depending on the size of the data inside the database, the rbpostgres-db-2 pod catches up to the leader instance and moves to the streaming state.

ON THIS PAGE

Problem

Cause

Solution