[Pgcluster-general] Possible bug in lifecheck 1.7.0rc7

Martin Matuska martin at matuska.org
Mon Dec 3 10:50:51 UTC 2007


Version: PGCluster 1.7.0rc7
OS: FreeBSD 7.0-BETA4 amd64

Problem: 
If you restart (start+stop) any of the pgreplicate servers, they are not registered properly, 
e.g. if you restart all pgreplicate servers there is no more replication at all, the replication servers
are in DATA_ERR state.
This is absolutely not HA-compliant.

Cause:
Lifecheck tries to send packets on the old socket without a counterpart that should already have been closed.

Solution (possible):
Try to close and reopen a new socket, if the old socket fails (see attached patch)


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: replicate.diff
Url: http://pgfoundry.org/pipermail/pgcluster-general/attachments/20071203/6d2ff1ca/attachment.ksh 


More information about the Pgcluster-general mailing list