[Pgcluster-general] pglb stops working when at leat on postmaster dies, why ?

Holger Lehmann Holger.Lehmann at catworkx.de
Wed Oct 24 10:00:35 UTC 2007


Hi all,

I have evaluated the current pgcluster release (1.7.0rc7) under Linux
(x86_64).
The setup looks like this:
machine 1: pgreplicate & pglb
machine 2: postmaster (configured as the first machine in pglb and
pgreplicate)
machine 3: postmaster

When I start a pgcbench -i repeatedly in a loop (20 times) everything
works fine. I can see connections to machine 2 and 3, so the pglb
process really does loadbalance the requests.

Now, if I shutdown the postmaster on machine 2 (via pg_ctl stop), during
(!) a pgcbench run, something wierd (bad) happens:
1. the current pgcbench
run is interrupted (as expected), no transfer happens to machine 3 (what might work
eventually)
2. the next pgcbench run runs against machine 3 (ok, fine)
3.
all next pgcbench runs fail with the pglb telling me that the cluster is down (what is not true, since only one postmaster "died")
4. pglb never recovers from this state, the only whay to recover this is
to stop pglb, comment out the "dead" postmaster machine and restart pglb

Since pglb is supposed to be derived work from pgpool (version 1), I
simply added an installation of pgpool to machine 1 and configured it to
loadbalance accordingly. Here are the, astonishing, results, when the
postmaster on machine 2 is shutown during a pgcbench run:
1. the current pgcbench
run is interrupted (as expected), no transfer happens to machine 3 (what might work
eventually)
2. the next pgcbench run runs against machine 3 (ok, fine)
3. all next pgcbench runs run against machine 3 (fantastic)
4. pgpool does completely what I expected from a loadbalancer

Furthermore is pgcluster described as follows (see
http://pgcluster.projects.postgresql.org/feature.html) for details:
"""
PGCluster has two functions.
- A load sharing function
 - The session load of a reference request is distributed. It is
effective at the Web application with which a reference request pours
in.
 - A replication object can be specified per table. When the tables
which receive an updating request and a reference request differ, the
PGCluster can distribute the table which receives an updating request
and can reproduce only the table which receives a reference request.
- A high availability function
 - When failure occurs in Cluster DB, a load balancer and a replication
server separate Failure DB from a system, and continue service using the
remaining DB. Since separation of Failure DB and continuation of service
are performed simultaneously, most service stop time is made to 0.
 - The Cluster DB which repair finished can be dynamically restored to a
system, without stopping service.
 - Data is automatically copied to DB restored or added from other DB.
The query which received during restoration isexecuted from the
replication server after restoration.
"""
Is the latter, "- When failure occurs in Cluster DB, a load balancer and
a replication server separate Failure DB from a system, and continue
service using the remaining DB. Since separation of Failure DB and
continuation of service are performed simultaneously, most service stop
time is made to 0." not true anymore or is it simple a "bug" in the
current implementation ?

Has anyone attempted this feat and succeeded ? Versions anyone ?

I am not afraid to recompile the sourcecode with patches or bugfixes :-)
Thanks a lot in advance :-)

Regards,
Holger

PS: Please ignore the automatically attached footer ...

--
This e-mail and any attachments is confidential and solely intended for
the indicated addressee. If you are not the intended recipient or an
authorized person, please note, that any form of notice, disclosure,
reproduction or circulation of the contents of this mail is prohibited.
In this case, please immediately inform the sender of the e-mail an
destroy this e-mail. We use updated antivirus protection software. We do
not accept any responsibility for damages caused anyhow by viruses.

-
Diese Information ist ausschliesslich fuer den Adressaten bestimmt und kann 
vertraulich oder gesetzlich geschuetzte Informationen enthalten. Wenn Sie nicht 
der bestimmungsgemaesse Adressat sind, unterrichten Sie bitte den Absender und 
vernichten Sie diese Mail.
Anderen als dem bestimmungsgemaessen Adressaten ist es untersagt, diese E-Mail 
weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. Wir 
verwenden aktuelle Virenschutzprogramme und Content-Filter.
Fuer Schaeden, die dem Empfaenger gleichwohl durch von uns zugesandte mit Viren 
befallene E-Mails entstehen, schliessen wir jede Haftung aus.
-         
This e-mail and any attachments is confidential and solely intended for the 
indicated addressee. If you are not the intended recipient or an authorized 
person, please note, that any form of notice, disclosure, reproduction or 
circulation of the contents of this mail is prohibited. In this case, please 
immediately inform the sender of the e-mail an destroy this e-mail. We use 
updated antivirus protection software. We do not accept any responsibility for 
damages caused anyhow by viruses.
-
catWorkX GmbH: Sitz der Gesellschaft in Hamburg, HRB: 71494, USt-IdNr.: 
DE201625856, Geschaeftsfuehrung: Dipl. Kfm. Andreas Girnuweit, Dipl.-Ing. Oliver 
Groht, Dr. Wolfgang Tank


More information about the Pgcluster-general mailing list