[Pgcluster-general] PGcluster, load balancer problem
Lia Domide
lia.domide at codemart.ro
Wed Jan 16 14:43:14 UTC 2008
Hi everybody,
I am trying to organize a highly available DB solution using
Postgresql(8.2.5) and pgCluster(1.7.7rc7).
I use 2 Ubuntu 7.04 (x32)machines, currently in virtual machines.
Pg1 (192.168.123.31)
Rep1
Lb1
Pg2(193.168.123.29)
Rep2
Lb2
- I managed to make the replication working;
- I checked the etc/hosts file;
- When one node is recovering from failure (-R) any operations executed on
the other node is correctly replicated;
But the load balancers seem to work in a wrong way (at least not the way I
am expecting them to work).
- first all DB nodes are initialized, as the pglb.sts file shows
- immediately after:" PGRscan_cluster:X ClusterDB can be used"
decreases with one (X -1)
- I tried to add 3 DB nodes in cluster, and I have the same
problem: at the beginning "3 ClusterDB nodes can be used" and immediately
after that "2 ClusterDB..", even if all three DB nodes are running.
- In the 3 nodes scenario, when only the last DB node is up, the
cluster is unreachable, but with any of the first two DB nodes is alive, the
cluster is running.
- In the 2 nodes scenario, when the first DB node is down the
cluster is unreachable, even if the second DB node is alive.
Does anyone knows why "PGRscan_cluster:X ClusterDB can be used" decreases,
and when the X number is updated?
A supplementary node must be always kept for safety reasons? (E.g. from a 3
nodes cluster only 2 may be used)?
Below, some logs from load balancers, in the 2 nodes scenario:
On PG2, LB2 log: (PG1 DB node stopped):
2008-01-16 15:30:30 [13087] DEBUG:PGRset_status_on_cluster_tbl():host:pg1
port:5432 max:32 use:0 status1
2008-01-16 15:30:30 [13087] DEBUG:PGRset_status_on_cluster_tbl():host:pg2
port:5432 max:32 use:0 status1
2008-01-16 15:30:30 [13087] DEBUG:init_pglb():Child_Tbl size is[49536]
2008-01-16 15:31:07 [13087] DEBUG:PGRscan_cluster:2 ClusterDB can be used
2008-01-16 15:31:07 [13087] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->1
max->32 use_num->0
2008-01-16 15:31:07 [13087] DEBUG:PGRset_status_on_cluster_tbl():host:pg1
port:5432 max:32 use:1 status2
2008-01-16 15:31:07 [13116] DEBUG:PGRdo_child():I am 13116
2008-01-16 15:31:07 [13116] DEBUG:do_accept():I am 13116 accept fd 6
2008-01-16 15:31:07 [13116] DEBUG:read_startup_packet():Protocol Major: 3
Minor: 0 database: TEST user: postgres
2008-01-16 15:31:07 [13116] ERROR:connect_inet_domain_socket(): connect()
failed: Connection refused
2008-01-16 15:31:07 [13116] DEBUG:PGRset_status_on_cluster_tbl():host:pg1
port:5432 max:32 use:2 status98
2008-01-16 15:31:09 [13087] DEBUG:PGRscan_cluster:1 ClusterDB can be used
2008-01-16 15:31:09 [13087] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->98
max->32 use_num->1
2008-01-16 15:31:09 [13087] DEBUG:PGRscan_cluster:pg2 [5432],useFlag->1
max->32 use_num->0
2008-01-16 15:31:09 [13087] DEBUG:PGRset_status_on_cluster_tbl():host:pg2
port:5432 max:32 use:1 status2
2008-01-16 15:31:09 [13117] DEBUG:PGRdo_child():I am 13117
2008-01-16 15:31:09 [13117] DEBUG:do_accept():I am 13117 accept fd 6
2008-01-16 15:31:09 [13117] DEBUG:read_startup_packet():Protocol Major: 3
Minor: 0 database: TEST user: postgres
2008-01-16 15:31:09 [13117] DEBUG:create_cp():[pg2] [pg2] is same
2008-01-16 15:31:09 [13117] DEBUG:connect_unix_domain_socket():postmaster
Unix domain socket: /tmp/.s.PGSQL.5432
2008-01-16 15:31:09 [13117] DEBUG:connect_unix_domain_socket():connected to
postmaster Unix domain socket: /tmp/.s.PGSQL.5432 fd: 7
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ### HERE I created a JDBC connection
from another host.............
2008-01-16 15:31:09 [13117] DEBUG:ReadyForQuery(): message length: 5
2008-01-16 15:31:09 [13117] DEBUG:ReadyForQuery(): transaction state: I
2008-01-16 15:31:09 [13117] DEBUG:ProcessFrontendResponse():read kind from
frontend X(58)
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:0 ClusterDB can be used
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->99
max->32 use_num->1
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:0 ClusterDB can be used
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->99
max->32 use_num->1
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:0 ClusterDB can be used
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->99
max->32 use_num->1
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:0 ClusterDB can be used
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->99
max->32 use_num->1
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:0 ClusterDB can be used
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->99
max->32 use_num->1
2008-01-16 15:33:32 [13087] ERROR:PGRload_balance():no cluster available
2008-01-16 15:33:32 [13087] ERROR:load_balance_main():load balance process
failed
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:0 ClusterDB can be used
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->99
max->32 use_num->1
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:0 ClusterDB can be used
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->99
max->32 use_num->1
2008-01-16 15:33:32 [13087] ERROR:PGRload_balance():no cluster available
2008-01-16 15:33:32 [13087] ERROR:load_balance_main():load balance process
failed
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:0 ClusterDB can be used
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->99
max->32 use_num->1
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:0 ClusterDB can be used
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->99
max->32 use_num->1
...................
2008-01-16 15:33:32 [13087] ERROR:PGRload_balance():no cluster available
2008-01-16 15:33:32 [13087] ERROR:load_balance_main():load balance process
failed
2008-01-16 15:33:32 [13087] ERROR:load_balance_main():no cluster available
2008-01-16 15:33:32 [13087] DEBUG:do_accept():I am 13087 accept fd 6
2008-01-16 15:33:32 [13087] DEBUG:read_startup_packet():Protocol Major: 3
Minor: 0 database: TEST user: postgres
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:0 ClusterDB can be used
2008-01-16 15:33:32 [13087] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->99
max->32 use_num->1
On PG1, LB1 log (both PG1 and PG2 DB services were previously started on
5432 port, with postgres user):
2008-01-16 16:13:27 [29688] DEBUG:PGRset_status_on_cluster_tbl():host:pg1
port:5432 max:42 use:0 status1
2008-01-16 16:13:27 [29688] DEBUG:PGRset_status_on_cluster_tbl():host:pg2
port:5432 max:42 use:0 status1
2008-01-16 16:13:27 [29688] DEBUG:init_pglb():Child_Tbl size is[65016]
2008-01-16 16:13:28 [29688] DEBUG:PGRscan_cluster:2 ClusterDB can be used
2008-01-16 16:13:28 [29688] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->1
max->42 use_num->0
2008-01-16 16:13:28 [29688] DEBUG:PGRset_status_on_cluster_tbl():host:pg1
port:5432 max:42 use:1 status2
2008-01-16 16:13:28 [29695] DEBUG:PGRdo_child():I am 29695
2008-01-16 16:13:28 [29695] DEBUG:do_accept():I am 29695 accept fd 6
2008-01-16 16:13:28 [29695] ERROR:pool_read: read failed (Connection reset
by peer)
2008-01-16 16:13:30 [29688] DEBUG:PGRscan_cluster:1 ClusterDB can be used
2008-01-16 16:13:30 [29688] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->2
max->42 use_num->0
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
2008-01-16 16:24:58 [31526] DEBUG:PGRdo_child():I am 31526
2008-01-16 16:24:58 [31526] DEBUG:do_accept():I am 31526 accept fd 6
2008-01-16 16:24:58 [31526] ERROR:pool_read: read failed (Connection reset
by peer)
2008-01-16 16:25:00 [29688] DEBUG:PGRscan_cluster:1 ClusterDB can be used
2008-01-16 16:25:00 [29688] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->2
max->42 use_num->0
2008-01-16 16:25:00 [31531] DEBUG:PGRdo_child():I am 31531
2008-01-16 16:25:00 [31531] DEBUG:do_accept():I am 31531 accept fd 6
2008-01-16 16:25:00 [31531] ERROR:pool_read: read failed (Connection reset
by peer)
2008-01-16 16:25:02 [29688] DEBUG:PGRscan_cluster:1 ClusterDB can be used
2008-01-16 16:25:02 [29688] DEBUG:PGRscan_cluster:pg1 [5432],useFlag->2
max->42 use_num->0
Thanks in
advance,
Lia Domide.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://pgfoundry.org/pipermail/pgcluster-general/attachments/20080116/60e24129/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pglb.conf
Type: application/octet-stream
Size: 3064 bytes
Desc: not available
Url : http://pgfoundry.org/pipermail/pgcluster-general/attachments/20080116/60e24129/attachment-0002.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pgreplicate.conf
Type: application/octet-stream
Size: 4823 bytes
Desc: not available
Url : http://pgfoundry.org/pipermail/pgcluster-general/attachments/20080116/60e24129/attachment-0003.obj
More information about the Pgcluster-general
mailing list