[Pgcluster-general] PGCluster 1.7 seems to be stuck in deadlocks
Hari G
hari_g at hotmail.com
Thu Dec 27 21:58:50 UTC 2007
Hello All -
(this cross-posted to pgcluster-general and geoserver-users lists)
For the last few days I am trying my best to implement a PGCluster solution. The client of this cluster is Java applications (mainly geoserver) that use postgis as well. This setup works fine with a single Postgresql but when I use the cluster everything goes wrong. Many database connections gets hung. Sometime I see lot of "idle in transaction" threads. I have two physical servers, each running a replicator, load balancer and cluster.
I am not doing any upate at this time but when I do "select * from pg_locks;" I see 160 rows:
relation | 18577 | 19058 | | | | | | | 654004 | 12119 | AccessShareLock | t relation | 18577 | 19020 | | | | | | | 653938 | 12001 | AccessShareLock | t relation | 18577 | 18588 | | | | | | | 653958 | 12096 | AccessShareLock | t relation | 18577 | 19020 | | | | | | | 653994 | 12113 | AccessShareLock | t relation | 18577 | 18654 | | | | | | | 653980 | 12106 | AccessShareLock | t relation | 18577 | 19020 | | | | | | | 654191 | 12481 | AccessShareLock | t relation | 18577 | 19058 | | | | | | | 654180 | 12504 | AccessShareLock | t relation | 18577 | 19020 | | | | | | | 654375 | 12900 | AccessShareLock | t relation | 18577 | 18588 | | | | | | | 654191 | 12481 | AccessShareLock | t relation | 18577 | 18668 | | | | | | | 654540 | 13330 | AccessShareLock | t relation | 18577 | 18588 | | | | | | | 654375 | 12900 | AccessShareLock | t relation | 18577 | 18588 | | | | | | | 653938 | 12001 | AccessShareLock | t transactionid | | | | | 654367 | | | | 654367 | 12885 | ExclusiveLock | t relation | 18577 | 18588 | | | | | | | 653994 | 12113 | AccessShareLock | t transactionid | | | | | 654375 | | | | 654375 | 12900 | ExclusiveLock | t relation | 18577 | 19020 | | | | | | | 653958 | 12096 | AccessShareLock | t relation | 18577 | 19059 | | | | | | | 653980 | 12106 | AccessShareLock | t relation | 18577 | 19062 | | | | | | | 654540 | 13330 | AccessShareLock | t transactionid | | | | | 654362 | | | | 654362 | 12877 | ExclusiveLock | t transactionid | | | | | 654189 | | | | 654189 | 12477 | ExclusiveLock | t relation | 18577 | 19059 | | | | | | | 654543 | 13298 | AccessShareLock | t relation | 18577 | 18588 | | | | | | | 654190 | 12479 | AccessShareLock | t relation | 18577 | 19058 | | | | | | | 654002 | 12118 | AccessShareLock | t transactionid | | | | | 653976 | | | | 653976 | 12104 | ExclusiveLock | t transactionid | | | | | 654004 | | | | 654004 | 12119 | ExclusiveLock | t relation | 18577 | 19020 | | | | | | | 653982 | 12107 | AccessShareLock | t
Does this look like deadlock to you? I am also copying my conf files:
#------------------------------------------------------------# file: cluster.conf (cluster_2)#------------------------------------------------------------# set Replication Server information#------------------------------------------------------------<Replicate_Server_Info> <Host_Name> rep_1 </Host_Name> <Port> 8001 </Port> <Recovery_Port> 8101 </Recovery_Port></Replicate_Server_Info><Replicate_Server_Info> <Host_Name> rep_2 </Host_Name> <Port> 8001 </Port> <Recovery_Port> 8101 </Recovery_Port></Replicate_Server_Info>#-------------------------------------------------------------# set Cluster DB Server information#-------------------------------------------------------------<Recovery_Port> 7001 </Recovery_Port><Rsync_Path> /usr/bin/rsync </Rsync_Path><Rsync_Option>ssh</Rsync_Option><Rsync_Compress> yes </Rsync_Compress><Pg_Dump_Path> /usr/local/pgsql/bin/pg_dump </Pg_Dump_Path>
#-------------------------------------------------------------# file: pglb.conf#-------------------------------------------------------------# set cluster DB server information#--------------------------------------------------------------------<Cluster_Server_Info> <Host_Name> cluster_1 </Host_Name> <Port> 5433 </Port> <Max_Connect>100</Max_Connect> </Cluster_Server_Info><Cluster_Server_Info> <Host_Name> cluster_2 </Host_Name> <Port> 5433 </Port> <Max_Connect>100</Max_Connect> </Cluster_Server_Info>#-------------------------------------------------------------# set Load Balance server information #------------------------------------------------------------- <Backend_Socket_Dir> /tmp </Backend_Socket_Dir> <Receive_Port> 5432 </Receive_Port><Recovery_Port> 6101 </Recovery_Port><Max_Cluster_Num> 128 </Max_Cluster_Num><Use_Connection_Pooling> no </Use_Connection_Pooling><LifeCheck_Timeout> 3s </LifeCheck_Timeout><LifeCheck_Interval> 15s </LifeCheck_Interval>#-------------------------------------------------------------# A setup of a log files#-------------------------------------------------------------<Log_File_Info> <File_Name> /var/log/postgresql/pglb.log </File_Name> <File_Size> 1M </File_Size> <Rotate> 3 </Rotate></Log_File_Info>#-------------------------------------------------------------# file: pgreplicate.conf#-------------------------------------------------------------# A setup of Cluster DB(s)#-------------------------------------------------------------<Cluster_Server_Info> <Host_Name> cluster_1 </Host_Name> <Port> 5433 </Port> <Recovery_Port> 7001 </Recovery_Port></Cluster_Server_Info><Cluster_Server_Info> <Host_Name> cluster_2 </Host_Name> <Port> 5433 </Port> <Recovery_Port> 7001 </Recovery_Port></Cluster_Server_Info>#------------------------------------------------------------# A setup of a replication server#------------------------------------------------------------- <Replication_Port>8001</Replication_Port><Recovery_Port>8101</Recovery_Port><RLOG_Port>8301</RLOG_Port><Response_Mode>normal</Response_Mode><Use_Replication_Log>no</Use_Replication_Log><Replication_Timeout>1min</Replication_Timeout><LifeCheck_Timeout>3s</LifeCheck_Timeout><LifeCheck_Interval>15s</LifeCheck_Interval><Status_Log_File> /tmp/pgreplicate.sts </Status_Log_File><Error_Log_File> /tmp/pgreplicate.log </Error_Log_File>#-------------------------------------------------------------# A setup of a log files #-------------------------------------------------------------<Log_File_Info> <File_Name> /var/log/postgresql/pgreplicate.log </File_Name> <File_Size> 1M </File_Size> <Rotate> 3 </Rotate></Log_File_Info>
I also tried the following:
1. removing the Load balancer and use the cluster directly
2. starting lb and replication servers in Debug mode.
In Debug mode both LB and Replication servers are spitting out a lot of messages but I saw nothing that is alarming....
I am very new to Pgcluster and hence any help or pointers are greatly appreciated.
-- Hari Gangadharan
_________________________________________________________________
Share life as it happens with the new Windows Live.
http://www.windowslive.com/share.html?ocid=TXT_TAGHM_Wave2_sharelife_122007
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://pgfoundry.org/pipermail/pgcluster-general/attachments/20071227/44aa18d7/attachment-0001.html
More information about the Pgcluster-general
mailing list