[Pgcluster-general] cluster hangs on dbcreate

Pshem Kowalczyk pshem.k at gmail.com
Tue Jun 19 21:38:17 UTC 2007


Hi,

I'm evaluating pgcluster for one of our projects. I configured a
simple setup - 1 loadbalancer (.115), 1 replication server (.116) and
two data nodes (.117 and .118). Unfortunately no matter what I do - it
hangs on the first command - create database. Initially I thought that
it's because I'm trying to to create the database from within
template1, but then I reinstalled the whole set and tried from command
line - effect was exactly the same - the system hangs.
If I interrupt that it creates a local db on one of the data nodes
which obviously is not the desired effect.
My configuration (almost exact copy of the sample one) is below. I
only had to change ssh to use version 2 and different set of keys
(rsync works fine)
btw - is there any more documentation available then the
INSTALL_PGCLUSTER file? The web page seems to be a bit outdated.
postgresql 8.2.4 with 1.7.0rc7 patch, build from sources on ubuntu
dapper. 32bit architecture.

kind regards
Pshem


pglb.conf:
<Cluster_Server_Info>
    <Host_Name>                 10.23.254.117         </Host_Name>
    <Port>                      5432                    </Port>
    <Max_Connect>               32                      </Max_Connect>
</Cluster_Server_Info>
<Cluster_Server_Info>
    <Host_Name>         10.23.254.118                 </Host_Name>
    <Port>                      5432                    </Port>
    <Max_Connect>               32                      </Max_Connect>
</Cluster_Server_Info>
<Host_Name>                     10.23.254.115                 </Host_Name>
<Backend_Socket_Dir>            /var/run/postgresql/
</Backend_Socket_Dir>
<Receive_Port>                  5432                            </Receive_Port>
<Recovery_Port>         6001                            </Recovery_Port>
<Max_Cluster_Num>               128
</Max_Cluster_Num>
<Use_Connection_Pooling>        no
</Use_Connection_Pooling>
<LifeCheck_Timeout>             3s
</LifeCheck_Timeout>
<LifeCheck_Interval>            15s
</LifeCheck_Interval>
<Log_File_Info>
        <File_Name>             /tmp/pglb.log   </File_Name>
        <File_Size>             1M              </File_Size>
        <Rotate>                3               </Rotate>
</Log_File_Info>

pgreplicate.conf:
<Cluster_Server_Info>
    <Host_Name>                 10.23.254.117         </Host_Name>
    <Port>                      5432                    </Port>
    <Max_Connect>               32                      </Max_Connect>
</Cluster_Server_Info>
<Cluster_Server_Info>
    <Host_Name>                 10.23.254.118         </Host_Name>
    <Port>                      5432                    </Port>
    <Max_Connect>               32                      </Max_Connect>
</Cluster_Server_Info>
<LoadBalance_Server_Info>
        <Host_Name>             10.23.254.115                 </Host_Name>
        <Recovery_Port>         6001                            </Recovery_Port>
</LoadBalance_Server_Info>
<Host_Name>                     10.23.254.116         </Host_Name>
<Replication_Port>              8001                    </Replication_Port>
<Recovery_Port>                 8101                    </Recovery_Port>
<RLOG_Port>                     8301                    </RLOG_Port>
<Response_Mode>                 normal                  </Response_Mode>
<Use_Replication_Log>           no                      </Use_Replication_Log>
<Replication_Timeout>           1min                    </Replication_Timeout>
<LifeCheck_Timeout>             3s                      </LifeCheck_Timeout>
<LifeCheck_Interval>            15s                     </LifeCheck_Interval>
<Log_File_Info>
        <File_Name>             /tmp/pgreplicate.log    </File_Name>
        <File_Size>             1M                      </File_Size>
        <Rotate>                3                       </Rotate>
</Log_File_Info>


cluster.conf (1)
<Replicate_Server_Info>
        <Host_Name>             10.23.254.116                 </Host_Name>
        <Port>                  8001                            </Port>
        <Recovery_Port>         8101                            </Recovery_Port>
</Replicate_Server_Info>
<Host_Name>                     10.23.254.117                 </Host_Name>
<Recovery_Port>                 7001                            </Recovery_Port>
<Rsync_Path>                    /usr/bin/rsync                  </Rsync_Path>
<Rsync_Option>                  ssh                             </Rsync_Option>
<Rsync_Compress>                yes
</Rsync_Compress>
<Pg_Dump_Path>                  /usr/bin/pg_dump                </Pg_Dump_Path>
<When_Stand_Alone>              read_only
</When_Stand_Alone>
<Replication_Timeout>           1 min
</Replication_Timeout>
<LifeCheck_Timeout>             3s
</LifeCheck_Timeout>
<LifeCheck_Interval>            11s
</LifeCheck_Interval>


cluster.conf (2)
<Replicate_Server_Info>
        <Host_Name>             10.23.254.116                 </Host_Name>
        <Port>                  8001                            </Port>
        <Recovery_Port>         8101                            </Recovery_Port>
</Replicate_Server_Info>
<Host_Name>                     10.23.254.118                 </Host_Name>
<Recovery_Port>                 7001                            </Recovery_Port>
<Rsync_Path>                    /usr/bin/rsync                  </Rsync_Path>
<Rsync_Option>                  ssh                             </Rsync_Option>
<Rsync_Compress>                yes
</Rsync_Compress>
<Pg_Dump_Path>                  /usr/bin/pg_dump                </Pg_Dump_Path>
<When_Stand_Alone>              read_only
</When_Stand_Alone>
<Replication_Timeout>           1 min
</Replication_Timeout>
<LifeCheck_Timeout>             3s
</LifeCheck_Timeout>
<LifeCheck_Interval>            11s
</LifeCheck_Interval>


and the debug:

postgres at datasql2:~$ pgreplicate -D /etc/postgresql/8.2/main/ -n -v -U postgres
2007-06-20 21:09:39 [18920] DEBUG:PGR_Get_Conf_Data ok
2007-06-20 21:09:39 [18920] DEBUG:LoadBalanceTbl allocate ok
2007-06-20 21:09:39 [18920] DEBUG:PGRget_Conf_Data():CascadeTbl shmget ok
2007-06-20 21:09:39 [18920] DEBUG:PGRget_Conf_Data():CascadeTbl shmat ok
2007-06-20 21:09:39 [18920] DEBUG:PGRget_Conf_Data():CascadeInf shmget ok
2007-06-20 21:09:39 [18920] DEBUG:PGRget_Conf_Data():CascadeInf shmat ok
2007-06-20 21:09:39 [18920] DEBUG:PGRget_Conf_Data():CommitLog shmget ok
2007-06-20 21:09:39 [18920] DEBUG:PGRget_Conf_Data():Commit_Log_Tbl shmat ok
2007-06-20 21:09:39 [18920] DEBUG:PGRget_Conf_Data():RLog Memory Allocation ok
2007-06-20 21:09:39 [18920] DEBUG:registering
(key,value)=(Host_Name,10.23.254.117)
2007-06-20 21:09:39 [18920] DEBUG:registering hostname 10.23.254.117
2007-06-20 21:09:39 [18920] DEBUG:resolved name is 10.23.254.117
2007-06-20 21:09:39 [18920] DEBUG:registering (key,value)=(Port,5432)
2007-06-20 21:09:39 [18920] DEBUG:registering (key,value)=(Max_Connect,32)
2007-06-20 21:09:39 [18920] DEBUG:registering
(key,value)=(Host_Name,10.23.254.118)
2007-06-20 21:09:39 [18920] DEBUG:registering hostname 10.23.254.118
2007-06-20 21:09:39 [18920] DEBUG:resolved name is 10.23.254.118
2007-06-20 21:09:39 [18920] DEBUG:registering (key,value)=(Port,5432)
2007-06-20 21:09:39 [18920] DEBUG:registering (key,value)=(Max_Connect,32)
2007-06-20 21:09:39 [18920] DEBUG:registering
(key,value)=(Host_Name,10.23.254.115)
2007-06-20 21:09:39 [18920] DEBUG:registering (key,value)=(Recovery_Port,6001)
2007-06-20 21:09:39 [18920] DEBUG:registering
(key,value)=(Host_Name,10.23.254.116)
2007-06-20 21:09:39 [18920] DEBUG:registering
(key,value)=(Replication_Port,8001)
2007-06-20 21:09:39 [18920] DEBUG:registering (key,value)=(Recovery_Port,8101)
2007-06-20 21:09:39 [18920] DEBUG:registering (key,value)=(RLOG_Port,8301)
2007-06-20 21:09:39 [18920] DEBUG:registering (key,value)=(Response_Mode,normal)
2007-06-20 21:09:39 [18920] DEBUG:registering
(key,value)=(Use_Replication_Log,no)
2007-06-20 21:09:39 [18920] DEBUG:registering
(key,value)=(Replication_Timeout,1min)
2007-06-20 21:09:39 [18920] DEBUG:registering (key,value)=(LifeCheck_Timeout,3s)
2007-06-20 21:09:39 [18920] DEBUG:registering
(key,value)=(LifeCheck_Interval,15s)
2007-06-20 21:09:39 [18920] DEBUG:registering
(key,value)=(File_Name,/tmp/pgreplicate.log)
2007-06-20 21:09:39 [18920] DEBUG:registering (key,value)=(File_Size,1M)
2007-06-20 21:09:39 [18920] DEBUG:registering (key,value)=(Rotate,3)
2007-06-20 21:09:39 [18920] DEBUG:PGRget_Conf_Data():HostTbl shmget ok
2007-06-20 21:09:39 [18920] DEBUG:PGRget_Conf_Data():HostTbl shmat ok
2007-06-20 21:09:39 [18921] DEBUG:PGRrecovery_main():PGRrecovery_main
bind port 8101
2007-06-20 21:09:39 [18920] DEBUG:replicate_main():entering replicate_main
2007-06-20 21:09:39 [18920] DEBUG:replicate_main() 8001 port bind OK
2007-06-20 21:09:39 [18920] DEBUG:cmdSts=N
2007-06-20 21:09:39 [18920] DEBUG:rlog=0
2007-06-20 21:09:39 [18920] DEBUG:port=0
2007-06-20 21:09:39 [18920] DEBUG:pid=0
2007-06-20 21:09:39 [18920] DEBUG:from_host=10.23.254.116
2007-06-20 21:09:39 [18920] DEBUG:dbName=template1
2007-06-20 21:09:39 [18920] DEBUG:userName=postgres
2007-06-20 21:09:39 [18920] DEBUG:recieve sec=0
2007-06-20 21:09:39 [18920] DEBUG:recieve usec=0
2007-06-20 21:09:39 [18920] DEBUG:query_size=65
2007-06-20 21:09:39 [18920] DEBUG:request_id=0
2007-06-20 21:09:39 [18920] DEBUG:replicate_id=0
2007-06-20 21:09:39 [18920] DEBUG:recovery_status=0
2007-06-20 21:09:39 [18920] DEBUG:query=SELECT
PGR_SYSTEM_COMMAND_FUNCTION(1,'10.23.254.116',8001,8101)
2007-06-20 21:09:39 [18920] DEBUG:PGRis_same_host():target host
2007-06-20 21:09:39 [18920] DEBUG:PGRis_same_host():target host
2007-06-20 21:09:39 [18920] DEBUG:start thread_send_cluster()
2007-06-20 21:09:39 [18920]
DEBUG:send_replicate_packet_to_server():PQexec send :SELECT
PGR_SYSTEM_COMMAND_FUNCTION(1,'10.23.254.116',8001,8101)
2007-06-20 21:09:39 [18920]
DEBUG:send_replicate_packet_to_server():PQexec returns :SYSTEM_COMMAND
2007-06-20 21:09:39 [18920] DEBUG:thread_send_cluster():return value
from send_replicate_packet_to_server() is 0
2007-06-20 21:09:39 [18920] DEBUG:thread_send_cluster():pthread_exit[0]
2007-06-20 21:09:39 [18920] DEBUG:start thread_send_cluster()
2007-06-20 21:09:39 [18920]
DEBUG:send_replicate_packet_to_server():PQexec send :SELECT
PGR_SYSTEM_COMMAND_FUNCTION(1,'10.23.254.116',8001,8101)
2007-06-20 21:09:39 [18920]
DEBUG:send_replicate_packet_to_server():PQexec returns :SYSTEM_COMMAND
2007-06-20 21:09:39 [18920] DEBUG:thread_send_cluster():return value
from send_replicate_packet_to_server() is 0
2007-06-20 21:09:39 [18920] DEBUG:thread_send_cluster():pthread_exit[1]
2007-06-20 21:10:02 [18929] DEBUG:PGRdo_replicate():query :: CREATE
DATABASE testdb1
2007-06-20 21:10:02 [18929] DEBUG:cmdSts=Q
2007-06-20 21:10:02 [18929] DEBUG:cmdType=O
2007-06-20 21:10:02 [18929] DEBUG:rlog=0
2007-06-20 21:10:02 [18929] DEBUG:port=5432
2007-06-20 21:10:02 [18929] DEBUG:pid=27147
2007-06-20 21:10:02 [18929] DEBUG:from_host=0.0.0.0
2007-06-20 21:10:02 [18929] DEBUG:dbName=postgres
2007-06-20 21:10:02 [18929] DEBUG:userName=postgres
2007-06-20 21:10:02 [18929] DEBUG:recieve sec=1182330602
2007-06-20 21:10:02 [18929] DEBUG:recieve usec=349271
2007-06-20 21:10:02 [18929] DEBUG:query_size=23
2007-06-20 21:10:02 [18929] DEBUG:request_id=1
2007-06-20 21:10:02 [18929] DEBUG:replicate_id=0
2007-06-20 21:10:02 [18929] DEBUG:recovery_status=0
2007-06-20 21:10:02 [18929] DEBUG:query=CREATE DATABASE testdb1
2007-06-20 21:10:02 [18929] DEBUG:sem_lock [1] req
2007-06-20 21:10:02 [18929] DEBUG:sem_lock [1] got it
2007-06-20 21:10:02 [18920] DEBUG:start thread_send_cluster()
2007-06-20 21:10:02 [18920]
DEBUG:send_replicate_packet_to_server():sync_command(SELECT
PGR_SYSTEM_COMMAND_FUNCTION(3,1182330602,349271,0,1,2) )
2007-06-20 21:10:02 [18920] DEBUG:start thread_send_cluster()
2007-06-20 21:10:02 [18920]
DEBUG:send_replicate_packet_to_server():sync_command(SELECT
PGR_SYSTEM_COMMAND_FUNCTION(3,1182330602,349271,0,1,2) )
2007-06-20 21:10:02 [18920]
DEBUG:send_replicate_packet_to_server():PQexec send :CREATE DATABASE
testdb1
2007-06-20 21:10:02 [18920]
DEBUG:send_replicate_packet_to_server():PQexec returns :CREATE
DATABASE
2007-06-20 21:10:02 [18920] DEBUG:thread_send_cluster():return value
from send_replicate_packet_to_server() is 0
2007-06-20 21:10:02 [18920] DEBUG:thread_send_cluster():pthread_exit[0]
2007-06-20 21:10:03 [18920]
DEBUG:send_replicate_packet_to_server():PQexec send :CREATE DATABASE
testdb1
2007-06-20 21:10:03 [18920]
DEBUG:send_replicate_packet_to_server():PQexec returns :CREATE
DATABASE
2007-06-20 21:10:03 [18920] DEBUG:thread_send_cluster():return value
from send_replicate_packet_to_server() is 0
2007-06-20 21:10:03 [18920] DEBUG:thread_send_cluster():pthread_exit[1]
2007-06-20 21:10:03 [18929] DEBUG:sem_unlock[1]

and then it hangs.

postgres at datasql1:~$  pglb -D /etc/postgresql/8.2/main/ -n -v
2007-06-20 21:09:49 [8072]
DEBUG:PGRset_status_on_cluster_tbl():host:10.23.254.117 port:5432
max:32 use:0 status1
2007-06-20 21:09:49 [8072]
DEBUG:PGRset_status_on_cluster_tbl():host:10.23.254.118 port:5432
max:32 use:0 status1
2007-06-20 21:09:49 [8072] DEBUG:init_pglb():Child_Tbl size is[49536]
2007-06-20 21:09:57 [8072] DEBUG:PGRscan_cluster:2 ClusterDB can be used
2007-06-20 21:09:57 [8072] DEBUG:PGRscan_cluster:10.23.254.117
[5432],useFlag->1 max->32 use_num->0

2007-06-20 21:09:57 [8072]
DEBUG:PGRset_status_on_cluster_tbl():host:10.23.254.117 port:5432
max:32 use:1 status2
2007-06-20 21:09:57 [8076] DEBUG:PGRdo_child():I am 8076
2007-06-20 21:09:57 [8076] DEBUG:do_accept():I am 8076 accept fd 6
2007-06-20 21:09:57 [8076] DEBUG:read_startup_packet():Protocol Major:
3 Minor: 0 database: postgres user: postgres
2007-06-20 21:09:57 [8076] DEBUG:ParameterStatus():name:
client_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.0: name:
client_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:ParameterStatus():name: DateStyle
value: ISO, DMY
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.0: name:
client_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.1: name:
DateStyle value: ISO, DMY
2007-06-20 21:09:57 [8076] DEBUG:ParameterStatus():name:
integer_datetimes value: on
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.0: name:
client_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.1: name:
DateStyle value: ISO, DMY
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.2: name:
integer_datetimes value: on
2007-06-20 21:09:57 [8076] DEBUG:ParameterStatus():name: is_superuser value: on
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.0: name:
client_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.1: name:
DateStyle value: ISO, DMY
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.2: name:
integer_datetimes value: on
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.3: name:
is_superuser value: on
2007-06-20 21:09:57 [8076] DEBUG:ParameterStatus():name:
server_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.0: name:
client_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.1: name:
DateStyle value: ISO, DMY
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.2: name:
integer_datetimes value: on
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.3: name:
is_superuser value: on
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.4: name:
server_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:ParameterStatus():name:
server_version value: 8.2.4
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.0: name:
client_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.1: name:
DateStyle value: ISO, DMY
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.2: name:
integer_datetimes value: on
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.3: name:
is_superuser value: on
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.4: name:
server_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.5: name:
server_version value: 8.2.4
2007-06-20 21:09:57 [8076] DEBUG:ParameterStatus():name:
session_authorization value: postgres
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.0: name:
client_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.1: name:
DateStyle value: ISO, DMY
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.2: name:
integer_datetimes value: on
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.3: name:
is_superuser value: on
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.4: name:
server_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.5: name:
server_version value: 8.2.4
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.6: name:
session_authorization value: postgres
2007-06-20 21:09:57 [8076] DEBUG:ParameterStatus():name:
standard_conforming_strings value: off
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.0: name:
client_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.1: name:
DateStyle value: ISO, DMY
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.2: name:
integer_datetimes value: on
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.3: name:
is_superuser value: on
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.4: name:
server_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.5: name:
server_version value: 8.2.4
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.6: name:
session_authorization value: postgres
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.7: name:
standard_conforming_strings value: off
2007-06-20 21:09:57 [8076] DEBUG:ParameterStatus():name: TimeZone value: NZ
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.0: name:
client_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.1: name:
DateStyle value: ISO, DMY
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.2: name:
integer_datetimes value: on
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.3: name:
is_superuser value: on
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.4: name:
server_encoding value: UTF8
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.5: name:
server_version value: 8.2.4
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.6: name:
session_authorization value: postgres
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.7: name:
standard_conforming_strings value: off
2007-06-20 21:09:57 [8076] DEBUG:pool_param_debug_print(): No.8: name:
TimeZone value: NZ
2007-06-20 21:09:57 [8076] DEBUG:pool_process_query():read kind from
backend pending data Z len: 5 po: 261
2007-06-20 21:09:57 [8076] DEBUG:ReadyForQuery(): message length: 5
2007-06-20 21:09:57 [8076] DEBUG:ReadyForQuery(): transaction state: I
2007-06-20 21:09:57 [8076] DEBUG:ProcessFrontendResponse():read kind
from frontend Q(51)
2007-06-20 21:09:57 [8076] DEBUG:Query(): CREATE DATABASE testdb1;


More information about the Pgcluster-general mailing list