[Pgcluster-general] cluster hangs on dbcreate

Pshem Kowalczyk pshem.k at gmail.com
Wed Jun 20 00:06:16 UTC 2007


Ok, I believe it might be partialy my fault as I used ip addresses
instead of names. So I added entries to /etc/hosts.conf and replaced
the IPs in configuration files with names.
But it still hangs. I think it has something to do with the following line:

2007-06-20 23:24:34 [19761] DEBUG:from_host=0.0.0.0

in the replicator's log.

The hosts have only one physical interface each and they are all
connected to the same switch.

postgresql listens on the ip on both data nodes (listen in
postgresql.conf). The whole network is trusted in pg_hba.conf

Any ideas what might be wrong?


hosts file:
10.23.254.115 loadbalancer
10.23.254.116 replicator
10.23.254.117 data1
10.23.254.118 data2


postgres at datasql2:~$ pgreplicate -D /etc/postgresql/8.2/main/ -l -n -v
-U postgres
2007-06-20 23:12:46 [19660] DEBUG:PGR_Get_Conf_Data ok
2007-06-20 23:12:46 [19660] DEBUG:LoadBalanceTbl allocate ok
2007-06-20 23:12:46 [19660] DEBUG:PGRget_Conf_Data():CascadeTbl shmget ok
2007-06-20 23:12:46 [19660] DEBUG:PGRget_Conf_Data():CascadeTbl shmat ok
2007-06-20 23:12:46 [19660] DEBUG:PGRget_Conf_Data():CascadeInf shmget ok
2007-06-20 23:12:46 [19660] DEBUG:PGRget_Conf_Data():CascadeInf shmat ok
2007-06-20 23:12:46 [19660] DEBUG:PGRget_Conf_Data():CommitLog shmget ok
2007-06-20 23:12:46 [19660] DEBUG:PGRget_Conf_Data():Commit_Log_Tbl shmat ok
2007-06-20 23:12:46 [19660] DEBUG:PGRget_Conf_Data():RLog Memory Allocation ok
2007-06-20 23:12:46 [19660] DEBUG:registering (key,value)=(Host_Name,data1)
2007-06-20 23:12:46 [19660] DEBUG:registering hostname data1
2007-06-20 23:12:46 [19660] DEBUG:resolved name is 10.23.254.117
2007-06-20 23:12:46 [19660] DEBUG:registering (key,value)=(Port,5432)
2007-06-20 23:12:46 [19660] DEBUG:registering (key,value)=(Max_Connect,32)
2007-06-20 23:12:46 [19660] DEBUG:registering (key,value)=(Host_Name,data2)
2007-06-20 23:12:46 [19660] DEBUG:registering hostname data2
2007-06-20 23:12:46 [19660] DEBUG:resolved name is 10.23.254.118
2007-06-20 23:12:46 [19660] DEBUG:registering (key,value)=(Port,5432)
2007-06-20 23:12:46 [19660] DEBUG:registering (key,value)=(Max_Connect,32)
2007-06-20 23:12:46 [19660] DEBUG:registering
(key,value)=(Host_Name,loadbalancer)
2007-06-20 23:12:46 [19660] DEBUG:registering (key,value)=(Recovery_Port,6001)
2007-06-20 23:12:46 [19660] DEBUG:registering (key,value)=(Host_Name,replicator)
2007-06-20 23:12:46 [19660] DEBUG:registering
(key,value)=(Replication_Port,8001)
2007-06-20 23:12:46 [19660] DEBUG:registering (key,value)=(Recovery_Port,8101)
2007-06-20 23:12:46 [19660] DEBUG:registering (key,value)=(RLOG_Port,8301)
2007-06-20 23:12:46 [19660] DEBUG:registering (key,value)=(Response_Mode,normal)
2007-06-20 23:12:46 [19660] DEBUG:registering
(key,value)=(Use_Replication_Log,no)
2007-06-20 23:12:46 [19660] DEBUG:registering
(key,value)=(Replication_Timeout,1min)
2007-06-20 23:12:46 [19660] DEBUG:registering (key,value)=(LifeCheck_Timeout,3s)
2007-06-20 23:12:46 [19660] DEBUG:registering
(key,value)=(LifeCheck_Interval,15s)
2007-06-20 23:12:46 [19660] DEBUG:registering
(key,value)=(File_Name,/tmp/pgreplicate.log)
2007-06-20 23:12:46 [19660] DEBUG:registering (key,value)=(File_Size,1M)
2007-06-20 23:12:46 [19660] DEBUG:registering (key,value)=(Rotate,3)
2007-06-20 23:12:46 [19660] DEBUG:PGRget_Conf_Data():HostTbl shmget ok
2007-06-20 23:12:46 [19660] DEBUG:PGRget_Conf_Data():HostTbl shmat ok
2007-06-20 23:12:46 [19661] DEBUG:PGRrecovery_main():PGRrecovery_main
bind port 8101
2007-06-20 23:12:46 [19660] DEBUG:replicate_main():entering replicate_main
2007-06-20 23:12:46 [19660] DEBUG:replicate_main() 8001 port bind OK
2007-06-20 23:12:46 [19660] DEBUG:cmdSts=N
2007-06-20 23:12:46 [19660] DEBUG:rlog=0
2007-06-20 23:12:46 [19660] DEBUG:port=0
2007-06-20 23:12:46 [19660] DEBUG:pid=0
2007-06-20 23:12:46 [19660] DEBUG:from_host=10.23.254.116
2007-06-20 23:12:46 [19660] DEBUG:dbName=template1
2007-06-20 23:12:46 [19660] DEBUG:userName=postgres
2007-06-20 23:12:46 [19660] DEBUG:recieve sec=0
2007-06-20 23:12:46 [19660] DEBUG:recieve usec=0
2007-06-20 23:12:46 [19660] DEBUG:query_size=65
2007-06-20 23:12:46 [19660] DEBUG:request_id=0
2007-06-20 23:12:46 [19660] DEBUG:replicate_id=0
2007-06-20 23:12:46 [19660] DEBUG:recovery_status=0
2007-06-20 23:12:46 [19660] DEBUG:query=SELECT
PGR_SYSTEM_COMMAND_FUNCTION(1,'10.23.254.116',8001,8101)
2007-06-20 23:12:46 [19660] DEBUG:PGRis_same_host():target host
2007-06-20 23:12:46 [19660] DEBUG:PGRis_same_host():target host
2007-06-20 23:12:46 [19660] DEBUG:start thread_send_cluster()
2007-06-20 23:12:46 [19660]
DEBUG:send_replicate_packet_to_server():PQexec send :SELECT
PGR_SYSTEM_COMMAND_FUNCTION(1,'10.23.254.116',8001,8101)
2007-06-20 23:12:46 [19660]
DEBUG:send_replicate_packet_to_server():PQexec returns :SYSTEM_COMMAND
2007-06-20 23:12:46 [19660] DEBUG:thread_send_cluster():return value
from send_replicate_packet_to_server() is 0
2007-06-20 23:12:46 [19660] DEBUG:thread_send_cluster():pthread_exit[0]
2007-06-20 23:12:46 [19660] DEBUG:start thread_send_cluster()
2007-06-20 23:12:46 [19660]
DEBUG:send_replicate_packet_to_server():PQexec send :SELECT
PGR_SYSTEM_COMMAND_FUNCTION(1,'10.23.254.116',8001,8101)
2007-06-20 23:12:46 [19660]
DEBUG:send_replicate_packet_to_server():PQexec returns :SYSTEM_COMMAND
2007-06-20 23:12:46 [19660] DEBUG:thread_send_cluster():return value
from send_replicate_packet_to_server() is 0
2007-06-20 23:12:46 [19660] DEBUG:thread_send_cluster():pthread_exit[1]


2007-06-20 23:24:34 [19761] DEBUG:PGRdo_replicate():query :: CREATE
DATABASE testdb8
2007-06-20 23:24:34 [19761] DEBUG:cmdSts=Q
2007-06-20 23:24:34 [19761] DEBUG:cmdType=O
2007-06-20 23:24:34 [19761] DEBUG:rlog=0
2007-06-20 23:24:34 [19761] DEBUG:port=5432
2007-06-20 23:24:34 [19761] DEBUG:pid=27900
2007-06-20 23:24:34 [19761] DEBUG:from_host=0.0.0.0
2007-06-20 23:24:34 [19761] DEBUG:dbName=template1
2007-06-20 23:24:34 [19761] DEBUG:userName=postgres
2007-06-20 23:24:34 [19761] DEBUG:recieve sec=1182338674
2007-06-20 23:24:34 [19761] DEBUG:recieve usec=198669
2007-06-20 23:24:34 [19761] DEBUG:query_size=23
2007-06-20 23:24:34 [19761] DEBUG:request_id=1
2007-06-20 23:24:34 [19761] DEBUG:replicate_id=0
2007-06-20 23:24:34 [19761] DEBUG:recovery_status=0
2007-06-20 23:24:34 [19761] DEBUG:query=CREATE DATABASE testdb8
2007-06-20 23:24:34 [19761] DEBUG:sem_lock [1] req
2007-06-20 23:24:34 [19761] DEBUG:sem_lock [1] got it
2007-06-20 23:24:34 [19660] DEBUG:start thread_send_cluster()
2007-06-20 23:24:34 [19660]
DEBUG:send_replicate_packet_to_server():sync_command(SELECT
PGR_SYSTEM_COMMAND_FUNCTION(3,1182338674,198669,0,1,2) )
2007-06-20 23:24:34 [19660]
DEBUG:send_replicate_packet_to_server():PQexec send :CREATE DATABASE
testdb8
2007-06-20 23:24:34 [19660]
DEBUG:send_replicate_packet_to_server():PQexec returns :
2007-06-20 23:24:34 [19660] DEBUG:thread_send_cluster():return value
from send_replicate_packet_to_server() is 0
2007-06-20 23:24:34 [19660] DEBUG:thread_send_cluster():pthread_exit[0]
2007-06-20 23:24:34 [19660] DEBUG:start thread_send_cluster()
2007-06-20 23:24:34 [19660]
DEBUG:send_replicate_packet_to_server():sync_command(SELECT
PGR_SYSTEM_COMMAND_FUNCTION(3,1182338674,198669,0,1,2) )
2007-06-20 23:24:34 [19660]
DEBUG:send_replicate_packet_to_server():PQexec send :CREATE DATABASE
testdb8
2007-06-20 23:24:34 [19660]
DEBUG:send_replicate_packet_to_server():PQexec returns :CREATE
DATABASE
2007-06-20 23:24:34 [19660] DEBUG:thread_send_cluster():return value
from send_replicate_packet_to_server() is 0
2007-06-20 23:24:34 [19660] DEBUG:thread_send_cluster():pthread_exit[1]
2007-06-20 23:24:34 [19761] DEBUG:sem_unlock[1]



pglb.conf
<Cluster_Server_Info>
    <Host_Name>                 data1                   </Host_Name>
    <Port>                      5432                    </Port>
    <Max_Connect>               32                      </Max_Connect>
</Cluster_Server_Info>
<Cluster_Server_Info>
    <Host_Name>                 data2                   </Host_Name>
    <Port>                      5432                    </Port>
    <Max_Connect>               32                      </Max_Connect>
</Cluster_Server_Info>
<Host_Name>                     loadbalancer                    </Host_Name>
<Backend_Socket_Dir>            /var/run/postgresql/
</Backend_Socket_Dir>
<Receive_Port>                  5432                            </Receive_Port>
<Recovery_Port>         6001                            </Recovery_Port>
<Max_Cluster_Num>               128
</Max_Cluster_Num>
<Use_Connection_Pooling>        no
</Use_Connection_Pooling>
<LifeCheck_Timeout>             3s
</LifeCheck_Timeout>
<LifeCheck_Interval>            15s
</LifeCheck_Interval>
<Log_File_Info>
        <File_Name>             /tmp/pglb.log   </File_Name>
        <File_Size>             1M              </File_Size>
        <Rotate>                3               </Rotate>
</Log_File_Info>

pgreplicate.conf
<Cluster_Server_Info>
    <Host_Name>                 data1                   </Host_Name>
    <Port>                      5432                    </Port>
    <Max_Connect>               32                      </Max_Connect>
</Cluster_Server_Info>
<Cluster_Server_Info>
    <Host_Name>                 data2                   </Host_Name>
    <Port>                      5432                    </Port>
    <Max_Connect>               32                      </Max_Connect>
</Cluster_Server_Info>
<LoadBalance_Server_Info>
        <Host_Name>             loadbalancer                    </Host_Name>
        <Recovery_Port>         6001                            </Recovery_Port>
</LoadBalance_Server_Info>
<Host_Name>                     replicator              </Host_Name>
<Replication_Port>              8001                    </Replication_Port>
<Recovery_Port>                 8101                    </Recovery_Port>
<RLOG_Port>                     8301                    </RLOG_Port>
<Response_Mode>                 normal                  </Response_Mode>
<Use_Replication_Log>           no                      </Use_Replication_Log>
<Replication_Timeout>           1min                    </Replication_Timeout>
<LifeCheck_Timeout>             3s                      </LifeCheck_Timeout>
<LifeCheck_Interval>            15s                     </LifeCheck_Interval>
<Log_File_Info>
        <File_Name>             /tmp/pgreplicate.log    </File_Name>
        <File_Size>             1M                      </File_Size>
        <Rotate>                3                       </Rotate>
</Log_File_Info>

data1:
<Replicate_Server_Info>
        <Host_Name>             replicator                      </Host_Name>
        <Port>                  8001                            </Port>
        <Recovery_Port>         8101                            </Recovery_Port>
</Replicate_Server_Info>
<Host_Name>                     data1                           </Host_Name>
<Recovery_Port>                 7001                            </Recovery_Port>
<Rsync_Path>                    /usr/bin/rsync                  </Rsync_Path>
<Rsync_Option>                  ssh                             </Rsync_Option>
<Rsync_Compress>                yes
</Rsync_Compress>
<Pg_Dump_Path>                  /usr/bin/pg_dump                </Pg_Dump_Path>
<When_Stand_Alone>              read_only
</When_Stand_Alone>
<Replication_Timeout>           1 min
</Replication_Timeout>
<LifeCheck_Timeout>             3s
</LifeCheck_Timeout>
<LifeCheck_Interval>            11s
</LifeCheck_Interval>

data2:
<Replicate_Server_Info>
        <Host_Name>             replicator                      </Host_Name>
        <Port>                  8001                            </Port>
        <Recovery_Port>         8101                            </Recovery_Port>
</Replicate_Server_Info>
<Host_Name>                     data2                           </Host_Name>
<Recovery_Port>                 7001                            </Recovery_Port>
<Rsync_Path>                    /usr/bin/rsync                  </Rsync_Path>
<Rsync_Option>                  ssh                             </Rsync_Option>
<Rsync_Compress>                yes
</Rsync_Compress>
<Pg_Dump_Path>                  /usr/bin/pg_dump                </Pg_Dump_Path>
<When_Stand_Alone>              read_only
</When_Stand_Alone>
<Replication_Timeout>           1 min
</Replication_Timeout>
<LifeCheck_Timeout>             3s
</LifeCheck_Timeout>
<LifeCheck_Interval>            11s
</LifeCheck_Interval>


More information about the Pgcluster-general mailing list