[Pgcluster-general] Recovery after one node failure

Pavel Shaydo pshaydo at elverils.com
Fri Mar 21 12:52:30 UTC 2008


Hello all,

we have following configuration:

Node1:            Node2:
ClusterDB1        ClusterDB2
Pgreplicate1      Pgreplicate2

It works, if we inserting record into table at one node it appears
at another, same for deletion. If we put Node1 down, insert some 
records into table at Node2 and when turn on Node1 again then Node1
gets synchronized with Node2. After this if we inserting record into
table at Node1 it appears at Node2, but not vice versa -- if we
inserting record into table at Node2 it not replicated to Node1.
After restarting pgreplicate on Node2 all works again.

I have found description of the similar problem in this list
http://lists.pgfoundry.org/pipermail/pgcluster-general/2008-February/001852.html
Just want to ask if any solution exists?

Here configurations files for NSP1:

##### cluster.conf #####

<Replicate_Server_Info>
        <Host_Name> NSP1 </Host_Name>
        <Port> 5006 </Port>
        <Recovery_Port> 5007 </Recovery_Port>
</Replicate_Server_Info>

<Replicate_Server_Info>
        <Host_Name> NSP2 </Host_Name>
        <Port> 5006 </Port>
        <Recovery_Port> 5007 </Recovery_Port>
</Replicate_Server_Info>

<Host_Name>NSP1</Host_Name>
<Recovery_Port> 5005 </Recovery_Port>
<Rsync_Path> /usr/bin/rsync </Rsync_Path>
<Rsync_Option> ssh -2 </Rsync_Option>
<Rsync_Compress> yes </Rsync_Compress>
<Pg_Dump_Path> /usr/bin/pg_dump </Pg_Dump_Path>
<When_Stand_Alone> read_write </When_Stand_Alone>
<Replication_Timeout> 1 min </Replication_Timeout>
<LifeCheck_Timeout> 3s </LifeCheck_Timeout>
<LifeCheck_Interval> 11s </LifeCheck_Interval>


##### pgreplicate.conf #####

<Cluster_Server_Info> 
    <Host_Name> NSP1 </Host_Name>
    <Port> 5004 </Port>                 
    <Recovery_Port> 5005 </Recovery_Port>               
</Cluster_Server_Info>  

<Cluster_Server_Info> 
    <Host_Name> NSP2 </Host_Name>
    <Port> 5004 </Port>                 
    <Recovery_Port> 5005 </Recovery_Port>               
</Cluster_Server_Info>  

<Host_Name> NSP1 </Host_Name>
<Replication_Port> 5006 </Replication_Port>     
<Recovery_Port> 5007 </Recovery_Port>           
<RLOG_Port> 5008 </RLOG_Port>                   
<Response_Mode> reliable </Response_Mode>       
<Use_Replication_Log> yes </Use_Replication_Log>        
<Replication_Timeout> 5s </Replication_Timeout>         
<LifeCheck_Timeout> 1s </LifeCheck_Timeout>     
<LifeCheck_Interval> 15s </LifeCheck_Interval>          

<Log_File_Info> 
        <File_Name> /var/log/postgres/pgreplicate.log </File_Name>
        <File_Size> 1M </File_Size>     
        <Rotate> 3 </Rotate>            
</Log_File_Info>  


NSP2 has analogous configuration


-- 
Best regards,
Pavel Shaydo



More information about the Pgcluster-general mailing list