Ucarp update switch's arp cache

408 Views Asked by At

I'm using ucarp over linux bonding for high availability and automatic failover of two servers. Here are the commands I used on each server for starting ucarp :

Server 1 : ucarp -i bond0 -v 2 -p secret -a 10.110.0.243 -s 10.110.0.229 --upscript=/etc/vip-up.sh --downscript=/etc/vip-down.sh -b 1 -k 1 -r 2 -z

Server 2 : ucarp -i bond0 -v 2 -p secret -a 10.110.0.243 -s 10.110.0.242 --upscript=/etc/vip-up.sh --downscript=/etc/vip-down.sh -b 1 -k 1 -r 2 -z

and the content of the scripts :

vip-up.sh :

#!/bin/sh
exec 2> /dev/null
/sbin/ip addr add "$2"/24 dev "$1"

vip-down.sh :

#!/bin/sh
exec 2> /dev/null
/sbin/ip addr del "$2"/24 dev "$1"

Everything works well and the servers switch from one to another correctly when the master becomes unavailable.

The problem is when I unplug both servers from the switch for a too long time (approximatively 30 min). As they are unplugged they both think they are master, and when I replug them, the one with the lowest ip address tries to stay master by sending gratuitous arps. The other one switches to backup as expected, but I'm unable to access the master through its virtual ip. If I unplug the master, the second server goes from backup to master and is accessible through its virtual ip.

My guess is that the switch "forgets" about my servers when they are disconnected from too long, and when I reconnect them, it is needed to go from backup to master to update correctly switch's arp cache, eventhough the gratuitous arps send by master should do the work. Note that restarting ucarp on the master does fix the problem, but I need to restart it each time it was disconnected from too long...

Any idea why it does not work as I expected and how I could solve the problem ?

Thanks.

0

There are 0 best solutions below