The router in the vault L513-v-rftec-3 lost connectivity for a few seconds to the LCG core 3 times this afternoon.
The problem may be due to a software bug
The following list of services were affected
CPU_SP08#S513-V-IP113-PORT
CPU_SU15#S513-V-IP131-PORT
PaRC_servers_at_10Gb_in_ST20_-_service_1#S513-V-IP190-PORT
CPU_SP07#S513-V-IP114-PORT
CPU_SU14#S513-V-IP132-PORT
PaRC_servers_at_10Gb_in_ST20_-_service_2#S513-V-IP191-PORT
CPU_SP06#S513-V-IP115-PORT
CPU_SU13#S513-V-IP133-PORT
CPU_SU27#S513-V-IP157-PORT
CPU_SP05#S513-V-IP116-PORT
CPU_SU12#S513-V-IP134-PORT
CPU_SU21#S513-V-IP152-PORT
CPU_SU28#S513-V-IP158-PORT
CPU_SQ08#S513-V-IP117-PORT
CPU_SU11#S513-V-IP135-PORT
CPU_SU29#S513-V-IP159-PORT
CPU_SQ07#S513-V-IP118-PORT
CPU_SU09#S513-V-IP136-PORT
CPU_SU23#S513-V-IP154-PORT
CPU_SU30#S513-V-IP160-PORT
CPU_SQ06#S513-V-IP119-PORT
CPU_SU08#S513-V-IP137-PORT
Small_disk_servers_in_SI33#S513-V-IP308-PORT
CPU_SQ05#S513-V-IP120-PORT
CPU_SU07#S513-V-IP138-PORT
Small_disk_servers_in_SI34#S513-V-IP309-PORT
Tapeservers_at_10Gb_in_T520#S513-V-IP412-PORT
CPU_SU06#S513-V-IP139-PORT
Small_disk_servers_in_SI35#S513-V-IP310-PORT
CPU_ST27#S513-V-IP147-PORT
CPU_SU05#S513-V-IP140-PORT
We are going to switchover to the secondary management module (17:40) as the configured workaround did not work. A short interruption (few seconds) is expected on all connected services
Updates
The router has been stable
The router has been stable for the last hours. Incident solved
The switchover to the
The switchover to the secondary management module was performed. There was an Interruption of 18 secs for all services behind that router
We'll monitor the router during the weekend. If the problem continues we'll need to reboot the router. The downtime will be around 8 minutes