Date & time of incident:
Wednesday, January 18, 2012 - 02:33
Post date:
Wednesday, January 18, 2012 - 02:45
Incident Description:
Network problem in the building 513, a large number of important machines and services were affected. The network problem in building 513, has been solved at 03:30 am, the situation is returning to normal following the service restoration. Apologies for any inconvinience.
IT-CS
Service Element Affected:
Multiple Services
Specific Service detail:
All services relying on IP21 network service.
Any other affected service(s):
Mainly AFS servers, subsequently impacting other services.
Virtual machines hosted on the hypervisors behind IP21 (critical area).
Impact:
Some applications linked to services are unavailable
Status:
Resolved
Resolution date:
Wed, Jan 18, 02:33
Posted by:
IT-CF
Unit responsible for resolution:
IT Department
Updates
Switch that caused the network incident will be replaced
The network incident was due to a crash of a network switch at the computer centre. As a preventive measure the urgent replacement of the switch has been decided. See http://test-static-03.web.cern.ch/planned-intervention/switch-replacement-service-513-v-ip21/18-01-2012 for more details.
Situation back in normal.
Services and machines relying directly on the network access have resumed their activity gradually after IT-CS intervention.
The AFS servers affected by this incident were fully back in production around 09:00, thus allowing other services to resume as well.
(user home directories, project volumes -CMS, LHCb, ATLAS, CAF, smaller projects-, ATLAS T0)
IT-CF
The network problem in building 513
The network problem in her building 513, has been solved at 03:30, the situation begins to return to normal.
Situation back in normal.
The network problem in her building 513, has been solved, the situation begins to return to normal.