CERN Accelerating science

EOSATLAS failover

 
Date & time of incident: 
Friday, December 2, 2011 - 15:10
Incident Description: 

The namespace went out of memory due to too many 'find' commands over the full namespace issued in parallel.

while the main headnode was coredumping we failover on the other headnode and the namespace rebooted.

The instance is available since 15:36

Service Element Affected: 
Storage Service for Projects & Experiments
Impact: 
Service is unavailable
Status: 
Resolved
Resolution date: 
Fri, Dec 2, 15:36
Posted by: 
IT-DSS
Unit responsible for resolution: 
IT Department