Date & time of incident:
Sunday, September 11, 2011 - 19:30
Post date:
Sunday, September 11, 2011 - 21:30
Incident Description:
Due to unusual number of connection from one of applications, LCGR database became partially unavailable (reached maximum number of processes) - new connections to any service were not opened.
The problem occurred on Sunday 11/09 at about 7.30 PM and was resolved at 8.45 PM. Session already opened were not affected until 8.30PM when we were forced to reboot instance number 4.
Service Element Affected:
DB & Application Platform Service for Projects & Experiments
Specific Service detail:
alice_dashboard
atlas_dashboard
atlas_dashboard_dm
atlas_dashboard_prod
cms_dashboard
lcg_dashboard
lcg_fts
lcg_fts_monitor
lcg_fts_t2
lcg_fts_t2_w
lcg_fts_w
lcg_gridmap
lcg_gridops
lcg_gridview2
lcg_lfc
lcg_ops
lcg_sam_pi
lcg_sam_portal
lcg_sam_pps
lcg_same
lcg_sitemon
lcg_voms
lcgr.cern.ch
lcgr_backup
lhcb_dashboard
Impact:
Service is degraded
Status:
Resolved
Resolution date:
Sun, Sep 11, 20:45
Posted by:
IT-DB
Unit responsible for resolution:
IT Department