Incident Discovery Time: 02:30pm on 04/12/2022
Time of Resolution: 10:30pm on 04/12/2022
Services Impacted: Server Infrastructure
Description of Impact
The MGHPCC (Holyoke Datacenter) overheated today and some servers and network switches shut down. This primarily affected batch jobs on the SCC. Other areas that were affected are not production areas. They include the nas2/nas-ru2 mirror of the CRC production server, and a development/test environment set up for the BU works team.
Incident Description and Resolution
The SCC nodes were brought back online by 6pm. The rest of the servers were back online by 10:30pm except for a test environment which will need further investigation.
Additional Information
The cause of this incident is still under investigation. If you continue to have issues, please contact the
IT Help Center.
Previous Update
Incident Discovery Time: 02:30pm on 04/12/2022
Services Impacted: Server Infrastructure
Description of Impact
The Massachusetts Green High Performance Data Center (MGHPCC) overheated briefly today and a few servers were shutdown.
Current Status
SCC compute nodes are back online and all clients on the SCC were notified. Our data center operations team is en route to Holyoke to investigate other servers.
Next Update: 07:30pm
Previous Update
Incident Discovery Time: 02:30pm on 04/12/2022
Services Impacted: Server Infrastructure
Description of Impact
The Massachusetts Green High Performance Data Center (MGHPCC) overheated briefly today and a few servers were shutdown.
Current Status
IS&T teams have fixed the cooling issue and are currently waiting for temperatures to drop enough to bring servers back online.
Additional Information
Batch computing jobs on the Shared Computing Cluster and test environments for the BU Works Basic team were affected. IS&T teams continue to analyze the scope and impact.
Next Update: 07:30pm
Previous Update
Incident Discovery Time: 02:30pm on 04/12/2022
Services Impacted: Shared Computing Cluster Batch jobs, BU Works test environment
Description of Impact
The Massachusetts Green High Performance Data Center (
MGHPCC) has had an air conditioning issue. The room overheated to the point where some servers shut down. SCC login nodes and filesystem were not affected, but some batch jobs and BU works services may be unavailable.
Current Status
IS&T teams are investigating the impact of this outage and working to get servers back online.
Next Update: 5:30pm.