Event Date | Summary |
---|---|
Most nodes have been updated and are back to running jobs. |
Service | Incident status | Start Date | End Date |
---|---|---|---|
Graham | Closed |
Security update - affects logins and job queueing
In response to a critical security update, all nodes are being drained to allow them to be rebooted into a new kernel. Login nodes have been rebooted, so you may have noticed disconnection of your SSH sessions. Compute nodes are set to drain (not accept jobs), so your jobs may wait in the queue longer. While pending they will indicate "Nodes required for job are DOWN, DRAINED or reserved for jobs in higher priority partitions)" as a result of this. As soon as compute nodes drain, they're being rebooted and can handle new jobs as normal.
Updated by Mark Hahn on