Monitoring - System administrators have identified a specific workload that was doing concurrent writes to the same file from different clients. This eventually caused the Quobyte clients to stop making progress for those files. Once that happens, Slurm is unable to fully clean up the jobs and puts the nodes into maintenance mode (Kill task failed).

The impacted nodes are being drained so they can be rebooted.

The user's jobs have been killed, and the user has been contacted to modify their workload.

Jan 23, 2026 - 13:22 PST
Investigating - Several nodes are in status 'Kill task failed" and are now in the draining state in anticipation of a reboot. System administrators are looking for a root cause.
Jan 23, 2026 - 08:31 PST
Update - Some Quobyte clients continue to have issues. HPC@UCD has escalated to Quobyte engineering team. Please stay tuned.
Jan 23, 2026 - 13:08 PST
Investigating - It appears that a new workload is causing the Quobyte client to crash on Hive nodes. We are opening a case with the vendor.
Jan 15, 2026 - 15:53 PST
Hive Login Node Operational
90 days ago
100.0 % uptime
Today
Compute Nodes Degraded Performance
90 days ago
100.0 % uptime
Today
GPU Nodes Degraded Performance
90 days ago
100.0 % uptime
Today
Hive Network Operational
90 days ago
100.0 % uptime
Today
Storage Partial Outage
90 days ago
96.53 % uptime
Today
Quobyte Parallel File System Partial Outage
90 days ago
89.6 % uptime
Today
Hive Home Directories Operational
90 days ago
100.0 % uptime
Today
Legacy Storage Operational
90 days ago
100.0 % uptime
Today
Module System and Software Operational
90 days ago
100.0 % uptime
Today
Hippo User Portal Operational
90 days ago
100.0 % uptime
Today
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Feb 15, 2026

No incidents reported today.

Feb 14, 2026

No incidents reported.

Feb 13, 2026

No incidents reported.

Feb 12, 2026

No incidents reported.

Feb 11, 2026

No incidents reported.

Feb 10, 2026

No incidents reported.

Feb 9, 2026

No incidents reported.

Feb 8, 2026

No incidents reported.

Feb 7, 2026

No incidents reported.

Feb 6, 2026

No incidents reported.

Feb 5, 2026

No incidents reported.

Feb 4, 2026

No incidents reported.

Feb 3, 2026

No incidents reported.

Feb 2, 2026

No incidents reported.

Feb 1, 2026

No incidents reported.