Some systems are experiencing issues

About This Site

This page is intended to provide a quick overview of the operational status of the Sepia lab. It doesn't try to provide many testing-related metrics.

For more detailed testing information, see the Grafana dashboard

Stickied Incidents

Tuesday 10th May 2022

Long Running Cluster Health Long Running Cluster Outage

While adding some new hosts to the Sepia Long Running Cluster, the cluster got into a state where all the MONs started locking up due to lack of system resources. Josh, Neha, Dan, and David have been working to restore the cluster service by service.

The following workloads are down:

  • teuthology runs
  • Ceph CI builds (Jenkins/shaman)
  • quay.ceph.io
  • telemetry.ceph.com / telemetry-public.ceph.com
  • chacra.ceph.com

Past Incidents

Wednesday 18th May 2022

No incidents reported

Tuesday 17th May 2022

No incidents reported

Monday 16th May 2022

No incidents reported

Sunday 15th May 2022

No incidents reported

Saturday 14th May 2022

No incidents reported

Friday 13th May 2022

No incidents reported

Thursday 12th May 2022

No incidents reported

Wednesday 11th May 2022

No incidents reported

Tuesday 10th May 2022

Long Running Cluster Health Long Running Cluster Outage

While adding some new hosts to the Sepia Long Running Cluster, the cluster got into a state where all the MONs started locking up due to lack of system resources. Josh, Neha, Dan, and David have been working to restore the cluster service by service.

The following workloads are down:

  • teuthology runs
  • Ceph CI builds (Jenkins/shaman)
  • quay.ceph.io
  • telemetry.ceph.com / telemetry-public.ceph.com
  • chacra.ceph.com

Monday 9th May 2022

No incidents reported