Job:
#OCPBUGS-30267issue6 weeks ago[IBMCloud] MonitorTests liveness/readiness probe error events repeat MODIFIED
Mar 12 18:52:24.937 - 58s E namespace/openshift-kube-apiserver alert/KubeAPIErrorBudgetBurn alertstate/firing severity/critical ALERTS
{alertname="KubeAPIErrorBudgetBurn", alertstate="firing", long="1h", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="critical", short="5m"}
#OCPBUGS-15430issue4 weeks agoKubeAPIDown alert rename and/or degraded status ASSIGNED
We have many guards making sure that there are always at least two instances of the kube-apiserver. If we ever reach a single kube-apiserver and it causes disruption for the clients, other alerts such as KubeAPIErrorBudgetBurn will fire.
KubeAPIDown is here to make sure that Prometheus and really any client can reach the kube-apiserver, which they can even when there is only one instance of kube-apiserver running. If they can't or that availability is disrupted, `KubeAPIErrorBudgetBurn` will fire.
Comment 23058588 by Marcel Härri at 2023-09-19T06:57:07.949+0000
periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-azure-upgrade (all) - 2 runs, 50% failed, 200% of failures match = 100% impact
#1790422949740679168junit5 hours ago
# [bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above pending
KubeAPIErrorBudgetBurn was at or above pending for at least 12m22s on platformidentification.JobType{Release:"4.10", FromRelease:"4.9", Platform:"azure", Network:"sdn", Topology:"ha"} (maxAllowed=3s): pending for 12m22s, firing for 0s:
May 14 18:40:59.828 - 742s  I alert/KubeAPIErrorBudgetBurn ns/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"}
#1790422949740679168junit5 hours ago
# [sig-arch][bz-kube-apiserver][Late] Alerts alert/KubeAPIErrorBudgetBurn should not be at or above pending [Suite:openshift/conformance/parallel]
flake: KubeAPIErrorBudgetBurn was at or above pending for at least 13m59s on platformidentification.JobType{Release:"4.10", FromRelease:"4.9", Platform:"azure", Network:"sdn", Topology:"ha"} (maxAllowed=3s): pending for 13m59s, firing for 0s:
May 14 18:53:30.000 - 839s  I alert/KubeAPIErrorBudgetBurn ns/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"}
#1788443174218240000junit5 days ago
[bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above pending
[bz-storage][invariant] alert/KubePersistentVolumeErrors should not be at or above pending
#1788443174218240000junit5 days ago
[bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above pending
[bz-storage][invariant] alert/KubePersistentVolumeErrors should not be at or above pending
#1788443174218240000junit5 days ago
# [bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above pending
KubeAPIErrorBudgetBurn was at or above pending for at least 8m17s on platformidentification.JobType{Release:"4.10", FromRelease:"4.9", Platform:"azure", Network:"sdn", Topology:"ha"} (maxAllowed=3s): pending for 8m17s, firing for 0s:
May 09 07:53:50.557 - 497s  I alert/KubeAPIErrorBudgetBurn ns/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"}
#1788443174218240000junit5 days ago
# [sig-arch][bz-kube-apiserver][Late] Alerts alert/KubeAPIErrorBudgetBurn should not be at or above pending [Suite:openshift/conformance/parallel]
flake: KubeAPIErrorBudgetBurn was at or above pending for at least 12m35s on platformidentification.JobType{Release:"4.10", FromRelease:"4.9", Platform:"azure", Network:"sdn", Topology:"ha"} (maxAllowed=3s): pending for 12m35s, firing for 0s:

Found in 100.00% of runs (200.00% of failures) across 2 total runs and 1 jobs (50.00% failed) in 103ms - clear search | chart view - source code located on github