Job:
#OCPBUGS-15430issue4 weeks agoKubeAPIDown alert rename and/or degraded status ASSIGNED
We have many guards making sure that there are always at least two instances of the kube-apiserver. If we ever reach a single kube-apiserver and it causes disruption for the clients, other alerts such as KubeAPIErrorBudgetBurn will fire.
KubeAPIDown is here to make sure that Prometheus and really any client can reach the kube-apiserver, which they can even when there is only one instance of kube-apiserver running. If they can't or that availability is disrupted, `KubeAPIErrorBudgetBurn` will fire.
Comment 23058588 by Marcel Härri at 2023-09-19T06:57:07.949+0000
#OCPBUGS-30267issue6 weeks ago[IBMCloud] MonitorTests liveness/readiness probe error events repeat MODIFIED
Mar 12 18:52:24.937 - 58s E namespace/openshift-kube-apiserver alert/KubeAPIErrorBudgetBurn alertstate/firing severity/critical ALERTS
{alertname="KubeAPIErrorBudgetBurn", alertstate="firing", long="1h", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="critical", short="5m"}
periodic-ci-openshift-release-master-ci-4.13-upgrade-from-stable-4.12-e2e-aws-ovn-upgrade (all) - 25 runs, 12% failed, 200% of failures match = 24% impact
#1790039507349803008junit10 hours ago
May 13 16:30:07.394 E ns/openshift-authentication-operator pod/authentication-operator-6dd89b64ff-th8zj node/ip-10-0-138-36.ec2.internal uid/bdfeacb5-50ae-4d55-9579-0c8499903c2c container/authentication-operator reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
May 13 16:30:15.851 E ns/openshift-insights pod/insights-operator-587dc598f7-v8b5f node/ip-10-0-138-36.ec2.internal uid/2a05e580-45a9-46ec-a99b-0f503a293417 container/insights-operator reason/ContainerExit code/2 cause/Error r conditional gatherer with version 1.0.1\nI0513 16:30:01.346021       1 conditional_gatherer.go:242] updating alerts cache for conditional gatherer\nI0513 16:30:01.348016       1 conditional_gatherer.go:278] alert "AlertmanagerReceiversNotConfigured" has state "firing"\nI0513 16:30:01.348030       1 conditional_gatherer.go:278] alert "InsightsRecommendationActive" has state "firing"\nI0513 16:30:01.348036       1 conditional_gatherer.go:278] alert "KubeAPIErrorBudgetBurn" has state "pending"\nI0513 16:30:01.348041       1 conditional_gatherer.go:278] alert "PodSecurityViolation" has state "firing"\nI0513 16:30:01.348046       1 conditional_gatherer.go:278] alert "Watchdog" has state "firing"\nI0513 16:30:01.348110       1 conditional_gatherer.go:288] updating version cache for conditional gatherer\nI0513 16:30:01.353173       1 conditional_gatherer.go:296] cluster version is '4.13.0-0.ci-2024-05-13-151327'\nI0513 16:30:01.353201       1 tasks_processing.go:45] number of workers: 1\nI0513 16:30:01.353208       1 tasks_processing.go:69] worker 0 listening for tasks.\nI0513 16:30:01.353212       1 tasks_processing.go:71] worker 0 working on conditional_gatherer_rules task.\nI0513 16:30:01.353275       1 recorder.go:70] Recording insights-operator/conditional-gatherer-rules with fingerprint=8dbbbde181184600277bd0c8401374b23c24c4f4b08634e52ed045ff5aa12179\nI0513 16:30:01.353288       1 gather.go:180] gatherer "conditional" function "conditional_gatherer_rules" took 870ns to process 1 records\nI0513 16:30:01.353296       1 tasks_processing.go:74] worker 0 stopped.\nI0513 16:30:01.353305       1 periodic.go:162] Periodic gather conditional completed in 172ms\nI0513 16:30:01.378595       1 recorder.go:70] Recording insights-operator/gathers with fingerprint=b1b5b0d330f7271fcd14d6736bc80a336bd651ab562d48de0c9ca795aef2d2d8\nI0513 16:30:01.378816       1 diskrecorder.go:70] Writing 176 records to /var/lib/insights-operator/insights-2024-05-13-163001.tar.gz\nI0513 16:30:01.391176       1 diskrecorder.go:51] Wrote 176 records to disk in 12ms\n
May 13 16:30:15.851 E ns/openshift-insights pod/insights-operator-587dc598f7-v8b5f node/ip-10-0-138-36.ec2.internal uid/2a05e580-45a9-46ec-a99b-0f503a293417 container/insights-operator reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
#1790039507349803008junit10 hours ago
# [bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above pending
KubeAPIErrorBudgetBurn was at or above pending for at least 1h26m4s on platformidentification.JobType{Release:"4.13", FromRelease:"4.12", Platform:"aws", Architecture:"amd64", Network:"ovn", Topology:"ha"} (maxAllowed=0s): pending for 1h26m4s, firing for 0s:
May 13 15:58:28.758 - 80s   I alert/KubeAPIErrorBudgetBurn ns/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="6h", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="critical", short="30m"}
May 13 15:58:28.758 - 1042s I alert/KubeAPIErrorBudgetBurn ns/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="1d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="2h"}
May 13 15:58:28.758 - 4042s I alert/KubeAPIErrorBudgetBurn ns/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"}
#1789948963709784064junit16 hours ago
# [bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above pending
KubeAPIErrorBudgetBurn was at or above pending for at least 1m14s on platformidentification.JobType{Release:"4.13", FromRelease:"4.12", Platform:"aws", Architecture:"amd64", Network:"ovn", Topology:"ha"} (maxAllowed=0s): pending for 1m14s, firing for 0s:
May 13 09:57:25.041 - 74s   I alert/KubeAPIErrorBudgetBurn ns/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"}
#1788848856092381184junit3 days ago
# [bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above pending
KubeAPIErrorBudgetBurn was at or above pending for at least 3m48s on platformidentification.JobType{Release:"4.13", FromRelease:"4.12", Platform:"aws", Architecture:"amd64", Network:"ovn", Topology:"ha"} (maxAllowed=0s): pending for 3m48s, firing for 0s:
May 10 09:22:15.931 - 228s  I alert/KubeAPIErrorBudgetBurn ns/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"}
#1787768250126307328junit6 days ago
# [bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above pending
KubeAPIErrorBudgetBurn was at or above pending for at least 1m28s on platformidentification.JobType{Release:"4.13", FromRelease:"4.12", Platform:"aws", Architecture:"amd64", Network:"ovn", Topology:"ha"} (maxAllowed=0s): pending for 1m28s, firing for 0s:
May 07 09:35:32.788 - 88s   I alert/KubeAPIErrorBudgetBurn ns/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"}
#1787564749307777024junit7 days ago
# [bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above pending
KubeAPIErrorBudgetBurn was at or above pending for at least 1m14s on platformidentification.JobType{Release:"4.13", FromRelease:"4.12", Platform:"aws", Architecture:"amd64", Network:"ovn", Topology:"ha"} (maxAllowed=0s): pending for 1m14s, firing for 0s:
May 06 20:06:40.680 - 74s   I alert/KubeAPIErrorBudgetBurn ns/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"}
#1785948339280285696junit11 days ago
# [bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above pending
KubeAPIErrorBudgetBurn was at or above pending for at least 6m8s on platformidentification.JobType{Release:"4.13", FromRelease:"4.12", Platform:"aws", Architecture:"amd64", Network:"ovn", Topology:"ha"} (maxAllowed=0s): pending for 6m8s, firing for 0s:
May 02 08:59:55.144 - 368s  I alert/KubeAPIErrorBudgetBurn ns/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"}

Found in 24.00% of runs (200.00% of failures) across 25 total runs and 1 jobs (12.00% failed) in 124ms - clear search | chart view - source code located on github