#OCPBUGS-15430 | issue | 4 weeks ago | KubeAPIDown alert rename and/or degraded status ASSIGNED |
We have many guards making sure that there are always at least two instances of the kube-apiserver. If we ever reach a single kube-apiserver and it causes disruption for the clients, other alerts such as KubeAPIErrorBudgetBurn will fire. KubeAPIDown is here to make sure that Prometheus and really any client can reach the kube-apiserver, which they can even when there is only one instance of kube-apiserver running. If they can't or that availability is disrupted, `KubeAPIErrorBudgetBurn` will fire. Comment 23058588 by Marcel Härri at 2023-09-19T06:57:07.949+0000 | |||
#OCPBUGS-30267 | issue | 6 weeks ago | [IBMCloud] MonitorTests liveness/readiness probe error events repeat MODIFIED |
Mar 12 18:52:24.937 - 58s E namespace/openshift-kube-apiserver alert/KubeAPIErrorBudgetBurn alertstate/firing severity/critical ALERTS {alertname="KubeAPIErrorBudgetBurn", alertstate="firing", long="1h", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="critical", short="5m"} | |||
periodic-ci-openshift-release-master-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-sdn-upgrade (all) - 25 runs, 32% failed, 38% of failures match = 12% impact | |||
#1788425070784286720 | junit | 6 days ago | |
promQL query returned unexpected results: ALERTS{alertname!~"Watchdog|AlertmanagerReceiversNotConfigured|PrometheusRemoteWriteDesiredShards|KubeJobFailed|Watchdog|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|etcdMembersDown|etcdMembersDown|etcdGRPCRequestsSlow|etcdGRPCRequestsSlow|etcdHighNumberOfFailedGRPCRequests|etcdHighNumberOfFailedGRPCRequests|etcdMemberCommunicationSlow|etcdMemberCommunicationSlow|etcdNoLeader|etcdNoLeader|etcdHighFsyncDurations|etcdHighFsyncDurations|etcdHighCommitDurations|etcdHighCommitDurations|etcdInsufficientMembers|etcdInsufficientMembers|etcdHighNumberOfLeaderChanges|etcdHighNumberOfLeaderChanges|KubeAPIErrorBudgetBurn|KubeAPIErrorBudgetBurn|KubeClientErrors|KubeClientErrors|KubePersistentVolumeErrors|KubePersistentVolumeErrors|MCDDrainError|MCDDrainError|KubeMemoryOvercommit|KubeMemoryOvercommit|MCDPivotError|MCDPivotError|PrometheusOperatorWatchErrors|PrometheusOperatorWatchErrors|OVNKubernetesResourceRetryFailure|OVNKubernetesResourceRetryFailure|RedhatOperatorsCatalogError|RedhatOperatorsCatalogError|VSphereOpenshiftNodeHealthFail|VSphereOpenshiftNodeHealthFail|SamplesImagestreamImportFailing|SamplesImagestreamImportFailing",alertstate="firing",severity!="info"} >= 1 [ | |||
#1788425070784286720 | junit | 6 days ago | |
<*errors.errorString | 0xc0024be910>{ s: "promQL query returned unexpected results:\nALERTS{alertname!~\"Watchdog|AlertmanagerReceiversNotConfigured|PrometheusRemoteWriteDesiredShards|KubeJobFailed|Watchdog|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|etcdMembersDown|etcdMembersDown|etcdGRPCRequestsSlow|etcdGRPCRequestsSlow|etcdHighNumberOfFailedGRPCRequests|etcdHighNumberOfFailedGRPCRequests|etcdMemberCommunicationSlow|etcdMemberCommunicationSlow|etcdNoLeader|etcdNoLeader|etcdHighFsyncDurations|etcdHighFsyncDurations|etcdHighCommitDurations|etcdHighCommitDurations|etcdInsufficientMembers|etcdInsufficientMembers|etcdHighNumberOfLeaderChanges|etcdHighNumberOfLeaderChanges|KubeAPIErrorBudgetBurn|KubeAPIErrorBudgetBurn|KubeClientErrors|KubeClientErrors|KubePersistentVolumeErrors|KubePersistentVolumeErrors|MCDDrainError|MCDDrainError|KubeMemoryOvercommit|KubeMemoryOvercommit|MCDPivotError|MCDPivotError|PrometheusOperatorWatchErrors|PrometheusOperatorWatchErrors|OVNKubernetesResourceRetryFailure|OVNKubernetesResourceRetryFailure|RedhatOperatorsCatalogError|RedhatOperatorsCatalogError|VSphereOpenshiftNodeHealthFail|VSphereOpenshiftNodeHealthFail|SamplesImagestreamImportFailing|SamplesImagestreamImportFailing\",alertstate=\"firing\",severity!=\"info\"} >= 1\n[\n {\n \"metric\": {\n \"__name__\": \"ALERTS\",\n \"alertname\": \"OperatorHubSourceError\",\n \"alertstate\": \"firing\",\n \"container\": \"catalog-operator\",\n \"endpoint\": \"https-metrics\",\n \"exported_namespace\": \"openshift-marketplace\",\n \"instance\": \"10.128.0.37:8443\",\n \"job\": \"catalog-operator-metrics\",\n \"name\": \"community-operators\",\n \"namespace\": \"openshift-operator-lifecycle-manager\",\n \"pod\": \"catalog-operator-85fb75899d-lf5q7\",\n \"prometheus\": \"openshift-monitoring/k8s\",\n \"service\": \"catalog-operator-metrics\",\n \"severity\": \"warning\"\n },\n \"value\": [\n 1715235117.026,\n \"1\"\n ]\n }\n]", }, | |||
#1788714640386035712 | junit | 5 days ago | |
<*errors.errorString | 0xc001d35190>{ s: "promQL query returned unexpected results:\nALERTS{alertname!~\"Watchdog|AlertmanagerReceiversNotConfigured|PrometheusRemoteWriteDesiredShards|KubeJobFailed|Watchdog|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|etcdMembersDown|etcdMembersDown|etcdGRPCRequestsSlow|etcdGRPCRequestsSlow|etcdHighNumberOfFailedGRPCRequests|etcdHighNumberOfFailedGRPCRequests|etcdMemberCommunicationSlow|etcdMemberCommunicationSlow|etcdNoLeader|etcdNoLeader|etcdHighFsyncDurations|etcdHighFsyncDurations|etcdHighCommitDurations|etcdHighCommitDurations|etcdInsufficientMembers|etcdInsufficientMembers|etcdHighNumberOfLeaderChanges|etcdHighNumberOfLeaderChanges|KubeAPIErrorBudgetBurn|KubeAPIErrorBudgetBurn|KubeClientErrors|KubeClientErrors|KubePersistentVolumeErrors|KubePersistentVolumeErrors|MCDDrainError|MCDDrainError|KubeMemoryOvercommit|KubeMemoryOvercommit|MCDPivotError|MCDPivotError|PrometheusOperatorWatchErrors|PrometheusOperatorWatchErrors|OVNKubernetesResourceRetryFailure|OVNKubernetesResourceRetryFailure|RedhatOperatorsCatalogError|RedhatOperatorsCatalogError|VSphereOpenshiftNodeHealthFail|VSphereOpenshiftNodeHealthFail|SamplesImagestreamImportFailing|SamplesImagestreamImportFailing\",alertstate=\"firing\",severity!=\"info\"} >= 1\n[\n {\n \"metric\": {\n \"__name__\": \"ALERTS\",\n \"alertname\": \"KubeAggregatedAPIErrors\",\n \"alertstate\": \"firing\",\n \"name\": \"v1.apps.openshift.io\",\n \"namespace\": \"default\",\n \"prometheus\": \"openshift-monitoring/k8s\",\n \"severity\": \"warning\"\n },\n \"value\": [\n 1715300447.202,\n \"1\"\n ]\n },\n {\n \"metric\": {\n \"__name__\": \"ALERTS\",\n \"alertname\": \"KubeAggregatedAPIErrors\",\n \"alertstate\": \"firing\",\n \"name\": \"v1.authorization.openshift.io\",\n \"namespace\": \"default\",\n \"prometheus\": \"openshift-monitoring/k8s\",\n \"severity\": \"warning\"\n },\n \"value\": [\n 1715300447.202,\n \"1\"\n ]\n },\n {\n \"metric\": {\n \"__name__\": \"ALERTS\",\n \"alertname\": \"KubeAggregatedAPIErrors\",\n \"alertstate\": \"firing\",\n \"name\": \"v1.build.openshift.io\",\n \"namespace\": \"default\",\n \"prometheus\": \"openshift-monitoring/k8s\",\n \"severity\": \"warning\"\n },\n \"value\": [\n 1715300447.202,\n \"1\"\n ]\n },\n {\n \"metric\": {\n \"__name__\": \"ALERTS\",\n \"alertname\": \"KubeAggregatedAPIErrors\",\n \"alertstate\": \"firing\",\n \"name\": \"v1.image.openshift.io\",\n \"namespace\": \"default\",\n \"prometheus\": \"openshift-monitoring/k8s\",\n \"severity\": \"warning\"\n },\n \"value\": [\n 1715300447.202,\n \"1\"\n ]\n },\n {\n \"metric\": {\n \"__name__\": \"ALERTS\",\n \"alertname\": \"KubeAggregatedAPIErrors\",\n \"alertstate\": \"firing\",\n \"name\": \"v1.oauth.openshift.io\",\n \"namespace\": \"default\",\n \"prometheus\": \"openshift-monitoring/k8s\",\n \"severity\": \"warning\"\n },\n \"value\": [\n 1715300447.202,\n \"1\"\n ]\n },\n {\n \"metric\": {\n \"__name__\": \"ALERTS\",\n \"alertname\": \"KubeAggregatedAPIErrors\",\n \"alertstate\": \"firing\",\n \"name\": \"v1.packages.operators.coreos.com\",\n \"namespace\": \"default\",\n \"prometheus\": \"openshift-monitoring/k8s\",\n \"severity\": \"warning\"\n },\n \"value\": [\n 1715300447.202,\n \"1\"\n ]\n },\n {\n \"metric\": {\n \"__name__\": \"ALERTS\",\n \"alertname\": \"KubeAggregatedAPIErrors\",\n \"alertstate\": \"firing\",\n \"name\": \"v1.project.openshift.io\",\n \"namespace\": \"default\",\n \"prometheus\": \"openshift-monitoring/k8s\",\n \"severity\": \"warning\"\n },\n \"value\": [\n 1715300447.202,\n \"1\"\n ]\n },\n {\n \"metric\": {\n \"__name__\": \"ALERTS\",\n \"alertname\": \"KubeAggregatedAPIErrors\",\n \"alertstate\": \"firing\",\n \"name\": \"v1.quota.openshift.io\",\n \"namespace\": \"default\",\n \"prometheus\": \"openshift-monitoring/k8s\",\n \"severity\": \"warning\"\n },\n \"value\": [\n 1715300447.202,\n \"1\"\n ]\n },\n {\n \"metric\": {\n \"__name__\": \"ALERTS\",\n \"alertname\": \"KubeAggregatedAPIErrors\",\n \"alertstate\": \"firing\",\n \"name\": \"v1.route.openshift.io\",\n \"namespace\": \"default\",\n \"prometheus\": \"openshift-monitoring/k8s\",\n \"severity\": \"warning\"\n },\n \"value\": [\n 1715300447.202,\n \"1\"\n ]\n },\n {\n \"metric\": {\n \"__name__\": \"ALERTS\",\n \"alertname\": \"KubeAggregatedAPIErrors\",\n \"alertstate\": \"firing\",\n \"name\": \"v1.security.openshift.io\",\n \"namespace\": \"default\",\n \"prometheus\": \"openshift-monitoring/k8s\",\n \"severity\": \"warning\"\n },\n \"value\": [\n 1715300447.202,\n \"1\"\n ]\n },\n {\n \"metric\": {\n \"__name__\": \"ALERTS\",\n \"alertname\": \"KubeAggregatedAPIErrors\",\n \"alertstate\": \"firing\",\n \"name\": \"v1.template.openshift.io\",\n \"namespace\": \"default\",\n \"prometheus\": \"openshift-monitoring/k8s\",\n \"severity\": \"warning\"\n },\n \"value\": [\n 1715300447.202,\n \"1\"\n ]\n },\n {\n \"metric\": {\n \"__name__\": \"ALERTS\",\n \"alertname\": \"KubeAggregatedAPIErrors\",\n \"alertstate\": \"firing\",\n \"name\": \"v1.user.openshift.io\",\n \"namespace\": \"default\",\n \"prometheus\": \"openshift-monitoring/k8s\",\n \"severity\": \"warning\"\n },\n \"value\": [\n 1715300447.202,\n \"1\"\n ]\n },\n {\n \"metric\": {\n \"__name__\": \"ALERTS\",\n \"alertname\": \"KubeAggregatedAPIErrors\",\n \"alertstate\": \"firing\",\n... | |||
#1788714640386035712 | junit | 5 days ago | |
# [bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above pending KubeAPIErrorBudgetBurn was at or above pending for at least 50m18s on platformidentification.JobType{Release:"4.14", FromRelease:"4.13", Platform:"gcp", Architecture:"amd64", Network:"sdn", Topology:"ha"} (maxAllowed=0s): pending for 50m18s, firing for 0s: May 10 00:18:03.409 - 116s I alert/KubeAPIErrorBudgetBurn namespace/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="1d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="2h"} May 10 00:18:03.409 - 1796s I alert/KubeAPIErrorBudgetBurn namespace/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"} May 10 00:21:31.409 - 298s I alert/KubeAPIErrorBudgetBurn namespace/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="1d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="2h"} May 10 00:49:25.409 - 808s I alert/KubeAPIErrorBudgetBurn namespace/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"} | |||
#1786767396736864256 | junit | 11 days ago | |
# [bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above pending KubeAPIErrorBudgetBurn was at or above pending for at least 3m26s on platformidentification.JobType{Release:"4.14", FromRelease:"4.13", Platform:"gcp", Architecture:"amd64", Network:"sdn", Topology:"ha"} (maxAllowed=0s): pending for 3m26s, firing for 0s: May 04 15:15:59.504 - 148s I alert/KubeAPIErrorBudgetBurn namespace/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"} May 04 15:19:59.504 - 58s I alert/KubeAPIErrorBudgetBurn namespace/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"} |
Found in 12.00% of runs (37.50% of failures) across 25 total runs and 1 jobs (32.00% failed) in 121ms - clear search | chart view - source code located on github