Job:
#OCPBUGS-15430issue4 weeks agoKubeAPIDown alert rename and/or degraded status ASSIGNED
We have many guards making sure that there are always at least two instances of the kube-apiserver. If we ever reach a single kube-apiserver and it causes disruption for the clients, other alerts such as KubeAPIErrorBudgetBurn will fire.
KubeAPIDown is here to make sure that Prometheus and really any client can reach the kube-apiserver, which they can even when there is only one instance of kube-apiserver running. If they can't or that availability is disrupted, `KubeAPIErrorBudgetBurn` will fire.
Comment 23058588 by Marcel Härri at 2023-09-19T06:57:07.949+0000
#OCPBUGS-30267issue6 weeks ago[IBMCloud] MonitorTests liveness/readiness probe error events repeat MODIFIED
Mar 12 18:52:24.937 - 58s E namespace/openshift-kube-apiserver alert/KubeAPIErrorBudgetBurn alertstate/firing severity/critical ALERTS
{alertname="KubeAPIErrorBudgetBurn", alertstate="firing", long="1h", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="critical", short="5m"}
periodic-ci-openshift-release-master-ci-4.14-upgrade-from-stable-4.13-e2e-gcp-sdn-upgrade (all) - 25 runs, 32% failed, 38% of failures match = 12% impact
#1788425070784286720junit6 days ago
    promQL query returned unexpected results:
    ALERTS{alertname!~"Watchdog|AlertmanagerReceiversNotConfigured|PrometheusRemoteWriteDesiredShards|KubeJobFailed|Watchdog|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|etcdMembersDown|etcdMembersDown|etcdGRPCRequestsSlow|etcdGRPCRequestsSlow|etcdHighNumberOfFailedGRPCRequests|etcdHighNumberOfFailedGRPCRequests|etcdMemberCommunicationSlow|etcdMemberCommunicationSlow|etcdNoLeader|etcdNoLeader|etcdHighFsyncDurations|etcdHighFsyncDurations|etcdHighCommitDurations|etcdHighCommitDurations|etcdInsufficientMembers|etcdInsufficientMembers|etcdHighNumberOfLeaderChanges|etcdHighNumberOfLeaderChanges|KubeAPIErrorBudgetBurn|KubeAPIErrorBudgetBurn|KubeClientErrors|KubeClientErrors|KubePersistentVolumeErrors|KubePersistentVolumeErrors|MCDDrainError|MCDDrainError|KubeMemoryOvercommit|KubeMemoryOvercommit|MCDPivotError|MCDPivotError|PrometheusOperatorWatchErrors|PrometheusOperatorWatchErrors|OVNKubernetesResourceRetryFailure|OVNKubernetesResourceRetryFailure|RedhatOperatorsCatalogError|RedhatOperatorsCatalogError|VSphereOpenshiftNodeHealthFail|VSphereOpenshiftNodeHealthFail|SamplesImagestreamImportFailing|SamplesImagestreamImportFailing",alertstate="firing",severity!="info"} >= 1
    [
#1788425070784286720junit6 days ago
        <*errors.errorString | 0xc0024be910>{
            s: "promQL query returned unexpected results:\nALERTS{alertname!~\"Watchdog|AlertmanagerReceiversNotConfigured|PrometheusRemoteWriteDesiredShards|KubeJobFailed|Watchdog|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|etcdMembersDown|etcdMembersDown|etcdGRPCRequestsSlow|etcdGRPCRequestsSlow|etcdHighNumberOfFailedGRPCRequests|etcdHighNumberOfFailedGRPCRequests|etcdMemberCommunicationSlow|etcdMemberCommunicationSlow|etcdNoLeader|etcdNoLeader|etcdHighFsyncDurations|etcdHighFsyncDurations|etcdHighCommitDurations|etcdHighCommitDurations|etcdInsufficientMembers|etcdInsufficientMembers|etcdHighNumberOfLeaderChanges|etcdHighNumberOfLeaderChanges|KubeAPIErrorBudgetBurn|KubeAPIErrorBudgetBurn|KubeClientErrors|KubeClientErrors|KubePersistentVolumeErrors|KubePersistentVolumeErrors|MCDDrainError|MCDDrainError|KubeMemoryOvercommit|KubeMemoryOvercommit|MCDPivotError|MCDPivotError|PrometheusOperatorWatchErrors|PrometheusOperatorWatchErrors|OVNKubernetesResourceRetryFailure|OVNKubernetesResourceRetryFailure|RedhatOperatorsCatalogError|RedhatOperatorsCatalogError|VSphereOpenshiftNodeHealthFail|VSphereOpenshiftNodeHealthFail|SamplesImagestreamImportFailing|SamplesImagestreamImportFailing\",alertstate=\"firing\",severity!=\"info\"} >= 1\n[\n  {\n    \"metric\": {\n      \"__name__\": \"ALERTS\",\n      \"alertname\": \"OperatorHubSourceError\",\n      \"alertstate\": \"firing\",\n      \"container\": \"catalog-operator\",\n      \"endpoint\": \"https-metrics\",\n      \"exported_namespace\": \"openshift-marketplace\",\n      \"instance\": \"10.128.0.37:8443\",\n      \"job\": \"catalog-operator-metrics\",\n      \"name\": \"community-operators\",\n      \"namespace\": \"openshift-operator-lifecycle-manager\",\n      \"pod\": \"catalog-operator-85fb75899d-lf5q7\",\n      \"prometheus\": \"openshift-monitoring/k8s\",\n      \"service\": \"catalog-operator-metrics\",\n      \"severity\": \"warning\"\n    },\n    \"value\": [\n      1715235117.026,\n      \"1\"\n    ]\n  }\n]",
        },
#1788714640386035712junit5 days ago
        <*errors.errorString | 0xc001d35190>{
            s: "promQL query returned unexpected results:\nALERTS{alertname!~\"Watchdog|AlertmanagerReceiversNotConfigured|PrometheusRemoteWriteDesiredShards|KubeJobFailed|Watchdog|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|KubePodNotReady|etcdMembersDown|etcdMembersDown|etcdGRPCRequestsSlow|etcdGRPCRequestsSlow|etcdHighNumberOfFailedGRPCRequests|etcdHighNumberOfFailedGRPCRequests|etcdMemberCommunicationSlow|etcdMemberCommunicationSlow|etcdNoLeader|etcdNoLeader|etcdHighFsyncDurations|etcdHighFsyncDurations|etcdHighCommitDurations|etcdHighCommitDurations|etcdInsufficientMembers|etcdInsufficientMembers|etcdHighNumberOfLeaderChanges|etcdHighNumberOfLeaderChanges|KubeAPIErrorBudgetBurn|KubeAPIErrorBudgetBurn|KubeClientErrors|KubeClientErrors|KubePersistentVolumeErrors|KubePersistentVolumeErrors|MCDDrainError|MCDDrainError|KubeMemoryOvercommit|KubeMemoryOvercommit|MCDPivotError|MCDPivotError|PrometheusOperatorWatchErrors|PrometheusOperatorWatchErrors|OVNKubernetesResourceRetryFailure|OVNKubernetesResourceRetryFailure|RedhatOperatorsCatalogError|RedhatOperatorsCatalogError|VSphereOpenshiftNodeHealthFail|VSphereOpenshiftNodeHealthFail|SamplesImagestreamImportFailing|SamplesImagestreamImportFailing\",alertstate=\"firing\",severity!=\"info\"} >= 1\n[\n  {\n    \"metric\": {\n      \"__name__\": \"ALERTS\",\n      \"alertname\": \"KubeAggregatedAPIErrors\",\n      \"alertstate\": \"firing\",\n      \"name\": \"v1.apps.openshift.io\",\n      \"namespace\": \"default\",\n      \"prometheus\": \"openshift-monitoring/k8s\",\n      \"severity\": \"warning\"\n    },\n    \"value\": [\n      1715300447.202,\n      \"1\"\n    ]\n  },\n  {\n    \"metric\": {\n      \"__name__\": \"ALERTS\",\n      \"alertname\": \"KubeAggregatedAPIErrors\",\n      \"alertstate\": \"firing\",\n      \"name\": \"v1.authorization.openshift.io\",\n      \"namespace\": \"default\",\n      \"prometheus\": \"openshift-monitoring/k8s\",\n      \"severity\": \"warning\"\n    },\n    \"value\": [\n      1715300447.202,\n      \"1\"\n    ]\n  },\n  {\n    \"metric\": {\n      \"__name__\": \"ALERTS\",\n      \"alertname\": \"KubeAggregatedAPIErrors\",\n      \"alertstate\": \"firing\",\n      \"name\": \"v1.build.openshift.io\",\n      \"namespace\": \"default\",\n      \"prometheus\": \"openshift-monitoring/k8s\",\n      \"severity\": \"warning\"\n    },\n    \"value\": [\n      1715300447.202,\n      \"1\"\n    ]\n  },\n  {\n    \"metric\": {\n      \"__name__\": \"ALERTS\",\n      \"alertname\": \"KubeAggregatedAPIErrors\",\n      \"alertstate\": \"firing\",\n      \"name\": \"v1.image.openshift.io\",\n      \"namespace\": \"default\",\n      \"prometheus\": \"openshift-monitoring/k8s\",\n      \"severity\": \"warning\"\n    },\n    \"value\": [\n      1715300447.202,\n      \"1\"\n    ]\n  },\n  {\n    \"metric\": {\n      \"__name__\": \"ALERTS\",\n      \"alertname\": \"KubeAggregatedAPIErrors\",\n      \"alertstate\": \"firing\",\n      \"name\": \"v1.oauth.openshift.io\",\n      \"namespace\": \"default\",\n      \"prometheus\": \"openshift-monitoring/k8s\",\n      \"severity\": \"warning\"\n    },\n    \"value\": [\n      1715300447.202,\n      \"1\"\n    ]\n  },\n  {\n    \"metric\": {\n      \"__name__\": \"ALERTS\",\n      \"alertname\": \"KubeAggregatedAPIErrors\",\n      \"alertstate\": \"firing\",\n      \"name\": \"v1.packages.operators.coreos.com\",\n      \"namespace\": \"default\",\n      \"prometheus\": \"openshift-monitoring/k8s\",\n      \"severity\": \"warning\"\n    },\n    \"value\": [\n      1715300447.202,\n      \"1\"\n    ]\n  },\n  {\n    \"metric\": {\n      \"__name__\": \"ALERTS\",\n      \"alertname\": \"KubeAggregatedAPIErrors\",\n      \"alertstate\": \"firing\",\n      \"name\": \"v1.project.openshift.io\",\n      \"namespace\": \"default\",\n      \"prometheus\": \"openshift-monitoring/k8s\",\n      \"severity\": \"warning\"\n    },\n    \"value\": [\n      1715300447.202,\n      \"1\"\n    ]\n  },\n  {\n    \"metric\": {\n      \"__name__\": \"ALERTS\",\n      \"alertname\": \"KubeAggregatedAPIErrors\",\n      \"alertstate\": \"firing\",\n      \"name\": \"v1.quota.openshift.io\",\n      \"namespace\": \"default\",\n      \"prometheus\": \"openshift-monitoring/k8s\",\n      \"severity\": \"warning\"\n    },\n    \"value\": [\n      1715300447.202,\n      \"1\"\n    ]\n  },\n  {\n    \"metric\": {\n      \"__name__\": \"ALERTS\",\n      \"alertname\": \"KubeAggregatedAPIErrors\",\n      \"alertstate\": \"firing\",\n      \"name\": \"v1.route.openshift.io\",\n      \"namespace\": \"default\",\n      \"prometheus\": \"openshift-monitoring/k8s\",\n      \"severity\": \"warning\"\n    },\n    \"value\": [\n      1715300447.202,\n      \"1\"\n    ]\n  },\n  {\n    \"metric\": {\n      \"__name__\": \"ALERTS\",\n      \"alertname\": \"KubeAggregatedAPIErrors\",\n      \"alertstate\": \"firing\",\n      \"name\": \"v1.security.openshift.io\",\n      \"namespace\": \"default\",\n      \"prometheus\": \"openshift-monitoring/k8s\",\n      \"severity\": \"warning\"\n    },\n    \"value\": [\n      1715300447.202,\n      \"1\"\n    ]\n  },\n  {\n    \"metric\": {\n      \"__name__\": \"ALERTS\",\n      \"alertname\": \"KubeAggregatedAPIErrors\",\n      \"alertstate\": \"firing\",\n      \"name\": \"v1.template.openshift.io\",\n      \"namespace\": \"default\",\n      \"prometheus\": \"openshift-monitoring/k8s\",\n      \"severity\": \"warning\"\n    },\n    \"value\": [\n      1715300447.202,\n      \"1\"\n    ]\n  },\n  {\n    \"metric\": {\n      \"__name__\": \"ALERTS\",\n      \"alertname\": \"KubeAggregatedAPIErrors\",\n      \"alertstate\": \"firing\",\n      \"name\": \"v1.user.openshift.io\",\n      \"namespace\": \"default\",\n      \"prometheus\": \"openshift-monitoring/k8s\",\n      \"severity\": \"warning\"\n    },\n    \"value\": [\n      1715300447.202,\n      \"1\"\n    ]\n  },\n  {\n    \"metric\": {\n      \"__name__\": \"ALERTS\",\n      \"alertname\": \"KubeAggregatedAPIErrors\",\n      \"alertstate\": \"firing\",\n...
#1788714640386035712junit5 days ago
# [bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above pending
KubeAPIErrorBudgetBurn was at or above pending for at least 50m18s on platformidentification.JobType{Release:"4.14", FromRelease:"4.13", Platform:"gcp", Architecture:"amd64", Network:"sdn", Topology:"ha"} (maxAllowed=0s): pending for 50m18s, firing for 0s:
May 10 00:18:03.409 - 116s  I alert/KubeAPIErrorBudgetBurn namespace/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="1d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="2h"}
May 10 00:18:03.409 - 1796s I alert/KubeAPIErrorBudgetBurn namespace/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"}
May 10 00:21:31.409 - 298s  I alert/KubeAPIErrorBudgetBurn namespace/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="1d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="2h"}
May 10 00:49:25.409 - 808s  I alert/KubeAPIErrorBudgetBurn namespace/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"}
#1786767396736864256junit11 days ago
# [bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above pending
KubeAPIErrorBudgetBurn was at or above pending for at least 3m26s on platformidentification.JobType{Release:"4.14", FromRelease:"4.13", Platform:"gcp", Architecture:"amd64", Network:"sdn", Topology:"ha"} (maxAllowed=0s): pending for 3m26s, firing for 0s:
May 04 15:15:59.504 - 148s  I alert/KubeAPIErrorBudgetBurn namespace/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"}
May 04 15:19:59.504 - 58s   I alert/KubeAPIErrorBudgetBurn namespace/openshift-kube-apiserver ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="pending", long="3d", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="warning", short="6h"}

Found in 12.00% of runs (37.50% of failures) across 25 total runs and 1 jobs (32.00% failed) in 121ms - clear search | chart view - source code located on github