Job:
#1921157bug23 months ago[sig-api-machinery] Kubernetes APIs remain available for new connections ASSIGNED
T2: At 06:45:58: systemd-shutdown was sending SIGTERM to remaining processes...
T3: At 06:45:58: kube-apiserver-ci-op-z52cbzhi-6d7cd-pz2jw-master-0: Received signal to terminate, becoming unready, but keeping serving (TerminationStart event)
T4: At 06:47:08 kube-apiserver-ci-op-z52cbzhi-6d7cd-pz2jw-master-0: The minimal shutdown duration of 1m10s finished (TerminationMinimalShutdownDurationFinished event)
T5: At 06:47:08 kube-apiserver-ci-op-z52cbzhi-6d7cd-pz2jw-master-0: Server has stopped listening (TerminationStoppedServing event)
T5 is the last event reported from that api server. At T5 the server might wait up to 60s for all requests to complete and then it fires TerminationGracefulTerminationFinished event.
periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-aws-upgrade (all) - 24 runs, 50% failed, 25% of failures match = 13% impact
#1619916679238651904junit27 hours ago
Jan 30 05:47:48.118 E ns/openshift-apiserver-operator pod/openshift-apiserver-operator-5cf96ddf85-ww9cw node/ip-10-0-138-79.us-east-2.compute.internal container/openshift-apiserver-operator reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 30 05:47:55.226 E ns/openshift-ingress-operator pod/ingress-operator-665cf85bf-22g74 node/ip-10-0-138-79.us-east-2.compute.internal container/ingress-operator reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 30 05:47:58.211 E ns/openshift-insights pod/insights-operator-854449444c-7df2s node/ip-10-0-138-79.us-east-2.compute.internal container/insights-operator reason/ContainerExit code/2 cause/Error tplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="1.858361ms" userAgent="Prometheus/2.29.2" audit-ID="9b5156d7-7e65-4e0f-a982-f497b13cac3f" srcIP="10.131.0.22:50972" resp=200\nI0130 05:45:57.275987       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="4.516133ms" userAgent="Prometheus/2.29.2" audit-ID="1c7dc475-a060-48f6-b77a-a3d62d8fe5bc" srcIP="10.129.2.14:48240" resp=200\nI0130 05:46:26.294471       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="5.13084ms" userAgent="Prometheus/2.29.2" audit-ID="145fd174-e302-48c3-9edd-2a061928da7c" srcIP="10.131.0.22:50972" resp=200\nI0130 05:46:27.273334       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="1.818301ms" userAgent="Prometheus/2.29.2" audit-ID="dcbbf02e-3b22-4131-b405-0cc173c307d0" srcIP="10.129.2.14:48240" resp=200\nI0130 05:46:56.290449       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="1.80163ms" userAgent="Prometheus/2.29.2" audit-ID="d008b248-90e2-4ee9-9352-d77af5663410" srcIP="10.131.0.22:50972" resp=200\nI0130 05:46:57.283295       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="10.767366ms" userAgent="Prometheus/2.29.2" audit-ID="046f6c2a-71cd-41f0-ad10-92207f81d539" srcIP="10.129.2.14:48240" resp=200\nI0130 05:47:26.293807       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="4.526543ms" userAgent="Prometheus/2.29.2" audit-ID="c25d0734-d4c8-4f71-8dbd-ae18e2a43f95" srcIP="10.131.0.22:50972" resp=200\nI0130 05:47:27.274163       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="2.046994ms" userAgent="Prometheus/2.29.2" audit-ID="b77a90bd-e6bc-497c-9ef8-4ba61d37f29a" srcIP="10.129.2.14:48240" resp=200\nI0130 05:47:52.872539       1 status.go:354] The operator is healthy\nI0130 05:47:52.872594       1 status.go:441] No status update necessary, objects are identical\nI0130 05:47:56.290552       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="2.038334ms" userAgent="Prometheus/2.29.2" audit-ID="cb837a69-dc03-4325-b155-37b42231569c" srcIP="10.131.0.22:50972" resp=200\n
Jan 30 05:47:58.211 E ns/openshift-insights pod/insights-operator-854449444c-7df2s node/ip-10-0-138-79.us-east-2.compute.internal container/insights-operator reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 30 05:48:05.261 E ns/openshift-cluster-storage-operator pod/cluster-storage-operator-59456fcf98-2kpmv node/ip-10-0-138-79.us-east-2.compute.internal container/cluster-storage-operator reason/ContainerExit code/1 cause/Error server controller ...\nI0130 05:48:04.317323       1 base_controller.go:104] All ConfigObserver workers have been terminated\nI0130 05:48:04.317328       1 base_controller.go:114] Shutting down worker of CSIDriverStarter controller ...\nI0130 05:48:04.317344       1 base_controller.go:104] All CSIDriverStarter workers have been terminated\nI0130 05:48:04.317371       1 base_controller.go:167] Shutting down StaticResourceController ...\nI0130 05:48:04.317386       1 base_controller.go:167] Shutting down AWSEBSCSIDriverOperatorDeployment ...\nI0130 05:48:04.317390       1 base_controller.go:145] All AWSEBSCSIDriverOperatorDeployment post start hooks have been terminated\nI0130 05:48:04.317398       1 base_controller.go:114] Shutting down worker of AWSEBSCSIDriverOperator controller ...\nI0130 05:48:04.317402       1 base_controller.go:104] All AWSEBSCSIDriverOperator workers have been terminated\nI0130 05:48:04.317407       1 controller_manager.go:54] AWSEBSCSIDriverOperator controller terminated\nI0130 05:48:04.317413       1 base_controller.go:114] Shutting down worker of VSphereProblemDetectorStarter controller ...\nI0130 05:48:04.317418       1 base_controller.go:104] All VSphereProblemDetectorStarter workers have been terminated\nI0130 05:48:04.317424       1 base_controller.go:114] Shutting down worker of StaticResourceController controller ...\nI0130 05:48:04.317428       1 base_controller.go:104] All StaticResourceController workers have been terminated\nI0130 05:48:04.317432       1 controller_manager.go:54] StaticResourceController controller terminated\nI0130 05:48:04.317437       1 base_controller.go:114] Shutting down worker of AWSEBSCSIDriverOperatorDeployment controller ...\nI0130 05:48:04.317441       1 base_controller.go:104] All AWSEBSCSIDriverOperatorDeployment workers have been terminated\nI0130 05:48:04.317445       1 controller_manager.go:54] AWSEBSCSIDriverOperatorDeployment controller terminated\nW0130 05:48:04.317501       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\n
Jan 30 05:48:11.295 E ns/openshift-console-operator pod/console-operator-6bbd4fcc8c-njssk node/ip-10-0-138-79.us-east-2.compute.internal container/console-operator reason/ContainerExit code/1 cause/Error ersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationPreShutdownHooksFinished' All pre-shutdown hooks have been finished\nI0130 05:48:10.654240       1 genericapiserver.go:355] "[graceful-termination] shutdown event" name="ShutdownInitiated"\nI0130 05:48:10.654251       1 genericapiserver.go:709] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-console-operator", Name:"console-operator-6bbd4fcc8c-njssk", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationStart' Received signal to terminate, becoming unready, but keeping serving\nI0130 05:48:10.654269       1 genericapiserver.go:709] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-console-operator", Name:"console-operator-6bbd4fcc8c-njssk", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationMinimalShutdownDurationFinished' The minimal shutdown duration of 0s finished\nI0130 05:48:10.654283       1 genericapiserver.go:362] "[graceful-termination] shutdown event" name="AfterShutdownDelayDuration"\nI0130 05:48:10.654297       1 genericapiserver.go:709] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-console-operator", Name:"console-operator-6bbd4fcc8c-njssk", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationStoppedServing' Server has stopped listening\nI0130 05:48:10.654320       1 genericapiserver.go:387] "[graceful-termination] shutdown event" name="InFlightRequestsDrained"\nI0130 05:48:10.654547       1 base_controller.go:167] Shutting down ConsoleServiceController ...\nI0130 05:48:10.654594       1 base_controller.go:167] Shutting down UnsupportedConfigOverridesController ...\nI0130 05:48:10.654622       1 base_controller.go:167] Shutting down ConsoleDownloadsDeploymentSyncController ...\nI0130 05:48:10.654644       1 base_controller.go:167] Shutting down ConsoleCLIDownloadsController ...\nW0130 05:48:10.654654       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\n
Jan 30 05:48:18.403 E ns/openshift-controller-manager-operator pod/openshift-controller-manager-operator-65ddc5dd7b-s6psg node/ip-10-0-138-79.us-east-2.compute.internal container/openshift-controller-manager-operator reason/ContainerExit code/1 cause/Error ConfigMap (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0130 05:48:17.189879       1 base_controller.go:167] Shutting down StaticResourceController ...\nI0130 05:48:17.189883       1 reflector.go:225] Stopping reflector *v1.RoleBinding (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0130 05:48:17.189925       1 reflector.go:225] Stopping reflector *v1.Build (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0130 05:48:17.189957       1 base_controller.go:167] Shutting down ConfigObserver ...\nI0130 05:48:17.189932       1 base_controller.go:167] Shutting down ResourceSyncController ...\nI0130 05:48:17.189994       1 base_controller.go:167] Shutting down StatusSyncer_openshift-controller-manager ...\nI0130 05:48:17.190013       1 base_controller.go:145] All StatusSyncer_openshift-controller-manager post start hooks have been terminated\nI0130 05:48:17.189996       1 reflector.go:225] Stopping reflector *v1.Namespace (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0130 05:48:17.190034       1 reflector.go:225] Stopping reflector *v1.Secret (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0130 05:48:17.190069       1 reflector.go:225] Stopping reflector *v1.Image (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0130 05:48:17.190071       1 operator.go:115] Shutting down OpenShiftControllerManagerOperator\nI0130 05:48:17.190112       1 reflector.go:225] Stopping reflector *v1.Role (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0130 05:48:17.190156       1 reflector.go:225] Stopping reflector *v1.OpenShiftControllerManager (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0130 05:48:17.190198       1 reflector.go:225] Stopping reflector *v1.Service (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nW0130 05:48:17.190231       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\n
Jan 30 05:48:20.434 E ns/openshift-cluster-storage-operator pod/csi-snapshot-controller-operator-76f948cf74-zjksq node/ip-10-0-138-79.us-east-2.compute.internal container/csi-snapshot-controller-operator reason/ContainerExit code/1 cause/Error cing operator at 1.064686421s\nI0130 05:48:17.363063       1 operator.go:157] Starting syncing operator at 2023-01-30 05:48:17.363059762 +0000 UTC m=+3556.215927888\nI0130 05:48:18.255042       1 operator.go:159] Finished syncing operator at 891.973357ms\nI0130 05:48:18.255085       1 operator.go:157] Starting syncing operator at 2023-01-30 05:48:18.25508197 +0000 UTC m=+3557.107950096\nI0130 05:48:19.198799       1 operator.go:159] Finished syncing operator at 943.710588ms\nI0130 05:48:19.198835       1 operator.go:157] Starting syncing operator at 2023-01-30 05:48:19.198831238 +0000 UTC m=+3558.051699375\nI0130 05:48:19.282345       1 cmd.go:97] Received SIGTERM or SIGINT signal, shutting down controller.\nI0130 05:48:19.282407       1 genericapiserver.go:386] [graceful-termination] RunPreShutdownHooks has completed\nI0130 05:48:19.282437       1 genericapiserver.go:349] "[graceful-termination] shutdown event" name="ShutdownInitiated"\nI0130 05:48:19.282439       1 base_controller.go:167] Shutting down CSISnapshotWebhookController ...\nI0130 05:48:19.282449       1 genericapiserver.go:352] "[graceful-termination] shutdown event" name="AfterShutdownDelayDuration"\nI0130 05:48:19.282511       1 genericapiserver.go:376] "[graceful-termination] shutdown event" name="InFlightRequestsDrained"\nI0130 05:48:19.282534       1 base_controller.go:167] Shutting down StatusSyncer_csi-snapshot-controller ...\nI0130 05:48:19.282540       1 base_controller.go:145] All StatusSyncer_csi-snapshot-controller post start hooks have been terminated\nI0130 05:48:19.282550       1 base_controller.go:167] Shutting down StaticResourceController ...\nI0130 05:48:19.282558       1 base_controller.go:167] Shutting down LoggingSyncer ...\nI0130 05:48:19.282567       1 base_controller.go:167] Shutting down ManagementStateController ...\nW0130 05:48:19.282581       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\nI0130 05:48:19.282589       1 requestheader_controller.go:183] Shutting down RequestHeaderAuthRequestController\n
Jan 30 05:48:22.638 E ns/openshift-monitoring pod/cluster-monitoring-operator-894d44997-ck9kc node/ip-10-0-138-79.us-east-2.compute.internal container/kube-rbac-proxy reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 30 05:48:22.862 E ns/openshift-ingress-canary pod/ingress-canary-qt8zv node/ip-10-0-140-223.us-east-2.compute.internal container/serve-healthcheck-canary reason/ContainerExit code/2 cause/Error serving on 8888\nserving on 8080\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\n
Jan 30 05:48:30.887 E ns/openshift-monitoring pod/alertmanager-main-1 node/ip-10-0-201-170.us-east-2.compute.internal container/alertmanager-proxy reason/ContainerExit code/2 cause/Error 2023/01/30 05:00:40 provider.go:128: Defaulting client-id to system:serviceaccount:openshift-monitoring:alertmanager-main\n2023/01/30 05:00:40 provider.go:133: Defaulting client-secret to service account token /var/run/secrets/kubernetes.io/serviceaccount/token\n2023/01/30 05:00:40 provider.go:351: Delegation of authentication and authorization to OpenShift is enabled for bearer tokens and client certificates.\n2023/01/30 05:00:40 oauthproxy.go:203: mapping path "/" => upstream "http://localhost:9093/"\n2023/01/30 05:00:40 oauthproxy.go:230: OAuthProxy configured for  Client ID: system:serviceaccount:openshift-monitoring:alertmanager-main\n2023/01/30 05:00:40 oauthproxy.go:240: Cookie settings: name:_oauth_proxy secure(https):true httponly:true expiry:168h0m0s domain:<default> samesite: refresh:disabled\nI0130 05:00:40.747903       1 dynamic_serving_content.go:130] Starting serving::/etc/tls/private/tls.crt::/etc/tls/private/tls.key\n2023/01/30 05:00:40 http.go:107: HTTPS: listening on [::]:9095\n
#1619432009694711808junit2 days ago
Jan 28 21:48:15.168 E ns/openshift-cluster-storage-operator pod/csi-snapshot-webhook-987f7bc9c-cd8l5 node/ip-10-0-253-220.us-east-2.compute.internal container/webhook reason/ContainerExit code/2 cause/Error
Jan 28 21:48:16.794 E ns/openshift-cluster-storage-operator pod/csi-snapshot-controller-758f5b59c5-m6wf2 node/ip-10-0-130-198.us-east-2.compute.internal container/snapshot-controller reason/ContainerExit code/2 cause/Error
Jan 28 21:48:16.794 E ns/openshift-cluster-storage-operator pod/csi-snapshot-controller-758f5b59c5-m6wf2 node/ip-10-0-130-198.us-east-2.compute.internal container/snapshot-controller reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 28 21:48:17.604 E ns/openshift-monitoring pod/alertmanager-main-2 node/ip-10-0-162-202.us-east-2.compute.internal container/alertmanager-proxy reason/ContainerExit code/2 cause/Error 2023/01/28 20:55:26 provider.go:128: Defaulting client-id to system:serviceaccount:openshift-monitoring:alertmanager-main\n2023/01/28 20:55:26 provider.go:133: Defaulting client-secret to service account token /var/run/secrets/kubernetes.io/serviceaccount/token\n2023/01/28 20:55:26 provider.go:351: Delegation of authentication and authorization to OpenShift is enabled for bearer tokens and client certificates.\n2023/01/28 20:55:26 oauthproxy.go:203: mapping path "/" => upstream "http://localhost:9093/"\n2023/01/28 20:55:26 oauthproxy.go:230: OAuthProxy configured for  Client ID: system:serviceaccount:openshift-monitoring:alertmanager-main\n2023/01/28 20:55:26 oauthproxy.go:240: Cookie settings: name:_oauth_proxy secure(https):true httponly:true expiry:168h0m0s domain:<default> samesite: refresh:disabled\nI0128 20:55:26.353333       1 dynamic_serving_content.go:130] Starting serving::/etc/tls/private/tls.crt::/etc/tls/private/tls.key\n2023/01/28 20:55:26 http.go:107: HTTPS: listening on [::]:9095\n
Jan 28 21:48:17.604 E ns/openshift-monitoring pod/alertmanager-main-2 node/ip-10-0-162-202.us-east-2.compute.internal container/config-reloader reason/ContainerExit code/2 cause/Error level=info ts=2023-01-28T20:55:26.180578533Z caller=main.go:148 msg="Starting prometheus-config-reloader" version="(version=0.49.0, branch=rhaos-4.9-rhel-8, revision=d709566)"\nlevel=info ts=2023-01-28T20:55:26.180624734Z caller=main.go:149 build_context="(go=go1.16.12, user=root, date=20221205-20:41:17)"\nlevel=info ts=2023-01-28T20:55:26.180756326Z caller=main.go:183 msg="Starting web server for metrics" listen=localhost:8080\nlevel=info ts=2023-01-28T20:55:26.180882898Z caller=reloader.go:219 msg="started watching config file and directories for changes" cfg= out= dirs=/etc/alertmanager/config,/etc/alertmanager/secrets/alertmanager-main-tls,/etc/alertmanager/secrets/alertmanager-main-proxy,/etc/alertmanager/secrets/alertmanager-kube-rbac-proxy\nlevel=info ts=2023-01-28T20:55:27.315242305Z caller=reloader.go:355 msg="Reload triggered" cfg_in= cfg_out= watched_dirs="/etc/alertmanager/config, /etc/alertmanager/secrets/alertmanager-main-tls, /etc/alertmanager/secrets/alertmanager-main-proxy, /etc/alertmanager/secrets/alertmanager-kube-rbac-proxy"\n
Jan 28 21:48:17.794 E ns/openshift-console-operator pod/console-operator-6bbd4fcc8c-f4n28 node/ip-10-0-130-198.us-east-2.compute.internal container/console-operator reason/ContainerExit code/1 cause/Error  type: 'Normal' reason: 'TerminationStart' Received signal to terminate, becoming unready, but keeping serving\nI0128 21:48:16.198783       1 genericapiserver.go:709] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-console-operator", Name:"console-operator-6bbd4fcc8c-f4n28", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationMinimalShutdownDurationFinished' The minimal shutdown duration of 0s finished\nI0128 21:48:16.197785       1 base_controller.go:167] Shutting down LoggingSyncer ...\nI0128 21:48:16.197797       1 base_controller.go:167] Shutting down ResourceSyncController ...\nI0128 21:48:16.197806       1 base_controller.go:167] Shutting down ConsoleDownloadsDeploymentSyncController ...\nI0128 21:48:16.197816       1 base_controller.go:167] Shutting down StatusSyncer_console ...\nI0128 21:48:16.197823       1 base_controller.go:167] Shutting down DownloadsRouteController ...\nI0128 21:48:16.197829       1 base_controller.go:167] Shutting down ConsoleOperator ...\nI0128 21:48:16.197839       1 base_controller.go:167] Shutting down ConsoleServiceController ...\nI0128 21:48:16.197846       1 base_controller.go:167] Shutting down ConsoleServiceController ...\nI0128 21:48:16.197852       1 base_controller.go:167] Shutting down ConsoleRouteController ...\nI0128 21:48:16.197859       1 base_controller.go:167] Shutting down ConsoleCLIDownloadsController ...\nI0128 21:48:16.197867       1 base_controller.go:167] Shutting down ManagementStateController ...\nI0128 21:48:16.197874       1 base_controller.go:167] Shutting down UnsupportedConfigOverridesController ...\nI0128 21:48:16.197882       1 base_controller.go:167] Shutting down RemoveStaleConditionsController ...\nI0128 21:48:16.197889       1 base_controller.go:167] Shutting down HealthCheckController ...\nW0128 21:48:16.198003       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\nI0128 21:48:16.198019       1 base_controller.go:114] Shutting down worker of LoggingSyncer controller ...\n
Jan 28 21:48:17.794 E ns/openshift-console-operator pod/console-operator-6bbd4fcc8c-f4n28 node/ip-10-0-130-198.us-east-2.compute.internal container/console-operator reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 28 21:48:17.995 E ns/openshift-cluster-storage-operator pod/cluster-storage-operator-59456fcf98-zmcjb node/ip-10-0-253-220.us-east-2.compute.internal container/cluster-storage-operator reason/ContainerExit code/1 cause/Error 01860       1 controller.go:174] Existing StorageClass gp2 found, reconciling\nI0128 21:48:16.551026       1 cmd.go:97] Received SIGTERM or SIGINT signal, shutting down controller.\nI0128 21:48:16.551158       1 genericapiserver.go:386] [graceful-termination] RunPreShutdownHooks has completed\nI0128 21:48:16.551184       1 genericapiserver.go:349] "[graceful-termination] shutdown event" name="ShutdownInitiated"\nI0128 21:48:16.551365       1 base_controller.go:167] Shutting down SnapshotCRDController ...\nI0128 21:48:16.551378       1 base_controller.go:167] Shutting down CSIDriverStarter ...\nI0128 21:48:16.551386       1 base_controller.go:167] Shutting down AWSEBSCSIDriverOperator ...\nI0128 21:48:16.551390       1 base_controller.go:145] All AWSEBSCSIDriverOperator post start hooks have been terminated\nI0128 21:48:16.551399       1 base_controller.go:167] Shutting down ManagementStateController ...\nI0128 21:48:16.551433       1 base_controller.go:167] Shutting down DefaultStorageClassController ...\nI0128 21:48:16.551443       1 base_controller.go:167] Shutting down LoggingSyncer ...\nI0128 21:48:16.551454       1 base_controller.go:167] Shutting down StatusSyncer_storage ...\nI0128 21:48:16.551457       1 base_controller.go:145] All StatusSyncer_storage post start hooks have been terminated\nI0128 21:48:16.551464       1 base_controller.go:167] Shutting down ConfigObserver ...\nI0128 21:48:16.551473       1 base_controller.go:167] Shutting down VSphereProblemDetectorStarter ...\nI0128 21:48:16.551640       1 base_controller.go:114] Shutting down worker of SnapshotCRDController controller ...\nI0128 21:48:16.551647       1 base_controller.go:104] All SnapshotCRDController workers have been terminated\nI0128 21:48:16.551654       1 base_controller.go:114] Shutting down worker of CSIDriverStarter controller ...\nI0128 21:48:16.551657       1 base_controller.go:104] All CSIDriverStarter workers have been terminated\nW0128 21:48:16.551660       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\n
Jan 28 21:48:18.161 E ns/openshift-monitoring pod/node-exporter-cmtw9 node/ip-10-0-173-35.us-east-2.compute.internal container/node-exporter reason/ContainerExit code/143 cause/Error 8T20:41:00.162Z caller=node_exporter.go:113 collector=meminfo\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=netclass\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=netdev\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=netstat\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=nfs\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=nfsd\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=powersupplyclass\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=pressure\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=rapl\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=schedstat\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=sockstat\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=softnet\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=stat\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=textfile\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=thermal_zone\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=time\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=timex\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=udp_queues\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=uname\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=vmstat\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=xfs\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:113 collector=zfs\nlevel=info ts=2023-01-28T20:41:00.162Z caller=node_exporter.go:195 msg="Listening on" address=127.0.0.1:9100\nlevel=info ts=2023-01-28T20:41:00.162Z caller=tls_config.go:191 msg="TLS is disabled." http2=false\n
Jan 28 21:48:18.610 E ns/openshift-monitoring pod/telemeter-client-7499678659-ghh27 node/ip-10-0-162-202.us-east-2.compute.internal container/reload reason/ContainerExit code/2 cause/Error
Jan 28 21:48:18.610 E ns/openshift-monitoring pod/telemeter-client-7499678659-ghh27 node/ip-10-0-162-202.us-east-2.compute.internal container/telemeter-client reason/ContainerExit code/2 cause/Error
#1617762501636657152junit7 days ago
Jan 24 07:04:50.581 E ns/openshift-monitoring pod/telemeter-client-68bf5949bc-d6prh node/ip-10-0-199-158.ec2.internal container/reload reason/ContainerExit code/2 cause/Error
Jan 24 07:04:50.685 E clusterversion/version changed Failing to True: MultipleErrors: Multiple errors are preventing progress:\n* Cluster operator authentication is updating versions\n* Cluster operator cloud-credential is updating versions\n* Cluster operator cluster-autoscaler is updating versions\n* Cluster operator console is updating versions\n* Cluster operator csi-snapshot-controller is updating versions\n* Cluster operator image-registry is updating versions\n* Cluster operator ingress is updating versions\n* Cluster operator insights is updating versions\n* Cluster operator kube-storage-version-migrator is updating versions\n* Cluster operator machine-approver is updating versions\n* Cluster operator monitoring is updating versions\n* Cluster operator node-tuning is updating versions\n* Cluster operator openshift-apiserver is updating versions\n* Cluster operator openshift-controller-manager is updating versions\n* Cluster operator openshift-samples is updating versions\n* Cluster operator storage is updating versions
Jan 24 07:04:50.851 E ns/openshift-monitoring pod/prometheus-operator-6594997947-ds5gq node/ip-10-0-200-168.ec2.internal container/kube-rbac-proxy reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 24 07:04:52.830 E ns/openshift-monitoring pod/thanos-querier-68995ccb94-rxl98 node/ip-10-0-178-71.ec2.internal container/oauth-proxy reason/ContainerExit code/2 cause/Error 2023/01/24 06:19:20 provider.go:128: Defaulting client-id to system:serviceaccount:openshift-monitoring:thanos-querier\n2023/01/24 06:19:20 provider.go:133: Defaulting client-secret to service account token /var/run/secrets/kubernetes.io/serviceaccount/token\n2023/01/24 06:19:20 provider.go:351: Delegation of authentication and authorization to OpenShift is enabled for bearer tokens and client certificates.\n2023/01/24 06:19:20 oauthproxy.go:203: mapping path "/" => upstream "http://localhost:9090/"\n2023/01/24 06:19:20 oauthproxy.go:224: compiled skip-auth-regex => "^/-/(healthy|ready)$"\n2023/01/24 06:19:20 oauthproxy.go:230: OAuthProxy configured for  Client ID: system:serviceaccount:openshift-monitoring:thanos-querier\n2023/01/24 06:19:20 oauthproxy.go:240: Cookie settings: name:_oauth_proxy secure(https):true httponly:true expiry:168h0m0s domain:<default> samesite: refresh:disabled\n2023/01/24 06:19:20 main.go:156: using htpasswd file /etc/proxy/htpasswd/auth\nI0124 06:19:20.626844       1 dynamic_serving_content.go:130] Starting serving::/etc/tls/private/tls.crt::/etc/tls/private/tls.key\n2023/01/24 06:19:20 http.go:107: HTTPS: listening on [::]:9091\n
Jan 24 07:04:53.612 E ns/openshift-monitoring pod/thanos-querier-68995ccb94-t2b7v node/ip-10-0-199-158.ec2.internal container/oauth-proxy reason/ContainerExit code/2 cause/Error 2023/01/24 06:19:27 provider.go:128: Defaulting client-id to system:serviceaccount:openshift-monitoring:thanos-querier\n2023/01/24 06:19:27 provider.go:133: Defaulting client-secret to service account token /var/run/secrets/kubernetes.io/serviceaccount/token\n2023/01/24 06:19:27 provider.go:351: Delegation of authentication and authorization to OpenShift is enabled for bearer tokens and client certificates.\n2023/01/24 06:19:27 oauthproxy.go:203: mapping path "/" => upstream "http://localhost:9090/"\n2023/01/24 06:19:27 oauthproxy.go:224: compiled skip-auth-regex => "^/-/(healthy|ready)$"\n2023/01/24 06:19:27 oauthproxy.go:230: OAuthProxy configured for  Client ID: system:serviceaccount:openshift-monitoring:thanos-querier\n2023/01/24 06:19:27 oauthproxy.go:240: Cookie settings: name:_oauth_proxy secure(https):true httponly:true expiry:168h0m0s domain:<default> samesite: refresh:disabled\n2023/01/24 06:19:27 main.go:156: using htpasswd file /etc/proxy/htpasswd/auth\nI0124 06:19:27.067051       1 dynamic_serving_content.go:130] Starting serving::/etc/tls/private/tls.crt::/etc/tls/private/tls.key\n2023/01/24 06:19:27 http.go:107: HTTPS: listening on [::]:9091\nE0124 06:20:02.778900       1 reflector.go:127] github.com/openshift/oauth-proxy/providers/openshift/provider.go:347: Failed to watch *v1.ConfigMap: unknown (get configmaps)\n
Jan 24 07:04:55.547 E ns/openshift-console-operator pod/console-operator-6bbd4fcc8c-xlxj4 node/ip-10-0-163-71.ec2.internal container/console-operator reason/ContainerExit code/1 cause/Error  Shutting down worker of ConsoleRouteController controller ...\nI0124 07:04:54.268084       1 base_controller.go:114] Shutting down worker of RemoveStaleConditionsController controller ...\nI0124 07:04:54.268089       1 base_controller.go:114] Shutting down worker of ConsoleServiceController controller ...\nI0124 07:04:54.268094       1 base_controller.go:114] Shutting down worker of UnsupportedConfigOverridesController controller ...\nI0124 07:04:54.268099       1 base_controller.go:114] Shutting down worker of ManagementStateController controller ...\nI0124 07:04:54.287392       1 genericapiserver.go:709] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-console-operator", Name:"console-operator-6bbd4fcc8c-xlxj4", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationStart' Received signal to terminate, becoming unready, but keeping serving\nI0124 07:04:54.287438       1 genericapiserver.go:709] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-console-operator", Name:"console-operator-6bbd4fcc8c-xlxj4", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationMinimalShutdownDurationFinished' The minimal shutdown duration of 0s finished\nI0124 07:04:54.287455       1 genericapiserver.go:362] "[graceful-termination] shutdown event" name="AfterShutdownDelayDuration"\nI0124 07:04:54.287474       1 genericapiserver.go:709] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-console-operator", Name:"console-operator-6bbd4fcc8c-xlxj4", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationStoppedServing' Server has stopped listening\nI0124 07:04:54.287488       1 genericapiserver.go:387] "[graceful-termination] shutdown event" name="InFlightRequestsDrained"\nI0124 07:04:54.287501       1 base_controller.go:145] All StatusSyncer_console post start hooks have been terminated\nI0124 07:04:54.287509       1 base_controller.go:104] All ManagementStateController workers have been terminated\n
Jan 24 07:04:55.820 E ns/openshift-cluster-storage-operator pod/cluster-storage-operator-59456fcf98-qn8kz node/ip-10-0-136-92.ec2.internal container/cluster-storage-operator reason/ContainerExit code/1 cause/Error 7] Received SIGTERM or SIGINT signal, shutting down controller.\nI0124 07:04:53.058448       1 genericapiserver.go:349] "[graceful-termination] shutdown event" name="ShutdownInitiated"\nI0124 07:04:53.058461       1 genericapiserver.go:386] [graceful-termination] RunPreShutdownHooks has completed\nI0124 07:04:53.058471       1 base_controller.go:167] Shutting down SnapshotCRDController ...\nI0124 07:04:53.058482       1 base_controller.go:167] Shutting down DefaultStorageClassController ...\nI0124 07:04:53.058506       1 requestheader_controller.go:183] Shutting down RequestHeaderAuthRequestController\nI0124 07:04:53.058511       1 base_controller.go:167] Shutting down StatusSyncer_storage ...\nI0124 07:04:53.058515       1 base_controller.go:145] All StatusSyncer_storage post start hooks have been terminated\nI0124 07:04:53.058521       1 configmap_cafile_content.go:222] "Shutting down controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"\nI0124 07:04:53.058524       1 base_controller.go:167] Shutting down ConfigObserver ...\nI0124 07:04:53.058532       1 configmap_cafile_content.go:222] "Shutting down controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"\nI0124 07:04:53.058534       1 base_controller.go:167] Shutting down VSphereProblemDetectorStarter ...\nI0124 07:04:53.058544       1 base_controller.go:167] Shutting down CSIDriverStarter ...\nI0124 07:04:53.058551       1 base_controller.go:167] Shutting down AWSEBSCSIDriverOperator ...\nI0124 07:04:53.058555       1 base_controller.go:145] All AWSEBSCSIDriverOperator post start hooks have been terminated\nI0124 07:04:53.058563       1 base_controller.go:167] Shutting down ManagementStateController ...\nI0124 07:04:53.058592       1 base_controller.go:167] Shutting down LoggingSyncer ...\nW0124 07:04:53.058602       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\nI0124 07:04:53.058611       1 secure_serving.go:311] Stopped listening on [::]:8443\n
Jan 24 07:04:55.820 E ns/openshift-cluster-storage-operator pod/cluster-storage-operator-59456fcf98-qn8kz node/ip-10-0-136-92.ec2.internal container/cluster-storage-operator reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 24 07:05:02.848 E ns/openshift-ingress-canary pod/ingress-canary-2qdmx node/ip-10-0-178-71.ec2.internal container/serve-healthcheck-canary reason/ContainerExit code/2 cause/Error serving on 8888\nserving on 8080\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\n
Jan 24 07:05:03.742 E ns/openshift-monitoring pod/alertmanager-main-1 node/ip-10-0-199-158.ec2.internal container/alertmanager-proxy reason/ContainerExit code/2 cause/Error 2023/01/24 06:21:10 provider.go:128: Defaulting client-id to system:serviceaccount:openshift-monitoring:alertmanager-main\n2023/01/24 06:21:10 provider.go:133: Defaulting client-secret to service account token /var/run/secrets/kubernetes.io/serviceaccount/token\n2023/01/24 06:21:10 provider.go:351: Delegation of authentication and authorization to OpenShift is enabled for bearer tokens and client certificates.\n2023/01/24 06:21:10 oauthproxy.go:203: mapping path "/" => upstream "http://localhost:9093/"\n2023/01/24 06:21:10 oauthproxy.go:230: OAuthProxy configured for  Client ID: system:serviceaccount:openshift-monitoring:alertmanager-main\n2023/01/24 06:21:10 oauthproxy.go:240: Cookie settings: name:_oauth_proxy secure(https):true httponly:true expiry:168h0m0s domain:<default> samesite: refresh:disabled\nI0124 06:21:10.319291       1 dynamic_serving_content.go:130] Starting serving::/etc/tls/private/tls.crt::/etc/tls/private/tls.key\n2023/01/24 06:21:10 http.go:107: HTTPS: listening on [::]:9095\n
Jan 24 07:05:03.742 E ns/openshift-monitoring pod/alertmanager-main-1 node/ip-10-0-199-158.ec2.internal container/config-reloader reason/ContainerExit code/2 cause/Error level=info ts=2023-01-24T06:21:10.153775359Z caller=main.go:148 msg="Starting prometheus-config-reloader" version="(version=0.49.0, branch=rhaos-4.9-rhel-8, revision=d709566)"\nlevel=info ts=2023-01-24T06:21:10.153807209Z caller=main.go:149 build_context="(go=go1.16.12, user=root, date=20221205-20:41:17)"\nlevel=info ts=2023-01-24T06:21:10.153893421Z caller=main.go:183 msg="Starting web server for metrics" listen=localhost:8080\nlevel=info ts=2023-01-24T06:21:10.15439683Z caller=reloader.go:219 msg="started watching config file and directories for changes" cfg= out= dirs=/etc/alertmanager/config,/etc/alertmanager/secrets/alertmanager-main-tls,/etc/alertmanager/secrets/alertmanager-main-proxy,/etc/alertmanager/secrets/alertmanager-kube-rbac-proxy\nlevel=info ts=2023-01-24T06:21:12.176256477Z caller=reloader.go:355 msg="Reload triggered" cfg_in= cfg_out= watched_dirs="/etc/alertmanager/config, /etc/alertmanager/secrets/alertmanager-main-tls, /etc/alertmanager/secrets/alertmanager-main-proxy, /etc/alertmanager/secrets/alertmanager-kube-rbac-proxy"\n

Found in 12.50% of runs (25.00% of failures) across 24 total runs and 1 jobs (50.00% failed) in 98ms - clear search | chart view - source code located on github