Job:
#1842002bug2 years agoKubePodCrashLooping kube-contoller-manager cluster-policy-controller: 6443: connect: connection refused RELEASE_PENDING
$ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-4.5/2428/artifacts/e2e-gcp/events.json | jq -r '.items[] | select(.metadata.namespace == "openshift-kube-apiserver") | .firstTimestamp + " " + .lastTimestamp + " " + .message' | sort
...
2020-05-30T01:10:53Z 2020-05-30T01:10:53Z All pending requests processed
2020-05-30T01:10:53Z 2020-05-30T01:10:53Z Server has stopped listening
2020-05-30T01:10:53Z 2020-05-30T01:10:53Z The minimal shutdown duration of 1m10s finished
...
2020-05-30T01:11:58Z 2020-05-30T01:11:58Z Created container kube-apiserver-cert-regeneration-controller
2020-05-30T01:11:58Z 2020-05-30T01:11:58Z Created container kube-apiserver-cert-syncer
2020-05-30T01:11:58Z 2020-05-30T01:11:58Z Started container kube-apiserver
...
#1934628bug17 months agoAPI server stopped reporting healthy during upgrade to 4.7.0 ASSIGNED
during that time the API server was restarted by kubelet due to a failed liveness probe
14:18:00	openshift-kube-apiserver	kubelet	kube-apiserver-ip-10-0-159-123.ec2.internal	Killing	Container kube-apiserver failed liveness probe, will be restarted
14:19:17	openshift-kube-apiserver	apiserver	kube-apiserver-ip-10-0-159-123.ec2.internal	TerminationMinimalShutdownDurationFinished	The minimal shutdown duration of 1m10s finished
moving to etcd team to investigate why etcd was unavailable during that time
Comment 15200626 by mfojtik@redhat.com at 2021-06-17T18:29:50Z
The LifecycleStale keyword was removed because the bug got commented on recently.
#1943804bug18 months agoAPI server on AWS takes disruption between 70s and 110s after pod begins termination via external LB RELEASE_PENDING
    "name": "kube-apiserver-ip-10-0-131-183.ec2.internal",
    "namespace": "openshift-kube-apiserver"
  },
  "kind": "Event",
  "lastTimestamp": null,
  "message": "The minimal shutdown duration of 1m10s finished",
  "metadata": {
    "creationTimestamp": "2021-03-29T12:18:04Z",
    "name": "kube-apiserver-ip-10-0-131-183.ec2.internal.1670cf61b0f72d2d",
    "namespace": "openshift-kube-apiserver",
    "resourceVersion": "89139",
#1921157bug23 months ago[sig-api-machinery] Kubernetes APIs remain available for new connections ASSIGNED
T2: At 06:45:58: systemd-shutdown was sending SIGTERM to remaining processes...
T3: At 06:45:58: kube-apiserver-ci-op-z52cbzhi-6d7cd-pz2jw-master-0: Received signal to terminate, becoming unready, but keeping serving (TerminationStart event)
T4: At 06:47:08 kube-apiserver-ci-op-z52cbzhi-6d7cd-pz2jw-master-0: The minimal shutdown duration of 1m10s finished (TerminationMinimalShutdownDurationFinished event)
T5: At 06:47:08 kube-apiserver-ci-op-z52cbzhi-6d7cd-pz2jw-master-0: Server has stopped listening (TerminationStoppedServing event)
T5 is the last event reported from that api server. At T5 the server might wait up to 60s for all requests to complete and then it fires TerminationGracefulTerminationFinished event.
#1932097bug18 months agoApiserver liveness probe is marking it as unhealthy during normal shutdown RELEASE_PENDING
Feb 23 20:18:04.212 - 1s    E kube-apiserver-new-connection kube-apiserver-new-connection is not responding to GET requests
Feb 23 20:18:05.318 I kube-apiserver-new-connection kube-apiserver-new-connection started responding to GET requests
Deeper detail from the node log shows that right as we get this error one of the instances finishes its connection ,which is right when the error happens.
Feb 23 20:18:02.505 I ns/openshift-kube-apiserver pod/kube-apiserver-ip-10-0-203-7.us-east-2.compute.internal node/ip-10-0-203-7 reason/TerminationMinimalShutdownDurationFinished The minimal shutdown duration of 1m10s finished
Feb 23 20:18:02.509 I ns/openshift-kube-apiserver pod/kube-apiserver-ip-10-0-203-7.us-east-2.compute.internal node/ip-10-0-203-7 reason/TerminationStoppedServing Server has stopped listening
Feb 23 20:18:03.148 I ns/openshift-console-operator deployment/console-operator reason/OperatorStatusChanged Status for clusteroperator/console changed: Degraded message changed from "CustomRouteSyncDegraded: the server is currently unable to handle the request (delete routes.route.openshift.io console-custom)\nSyncLoopRefreshDegraded: the server is currently unable to handle the request (get routes.route.openshift.io console)" to "SyncLoopRefreshDegraded: the server is currently unable to handle the request (get routes.route.openshift.io console)" (2 times)
Feb 23 20:18:03.880 E kube-apiserver-reused-connection kube-apiserver-reused-connection started failing: Get "https://api.ci-op-ivyvzgrr-0b477.origin-ci-int-aws.dev.rhcloud.com:6443/api/v1/namespaces/default": dial tcp 3.21.250.132:6443: connect: connection refused
This kind of looks like the load balancer didn't remove the kube-apiserver and kept sending traffic and the connection didn't cleanly shut down - did something regress in the apiserver traffic connection?
#1995804bug15 months agoRewrite carry "UPSTREAM: <carry>: create termination events" to lifecycleEvents RELEASE_PENDING
Use the new lifecycle event names for the events that we generate when an apiserver is gracefully terminating.
Comment 15454963 by kewang@redhat.com at 2021-09-03T09:36:37Z
$ w3m -dump -cols 200 'https://search.ci.openshift.org/?search=The+minimal+shutdown+duration&maxAge=168h&context=5&type=build-log&name=4%5C.9&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job' | grep -E 'kube-system node\/apiserver|openshift-kube-apiserver|openshift-apiserver' > test.log
$ grep 'The minimal shutdown duration of' test.log | head -2
Sep 03 05:22:37.000 I ns/openshift-kube-apiserver pod/kube-apiserver-ip-10-0-163-71.us-west-1.compute.internal node/ip-10-0-163-71 reason/AfterShutdownDelayDuration The minimal shutdown duration of 3m30s finished
Sep 03 05:22:37.000 I ns/openshift-kube-apiserver pod/kube-apiserver-ip-10-0-163-71.us-west-1.compute.internal node/ip-10-0-163-71 reason/AfterShutdownDelayDuration The minimal shutdown duration of 3m30s finished
$ grep 'Received signal to terminate' test.log | head -2
Sep 03 08:49:11.000 I ns/default namespace/kube-system node/apiserver-75cf4778cb-9zk42 reason/TerminationStart Received signal to terminate, becoming unready, but keeping serving
Sep 03 08:53:40.000 I ns/default namespace/kube-system node/apiserver-75cf4778cb-c8429 reason/TerminationStart Received signal to terminate, becoming unready, but keeping serving
#1955333bug11 months ago"Kubernetes APIs remain available for new connections" and similar failing on 4.8 Azure updates NEW
  2021-05-01T03:59:42Z 1 kube-apiserver-ip-10-0-189-59.ec2.internal Killing: Stopping container kube-apiserver-check-endpoints
  2021-05-01T03:59:42Z 1 kube-apiserver-ip-10-0-189-59.ec2.internal Killing: Stopping container kube-apiserver-insecure-readyz
  2021-05-01T03:59:43Z null kube-apiserver-ip-10-0-189-59.ec2.internal TerminationPreShutdownHooksFinished: All pre-shutdown hooks have been finished
  2021-05-01T03:59:43Z null kube-apiserver-ip-10-0-189-59.ec2.internal TerminationStart: Received signal to terminate, becoming unready, but keeping serving
  2021-05-01T03:59:49Z 1 cert-regeneration-controller-lock LeaderElection: ip-10-0-239-74_02f2b687-97f4-44c4-9516-e3fb364deb85 became leader
  2021-05-01T04:00:53Z null kube-apiserver-ip-10-0-189-59.ec2.internal TerminationMinimalShutdownDurationFinished: The minimal shutdown duration of 1m10s finished
  2021-05-01T04:00:53Z null kube-apiserver-ip-10-0-189-59.ec2.internal TerminationStoppedServing: Server has stopped listening
  2021-05-01T04:01:53Z null kube-apiserver-ip-10-0-189-59.ec2.internal TerminationGracefulTerminationFinished: All pending requests processed
  2021-05-01T04:01:55Z 1 kube-apiserver-ip-10-0-189-59.ec2.internal Pulling: Pulling image "registry.ci.openshift.org/ocp/4.8-2021-04-30-212732@sha256:e4c7be2f0e8b1e9ef1ad9161061449ec1bdc6953a58f6d456971ee945a8d3197"
  2021-05-01T04:02:05Z 1 kube-apiserver-ip-10-0-189-59.ec2.internal Created: Created container setup
  2021-05-01T04:02:05Z 1 kube-apiserver-ip-10-0-189-59.ec2.internal Pulled: Container image "registry.ci.openshift.org/ocp/4.8-2021-04-30-212732@sha256:e4c7be2f0e8b1e9ef1ad9161061449ec1bdc6953a58f6d456971ee945a8d3197" already present on machine
That really looks like kube-apiserver is rolling out a new version, and for some reason there is not the graceful LB handoff we need to avoid connection issues.  Unifying the two timelines:
* 03:59:43Z TerminationPreShutdownHooksFinished
* 03:59:43Z TerminationStart: Received signal to terminate, becoming unready, but keeping serving
* 04:00:53Z TerminationMinimalShutdownDurationFinished: The minimal shutdown duration of 1m10s finished
* 04:00:53Z TerminationStoppedServing: Server has stopped listening
* 04:00:58.307Z kube-apiserver-new-connection started failing... connection refused
* 04:00:59.314Z kube-apiserver-new-connection started responding to GET requests
* 04:01:03.307Z kube-apiserver-new-connection started failing... connection refused
* 04:01:04.313Z kube-apiserver-new-connection started responding to GET requests
#1979916bug18 months agokube-apiserver constantly receiving signals to terminate after a fresh install, but still keeps serving ASSIGNED
kube-apiserver-master-0-2
Server has stopped listening
kube-apiserver-master-0-2
The minimal shutdown duration of 1m10s finished
redhat-operators-7p4nb
Stopping container registry-server
Successfully pulled image "registry.redhat.io/redhat/redhat-operator-index:v4.8" in 3.09180991s
periodic-ci-openshift-release-master-nightly-4.9-e2e-metal-ipi-upgrade (all) - 12 runs, 50% failed, 50% of failures match = 25% impact
#1615676727529312256junit12 days ago
Jan 18 13:10:19.351 E ns/openshift-insights pod/insights-operator-6965ff48d7-dt8hv node/master-1 container/insights-operator reason/ContainerExit code/2 cause/Error ] Refreshing configuration from cluster pull secret\nI0118 13:08:32.091779       1 configobserver.go:102] Found cloud.openshift.com token\nI0118 13:08:32.091817       1 configobserver.go:120] Refreshing configuration from cluster secret\nI0118 13:08:32.093947       1 configobserver.go:124] Support secret does not exist\nI0118 13:08:38.409587       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="3.607384ms" userAgent="Prometheus/2.29.2" audit-ID="6f09a869-b066-4afe-b34f-a6170fa474d3" srcIP="10.128.2.9:35172" resp=200\nI0118 13:08:48.109250       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="3.588203ms" userAgent="Prometheus/2.29.2" audit-ID="454b780a-29c4-4ee6-b6a9-436a91f3b0f1" srcIP="10.131.0.23:45776" resp=200\nI0118 13:09:08.413193       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="6.84102ms" userAgent="Prometheus/2.29.2" audit-ID="48780fdb-8520-43f7-805e-c3313da4e3ba" srcIP="10.128.2.9:35172" resp=200\nI0118 13:09:18.113548       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="7.530784ms" userAgent="Prometheus/2.29.2" audit-ID="844b6058-43b6-4ec8-b8ea-76cac1f575a1" srcIP="10.131.0.23:45776" resp=200\nI0118 13:09:38.410199       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="3.669205ms" userAgent="Prometheus/2.29.2" audit-ID="9db25390-5873-46df-8f46-eee7d916d274" srcIP="10.128.2.9:35172" resp=200\nI0118 13:09:48.109085       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="3.211499ms" userAgent="Prometheus/2.29.2" audit-ID="82800069-823a-426a-bb38-3c5cd3b57d81" srcIP="10.131.0.23:45776" resp=200\nI0118 13:09:56.976691       1 status.go:178] Failed to download Insights report\nI0118 13:09:56.976731       1 status.go:354] The operator is healthy\nI0118 13:09:56.976789       1 status.go:441] No status update necessary, objects are identical\nI0118 13:10:08.413046       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="6.577009ms" userAgent="Prometheus/2.29.2" audit-ID="173be9a0-7124-4094-998f-d78add0a5555" srcIP="10.128.2.9:35172" resp=200\n
Jan 18 13:10:19.351 E ns/openshift-insights pod/insights-operator-6965ff48d7-dt8hv node/master-1 container/insights-operator reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 18 13:10:19.374 E ns/openshift-controller-manager-operator pod/openshift-controller-manager-operator-6f4c656958-6z47s node/master-1 container/openshift-controller-manager-operator reason/ContainerExit code/1 cause/Error "Shutting down controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"\nI0118 13:10:18.084900       1 reflector.go:225] Stopping reflector *v1.Secret (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0118 13:10:18.084898       1 tlsconfig.go:255] "Shutting down DynamicServingCertificateController"\nI0118 13:10:18.084936       1 reflector.go:225] Stopping reflector *v1.Role (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0118 13:10:18.084941       1 dynamic_serving_content.go:144] "Shutting down controller" name="serving-cert::/var/run/secrets/serving-cert/tls.crt::/var/run/secrets/serving-cert/tls.key"\nI0118 13:10:18.084957       1 reflector.go:225] Stopping reflector *v1.ClusterRole (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0118 13:10:18.084989       1 reflector.go:225] Stopping reflector *v1.Secret (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0118 13:10:18.085009       1 operator.go:115] Shutting down OpenShiftControllerManagerOperator\nI0118 13:10:18.085012       1 secure_serving.go:301] Stopped listening on [::]:8443\nI0118 13:10:18.085016       1 reflector.go:225] Stopping reflector *v1.RoleBinding (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0118 13:10:18.085025       1 reflector.go:225] Stopping reflector *v1.Deployment (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0118 13:10:18.085033       1 reflector.go:225] Stopping reflector *v1.ConfigMap (12h0m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0118 13:10:18.085034       1 genericapiserver.go:363] "[graceful-termination] shutdown event" name="HTTPServerStoppedListening"\nI0118 13:10:18.084655       1 reflector.go:225] Stopping reflector *v1.ClusterRoleBinding (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nW0118 13:10:18.085049       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\n
Jan 18 13:10:19.374 E ns/openshift-controller-manager-operator pod/openshift-controller-manager-operator-6f4c656958-6z47s node/master-1 container/openshift-controller-manager-operator reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 18 13:10:20.365 E ns/openshift-authentication-operator pod/authentication-operator-6db67cd596-rzq8x node/master-1 container/authentication-operator reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 18 13:10:23.147 E ns/openshift-console-operator pod/console-operator-c4d9985dc-s67w5 node/master-0 container/console-operator reason/ContainerExit code/1 cause/Error  1 genericapiserver.go:709] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-console-operator", Name:"console-operator-c4d9985dc-s67w5", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationMinimalShutdownDurationFinished' The minimal shutdown duration of 0s finished\nI0118 13:10:22.204488       1 genericapiserver.go:362] "[graceful-termination] shutdown event" name="AfterShutdownDelayDuration"\nI0118 13:10:22.205187       1 base_controller.go:167] Shutting down LoggingSyncer ...\nI0118 13:10:22.205211       1 base_controller.go:167] Shutting down StatusSyncer_console ...\nI0118 13:10:22.205216       1 base_controller.go:145] All StatusSyncer_console post start hooks have been terminated\nI0118 13:10:22.205225       1 base_controller.go:167] Shutting down ManagementStateController ...\nI0118 13:10:22.205241       1 base_controller.go:167] Shutting down UnsupportedConfigOverridesController ...\nI0118 13:10:22.205250       1 base_controller.go:167] Shutting down ResourceSyncController ...\nI0118 13:10:22.205259       1 base_controller.go:167] Shutting down ConsoleCLIDownloadsController ...\nI0118 13:10:22.205269       1 base_controller.go:167] Shutting down HealthCheckController ...\nI0118 13:10:22.205276       1 base_controller.go:167] Shutting down ConsoleServiceController ...\nI0118 13:10:22.205284       1 base_controller.go:167] Shutting down ConsoleOperator ...\nI0118 13:10:22.205291       1 base_controller.go:167] Shutting down ConsoleDownloadsDeploymentSyncController ...\nI0118 13:10:22.205298       1 base_controller.go:167] Shutting down DownloadsRouteController ...\nI0118 13:10:22.205305       1 base_controller.go:167] Shutting down ConsoleRouteController ...\nI0118 13:10:22.205342       1 base_controller.go:167] Shutting down ConsoleServiceController ...\nI0118 13:10:22.205353       1 base_controller.go:167] Shutting down RemoveStaleConditionsController ...\nW0118 13:10:22.205578       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\n
Jan 18 13:10:24.510 E ns/openshift-monitoring pod/cluster-monitoring-operator-556878b54d-r7rqs node/master-1 container/kube-rbac-proxy reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 18 13:10:27.474 E ns/openshift-cluster-storage-operator pod/csi-snapshot-controller-operator-6d84c868bd-59mfc node/master-1 container/csi-snapshot-controller-operator reason/ContainerExit code/1 cause/Error ler.go:167] Shutting down LoggingSyncer ...\nI0118 13:10:26.202669       1 secure_serving.go:311] Stopped listening on [::]:8443\nI0118 13:10:26.202705       1 genericapiserver.go:363] "[graceful-termination] shutdown event" name="HTTPServerStoppedListening"\nI0118 13:10:26.202741       1 configmap_cafile_content.go:222] "Shutting down controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"\nI0118 13:10:26.202758       1 configmap_cafile_content.go:222] "Shutting down controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"\nI0118 13:10:26.202759       1 requestheader_controller.go:183] Shutting down RequestHeaderAuthRequestController\nI0118 13:10:26.202804       1 base_controller.go:114] Shutting down worker of StatusSyncer_csi-snapshot-controller controller ...\nI0118 13:10:26.202831       1 base_controller.go:104] All StatusSyncer_csi-snapshot-controller workers have been terminated\nI0118 13:10:26.202874       1 base_controller.go:114] Shutting down worker of StaticResourceController controller ...\nI0118 13:10:26.202882       1 base_controller.go:104] All StaticResourceController workers have been terminated\nI0118 13:10:26.202893       1 tlsconfig.go:255] "Shutting down DynamicServingCertificateController"\nI0118 13:10:26.202910       1 dynamic_serving_content.go:144] "Shutting down controller" name="serving-cert::/tmp/serving-cert-809261133/tls.crt::/tmp/serving-cert-809261133/tls.key"\nI0118 13:10:26.203029       1 base_controller.go:114] Shutting down worker of ManagementStateController controller ...\nI0118 13:10:26.203034       1 base_controller.go:104] All ManagementStateController workers have been terminated\nI0118 13:10:26.203038       1 base_controller.go:114] Shutting down worker of LoggingSyncer controller ...\nI0118 13:10:26.203047       1 base_controller.go:104] All LoggingSyncer workers have been terminated\nW0118 13:10:26.203061       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\n
Jan 18 13:10:27.574 E ns/openshift-cluster-storage-operator pod/cluster-storage-operator-547d494d86-67rbw node/master-1 container/cluster-storage-operator reason/ContainerExit code/1 cause/Error ing] validServingFor=[cluster-storage-operator-metrics.openshift-cluster-storage-operator.svc,cluster-storage-operator-metrics.openshift-cluster-storage-operator.svc.cluster.local] issuer=\"openshift-service-serving-signer@1674044227\" (2023-01-18 12:17:18 +0000 UTC to 2025-01-17 12:17:19 +0000 UTC (now=2023-01-18 12:20:41.495386289 +0000 UTC))"\nI0118 12:20:41.495529       1 named_certificates.go:53] "Loaded SNI cert" index=0 certName="self-signed loopback" certDetail="\"apiserver-loopback-client@1674044296\" [serving] validServingFor=[apiserver-loopback-client] issuer=\"apiserver-loopback-client-ca@1674044296\" (2023-01-18 11:18:16 +0000 UTC to 2024-01-18 11:18:16 +0000 UTC (now=2023-01-18 12:20:41.495515801 +0000 UTC))"\nI0118 13:10:26.179046       1 cmd.go:97] Received SIGTERM or SIGINT signal, shutting down controller.\nI0118 13:10:26.179120       1 genericapiserver.go:386] [graceful-termination] RunPreShutdownHooks has completed\nI0118 13:10:26.179153       1 genericapiserver.go:349] "[graceful-termination] shutdown event" name="ShutdownInitiated"\nI0118 13:10:26.179200       1 base_controller.go:167] Shutting down SnapshotCRDController ...\nI0118 13:10:26.179216       1 base_controller.go:167] Shutting down CSIDriverStarter ...\nI0118 13:10:26.179223       1 base_controller.go:167] Shutting down DefaultStorageClassController ...\nI0118 13:10:26.179237       1 base_controller.go:167] Shutting down ConfigObserver ...\nI0118 13:10:26.179247       1 base_controller.go:167] Shutting down StatusSyncer_storage ...\nI0118 13:10:26.179250       1 base_controller.go:145] All StatusSyncer_storage post start hooks have been terminated\nI0118 13:10:26.179257       1 base_controller.go:167] Shutting down ManagementStateController ...\nI0118 13:10:26.179263       1 base_controller.go:167] Shutting down LoggingSyncer ...\nI0118 13:10:26.179270       1 base_controller.go:167] Shutting down VSphereProblemDetectorStarter ...\nW0118 13:10:26.179379       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\n
Jan 18 13:10:27.574 E ns/openshift-cluster-storage-operator pod/cluster-storage-operator-547d494d86-67rbw node/master-1 container/cluster-storage-operator reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 18 13:10:34.897 E clusterversion/version changed Failing to True: MultipleErrors: Multiple errors are preventing progress:\n* Cluster operator authentication is updating versions\n* Cluster operator cloud-credential is updating versions\n* Cluster operator cluster-autoscaler is updating versions\n* Cluster operator console is updating versions\n* Cluster operator csi-snapshot-controller is updating versions\n* Cluster operator image-registry is updating versions\n* Cluster operator ingress is updating versions\n* Cluster operator insights is updating versions\n* Cluster operator machine-approver is updating versions\n* Cluster operator marketplace is updating versions\n* Cluster operator monitoring is updating versions\n* Cluster operator node-tuning is updating versions\n* Cluster operator openshift-controller-manager is updating versions\n* Cluster operator openshift-samples is updating versions\n* Cluster operator operator-lifecycle-manager is updating versions\n* Cluster operator storage is updating versions
#1620268972702175232junitAbout an hour ago
Jan 31 06:14:50.997 E ns/openshift-insights pod/insights-operator-7958685c99-fgxml node/master-1 container/insights-operator reason/ContainerExit code/2 cause/Error 131.0.21:37478" resp=200\nI0131 06:12:51.451658       1 configobserver.go:77] Refreshing configuration from cluster pull secret\nI0131 06:12:51.454551       1 configobserver.go:102] Found cloud.openshift.com token\nI0131 06:12:51.454577       1 configobserver.go:120] Refreshing configuration from cluster secret\nI0131 06:12:51.456617       1 configobserver.go:124] Support secret does not exist\nI0131 06:13:05.845621       1 status.go:354] The operator is healthy\nI0131 06:13:05.845686       1 status.go:441] No status update necessary, objects are identical\nI0131 06:13:16.683919       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="7.214702ms" userAgent="Prometheus/2.29.2" audit-ID="758884b1-8f53-4447-b46f-d1b05723a65e" srcIP="10.131.0.20:50938" resp=200\nI0131 06:13:21.393139       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="4.715735ms" userAgent="Prometheus/2.29.2" audit-ID="f0fedea3-528b-46c5-89aa-8fe4563af8ce" srcIP="10.131.0.21:37478" resp=200\nI0131 06:13:46.680457       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="3.253377ms" userAgent="Prometheus/2.29.2" audit-ID="abd1b16c-5b78-4fd4-9cd9-e965f91fb25b" srcIP="10.131.0.20:50938" resp=200\nI0131 06:13:51.394886       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="7.196708ms" userAgent="Prometheus/2.29.2" audit-ID="3f0ab73f-73d2-4499-838e-74fdbef8073a" srcIP="10.131.0.21:37478" resp=200\nI0131 06:14:16.683099       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="6.178153ms" userAgent="Prometheus/2.29.2" audit-ID="96b3a922-1a00-4498-b91a-f01a363c21f2" srcIP="10.131.0.20:50938" resp=200\nI0131 06:14:21.392328       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="4.665729ms" userAgent="Prometheus/2.29.2" audit-ID="0d86b69a-2120-483c-9d2f-15c6c4275956" srcIP="10.131.0.21:37478" resp=200\nI0131 06:14:46.680534       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="2.902423ms" userAgent="Prometheus/2.29.2" audit-ID="8e4c4714-3bff-4b9d-8409-e0492abd7e06" srcIP="10.131.0.20:50938" resp=200\n
Jan 31 06:14:53.034 E ns/openshift-controller-manager-operator pod/openshift-controller-manager-operator-77c7c44dcd-jnw4d node/master-1 container/openshift-controller-manager-operator reason/ContainerExit code/1 cause/Error server.go:352] "[graceful-termination] shutdown event" name="AfterShutdownDelayDuration"\nI0131 06:14:52.206164       1 genericapiserver.go:376] "[graceful-termination] shutdown event" name="InFlightRequestsDrained"\nI0131 06:14:52.206256       1 base_controller.go:167] Shutting down UserCAObservationController ...\nI0131 06:14:52.206293       1 base_controller.go:167] Shutting down StaticResourceController ...\nI0131 06:14:52.206574       1 base_controller.go:167] Shutting down ConfigObserver ...\nI0131 06:14:52.206651       1 base_controller.go:167] Shutting down ResourceSyncController ...\nI0131 06:14:52.206876       1 reflector.go:225] Stopping reflector *v1.Build (10m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0131 06:14:52.207015       1 reflector.go:225] Stopping reflector *v1.ConfigMap (12h0m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0131 06:14:52.207086       1 reflector.go:225] Stopping reflector *v1.ConfigMap (12h0m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0131 06:14:52.207148       1 reflector.go:225] Stopping reflector *v1.ConfigMap (12h0m0s) from k8s.io/client-go@v0.22.0-rc.0/tools/cache/reflector.go:167\nI0131 06:14:52.207235       1 base_controller.go:114] Shutting down worker of ConfigObserver controller ...\nI0131 06:14:52.209956       1 base_controller.go:104] All ConfigObserver workers have been terminated\nI0131 06:14:52.207261       1 base_controller.go:167] Shutting down StatusSyncer_openshift-controller-manager ...\nI0131 06:14:52.209969       1 base_controller.go:145] All StatusSyncer_openshift-controller-manager post start hooks have been terminated\nI0131 06:14:52.207267       1 base_controller.go:114] Shutting down worker of StatusSyncer_openshift-controller-manager controller ...\nI0131 06:14:52.209974       1 base_controller.go:104] All StatusSyncer_openshift-controller-manager workers have been terminated\nW0131 06:14:52.207275       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\n
Jan 31 06:14:59.083 E ns/openshift-cluster-storage-operator pod/csi-snapshot-controller-operator-5dc5c8f588-kmmt5 node/master-1 container/csi-snapshot-controller-operator reason/ContainerExit code/1 cause/Error  UTC m=+4109.534987426\nI0131 06:14:58.158602       1 cmd.go:97] Received SIGTERM or SIGINT signal, shutting down controller.\nI0131 06:14:58.159226       1 genericapiserver.go:386] [graceful-termination] RunPreShutdownHooks has completed\nI0131 06:14:58.159412       1 genericapiserver.go:349] "[graceful-termination] shutdown event" name="ShutdownInitiated"\nI0131 06:14:58.159595       1 base_controller.go:167] Shutting down CSISnapshotWebhookController ...\nI0131 06:14:58.159795       1 base_controller.go:114] Shutting down worker of CSISnapshotWebhookController controller ...\nI0131 06:14:58.159838       1 base_controller.go:104] All CSISnapshotWebhookController workers have been terminated\nI0131 06:14:58.159858       1 base_controller.go:167] Shutting down StatusSyncer_csi-snapshot-controller ...\nI0131 06:14:58.159874       1 base_controller.go:145] All StatusSyncer_csi-snapshot-controller post start hooks have been terminated\nI0131 06:14:58.159883       1 base_controller.go:167] Shutting down ManagementStateController ...\nI0131 06:14:58.159890       1 base_controller.go:114] Shutting down worker of StatusSyncer_csi-snapshot-controller controller ...\nI0131 06:14:58.159894       1 base_controller.go:104] All StatusSyncer_csi-snapshot-controller workers have been terminated\nI0131 06:14:58.159969       1 base_controller.go:114] Shutting down worker of ManagementStateController controller ...\nI0131 06:14:58.159976       1 base_controller.go:104] All ManagementStateController workers have been terminated\nI0131 06:14:58.159978       1 base_controller.go:167] Shutting down LoggingSyncer ...\nI0131 06:14:58.159982       1 base_controller.go:114] Shutting down worker of LoggingSyncer controller ...\nI0131 06:14:58.159986       1 base_controller.go:104] All LoggingSyncer workers have been terminated\nI0131 06:14:58.159987       1 base_controller.go:114] Shutting down worker of StaticResourceController controller ...\nW0131 06:14:58.160010       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\n
Jan 31 06:15:00.125 E ns/openshift-cluster-storage-operator pod/cluster-storage-operator-59988f7bbd-9s4z7 node/master-1 container/cluster-storage-operator reason/ContainerExit code/1 cause/Error Starting #1 worker of DefaultStorageClassController controller ...\nI0131 05:10:52.539901       1 base_controller.go:73] Caches are synced for StatusSyncer_storage \nI0131 05:10:52.539908       1 base_controller.go:110] Starting #1 worker of StatusSyncer_storage controller ...\nI0131 05:10:52.539915       1 base_controller.go:73] Caches are synced for VSphereProblemDetectorStarter \nI0131 05:10:52.539917       1 base_controller.go:110] Starting #1 worker of VSphereProblemDetectorStarter controller ...\nI0131 05:10:52.539956       1 base_controller.go:73] Caches are synced for LoggingSyncer \nI0131 05:10:52.539988       1 base_controller.go:110] Starting #1 worker of LoggingSyncer controller ...\nI0131 05:10:52.639228       1 base_controller.go:73] Caches are synced for SnapshotCRDController \nI0131 05:10:52.639248       1 base_controller.go:110] Starting #1 worker of SnapshotCRDController controller ...\nI0131 06:14:58.159671       1 cmd.go:97] Received SIGTERM or SIGINT signal, shutting down controller.\nI0131 06:14:58.160181       1 genericapiserver.go:386] [graceful-termination] RunPreShutdownHooks has completed\nI0131 06:14:58.160212       1 genericapiserver.go:349] "[graceful-termination] shutdown event" name="ShutdownInitiated"\nI0131 06:14:58.160291       1 base_controller.go:167] Shutting down SnapshotCRDController ...\nI0131 06:14:58.160311       1 configmap_cafile_content.go:222] "Shutting down controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"\nI0131 06:14:58.160323       1 base_controller.go:114] Shutting down worker of SnapshotCRDController controller ...\nI0131 06:14:58.160326       1 requestheader_controller.go:183] Shutting down RequestHeaderAuthRequestController\nW0131 06:14:58.160331       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\nI0131 06:14:58.160331       1 base_controller.go:104] All SnapshotCRDController workers have been terminated\nI0131 06:14:58.160329       1 base_controller.go:167] Shutting down CSIDriverStarter ...\n
Jan 31 06:15:00.125 E ns/openshift-cluster-storage-operator pod/cluster-storage-operator-59988f7bbd-9s4z7 node/master-1 container/cluster-storage-operator reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 31 06:15:00.262 E ns/openshift-console-operator pod/console-operator-7cd47d569f-d6f7x node/master-0 container/console-operator reason/ContainerExit code/1 cause/Error or on the server ("unable to decode an event from the watch stream: stream error: stream ID 1213; INTERNAL_ERROR") has prevented the request from succeeding\nI0131 06:14:59.304575       1 cmd.go:97] Received SIGTERM or SIGINT signal, shutting down controller.\nI0131 06:14:59.304652       1 genericapiserver.go:398] [graceful-termination] RunPreShutdownHooks has completed\nI0131 06:14:59.304677       1 base_controller.go:167] Shutting down ConsoleServiceController ...\nI0131 06:14:59.304675       1 genericapiserver.go:709] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-console-operator", Name:"console-operator-7cd47d569f-d6f7x", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationPreShutdownHooksFinished' All pre-shutdown hooks have been finished\nI0131 06:14:59.304685       1 genericapiserver.go:355] "[graceful-termination] shutdown event" name="ShutdownInitiated"\nI0131 06:14:59.304702       1 base_controller.go:167] Shutting down ConsoleDownloadsDeploymentSyncController ...\nI0131 06:14:59.304709       1 base_controller.go:167] Shutting down ManagementStateController ...\nI0131 06:14:59.304703       1 genericapiserver.go:709] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-console-operator", Name:"console-operator-7cd47d569f-d6f7x", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationStart' Received signal to terminate, becoming unready, but keeping serving\nI0131 06:14:59.304732       1 base_controller.go:167] Shutting down RemoveStaleConditionsController ...\nW0131 06:14:59.304732       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\nI0131 06:14:59.304736       1 genericapiserver.go:709] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-console-operator", Name:"console-operator-7cd47d569f-d6f7x", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationMinimalShutdownDurationFinished' The minimal shutdown duration of 0s finished\n
Jan 31 06:15:02.454 E ns/openshift-ingress-canary pod/ingress-canary-bc98v node/worker-0 container/serve-healthcheck-canary reason/ContainerExit code/2 cause/Error serving on 8888\nserving on 8080\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\nServing canary healthcheck request\n
Jan 31 06:15:06.235 E ns/openshift-service-ca-operator pod/service-ca-operator-bf9d548c6-z26h7 node/master-1 container/service-ca-operator reason/ContainerExit code/1 cause/Error
Jan 31 06:15:08.250 E ns/openshift-service-ca pod/service-ca-69fd4dcb48-hcjdv node/master-1 container/service-ca-controller reason/ContainerExit code/1 cause/Error
Jan 31 06:15:11.004 E ns/openshift-monitoring pod/alertmanager-main-2 node/worker-2 container/alertmanager-proxy reason/ContainerExit code/2 cause/Error 2023/01/31 05:17:53 provider.go:128: Defaulting client-id to system:serviceaccount:openshift-monitoring:alertmanager-main\n2023/01/31 05:17:53 provider.go:133: Defaulting client-secret to service account token /var/run/secrets/kubernetes.io/serviceaccount/token\n2023/01/31 05:17:53 provider.go:351: Delegation of authentication and authorization to OpenShift is enabled for bearer tokens and client certificates.\n2023/01/31 05:17:53 oauthproxy.go:203: mapping path "/" => upstream "http://localhost:9093/"\n2023/01/31 05:17:53 oauthproxy.go:230: OAuthProxy configured for  Client ID: system:serviceaccount:openshift-monitoring:alertmanager-main\n2023/01/31 05:17:53 oauthproxy.go:240: Cookie settings: name:_oauth_proxy secure(https):true httponly:true expiry:168h0m0s domain:<default> samesite: refresh:disabled\n2023/01/31 05:17:53 http.go:107: HTTPS: listening on [::]:9095\nI0131 05:17:53.982434       1 dynamic_serving_content.go:130] Starting serving::/etc/tls/private/tls.crt::/etc/tls/private/tls.key\n
Jan 31 06:15:11.004 E ns/openshift-monitoring pod/alertmanager-main-2 node/worker-2 container/config-reloader reason/ContainerExit code/2 cause/Error level=info ts=2023-01-31T05:17:53.043482931Z caller=main.go:148 msg="Starting prometheus-config-reloader" version="(version=0.49.0, branch=rhaos-4.9-rhel-8, revision=d709566)"\nlevel=info ts=2023-01-31T05:17:53.043516986Z caller=main.go:149 build_context="(go=go1.16.12, user=root, date=20221205-20:41:17)"\nlevel=info ts=2023-01-31T05:17:53.043604053Z caller=main.go:183 msg="Starting web server for metrics" listen=localhost:8080\nlevel=info ts=2023-01-31T05:17:53.043957442Z caller=reloader.go:219 msg="started watching config file and directories for changes" cfg= out= dirs=/etc/alertmanager/config,/etc/alertmanager/secrets/alertmanager-main-tls,/etc/alertmanager/secrets/alertmanager-main-proxy,/etc/alertmanager/secrets/alertmanager-kube-rbac-proxy\nlevel=info ts=2023-01-31T05:17:54.490669131Z caller=reloader.go:355 msg="Reload triggered" cfg_in= cfg_out= watched_dirs="/etc/alertmanager/config, /etc/alertmanager/secrets/alertmanager-main-tls, /etc/alertmanager/secrets/alertmanager-main-proxy, /etc/alertmanager/secrets/alertmanager-kube-rbac-proxy"\n
#1617597111589670912junit7 days ago
Jan 23 20:17:13.501 E ns/openshift-controller-manager-operator pod/openshift-controller-manager-operator-6f4687459b-knrdx node/master-1 container/openshift-controller-manager-operator reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 23 20:17:14.511 E ns/openshift-insights pod/insights-operator-79c7d54757-68v4v node/master-1 container/insights-operator reason/ContainerExit code/2 cause/Error :15:16.207437       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="5.480622ms" userAgent="Prometheus/2.29.2" audit-ID="8fe14fce-5a91-441d-bdaf-2b6c2e3788a3" srcIP="10.131.0.23:47668" resp=200\nI0123 20:15:34.502378       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="6.684049ms" userAgent="Prometheus/2.29.2" audit-ID="5c1caa3c-97b5-43cc-8699-1b3d5f9a163c" srcIP="10.131.0.22:49428" resp=200\nI0123 20:15:46.204012       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="2.126393ms" userAgent="Prometheus/2.29.2" audit-ID="cc99394e-9de3-472f-9103-1d42d86cef4d" srcIP="10.131.0.23:47668" resp=200\nI0123 20:15:56.602824       1 status.go:354] The operator is healthy\nI0123 20:15:56.602879       1 status.go:441] No status update necessary, objects are identical\nI0123 20:16:02.381199       1 reflector.go:535] k8s.io/apiserver/pkg/authentication/request/headerrequest/requestheader_controller.go:172: Watch close - *v1.ConfigMap total 7 items received\nI0123 20:16:04.504713       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="8.371602ms" userAgent="Prometheus/2.29.2" audit-ID="5fb0df39-405d-4dfa-ad55-6b0e3c0018c0" srcIP="10.131.0.22:49428" resp=200\nI0123 20:16:16.207154       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="5.191183ms" userAgent="Prometheus/2.29.2" audit-ID="4314add9-0dd9-4daf-afc9-fab4efccfbb8" srcIP="10.131.0.23:47668" resp=200\nI0123 20:16:34.502427       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="5.953962ms" userAgent="Prometheus/2.29.2" audit-ID="b0d5be25-f116-4a2d-93b7-7759f6d7559f" srcIP="10.131.0.22:49428" resp=200\nI0123 20:16:46.204232       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="2.320718ms" userAgent="Prometheus/2.29.2" audit-ID="0d515458-9e54-4eee-8daf-e98acf609c67" srcIP="10.131.0.23:47668" resp=200\nI0123 20:17:04.505309       1 httplog.go:104] "HTTP" verb="GET" URI="/metrics" latency="8.7289ms" userAgent="Prometheus/2.29.2" audit-ID="b09a8b2e-2c4d-4497-a9f7-0efeb69e134c" srcIP="10.131.0.22:49428" resp=200\n
Jan 23 20:17:16.549 E ns/openshift-cluster-storage-operator pod/cluster-storage-operator-7fbc644d58-f4r4f node/master-1 container/cluster-storage-operator reason/ContainerExit code/1 cause/Error oller.go:167] Shutting down ManagementStateController ...\nI0123 20:17:15.444382       1 configmap_cafile_content.go:222] "Shutting down controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"\nI0123 20:17:15.444388       1 base_controller.go:167] Shutting down VSphereProblemDetectorStarter ...\nI0123 20:17:15.444396       1 configmap_cafile_content.go:222] "Shutting down controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"\nI0123 20:17:15.444400       1 base_controller.go:114] Shutting down worker of SnapshotCRDController controller ...\nI0123 20:17:15.444404       1 base_controller.go:167] Shutting down StatusSyncer_storage ...\nI0123 20:17:15.444405       1 base_controller.go:104] All SnapshotCRDController workers have been terminated\nI0123 20:17:15.444408       1 base_controller.go:145] All StatusSyncer_storage post start hooks have been terminated\nI0123 20:17:15.444413       1 base_controller.go:114] Shutting down worker of DefaultStorageClassController controller ...\nI0123 20:17:15.444416       1 base_controller.go:104] All DefaultStorageClassController workers have been terminated\nI0123 20:17:15.444420       1 base_controller.go:167] Shutting down LoggingSyncer ...\nI0123 20:17:15.444421       1 base_controller.go:114] Shutting down worker of ConfigObserver controller ...\nI0123 20:17:15.444425       1 base_controller.go:104] All ConfigObserver workers have been terminated\nI0123 20:17:15.444431       1 base_controller.go:114] Shutting down worker of CSIDriverStarter controller ...\nI0123 20:17:15.444435       1 base_controller.go:104] All CSIDriverStarter workers have been terminated\nI0123 20:17:15.444439       1 base_controller.go:114] Shutting down worker of ManagementStateController controller ...\nI0123 20:17:15.444442       1 base_controller.go:104] All ManagementStateController workers have been terminated\nW0123 20:17:15.444431       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\n
Jan 23 20:17:17.677 E ns/openshift-monitoring pod/cluster-monitoring-operator-55748f77fd-r7xmc node/master-1 container/kube-rbac-proxy reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 23 20:17:19.692 E ns/openshift-cluster-storage-operator pod/csi-snapshot-controller-operator-649ccd8b7b-7kk2n node/master-1 container/csi-snapshot-controller-operator reason/ContainerExit code/1 cause/Error I0123 20:17:12.099673       1 operator.go:157] Starting syncing operator at 2023-01-23 20:17:12.099670647 +0000 UTC m=+3013.343745257\nI0123 20:17:12.697227       1 operator.go:159] Finished syncing operator at 597.548196ms\nI0123 20:17:15.432831       1 operator.go:157] Starting syncing operator at 2023-01-23 20:17:15.432822127 +0000 UTC m=+3016.676896726\nI0123 20:17:15.460890       1 operator.go:159] Finished syncing operator at 28.061364ms\nI0123 20:17:15.461519       1 operator.go:157] Starting syncing operator at 2023-01-23 20:17:15.461514162 +0000 UTC m=+3016.705588825\nI0123 20:17:15.486578       1 operator.go:159] Finished syncing operator at 25.057892ms\nI0123 20:17:18.147494       1 operator.go:157] Starting syncing operator at 2023-01-23 20:17:18.147470059 +0000 UTC m=+3019.391544674\nI0123 20:17:18.165365       1 cmd.go:97] Received SIGTERM or SIGINT signal, shutting down controller.\nI0123 20:17:18.165551       1 base_controller.go:167] Shutting down CSISnapshotWebhookController ...\nI0123 20:17:18.165675       1 base_controller.go:167] Shutting down StaticResourceController ...\nI0123 20:17:18.165584       1 base_controller.go:167] Shutting down StatusSyncer_csi-snapshot-controller ...\nI0123 20:17:18.165685       1 base_controller.go:114] Shutting down worker of StaticResourceController controller ...\nI0123 20:17:18.165702       1 base_controller.go:145] All StatusSyncer_csi-snapshot-controller post start hooks have been terminated\nI0123 20:17:18.165604       1 base_controller.go:167] Shutting down ManagementStateController ...\nI0123 20:17:18.165605       1 genericapiserver.go:349] "[graceful-termination] shutdown event" name="ShutdownInitiated"\nI0123 20:17:18.165716       1 genericapiserver.go:352] "[graceful-termination] shutdown event" name="AfterShutdownDelayDuration"\nI0123 20:17:18.165743       1 genericapiserver.go:376] "[graceful-termination] shutdown event" name="InFlightRequestsDrained"\nW0123 20:17:18.165605       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\n
Jan 23 20:17:20.533 E ns/openshift-console-operator pod/console-operator-7bddd84bf-t6lsb node/master-2 container/console-operator reason/ContainerExit code/1 cause/Error eVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationStart' Received signal to terminate, becoming unready, but keeping serving\nI0123 20:17:19.049251       1 base_controller.go:167] Shutting down ConsoleOperator ...\nI0123 20:17:19.049253       1 genericapiserver.go:709] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-console-operator", Name:"console-operator-7bddd84bf-t6lsb", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationMinimalShutdownDurationFinished' The minimal shutdown duration of 0s finished\nI0123 20:17:19.049264       1 genericapiserver.go:362] "[graceful-termination] shutdown event" name="AfterShutdownDelayDuration"\nI0123 20:17:19.049267       1 base_controller.go:167] Shutting down DownloadsRouteController ...\nI0123 20:17:19.049277       1 base_controller.go:167] Shutting down ConsoleServiceController ...\nI0123 20:17:19.049281       1 genericapiserver.go:709] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-console-operator", Name:"console-operator-7bddd84bf-t6lsb", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationStoppedServing' Server has stopped listening\nI0123 20:17:19.049286       1 base_controller.go:167] Shutting down ConsoleRouteController ...\nI0123 20:17:19.049293       1 genericapiserver.go:387] "[graceful-termination] shutdown event" name="InFlightRequestsDrained"\nI0123 20:17:19.049310       1 base_controller.go:167] Shutting down ConsoleServiceController ...\nI0123 20:17:19.049321       1 base_controller.go:167] Shutting down UnsupportedConfigOverridesController ...\nI0123 20:17:19.049330       1 base_controller.go:167] Shutting down RemoveStaleConditionsController ...\nI0123 20:17:19.049338       1 base_controller.go:167] Shutting down HealthCheckController ...\nI0123 20:17:19.049348       1 base_controller.go:167] Shutting down ManagementStateController ...\nW0123 20:17:19.049514       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\n
Jan 23 20:17:24.643 E ns/openshift-service-ca-operator pod/service-ca-operator-b54f5d694-rj999 node/master-1 container/service-ca-operator reason/ContainerExit code/1 cause/Error
Jan 23 20:17:42.143 E clusterversion/version changed Failing to True: MultipleErrors: Multiple errors are preventing progress:\n* Cluster operator authentication is updating versions\n* Cluster operator cluster-autoscaler is updating versions\n* Cluster operator console is updating versions\n* Cluster operator csi-snapshot-controller is updating versions\n* Cluster operator image-registry is updating versions\n* Cluster operator ingress is updating versions\n* Cluster operator insights is updating versions\n* Cluster operator marketplace is updating versions\n* Cluster operator monitoring is updating versions\n* Cluster operator node-tuning is updating versions\n* Cluster operator openshift-controller-manager is updating versions\n* Cluster operator openshift-samples is updating versions\n* Cluster operator service-ca is updating versions\n* Cluster operator storage is updating versions
Jan 23 20:20:28.264 E ns/openshift-controller-manager pod/controller-manager-q99dl node/master-2 container/controller-manager reason/ContainerExit code/137 cause/Error I0123 19:48:38.303362       1 controller_manager.go:39] Starting controllers on 0.0.0.0:8443 (4.9.0-202212051626.p0.g79857a3.assembly.stream-79857a3)\nI0123 19:48:38.305248       1 controller_manager.go:50] DeploymentConfig controller using images from "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b9f4cc933f9bced10b1e8b7ebd0695e02f09eba30ac0a43c9cca51c04adc9589"\nI0123 19:48:38.305265       1 controller_manager.go:56] Build controller using images from "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6601b9ef96b38632311dfced9f4588402fed41a0112586f7dad45ef62474beb1"\nI0123 19:48:38.305359       1 standalone_apiserver.go:104] Started health checks at 0.0.0.0:8443\nI0123 19:48:38.305441       1 leaderelection.go:248] attempting to acquire leader lease openshift-controller-manager/openshift-master-controllers...\n
Jan 23 20:20:28.303 E ns/openshift-controller-manager pod/controller-manager-9xhmj node/master-1 container/controller-manager reason/ContainerExit code/137 cause/Error I0123 19:48:38.680560       1 controller_manager.go:39] Starting controllers on 0.0.0.0:8443 (4.9.0-202212051626.p0.g79857a3.assembly.stream-79857a3)\nI0123 19:48:38.681993       1 controller_manager.go:50] DeploymentConfig controller using images from "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b9f4cc933f9bced10b1e8b7ebd0695e02f09eba30ac0a43c9cca51c04adc9589"\nI0123 19:48:38.682044       1 controller_manager.go:56] Build controller using images from "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6601b9ef96b38632311dfced9f4588402fed41a0112586f7dad45ef62474beb1"\nI0123 19:48:38.682120       1 standalone_apiserver.go:104] Started health checks at 0.0.0.0:8443\nI0123 19:48:38.682243       1 leaderelection.go:248] attempting to acquire leader lease openshift-controller-manager/openshift-master-controllers...\n
Jan 23 20:20:28.763 E ns/openshift-controller-manager pod/controller-manager-8w9kz node/master-0 container/controller-manager reason/ContainerExit code/137 cause/Error 51:05.973395       1 templateinstance_finalizer.go:194] Starting TemplateInstanceFinalizer controller\nI0123 19:51:05.975380       1 shared_informer.go:247] Caches are synced for service account \nI0123 19:51:05.979428       1 factory.go:85] deploymentconfig controller caches are synced. Starting workers.\nI0123 19:51:05.984780       1 shared_informer.go:247] Caches are synced for DefaultRoleBindingController \nI0123 19:51:05.988797       1 buildconfig_controller.go:212] Starting buildconfig controller\nI0123 19:51:06.160348       1 deleted_dockercfg_secrets.go:75] caches synced\nI0123 19:51:06.160359       1 deleted_token_secrets.go:70] caches synced\nI0123 19:51:06.160379       1 docker_registry_service.go:156] caches synced\nI0123 19:51:06.160412       1 create_dockercfg_secrets.go:219] urls found\nI0123 19:51:06.160421       1 create_dockercfg_secrets.go:225] caches synced\nI0123 19:51:06.160470       1 docker_registry_service.go:298] Updating registry URLs from map[172.30.26.175:5000:{} image-registry.openshift-image-registry.svc.cluster.local:5000:{} image-registry.openshift-image-registry.svc:5000:{}] to map[172.30.26.175:5000:{} image-registry.openshift-image-registry.svc.cluster.local:5000:{} image-registry.openshift-image-registry.svc:5000:{}]\nI0123 19:51:06.166385       1 build_controller.go:475] Starting build controller\nI0123 19:51:06.166398       1 build_controller.go:477] OpenShift image registry hostname: image-registry.openshift-image-registry.svc:5000\nE0123 20:17:15.945873       1 imagestream_controller.go:136] Error syncing image stream "openshift/jenkins-agent-base": Operation cannot be fulfilled on imagestreamimports.image.openshift.io "jenkins-agent-base": the object has been modified; please apply your changes to the latest version and try again\nE0123 20:17:16.297544       1 imagestream_controller.go:136] Error syncing image stream "openshift/jenkins-agent-base": Operation cannot be fulfilled on imagestream.image.openshift.io "jenkins-agent-base": the image stream was updated from "39690" to "39827"\n
#1617597111589670912junit7 days ago
Jan 23 20:30:28.190 E ns/openshift-network-diagnostics pod/network-check-target-j6dp4 node/master-0 container/network-check-target-container reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 23 20:30:30.195 E ns/openshift-controller-manager pod/controller-manager-592jv node/master-0 container/controller-manager reason/TerminationStateCleared lastState.terminated was cleared on a pod (bug https://bugzilla.redhat.com/show_bug.cgi?id=1933760 or similar)
Jan 23 20:30:48.800 E ns/openshift-cluster-storage-operator pod/csi-snapshot-webhook-987f7bc9c-9jfgv node/master-2 container/webhook reason/ContainerExit code/2 cause/Error
Jan 23 20:30:50.833 E ns/openshift-cluster-storage-operator pod/cluster-storage-operator-5b756574d-x2c62 node/master-2 container/cluster-storage-operator reason/ContainerExit code/1 cause/Error [{"operator" "4.9.0-0.nightly-2023-01-23-185428"}]\nI0123 20:17:16.582283       1 base_controller.go:73] Caches are synced for SnapshotCRDController \nI0123 20:17:16.582314       1 base_controller.go:110] Starting #1 worker of SnapshotCRDController controller ...\nI0123 20:30:49.687304       1 cmd.go:97] Received SIGTERM or SIGINT signal, shutting down controller.\nI0123 20:30:49.687399       1 genericapiserver.go:386] [graceful-termination] RunPreShutdownHooks has completed\nI0123 20:30:49.687410       1 genericapiserver.go:349] "[graceful-termination] shutdown event" name="ShutdownInitiated"\nI0123 20:30:49.687587       1 base_controller.go:167] Shutting down SnapshotCRDController ...\nI0123 20:30:49.687625       1 base_controller.go:167] Shutting down StatusSyncer_storage ...\nI0123 20:30:49.687631       1 base_controller.go:145] All StatusSyncer_storage post start hooks have been terminated\nI0123 20:30:49.687642       1 base_controller.go:167] Shutting down DefaultStorageClassController ...\nI0123 20:30:49.687652       1 base_controller.go:167] Shutting down ConfigObserver ...\nI0123 20:30:49.687661       1 base_controller.go:167] Shutting down CSIDriverStarter ...\nI0123 20:30:49.687671       1 base_controller.go:167] Shutting down ManagementStateController ...\nI0123 20:30:49.687703       1 base_controller.go:167] Shutting down LoggingSyncer ...\nI0123 20:30:49.687716       1 base_controller.go:167] Shutting down VSphereProblemDetectorStarter ...\nI0123 20:30:49.687722       1 configmap_cafile_content.go:222] "Shutting down controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"\nI0123 20:30:49.687764       1 configmap_cafile_content.go:222] "Shutting down controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"\nI0123 20:30:49.687801       1 requestheader_controller.go:183] Shutting down RequestHeaderAuthRequestController\nW0123 20:30:49.687880       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\n
Jan 23 20:30:51.889 E ns/openshift-cluster-storage-operator pod/csi-snapshot-controller-758f5b59c5-zstf9 node/master-2 container/snapshot-controller reason/ContainerExit code/2 cause/Error
Jan 23 20:30:52.878 E ns/openshift-console-operator pod/console-operator-85775bb4dc-sf6rg node/master-2 container/console-operator reason/ContainerExit code/1 cause/Error y, but keeping serving\nI0123 20:30:52.084950       1 genericapiserver.go:709] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-console-operator", Name:"console-operator-85775bb4dc-sf6rg", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationMinimalShutdownDurationFinished' The minimal shutdown duration of 0s finished\nI0123 20:30:52.084973       1 genericapiserver.go:362] "[graceful-termination] shutdown event" name="AfterShutdownDelayDuration"\nI0123 20:30:52.084984       1 genericapiserver.go:709] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-console-operator", Name:"console-operator-85775bb4dc-sf6rg", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationStoppedServing' Server has stopped listening\nI0123 20:30:52.084998       1 genericapiserver.go:387] "[graceful-termination] shutdown event" name="InFlightRequestsDrained"\nI0123 20:30:52.085126       1 base_controller.go:167] Shutting down ConsoleServiceController ...\nI0123 20:30:52.085148       1 base_controller.go:167] Shutting down ConsoleCLIDownloadsController ...\nI0123 20:30:52.085157       1 base_controller.go:167] Shutting down ManagementStateController ...\nI0123 20:30:52.085165       1 base_controller.go:167] Shutting down ResourceSyncController ...\nI0123 20:30:52.085175       1 base_controller.go:167] Shutting down ConsoleDownloadsDeploymentSyncController ...\nI0123 20:30:52.085185       1 base_controller.go:167] Shutting down StatusSyncer_console ...\nI0123 20:30:52.085191       1 base_controller.go:145] All StatusSyncer_console post start hooks have been terminated\nI0123 20:30:52.085198       1 base_controller.go:167] Shutting down DownloadsRouteController ...\nW0123 20:30:52.085201       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\nI0123 20:30:52.085207       1 base_controller.go:167] Shutting down LoggingSyncer ...\nI0123 20:30:52.085215       1 base_controller.go:167] Shutting down ConsoleRouteController ...\n
Jan 23 20:30:56.014 E ns/openshift-machine-api pod/machine-api-operator-66d9666d6-sbnk2 node/master-2 container/machine-api-operator reason/ContainerExit code/2 cause/Error
Jan 23 20:30:56.170 E ns/openshift-kube-storage-version-migrator-operator pod/kube-storage-version-migrator-operator-598ddb8bf9-gcl4q node/master-2 container/kube-storage-version-migrator-operator reason/ContainerExit code/1 cause/Error 282] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-kube-storage-version-migrator-operator", Name:"kube-storage-version-migrator-operator", UID:"e12d2ea9-76d9-4bdf-8e2e-76ab52c4547c", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/kube-storage-version-migrator changed: Progressing changed from True to False ("All is well"),Available changed from False to True ("All is well")\nI0123 20:30:55.295136       1 cmd.go:97] Received SIGTERM or SIGINT signal, shutting down controller.\nI0123 20:30:55.295544       1 configmap_cafile_content.go:223] Shutting down client-ca::kube-system::extension-apiserver-authentication::client-ca-file\nI0123 20:30:55.295581       1 base_controller.go:167] Shutting down StaticResourceController ...\nI0123 20:30:55.295595       1 requestheader_controller.go:183] Shutting down RequestHeaderAuthRequestController\nI0123 20:30:55.295604       1 configmap_cafile_content.go:223] Shutting down client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file\nI0123 20:30:55.295608       1 base_controller.go:167] Shutting down StatusSyncer_kube-storage-version-migrator ...\nI0123 20:30:55.295612       1 base_controller.go:145] All StatusSyncer_kube-storage-version-migrator post start hooks have been terminated\nI0123 20:30:55.295613       1 tlsconfig.go:255] Shutting down DynamicServingCertificateController\nI0123 20:30:55.295623       1 base_controller.go:167] Shutting down StaticConditionsController ...\nI0123 20:30:55.295633       1 base_controller.go:167] Shutting down LoggingSyncer ...\nI0123 20:30:55.295640       1 base_controller.go:167] Shutting down RemoveStaleConditionsController ...\nI0123 20:30:55.295648       1 base_controller.go:167] Shutting down KubeStorageVersionMigrator ...\nW0123 20:30:55.295693       1 builder.go:101] graceful termination failed, controllers failed with error: stopped\nI0123 20:30:55.295698       1 secure_serving.go:241] Stopped listening on [::]:8443\n
Jan 23 20:30:57.160 E ns/openshift-machine-api pod/machine-api-controllers-5bcccd849c-8n2mk node/master-2 container/machineset-controller reason/ContainerExit code/1 cause/Error
Jan 23 20:30:57.246 E ns/openshift-machine-config-operator pod/machine-config-controller-64c44dcd64-5wkh4 node/master-2 container/machine-config-controller reason/ContainerExit code/2 cause/Error ndered-master-e28910f1384b714f77065e0de3dc66ba\nI0123 20:30:34.285714       1 node_controller.go:424] Pool master: node master-2: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-master-e28910f1384b714f77065e0de3dc66ba\nI0123 20:30:34.285762       1 event.go:282] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"", Name:"master", UID:"3bb27351-c7ab-45fd-ab6f-33ab82f95328", APIVersion:"machineconfiguration.openshift.io/v1", ResourceVersion:"50405", FieldPath:""}): type: 'Normal' reason: 'AnnotationChange' Node master-2 now has machineconfiguration.openshift.io/desiredConfig=rendered-master-e28910f1384b714f77065e0de3dc66ba\nI0123 20:30:35.299716       1 node_controller.go:424] Pool master: node master-2: changed annotation machineconfiguration.openshift.io/state = Working\nI0123 20:30:35.299933       1 event.go:282] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"", Name:"master", UID:"3bb27351-c7ab-45fd-ab6f-33ab82f95328", APIVersion:"machineconfiguration.openshift.io/v1", ResourceVersion:"50405", FieldPath:""}): type: 'Normal' reason: 'AnnotationChange' Node master-2 now has machineconfiguration.openshift.io/state=Working\nI0123 20:30:35.391987       1 node_controller.go:424] Pool master: node master-2: Reporting unready: node master-2 is reporting Unschedulable\nE0123 20:30:39.373978       1 render_controller.go:460] Error updating MachineConfigPool master: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "master": the object has been modified; please apply your changes to the latest version and try again\nI0123 20:30:39.374000       1 render_controller.go:377] Error syncing machineconfigpool master: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "master": the object has been modified; please apply your changes to the latest version and try again\nI0123 20:30:43.163927       1 node_controller.go:424] Pool worker: node worker-1: Reporting unready: node worker-1 is reporting OutOfDisk=Unknown\n
Jan 23 20:31:17.237 E ns/openshift-console pod/console-ccb6c4d6f-hx9sq node/master-2 container/console reason/ContainerExit code/2 cause/Error iled: 404 Not Found\nE0123 19:41:00.834390       1 auth.go:231] error contacting auth provider (retrying in 10s): discovery through endpoint https://kubernetes.default.svc/.well-known/oauth-authorization-server failed: 404 Not Found\nE0123 19:41:10.845537       1 auth.go:231] error contacting auth provider (retrying in 10s): discovery through endpoint https://kubernetes.default.svc/.well-known/oauth-authorization-server failed: 404 Not Found\nE0123 19:41:20.858242       1 auth.go:231] error contacting auth provider (retrying in 10s): discovery through endpoint https://kubernetes.default.svc/.well-known/oauth-authorization-server failed: 404 Not Found\nE0123 19:41:30.869151       1 auth.go:231] error contacting auth provider (retrying in 10s): discovery through endpoint https://kubernetes.default.svc/.well-known/oauth-authorization-server failed: 404 Not Found\nE0123 19:41:40.877757       1 auth.go:231] error contacting auth provider (retrying in 10s): discovery through endpoint https://kubernetes.default.svc/.well-known/oauth-authorization-server failed: 404 Not Found\nE0123 19:41:50.883695       1 auth.go:231] error contacting auth provider (retrying in 10s): discovery through endpoint https://kubernetes.default.svc/.well-known/oauth-authorization-server failed: 404 Not Found\nE0123 19:42:00.892803       1 auth.go:231] error contacting auth provider (retrying in 10s): discovery through endpoint https://kubernetes.default.svc/.well-known/oauth-authorization-server failed: 404 Not Found\nE0123 19:42:10.912222       1 auth.go:231] error contacting auth provider (retrying in 10s): discovery through endpoint https://kubernetes.default.svc/.well-known/oauth-authorization-server failed: 404 Not Found\nE0123 19:42:20.924523       1 auth.go:231] error contacting auth provider (retrying in 10s): discovery through endpoint https://kubernetes.default.svc/.well-known/oauth-authorization-server failed: 404 Not Found\nI0123 19:42:30.955420       1 main.go:660] Binding to [::]:8443...\nI0123 19:42:30.955443       1 main.go:662] using TLS\n

Found in 25.00% of runs (50.00% of failures) across 12 total runs and 1 jobs (50.00% failed) in 101ms - clear search | chart view - source code located on github