Job:
#1967614bug18 months agoprometheus-k8s pods can't be scheduled due to volume node affinity conflict RELEASE_PENDING
Looking at [1], the initial error reported by CMO is indeed the same: "creating Deployment object failed after update failed: object is being deleted: deployments.apps "prometheus-operator" already exists". But the subsequent reconciliations fail for different reasons [2].
#1956308bug16 months agoCMO fails to delete/recreate the deployment resource after '422 Unprocessable Entity' update response POST
Actual results:
The operator goes Degraded and Unavailable for a short period of time, the reason being '... Deployment failed: creating Deployment object failed after update failed: object is being deleted: deployments.apps "xxx" already exists'.
Jun 02 05:00:43.066 - 7244s E clusteroperator/monitoring condition/Available status/False reason/Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error.
Jun 02 05:00:43.066 - 7244s E clusteroperator/monitoring condition/Degraded status/True reason/Failed to rollout the stack. Error: running task Updating Prometheus Operator failed: reconciling Prometheus Operator Deployment failed: creating Deployment object failed after update failed: object is being deleted: deployments.apps "prometheus-operator" already exists
Jun 02 05:00:43.066 E clusteroperator/monitoring condition/Available status/False reason/UpdatingPrometheusOperatorFailed changed: Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error.
Jun 02 05:00:43.066 E clusteroperator/monitoring condition/Degraded status/True reason/UpdatingPrometheusOperatorFailed changed: Failed to rollout the stack. Error: running task Updating Prometheus Operator failed: reconciling Prometheus Operator Deployment failed: creating Deployment object failed after update failed: object is being deleted: deployments.apps "prometheus-operator" already exists
Jun 02 05:00:43.066 W clusteroperator/monitoring condition/Progressing status/False reason/UpdatingPrometheusOperatorFailed changed: Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error.
Comment 15145928 by spasquie@redhat.com at 2021-06-03T09:06:36Z
Looking at [1], the initial error reported by CMO is indeed the same: "creating Deployment object failed after update failed: object is being deleted: deployments.apps "prometheus-operator" already exists". But the subsequent reconciliations fail for different reasons [2].
error:
running task Updating Prometheus Operator failed: reconciling Prometheus Operator Deployment failed: creating Deployment object failed after update failed: object is being deleted: deployments.apps "prometheus-operator" already exists
shows when upgrade from 4.7.19->4.8.0-fc.7, this is expected, since the fix is not in 4.8 now
#1982369bug16 months agoCMO fails to delete/recreate the deployment resource after '422 Unprocessable Entity' update response ASSIGNED
still can see error:
Aug 02 13:37:41.192 - 71s   E clusteroperator/monitoring condition/Degraded status/True reason/Failed to rollout the stack. Error: running task Updating Prometheus Operator failed: reconciling Prometheus Operator Deployment failed: creating Deployment object failed after update failed: object is being deleted: deployments.apps "prometheus-operator" already exists
Aug 02 13:37:41.192 E clusteroperator/monitoring condition/Available status/False reason/UpdatingPrometheusOperatorFailed changed: Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error.
Aug 02 13:37:41.192 E clusteroperator/monitoring condition/Degraded status/True reason/UpdatingPrometheusOperatorFailed changed: Failed to rollout the stack. Error: running task Updating Prometheus Operator failed: reconciling Prometheus Operator Deployment failed: creating Deployment object failed after update failed: object is being deleted: deployments.apps "prometheus-operator" already exists
Aug 02 13:37:41.192 - 71s   E clusteroperator/monitoring condition/Available status/False reason/Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error.
Aug 02 13:37:41.192 - 71s   E clusteroperator/monitoring condition/Degraded status/True reason/Failed to rollout the stack. Error: running task Updating Prometheus Operator failed: reconciling Prometheus Operator Deployment failed: creating Deployment object failed after update failed: object is being deleted: deployments.apps "prometheus-operator" already exists
Aug 02 13:37:43.169 E ns/openshift-service-ca-operator pod/service-ca-operator-699fdbb947-4cv54 node/ip-10-0-222-211.ec2.internal container/service-ca-operator reason/ContainerExit code/1 cause/Error
Comment 15403497 by spasquie@redhat.com at 2021-08-19T16:02:29Z
I've searched for "creating Deployment object failed after update failed" in all jobs whose names contain "4.8" but not "4.7" (e.g. excluding 4.7 > 4.8 upgrade jobs) [1] and I've found nothing except for release-openshift-origin-installer-old-rhcos-e2e-aws-4.8. But this one is special because despite what the job name claims, it spins up a 4.7 cluster [2].
periodic-ci-openshift-release-master-nightly-4.8-upgrade-from-stable-4.7-e2e-metal-ipi-upgrade (all) - 1 runs, 0% failed, 100% of runs match
#1620242054762205184junit6 hours ago
Jan 31 03:53:39.375 E clusteroperator/monitoring condition/Available status/False reason/UpdatingPrometheusOperatorFailed changed: Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error.
Jan 31 03:53:39.375 E clusteroperator/monitoring condition/Degraded status/True reason/UpdatingPrometheusOperatorFailed changed: Failed to rollout the stack. Error: running task Updating Prometheus Operator failed: reconciling Prometheus Operator Deployment failed: creating Deployment object failed after update failed: object is being deleted: deployments.apps "prometheus-operator" already exists
Jan 31 03:53:39.375 - 125s  E clusteroperator/monitoring condition/Available status/False reason/Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error.
Jan 31 03:53:39.375 - 125s  E clusteroperator/monitoring condition/Degraded status/True reason/Failed to rollout the stack. Error: running task Updating Prometheus Operator failed: reconciling Prometheus Operator Deployment failed: creating Deployment object failed after update failed: object is being deleted: deployments.apps "prometheus-operator" already exists
Jan 31 03:53:48.854 E ns/openshift-console-operator pod/console-operator-77cbd99dbb-sgjtf node/master-0 container/console-operator reason/ContainerExit code/1 cause/Error 206\nI0131 03:53:47.685107       1 tlsconfig.go:255] Shutting down DynamicServingCertificateController\nI0131 03:53:47.685140       1 reflector.go:225] Stopping reflector *v1.ClusterOperator (10m0s) from github.com/openshift/client-go/config/informers/externalversions/factory.go:101\nI0131 03:53:47.685153       1 controller.go:115] shutting down ConsoleResourceSyncDestinationController\nI0131 03:53:47.685160       1 controller.go:181] shutting down ConsoleServiceSyncController\nI0131 03:53:47.685183       1 controller.go:377] shutting down ConsoleRouteSyncController\nI0131 03:53:47.685196       1 base_controller.go:166] Shutting down ConsoleOperator ...\nI0131 03:53:47.685203       1 base_controller.go:166] Shutting down UnsupportedConfigOverridesController ...\nI0131 03:53:47.685210       1 base_controller.go:166] Shutting down ManagementStateController ...\nI0131 03:53:47.685229       1 reflector.go:225] Stopping reflector *v1.Console (10m0s) from github.com/openshift/client-go/operator/informers/externalversions/factory.go:101\nI0131 03:53:47.685249       1 reflector.go:225] Stopping reflector *v1.ConfigMap (10m0s) from k8s.io/client-go/informers/factory.go:134\nI0131 03:53:47.685259       1 reflector.go:225] Stopping reflector *v1.OAuthClient (10m0s) from github.com/openshift/client-go/oauth/informers/externalversions/factory.go:101\nI0131 03:53:47.685280       1 reflector.go:225] Stopping reflector *v1.Secret (10m0s) from k8s.io/client-go/informers/factory.go:134\nI0131 03:53:47.685306       1 reflector.go:225] Stopping reflector *v1.Service (10m0s) from k8s.io/client-go/informers/factory.go:134\nI0131 03:53:47.685320       1 reflector.go:225] Stopping reflector *v1.ConsoleCLIDownload (10m0s) from github.com/openshift/client-go/console/informers/externalversions/factory.go:101\nI0131 03:53:47.685344       1 reflector.go:225] Stopping reflector *v1.ConfigMap (10m0s) from k8s.io/client-go/informers/factory.go:134\nW0131 03:53:47.685345       1 builder.go:97] graceful termination failed, controllers failed with error: stopped\n
#1620242054762205184junit6 hours ago
Jan 31 03:53:39.375 - 125s  E clusteroperator/monitoring condition/Degraded status/True reason/Failed to rollout the stack. Error: running task Updating Prometheus Operator failed: reconciling Prometheus Operator Deployment failed: creating Deployment object failed after update failed: object is being deleted: deployments.apps "prometheus-operator" already exists
Jan 31 04:12:45.276 - 18s   E clusteroperator/monitoring condition/Degraded status/True reason/Failed to rollout the stack. Error: running task Updating configuration sharing failed: failed to retrieve Prometheus host: getting Route object failed: the server is currently unable to handle the request (get routes.route.openshift.io prometheus-k8s)
periodic-ci-openshift-release-master-nightly-4.8-upgrade-from-stable-4.7-e2e-aws-upgrade (all) - 1 runs, 0% failed, 100% of runs match
#1620242072055320576junit7 hours ago
Jan 31 03:13:41.791 E clusteroperator/monitoring condition/Available status/False reason/UpdatingPrometheusOperatorFailed changed: Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error.
Jan 31 03:13:41.791 E clusteroperator/monitoring condition/Degraded status/True reason/UpdatingPrometheusOperatorFailed changed: Failed to rollout the stack. Error: running task Updating Prometheus Operator failed: reconciling Prometheus Operator Deployment failed: creating Deployment object failed after update failed: object is being deleted: deployments.apps "prometheus-operator" already exists
Jan 31 03:13:41.791 - 167s  E clusteroperator/monitoring condition/Available status/False reason/Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error.
Jan 31 03:13:41.791 - 167s  E clusteroperator/monitoring condition/Degraded status/True reason/Failed to rollout the stack. Error: running task Updating Prometheus Operator failed: reconciling Prometheus Operator Deployment failed: creating Deployment object failed after update failed: object is being deleted: deployments.apps "prometheus-operator" already exists
Jan 31 03:13:43.331 E ns/openshift-cluster-storage-operator pod/csi-snapshot-controller-operator-5b9df66bc8-z5g6n node/ip-10-0-151-249.ec2.internal container/csi-snapshot-controller-operator reason/ContainerExit code/1 cause/Error  1 operator.go:149] Finished syncing operator at 598.210826ms\nI0131 03:13:34.372053       1 operator.go:147] Starting syncing operator at 2023-01-31 03:13:34.372043794 +0000 UTC m=+3356.083762880\nI0131 03:13:34.544180       1 operator.go:149] Finished syncing operator at 172.127766ms\nI0131 03:13:42.559367       1 operator.go:147] Starting syncing operator at 2023-01-31 03:13:42.559357067 +0000 UTC m=+3364.271076153\nI0131 03:13:42.644999       1 cmd.go:88] Received SIGTERM or SIGINT signal, shutting down controller.\nI0131 03:13:42.645185       1 reflector.go:225] Stopping reflector *v1.Deployment (25m44.187000609s) from k8s.io/client-go/informers/factory.go:134\nI0131 03:13:42.645235       1 reflector.go:225] Stopping reflector *v1.ClusterOperator (20m0s) from github.com/openshift/client-go/config/informers/externalversions/factory.go:101\nI0131 03:13:42.645266       1 reflector.go:225] Stopping reflector *v1.CustomResourceDefinition (37m10.16851861s) from k8s.io/apiextensions-apiserver/pkg/client/informers/externalversions/factory.go:117\nI0131 03:13:42.645296       1 reflector.go:225] Stopping reflector *v1.CSISnapshotController (20m0s) from github.com/openshift/client-go/operator/informers/externalversions/factory.go:101\nI0131 03:13:42.645317       1 reflector.go:225] Stopping reflector *v1.ValidatingWebhookConfiguration (25m44.187000609s) from k8s.io/client-go/informers/factory.go:134\nI0131 03:13:42.645352       1 base_controller.go:166] Shutting down ManagementStateController ...\nI0131 03:13:42.645365       1 base_controller.go:166] Shutting down CSISnapshotWebhookController ...\nI0131 03:13:42.645374       1 base_controller.go:166] Shutting down StatusSyncer_csi-snapshot-controller ...\nI0131 03:13:42.645380       1 base_controller.go:144] All StatusSyncer_csi-snapshot-controller post start hooks have been terminated\nI0131 03:13:42.645389       1 base_controller.go:166] Shutting down LoggingSyncer ...\nW0131 03:13:42.645443       1 builder.go:97] graceful termination failed, controllers failed with error: stopped\n
#1620242072055320576junit7 hours ago
Jan 31 03:13:41.791 - 167s  E clusteroperator/monitoring condition/Degraded status/True reason/Failed to rollout the stack. Error: running task Updating Prometheus Operator failed: reconciling Prometheus Operator Deployment failed: creating Deployment object failed after update failed: object is being deleted: deployments.apps "prometheus-operator" already exists

Found in 100.00% of runs (+Inf% of failures) across 2 total runs and 2 jobs (0.00% failed) in 141ms - clear search | chart view - source code located on github