Skip to content

[SURE-9138] rke2-control-plane-system: unable to lookup or create cluster certificates, external certificate not found: secrets "cluster-etcd" not found #774

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kkaempf opened this issue Oct 8, 2024 · 6 comments
Assignees
Labels
JIRA must shout kind/bug Something isn't working

Comments

@kkaempf
Copy link

kkaempf commented Oct 8, 2024

SURE-9138

Issue description:

The customer is seeing the following error:

E0916 12:43:13.582966 1 workload_cluster.go:118] "Collecting etcd key pair from remote" controller="rke2controlplane" controllerGroup="controlplane.cluster.x-k8s.io" controllerKind="RKE2ControlPlane" RKE2ControlPlane="rke2test/rke2test-master" namespace="rke2test" name="rke2test-master" reconcileID="b94f1a99-df21-4d73-aaa5-cb7e1a69a1a3"
E0916 12:43:13.603661 1 management_cluster.go:171] "unable to lookup or create cluster certificates" err="external certificate not found: secrets \"cluster-etcd\" not found" controller="rke2controlplane" controllerGroup="controlplane.cluster.x-k8s.io" controllerKind="RKE2ControlPlane" RKE2ControlPlane="rke2test/rke2test-master" namespace="rke2test" name="rke2test-master" reconcileID="b94f1a99-df21-4d73-aaa5-cb7e1a69a1a3" 

We told them that this should be a transient error during the provisioning phase [1] but they are reporting that this issue not only during the provisioning phase, but also in stable clusters even weeks after the cluster was deployed.

@kkaempf kkaempf added kind/bug Something isn't working JIRA must shout labels Oct 8, 2024
@kkaempf kkaempf added this to the October release milestone Oct 8, 2024
@kkaempf
Copy link
Author

kkaempf commented Oct 8, 2024

/cc @Danil-Grigorev - since you did the initial assessment when this bug was initially reported via Slack 😉

@Danil-Grigorev
Copy link
Contributor

@kkaempf As per discussion in slack - this is fixed in 0.7.1 CAPRKE2 release with combination of rancher/cluster-api-provider-rke2#451 and rancher/cluster-api-provider-rke2#453. Users will need to upgrade to this version later, we might need the version pinned in turtles release also.

@furkatgofurov7
Copy link
Contributor

Turtles release v0.13.0 which under the hood uses CAPRKE2 v0.8.0 (includes bug-fixes that 0.7.1 provides) was released and this could be closed or hold also until it is tested and verified it fixes their issue.

@furkatgofurov7 furkatgofurov7 moved this to To Test in CAPI / Turtles Oct 31, 2024
@cpinjani
Copy link
Contributor

Able to reproduce the issue on turtles v0.10.0 (caprke2 v0.4.1) which was upgraded from turtles v0.7.0 (rke2 v0.2.6)

I1125 08:47:02.524386       1 management_cluster.go:158] "Local secret is not up-to-date, skipping etcd client creation" controller="rke2controlplane" controllerGroup="controlplane.cluster.x-k8s.io" controllerKind="RKE2ControlPlane" RKE2ControlPlane="default/cluster1-control-plane" namespace="default" name="cluster1-control-plane" reconcileID="b89b32ee-db71-4f4a-8f04-2c0b5a3dd7ae"
2024-11-25T08:47:02.524602706Z I1125 08:47:02.524428       1 workload_cluster.go:118] "Collecting etcd key pair from remote" controller="rke2controlplane" controllerGroup="controlplane.cluster.x-k8s.io" controllerKind="RKE2ControlPlane" RKE2ControlPlane="default/cluster1-control-plane" namespace="default" name="cluster1-control-plane" reconcileID="b89b32ee-db71-4f4a-8f04-2c0b5a3dd7ae"
E1125 08:47:02.528014       1 management_cluster.go:171] "unable to lookup or create cluster certificates" err="external certificate not found: secrets \"cluster-etcd\" not found" controller="rke2controlplane" controllerGroup="controlplane.cluster.x-k8s.io" controllerKind="RKE2ControlPlane" RKE2ControlPlane="default/cluster1-control-plane" namespace="default" name="cluster1-control-plane" reconcileID="b89b32ee-db71-4f4a-8f04-2c0b5a3dd7ae"

After upgrading to turtles v0.13.0 (caprke2 v0.8.0) the error is not present in the logs.
@furkatgofurov7 @Danil-Grigorev Please review the issue for closure.

@furkatgofurov7
Copy link
Contributor

@cpinjani thanks for testing, it is fine to close it from my side.

@Danil-Grigorev
Copy link
Contributor

0.13.0 fixes it, and the verification passed? I don’t see issues with closing, that should be with the version 0.8.0 which has the fix.

@cpinjani cpinjani self-assigned this Nov 25, 2024
@github-project-automation github-project-automation bot moved this from To Test to Done in CAPI / Turtles Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
JIRA must shout kind/bug Something isn't working
Projects
Archived in project
Development

No branches or pull requests

4 participants