-
Notifications
You must be signed in to change notification settings - Fork 1.4k
*: Point at try.openshift.com for pull secrets #663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: Point at try.openshift.com for pull secrets #663
Conversation
The account.coreos.com reference was stale, and pull-secrets aren't libvirt-specific, so I've dropped them from the libvirt docs entirely. From Clayton, the flow for getting a pull secret will be: 1. Log in to try.openshift.com. 2. Accept the terms. 3. Get a pull secret you can download or copy/paste back into a local file. Podman doesn't really come into it. Currently the secret you get there looks like: $ cat ~/.personal/pull-secret.json { "auths": { "cloud.openshift.com": {"auth": "...", "email": "..."}, "quay.io": {"auth": "...", "email": "..."} } } Besides pulling images, the secret may also be used to authenticate to other services (e.g. telemetry) on hosts that do not contain image registries, which is more reason to decouple this from Podman.
416176f
to
449326d
Compare
/hold this needs to wait until we have removed the |
The kube-addon operator was the last remaining component in that namespace, and it was just controlling a metrics server. Metrics aren't critical to cluster functions, and dropping kube-addon means we don't need the old pull secret anymore (although we will shortly need new pull secrets for pulling private release images [1]). Also drop the admin and user roles [2], although I'm less clear on their connection. [1]: openshift#663 [2]: openshift#682 (comment)
/hold cancel #682 landed, removing the blocker here. |
@@ -25,7 +25,7 @@ func (a *pullSecret) Generate(asset.Parents) error { | |||
&survey.Question{ | |||
Prompt: &survey.Input{ | |||
Message: "Pull Secret", | |||
Help: "The container registry pull secret for this cluster, as a single line of JSON (e.g. {\"auths\": {...}}).", | |||
Help: "The container registry pull secret for this cluster, as a single line of JSON (e.g. {\"auths\": {...}}).\n\nYou can get this secret from https://try.openshift.com", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
users were complaining https://github.com/openshift/installer/pull/663/files#diff-e88e63c9df89ea6d2969694596325266L44 was insufficient. now this is more obscure. We should add/point to some docs on getting the pull secret process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try.openshift.com is fairly tightly scoped to this. Not sure about the old account.coreos.com. I'll go through the process again and see if there's anything that seems non-obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here's what the try.openshift.com flow looks like:
I don't know what text I could add to make that process easier, except that it would be nice if the secret was on a single line (I think @crawford has already asked for that) or if the web page suggested downloading the secret instead of copy/pasting it. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the OpenShift version is certainly putting it more front-and-center ;). That should help. #677 and #691 were both suggesting a "register for a Tectonic plan" step. Is that still required for OpenShift? I'd expect try.openshift.com
to be enough user input to get that information, but I'm not sure what the from-scratch flow looks like (presumably there's an accept-terms intermediate? Maybe more?). @smarterclayton, are there screenshots for a from-scratch registration somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... but I'm not sure what the from-scratch flow looks like (presumably there's an accept-terms intermediate? Maybe more?).
So I just went through this with a different email address and GitHub auth, and there are two intermediate pages. The first is a form for personal information and term-acceptance:
The next is an email-confirmation page:
Clicking on the confirmation link in the email took me to the page I posted above, with the token front and center. So I don't think the new flow has anything like the old flow's "register for a Tectonic plan" step. Is that enough to back up the text I have here now? If not, what additional text would you like to see?
We'll add more docs when people ask for. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhinavdahiya, wking The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
e2e-aws included:
Digging in a bit, the nodes look healthy: $ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/663/pull-ci-openshift-installer-master-e2e-aws/1771/artifacts/e2e-aws/nodes.json | jq -r '.items[] | {name: .metadata.name, role: (.metadata.labels | keys[] | select(. | startswith("node-role.kubernetes.io/"))), ready: [(.status.conditions[] | select(.type == "Ready") | {status, lastTransitionTime})][0]}'
{
"name": "ip-10-0-143-167.ec2.internal",
"role": "node-role.kubernetes.io/worker",
"ready": {
"status": "True",
"lastTransitionTime": "2018-11-30T06:31:05Z"
}
}
{
"name": "ip-10-0-147-44.ec2.internal",
"role": "node-role.kubernetes.io/worker",
"ready": {
"status": "True",
"lastTransitionTime": "2018-11-30T06:31:03Z"
}
}
{
"name": "ip-10-0-15-50.ec2.internal",
"role": "node-role.kubernetes.io/master",
"ready": {
"status": "True",
"lastTransitionTime": "2018-11-30T06:25:53Z"
}
}
{
"name": "ip-10-0-175-43.ec2.internal",
"role": "node-role.kubernetes.io/worker",
"ready": {
"status": "True",
"lastTransitionTime": "2018-11-30T06:30:59Z"
}
}
{
"name": "ip-10-0-21-63.ec2.internal",
"role": "node-role.kubernetes.io/master",
"ready": {
"status": "True",
"lastTransitionTime": "2018-11-30T06:25:28Z"
}
}
{
"name": "ip-10-0-45-158.ec2.internal",
"role": "node-role.kubernetes.io/master",
"ready": {
"status": "True",
"lastTransitionTime": "2018-11-30T06:25:51Z"
}
} Here's the first event, bootstrapping finished, and the last event: $ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/663/pull-ci-openshift-installer-master-e2e-aws/1771/artifacts/e2e-aws/events.json | jq -r '[.items[] | .firstTimestamp] | sort[0]'
2018-11-30T06:22:46Z
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/663/pull-ci-openshift-installer-master-e2e-aws/1771/artifacts/e2e-aws/events.json | jq -r '.items[] | select(.metadata.name == "bootstrap-complete") | .firstTimestamp'
2018-11-30T06:28:31Z
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/663/pull-ci-openshift-installer-master-e2e-aws/1771/artifacts/e2e-aws/events.json | jq -r '[.items[] | .lastTimestamp] | sort[-1]'
2018-11-30T06:48:23Z Here are the non-normal events by increasing $ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/663/pull-ci-openshift-installer-master-e2e-aws/1771/artifacts/e2e-aws/events.json | jq -r '[.items[] | select(.type != "Normal")] | sort_by(.lastTimestamp)[] | .lastTimestamp + " " + .message'
2018-11-30T06:23:13Z Failed to create new replica set "cluster-version-operator-5588cf49bc": replicasets.apps "cluster-version-operator-5588cf49bc" is forbidden: cannot set blockOwnerDeletion in this case because cannot find RESTMapping for APIVersion apps/v1 Kind Deployment: no matches for kind "Deployment" in version "apps/v1"
2018-11-30T06:23:42Z Error creating: pods "cluster-network-operator-" is forbidden: error looking up service account openshift-cluster-network-operator/default: serviceaccount "default" not found
2018-11-30T06:24:44Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:44Z 0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 node(s) were not ready.
2018-11-30T06:25:44Z 0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 node(s) were not ready.
2018-11-30T06:25:44Z 0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 node(s) were not ready.
2018-11-30T06:25:51Z 0/3 nodes are available: 1 node(s) were not ready, 2 node(s) had taints that the pod didn't tolerate.
2018-11-30T06:25:51Z 0/3 nodes are available: 1 node(s) were not ready, 2 node(s) had taints that the pod didn't tolerate.
2018-11-30T06:25:51Z 0/3 nodes are available: 1 node(s) were not ready, 2 node(s) had taints that the pod didn't tolerate.
2018-11-30T06:25:54Z 0/3 nodes are available: 3 node(s) had taints that the pod didn't tolerate.
2018-11-30T06:26:14Z MountVolume.SetUp failed for volume "config-volume" : configmaps "dns-default" not found
2018-11-30T06:26:27Z Failed to create revision 1: configmaps "kube-controller-manager-pod" not found
2018-11-30T06:26:32Z Failed to create revision 1: configmaps "kube-apiserver-pod" not found
2018-11-30T06:26:32Z Failed to create revision 1: configmaps "kube-controller-manager-pod" not found
2018-11-30T06:26:33Z Failed to create revision 1: configmaps "kube-apiserver-pod" not found
2018-11-30T06:26:33Z Failed to create revision 1: configmaps "kube-apiserver-pod" not found
2018-11-30T06:26:33Z Failed to create revision 1: configmaps "kube-controller-manager-pod" not found
2018-11-30T06:26:35Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:26:35Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:26:35Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:26:36Z Failed to create revision 1: configmaps "kube-controller-manager-pod" not found
2018-11-30T06:26:46Z Failed to create revision 1: secrets "serving-cert" not found
2018-11-30T06:27:01Z Failed to create revision 1: secrets "serving-cert" not found
2018-11-30T06:27:07Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:08Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:08Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:16Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:16Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:16Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:38Z Readiness probe failed: HTTP probe failed with statuscode: 403
2018-11-30T06:28:00Z Readiness probe failed: Get https://10.129.0.9:8443/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018-11-30T06:28:02Z Readiness probe failed: Get https://10.130.0.15:8443/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018-11-30T06:28:08Z Readiness probe failed: HTTP probe failed with statuscode: 500
2018-11-30T06:28:09Z Readiness probe failed: HTTP probe failed with statuscode: 500
2018-11-30T06:28:11Z Readiness probe failed: HTTP probe failed with statuscode: 500
2018-11-30T06:28:29Z Failed to create installer pod for revision 1 on node "ip-10-0-45-158.ec2.internal": Post https://172.30.0.1:443/api/v1/namespaces/openshift-kube-controller-manager/pods: dial tcp 172.30.0.1:443: connect: connection refused
2018-11-30T06:28:31Z cluster bootstrapping has completed
2018-11-30T06:28:36Z Unable to mount volumes for pod "apiserver-8952q_openshift-apiserver(e19d1cfb-f468-11e8-9d8b-125c0f33f2c8)": timeout expired waiting for volumes to attach or mount for pod "openshift-apiserver"/"apiserver-8952q". list of unmounted volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]. list of unattached volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]
2018-11-30T06:28:36Z Unable to mount volumes for pod "apiserver-d4tst_openshift-apiserver(e19e7a6a-f468-11e8-9d8b-125c0f33f2c8)": timeout expired waiting for volumes to attach or mount for pod "openshift-apiserver"/"apiserver-d4tst". list of unmounted volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]. list of unattached volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]
2018-11-30T06:28:36Z Unable to mount volumes for pod "apiserver-xhfw4_openshift-apiserver(e19e9bd7-f468-11e8-9d8b-125c0f33f2c8)": timeout expired waiting for volumes to attach or mount for pod "openshift-apiserver"/"apiserver-xhfw4". list of unmounted volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]. list of unattached volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]
2018-11-30T06:28:38Z Readiness probe failed: HTTP probe failed with statuscode: 403
2018-11-30T06:30:46Z network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni config uninitialized]
2018-11-30T06:30:58Z network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni config uninitialized]
2018-11-30T06:30:59Z network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni config uninitialized]
2018-11-30T06:31:41Z MountVolume.SetUp failed for volume "default-certificate" : secrets "router-certs-default" not found
2018-11-30T06:31:41Z MountVolume.SetUp failed for volume "default-certificate" : secrets "router-certs-default" not found
2018-11-30T06:31:41Z MountVolume.SetUp failed for volume "default-certificate" : secrets "router-certs-default" not found
2018-11-30T06:32:30Z Readiness probe failed: Get https://10.129.0.13:8443/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018-11-30T06:32:30Z MountVolume.SetUp failed for volume "secret-grafana-tls" : secrets "grafana-tls" not found
2018-11-30T06:32:35Z Readiness probe failed: HTTP probe failed with statuscode: 403
2018-11-30T06:33:00Z Readiness probe failed: Get https://10.128.0.16:8443/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018-11-30T06:33:09Z Readiness probe failed: HTTP probe failed with statuscode: 500
2018-11-30T06:33:46Z Readiness probe failed: Get https://10.130.0.20:8443/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018-11-30T06:37:42Z Error deleting EBS volume "vol-0117599289fd757d7" since volume is in "creating" state
2018-11-30T06:37:42Z Error deleting EBS volume "vol-04de6a4ab315162c4" since volume is in "creating" state
2018-11-30T06:37:42Z Error deleting EBS volume "vol-0443c140090950e02" since volume is in "creating" state
2018-11-30T06:37:42Z Error deleting EBS volume "vol-09fe6717b780bc627" since volume is in "creating" state
2018-11-30T06:37:42Z Error deleting EBS volume "vol-08d34a638c77690dc" since volume is in "creating" state
2018-11-30T06:37:42Z Error deleting EBS volume "vol-0eac6185a2ef7b842" since volume is in "creating" state
2018-11-30T06:37:45Z Error deleting EBS volume "vol-0a8974406639d7456" since volume is in "creating" state
2018-11-30T06:37:46Z Error deleting EBS volume "vol-0b762f7d8da6d2741" since volume is in "creating" state
2018-11-30T06:37:46Z Error deleting EBS volume "vol-022f667d8a414e96c" since volume is in "creating" state
2018-11-30T06:37:47Z Error deleting EBS volume "vol-0c399eb6be225efa5" since volume is in "creating" state
2018-11-30T06:37:47Z Error deleting EBS volume "vol-0e4b6600546d26835" since volume is in "creating" state You can see some canceled requests around the same time as the test-suite "could not find" errors. And not all that long before are the |
/retest |
I'll kick this again after #769 lands, since that one actually helps reduce flakes. |
/retest |
/retest |
@wking: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
try.openshift.com used to be an HTML redirect to cloud.openshift.com, but now it's a page in its own right talking about what OpenShift 4 is. Folks who are trying to find a pull secret for the installer are already pretty interested, so they shouldn't have to dig too hard to get the JSON they need. This will also help avoid confusion like we saw for the CoreOS flow [1], where the pull secret was not immediately obvious to several users due to an undocumented "register for a Tectonic plan" intermediate [2,3]. Currently the JavaScript on /clusters/install is stripping the '#pull-secret' fragment. Hopefully that will get sorted out soon and this link will drop users right into the section with the pull-secret. [1]: openshift#663 (comment) [2]: openshift#677 [3]: openshift#691
The account.coreos.com reference was stale, and pull-secrets aren't libvirt-specific, so I've dropped them from the libvirt docs entirely.
From @smarterclayton, the flow for getting a pull secret will be:
Podman doesn't really come into it. Currently the secret you get there looks like:
Besides pulling images, the secret may also be used to authenticate to other services (e.g. telemetry) on hosts that do not contain image registries, which is more reason to decouple this from Podman.