*: Point at try.openshift.com for pull secrets #663

wking · 2018-11-13T18:17:11Z

The account.coreos.com reference was stale, and pull-secrets aren't libvirt-specific, so I've dropped them from the libvirt docs entirely.

From @smarterclayton, the flow for getting a pull secret will be:

Log in to try.openshift.com.
Accept the terms.
Get a pull secret you can download or copy/paste back into a local file.

Podman doesn't really come into it. Currently the secret you get there looks like:

$ cat pull-secret.json
{
  "auths": {
    "cloud.openshift.com": {"auth": "...", "email": "..."},
    "quay.io": {"auth": "...", "email": "..."}
  }
}

Besides pulling images, the secret may also be used to authenticate to other services (e.g. telemetry) on hosts that do not contain image registries, which is more reason to decouple this from Podman.

The account.coreos.com reference was stale, and pull-secrets aren't libvirt-specific, so I've dropped them from the libvirt docs entirely. From Clayton, the flow for getting a pull secret will be: 1. Log in to try.openshift.com. 2. Accept the terms. 3. Get a pull secret you can download or copy/paste back into a local file. Podman doesn't really come into it. Currently the secret you get there looks like: $ cat ~/.personal/pull-secret.json { "auths": { "cloud.openshift.com": {"auth": "...", "email": "..."}, "quay.io": {"auth": "...", "email": "..."} } } Besides pulling images, the secret may also be used to authenticate to other services (e.g. telemetry) on hosts that do not contain image registries, which is more reason to decouple this from Podman.

abhinavdahiya · 2018-11-13T18:35:03Z

/hold this needs to wait until we have removed the kube-addon-operator.

The kube-addon operator was the last remaining component in that namespace, and it was just controlling a metrics server. Metrics aren't critical to cluster functions, and dropping kube-addon means we don't need the old pull secret anymore (although we will shortly need new pull secrets for pulling private release images [1]). Also drop the admin and user roles [2], although I'm less clear on their connection. [1]: openshift#663 [2]: openshift#682 (comment)

wking · 2018-11-15T19:58:00Z

/hold cancel

#682 landed, removing the blocker here.

abhinavdahiya · 2018-11-16T20:09:09Z

pkg/asset/installconfig/pullsecret.go

@@ -25,7 +25,7 @@ func (a *pullSecret) Generate(asset.Parents) error {
 		&survey.Question{
 			Prompt: &survey.Input{
 				Message: "Pull Secret",
-				Help:    "The container registry pull secret for this cluster, as a single line of JSON (e.g. {\"auths\": {...}}).",
+				Help:    "The container registry pull secret for this cluster, as a single line of JSON (e.g. {\"auths\": {...}}).\n\nYou can get this secret from https://try.openshift.com",


users were complaining https://github.com/openshift/installer/pull/663/files#diff-e88e63c9df89ea6d2969694596325266L44 was insufficient. now this is more obscure. We should add/point to some docs on getting the pull secret process.

try.openshift.com is fairly tightly scoped to this. Not sure about the old account.coreos.com. I'll go through the process again and see if there's anything that seems non-obvious.

So here's what the try.openshift.com flow looks like:

I don't know what text I could add to make that process easier, except that it would be nice if the secret was on a single line (I think @crawford has already asked for that) or if the web page suggested downloading the secret instead of copy/pasting it. Thoughts?

Even the old process was 2 steps:

Users still kept asking us?

Well, the OpenShift version is certainly putting it more front-and-center ;). That should help. #677 and #691 were both suggesting a "register for a Tectonic plan" step. Is that still required for OpenShift? I'd expect try.openshift.com to be enough user input to get that information, but I'm not sure what the from-scratch flow looks like (presumably there's an accept-terms intermediate? Maybe more?). @smarterclayton, are there screenshots for a from-scratch registration somewhere?

... but I'm not sure what the from-scratch flow looks like (presumably there's an accept-terms intermediate? Maybe more?).

So I just went through this with a different email address and GitHub auth, and there are two intermediate pages. The first is a form for personal information and term-acceptance:

The next is an email-confirmation page:

Clicking on the confirmation link in the email took me to the page I posted above, with the token front and center. So I don't think the new flow has anything like the old flow's "register for a Tectonic plan" step. Is that enough to back up the text I have here now? If not, what additional text would you like to see?

abhinavdahiya · 2018-11-28T18:01:27Z

We'll add more docs when people ask for.
/lgtm

openshift-ci-robot · 2018-11-28T18:01:36Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [abhinavdahiya,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

wking · 2018-11-30T06:07:03Z

/retest

wking · 2018-11-30T07:40:05Z

e2e-aws included:

E1130 06:33:41.217223     666 memcache.go:147] couldn't get resource list for authorization.openshift.io/v1: the server could not find the requested resource
E1130 06:33:41.393011     666 memcache.go:147] couldn't get resource list for project.openshift.io/v1: the server could not find the requested resource
E1130 06:33:41.420379     666 memcache.go:147] couldn't get resource list for quota.openshift.io/v1: the server could not find the requested resource

Digging in a bit, the nodes look healthy:

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/663/pull-ci-openshift-installer-master-e2e-aws/1771/artifacts/e2e-aws/nodes.json | jq -r '.items[] | {name: .metadata.name, role: (.metadata.labels | keys[] | select(. | startswith("node-role.kubernetes.io/"))), ready: [(.status.conditions[] | select(.type == "Ready") | {status, lastTransitionTime})][0]}'
{
  "name": "ip-10-0-143-167.ec2.internal",
  "role": "node-role.kubernetes.io/worker",
  "ready": {
    "status": "True",
    "lastTransitionTime": "2018-11-30T06:31:05Z"
  }
}
{
  "name": "ip-10-0-147-44.ec2.internal",
  "role": "node-role.kubernetes.io/worker",
  "ready": {
    "status": "True",
    "lastTransitionTime": "2018-11-30T06:31:03Z"
  }
}
{
  "name": "ip-10-0-15-50.ec2.internal",
  "role": "node-role.kubernetes.io/master",
  "ready": {
    "status": "True",
    "lastTransitionTime": "2018-11-30T06:25:53Z"
  }
}
{
  "name": "ip-10-0-175-43.ec2.internal",
  "role": "node-role.kubernetes.io/worker",
  "ready": {
    "status": "True",
    "lastTransitionTime": "2018-11-30T06:30:59Z"
  }
}
{
  "name": "ip-10-0-21-63.ec2.internal",
  "role": "node-role.kubernetes.io/master",
  "ready": {
    "status": "True",
    "lastTransitionTime": "2018-11-30T06:25:28Z"
  }
}
{
  "name": "ip-10-0-45-158.ec2.internal",
  "role": "node-role.kubernetes.io/master",
  "ready": {
    "status": "True",
    "lastTransitionTime": "2018-11-30T06:25:51Z"
  }
}

Here's the first event, bootstrapping finished, and the last event:

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/663/pull-ci-openshift-installer-master-e2e-aws/1771/artifacts/e2e-aws/events.json | jq -r '[.items[] | .firstTimestamp] | sort[0]'
2018-11-30T06:22:46Z
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/663/pull-ci-openshift-installer-master-e2e-aws/1771/artifacts/e2e-aws/events.json | jq -r '.items[] | select(.metadata.name == "bootstrap-complete") | .firstTimestamp'
2018-11-30T06:28:31Z
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/663/pull-ci-openshift-installer-master-e2e-aws/1771/artifacts/e2e-aws/events.json | jq -r '[.items[] | .lastTimestamp] | sort[-1]'
2018-11-30T06:48:23Z

Here are the non-normal events by increasing lastTimestamp:

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/663/pull-ci-openshift-installer-master-e2e-aws/1771/artifacts/e2e-aws/events.json | jq -r '[.items[] | select(.type != "Normal")] | sort_by(.lastTimestamp)[] | .lastTimestamp + " " + .message'
2018-11-30T06:23:13Z Failed to create new replica set "cluster-version-operator-5588cf49bc": replicasets.apps "cluster-version-operator-5588cf49bc" is forbidden: cannot set blockOwnerDeletion in this case because cannot find RESTMapping for APIVersion apps/v1 Kind Deployment: no matches for kind "Deployment" in version "apps/v1"
2018-11-30T06:23:42Z Error creating: pods "cluster-network-operator-" is forbidden: error looking up service account openshift-cluster-network-operator/default: serviceaccount "default" not found
2018-11-30T06:24:44Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:44Z 0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 node(s) were not ready.
2018-11-30T06:25:44Z 0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 node(s) were not ready.
2018-11-30T06:25:44Z 0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 node(s) were not ready.
2018-11-30T06:25:51Z 0/3 nodes are available: 1 node(s) were not ready, 2 node(s) had taints that the pod didn't tolerate.
2018-11-30T06:25:51Z 0/3 nodes are available: 1 node(s) were not ready, 2 node(s) had taints that the pod didn't tolerate.
2018-11-30T06:25:51Z 0/3 nodes are available: 1 node(s) were not ready, 2 node(s) had taints that the pod didn't tolerate.
2018-11-30T06:25:54Z 0/3 nodes are available: 3 node(s) had taints that the pod didn't tolerate.
2018-11-30T06:26:14Z MountVolume.SetUp failed for volume "config-volume" : configmaps "dns-default" not found
2018-11-30T06:26:27Z Failed to create revision 1: configmaps "kube-controller-manager-pod" not found
2018-11-30T06:26:32Z Failed to create revision 1: configmaps "kube-apiserver-pod" not found
2018-11-30T06:26:32Z Failed to create revision 1: configmaps "kube-controller-manager-pod" not found
2018-11-30T06:26:33Z Failed to create revision 1: configmaps "kube-apiserver-pod" not found
2018-11-30T06:26:33Z Failed to create revision 1: configmaps "kube-apiserver-pod" not found
2018-11-30T06:26:33Z Failed to create revision 1: configmaps "kube-controller-manager-pod" not found
2018-11-30T06:26:35Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:26:35Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:26:35Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:26:36Z Failed to create revision 1: configmaps "kube-controller-manager-pod" not found
2018-11-30T06:26:46Z Failed to create revision 1: secrets "serving-cert" not found
2018-11-30T06:27:01Z Failed to create revision 1: secrets "serving-cert" not found
2018-11-30T06:27:07Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:08Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:08Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:16Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:16Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:16Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:38Z Readiness probe failed: HTTP probe failed with statuscode: 403
2018-11-30T06:28:00Z Readiness probe failed: Get https://10.129.0.9:8443/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018-11-30T06:28:02Z Readiness probe failed: Get https://10.130.0.15:8443/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018-11-30T06:28:08Z Readiness probe failed: HTTP probe failed with statuscode: 500
2018-11-30T06:28:09Z Readiness probe failed: HTTP probe failed with statuscode: 500
2018-11-30T06:28:11Z Readiness probe failed: HTTP probe failed with statuscode: 500
2018-11-30T06:28:29Z Failed to create installer pod for revision 1 on node "ip-10-0-45-158.ec2.internal": Post https://172.30.0.1:443/api/v1/namespaces/openshift-kube-controller-manager/pods: dial tcp 172.30.0.1:443: connect: connection refused
2018-11-30T06:28:31Z cluster bootstrapping has completed
2018-11-30T06:28:36Z Unable to mount volumes for pod "apiserver-8952q_openshift-apiserver(e19d1cfb-f468-11e8-9d8b-125c0f33f2c8)": timeout expired waiting for volumes to attach or mount for pod "openshift-apiserver"/"apiserver-8952q". list of unmounted volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]. list of unattached volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]
2018-11-30T06:28:36Z Unable to mount volumes for pod "apiserver-d4tst_openshift-apiserver(e19e7a6a-f468-11e8-9d8b-125c0f33f2c8)": timeout expired waiting for volumes to attach or mount for pod "openshift-apiserver"/"apiserver-d4tst". list of unmounted volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]. list of unattached volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]
2018-11-30T06:28:36Z Unable to mount volumes for pod "apiserver-xhfw4_openshift-apiserver(e19e9bd7-f468-11e8-9d8b-125c0f33f2c8)": timeout expired waiting for volumes to attach or mount for pod "openshift-apiserver"/"apiserver-xhfw4". list of unmounted volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]. list of unattached volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]
2018-11-30T06:28:38Z Readiness probe failed: HTTP probe failed with statuscode: 403
2018-11-30T06:30:46Z network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni config uninitialized]
2018-11-30T06:30:58Z network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni config uninitialized]
2018-11-30T06:30:59Z network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni config uninitialized]
2018-11-30T06:31:41Z MountVolume.SetUp failed for volume "default-certificate" : secrets "router-certs-default" not found
2018-11-30T06:31:41Z MountVolume.SetUp failed for volume "default-certificate" : secrets "router-certs-default" not found
2018-11-30T06:31:41Z MountVolume.SetUp failed for volume "default-certificate" : secrets "router-certs-default" not found
2018-11-30T06:32:30Z Readiness probe failed: Get https://10.129.0.13:8443/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018-11-30T06:32:30Z MountVolume.SetUp failed for volume "secret-grafana-tls" : secrets "grafana-tls" not found
2018-11-30T06:32:35Z Readiness probe failed: HTTP probe failed with statuscode: 403
2018-11-30T06:33:00Z Readiness probe failed: Get https://10.128.0.16:8443/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018-11-30T06:33:09Z Readiness probe failed: HTTP probe failed with statuscode: 500
2018-11-30T06:33:46Z Readiness probe failed: Get https://10.130.0.20:8443/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018-11-30T06:37:42Z Error deleting EBS volume "vol-0117599289fd757d7" since volume is in "creating" state
2018-11-30T06:37:42Z Error deleting EBS volume "vol-04de6a4ab315162c4" since volume is in "creating" state
2018-11-30T06:37:42Z Error deleting EBS volume "vol-0443c140090950e02" since volume is in "creating" state
2018-11-30T06:37:42Z Error deleting EBS volume "vol-09fe6717b780bc627" since volume is in "creating" state
2018-11-30T06:37:42Z Error deleting EBS volume "vol-08d34a638c77690dc" since volume is in "creating" state
2018-11-30T06:37:42Z Error deleting EBS volume "vol-0eac6185a2ef7b842" since volume is in "creating" state
2018-11-30T06:37:45Z Error deleting EBS volume "vol-0a8974406639d7456" since volume is in "creating" state
2018-11-30T06:37:46Z Error deleting EBS volume "vol-0b762f7d8da6d2741" since volume is in "creating" state
2018-11-30T06:37:46Z Error deleting EBS volume "vol-022f667d8a414e96c" since volume is in "creating" state
2018-11-30T06:37:47Z Error deleting EBS volume "vol-0c399eb6be225efa5" since volume is in "creating" state
2018-11-30T06:37:47Z Error deleting EBS volume "vol-0e4b6600546d26835" since volume is in "creating" state

You can see some canceled requests around the same time as the test-suite "could not find" errors. And not all that long before are the secrets "router-certs-default" not found error (although that may have resolved by the time of the test errors; I'm not sure who populates that secret).

abhinavdahiya · 2018-12-01T00:50:09Z

/retest

wking · 2018-12-01T04:07:42Z

Release error:

Unable to connect to the server: net/http: TLS handshake timeout
2018/12/01 02:02:04 Container release in pod release-latest failed, exit code 1, reason Error

I'll kick this again after #769 lands, since that one actually helps reduce flakes.

wking · 2018-12-01T06:01:48Z

/retest

wking · 2018-12-01T14:44:32Z

/retest

openshift-ci-robot · 2018-12-01T16:38:32Z

@wking: The following test failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
ci/prow/e2e-libvirt	`449326d`	link	`/test e2e-libvirt`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

try.openshift.com used to be an HTML redirect to cloud.openshift.com, but now it's a page in its own right talking about what OpenShift 4 is. Folks who are trying to find a pull secret for the installer are already pretty interested, so they shouldn't have to dig too hard to get the JSON they need. This will also help avoid confusion like we saw for the CoreOS flow [1], where the pull secret was not immediately obvious to several users due to an undocumented "register for a Tectonic plan" intermediate [2,3]. Currently the JavaScript on /clusters/install is stripping the '#pull-secret' fragment. Hopefully that will get sorted out soon and this link will drop users right into the section with the pull-secret. [1]: openshift#663 (comment) [2]: openshift#677 [3]: openshift#691

openshift-ci-robot requested review from rajatchopra and staebler November 13, 2018 18:17

openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Nov 13, 2018

wking force-pushed the openshift-pull-secret branch from 416176f to 449326d Compare November 13, 2018 18:17

abhinavdahiya added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 13, 2018

wking mentioned this pull request Nov 15, 2018

Elaborate what it means to download pull secret #677

Closed

wking mentioned this pull request Nov 15, 2018

*: Drop tectonic-system namespace #682

Merged

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 15, 2018

wking mentioned this pull request Nov 16, 2018

Add instructions for getting a pull secret #691

Closed

abhinavdahiya reviewed Nov 16, 2018

View reviewed changes

openshift-ci-robot assigned abhinavdahiya Nov 28, 2018

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 28, 2018

openshift-merge-robot merged commit 71956aa into openshift:master Dec 1, 2018

wking deleted the openshift-pull-secret branch December 1, 2018 17:29

wking mentioned this pull request Dec 12, 2018

pkg/asset/installconfig/pullsecret: Point to cloud.openshift.com #886

Merged

*: Point at try.openshift.com for pull secrets #663

*: Point at try.openshift.com for pull secrets #663

Uh oh!

Conversation

wking commented Nov 13, 2018

Uh oh!

abhinavdahiya commented Nov 13, 2018

Uh oh!

wking commented Nov 15, 2018

Uh oh!

abhinavdahiya Nov 16, 2018

Choose a reason for hiding this comment

Uh oh!

wking Nov 16, 2018

Choose a reason for hiding this comment

Uh oh!

wking Nov 16, 2018

Choose a reason for hiding this comment

Uh oh!

abhinavdahiya Nov 16, 2018

Choose a reason for hiding this comment

Uh oh!

wking Nov 16, 2018

Choose a reason for hiding this comment

Uh oh!

wking Nov 20, 2018

Choose a reason for hiding this comment

Uh oh!

abhinavdahiya commented Nov 28, 2018

Uh oh!

openshift-ci-robot commented Nov 28, 2018

Uh oh!

wking commented Nov 30, 2018

Uh oh!

wking commented Nov 30, 2018

Uh oh!

abhinavdahiya commented Dec 1, 2018

Uh oh!

wking commented Dec 1, 2018

Uh oh!

wking commented Dec 1, 2018

Uh oh!

wking commented Dec 1, 2018

Uh oh!

openshift-ci-robot commented Dec 1, 2018

Uh oh!

Uh oh!