Skip to content

*: Point at try.openshift.com for pull secrets #663

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

wking
Copy link
Member

@wking wking commented Nov 13, 2018

The account.coreos.com reference was stale, and pull-secrets aren't libvirt-specific, so I've dropped them from the libvirt docs entirely.

From @smarterclayton, the flow for getting a pull secret will be:

  1. Log in to try.openshift.com.
  2. Accept the terms.
  3. Get a pull secret you can download or copy/paste back into a local file.

Podman doesn't really come into it. Currently the secret you get there looks like:

$ cat pull-secret.json
{
  "auths": {
    "cloud.openshift.com": {"auth": "...", "email": "..."},
    "quay.io": {"auth": "...", "email": "..."}
  }
}

Besides pulling images, the secret may also be used to authenticate to other services (e.g. telemetry) on hosts that do not contain image registries, which is more reason to decouple this from Podman.

The account.coreos.com reference was stale, and pull-secrets aren't
libvirt-specific, so I've dropped them from the libvirt docs entirely.

From Clayton, the flow for getting a pull secret will be:

1. Log in to try.openshift.com.
2. Accept the terms.
3. Get a pull secret you can download or copy/paste back into a local
   file.

Podman doesn't really come into it.  Currently the secret you get
there looks like:

  $ cat ~/.personal/pull-secret.json
  {
    "auths": {
      "cloud.openshift.com": {"auth": "...", "email": "..."},
      "quay.io": {"auth": "...", "email": "..."}
    }
  }

Besides pulling images, the secret may also be used to authenticate to
other services (e.g. telemetry) on hosts that do not contain image
registries, which is more reason to decouple this from Podman.
@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Nov 13, 2018
@wking wking force-pushed the openshift-pull-secret branch from 416176f to 449326d Compare November 13, 2018 18:17
@abhinavdahiya
Copy link
Contributor

/hold this needs to wait until we have removed the kube-addon-operator.

@abhinavdahiya abhinavdahiya added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 13, 2018
wking added a commit to wking/openshift-installer that referenced this pull request Nov 15, 2018
The kube-addon operator was the last remaining component in that
namespace, and it was just controlling a metrics server.  Metrics
aren't critical to cluster functions, and dropping kube-addon means we
don't need the old pull secret anymore (although we will shortly need
new pull secrets for pulling private release images [1]).

Also drop the admin and user roles [2], although I'm less clear on
their connection.

[1]: openshift#663
[2]: openshift#682 (comment)
@wking
Copy link
Member Author

wking commented Nov 15, 2018

/hold cancel

#682 landed, removing the blocker here.

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 15, 2018
@@ -25,7 +25,7 @@ func (a *pullSecret) Generate(asset.Parents) error {
&survey.Question{
Prompt: &survey.Input{
Message: "Pull Secret",
Help: "The container registry pull secret for this cluster, as a single line of JSON (e.g. {\"auths\": {...}}).",
Help: "The container registry pull secret for this cluster, as a single line of JSON (e.g. {\"auths\": {...}}).\n\nYou can get this secret from https://try.openshift.com",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users were complaining https://github.com/openshift/installer/pull/663/files#diff-e88e63c9df89ea6d2969694596325266L44 was insufficient. now this is more obscure. We should add/point to some docs on getting the pull secret process.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try.openshift.com is fairly tightly scoped to this. Not sure about the old account.coreos.com. I'll go through the process again and see if there's anything that seems non-obvious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here's what the try.openshift.com flow looks like:

login

secret

I don't know what text I could add to make that process easier, except that it would be nice if the secret was on a single line (I think @crawford has already asked for that) or if the web page suggested downloading the secret instead of copy/pasting it. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even the old process was 2 steps:
screenshot from 2018-11-16 12-53-45
screenshot from 2018-11-16 12-54-11

Users still kept asking us?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the OpenShift version is certainly putting it more front-and-center ;). That should help. #677 and #691 were both suggesting a "register for a Tectonic plan" step. Is that still required for OpenShift? I'd expect try.openshift.com to be enough user input to get that information, but I'm not sure what the from-scratch flow looks like (presumably there's an accept-terms intermediate? Maybe more?). @smarterclayton, are there screenshots for a from-scratch registration somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... but I'm not sure what the from-scratch flow looks like (presumably there's an accept-terms intermediate? Maybe more?).

So I just went through this with a different email address and GitHub auth, and there are two intermediate pages. The first is a form for personal information and term-acceptance:

form

The next is an email-confirmation page:

confirm

Clicking on the confirmation link in the email took me to the page I posted above, with the token front and center. So I don't think the new flow has anything like the old flow's "register for a Tectonic plan" step. Is that enough to back up the text I have here now? If not, what additional text would you like to see?

@abhinavdahiya
Copy link
Contributor

We'll add more docs when people ask for.
/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 28, 2018
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [abhinavdahiya,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@wking
Copy link
Member Author

wking commented Nov 30, 2018

/retest

@wking
Copy link
Member Author

wking commented Nov 30, 2018

e2e-aws included:

E1130 06:33:41.217223     666 memcache.go:147] couldn't get resource list for authorization.openshift.io/v1: the server could not find the requested resource
E1130 06:33:41.393011     666 memcache.go:147] couldn't get resource list for project.openshift.io/v1: the server could not find the requested resource
E1130 06:33:41.420379     666 memcache.go:147] couldn't get resource list for quota.openshift.io/v1: the server could not find the requested resource

Digging in a bit, the nodes look healthy:

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/663/pull-ci-openshift-installer-master-e2e-aws/1771/artifacts/e2e-aws/nodes.json | jq -r '.items[] | {name: .metadata.name, role: (.metadata.labels | keys[] | select(. | startswith("node-role.kubernetes.io/"))), ready: [(.status.conditions[] | select(.type == "Ready") | {status, lastTransitionTime})][0]}'
{
  "name": "ip-10-0-143-167.ec2.internal",
  "role": "node-role.kubernetes.io/worker",
  "ready": {
    "status": "True",
    "lastTransitionTime": "2018-11-30T06:31:05Z"
  }
}
{
  "name": "ip-10-0-147-44.ec2.internal",
  "role": "node-role.kubernetes.io/worker",
  "ready": {
    "status": "True",
    "lastTransitionTime": "2018-11-30T06:31:03Z"
  }
}
{
  "name": "ip-10-0-15-50.ec2.internal",
  "role": "node-role.kubernetes.io/master",
  "ready": {
    "status": "True",
    "lastTransitionTime": "2018-11-30T06:25:53Z"
  }
}
{
  "name": "ip-10-0-175-43.ec2.internal",
  "role": "node-role.kubernetes.io/worker",
  "ready": {
    "status": "True",
    "lastTransitionTime": "2018-11-30T06:30:59Z"
  }
}
{
  "name": "ip-10-0-21-63.ec2.internal",
  "role": "node-role.kubernetes.io/master",
  "ready": {
    "status": "True",
    "lastTransitionTime": "2018-11-30T06:25:28Z"
  }
}
{
  "name": "ip-10-0-45-158.ec2.internal",
  "role": "node-role.kubernetes.io/master",
  "ready": {
    "status": "True",
    "lastTransitionTime": "2018-11-30T06:25:51Z"
  }
}

Here's the first event, bootstrapping finished, and the last event:

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/663/pull-ci-openshift-installer-master-e2e-aws/1771/artifacts/e2e-aws/events.json | jq -r '[.items[] | .firstTimestamp] | sort[0]'
2018-11-30T06:22:46Z
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/663/pull-ci-openshift-installer-master-e2e-aws/1771/artifacts/e2e-aws/events.json | jq -r '.items[] | select(.metadata.name == "bootstrap-complete") | .firstTimestamp'
2018-11-30T06:28:31Z
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/663/pull-ci-openshift-installer-master-e2e-aws/1771/artifacts/e2e-aws/events.json | jq -r '[.items[] | .lastTimestamp] | sort[-1]'
2018-11-30T06:48:23Z

Here are the non-normal events by increasing lastTimestamp:

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/663/pull-ci-openshift-installer-master-e2e-aws/1771/artifacts/e2e-aws/events.json | jq -r '[.items[] | select(.type != "Normal")] | sort_by(.lastTimestamp)[] | .lastTimestamp + " " + .message'
2018-11-30T06:23:13Z Failed to create new replica set "cluster-version-operator-5588cf49bc": replicasets.apps "cluster-version-operator-5588cf49bc" is forbidden: cannot set blockOwnerDeletion in this case because cannot find RESTMapping for APIVersion apps/v1 Kind Deployment: no matches for kind "Deployment" in version "apps/v1"
2018-11-30T06:23:42Z Error creating: pods "cluster-network-operator-" is forbidden: error looking up service account openshift-cluster-network-operator/default: serviceaccount "default" not found
2018-11-30T06:24:44Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:21Z 0/3 nodes are available: 3 node(s) were not ready.
2018-11-30T06:25:44Z 0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 node(s) were not ready.
2018-11-30T06:25:44Z 0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 node(s) were not ready.
2018-11-30T06:25:44Z 0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 node(s) were not ready.
2018-11-30T06:25:51Z 0/3 nodes are available: 1 node(s) were not ready, 2 node(s) had taints that the pod didn't tolerate.
2018-11-30T06:25:51Z 0/3 nodes are available: 1 node(s) were not ready, 2 node(s) had taints that the pod didn't tolerate.
2018-11-30T06:25:51Z 0/3 nodes are available: 1 node(s) were not ready, 2 node(s) had taints that the pod didn't tolerate.
2018-11-30T06:25:54Z 0/3 nodes are available: 3 node(s) had taints that the pod didn't tolerate.
2018-11-30T06:26:14Z MountVolume.SetUp failed for volume "config-volume" : configmaps "dns-default" not found
2018-11-30T06:26:27Z Failed to create revision 1: configmaps "kube-controller-manager-pod" not found
2018-11-30T06:26:32Z Failed to create revision 1: configmaps "kube-apiserver-pod" not found
2018-11-30T06:26:32Z Failed to create revision 1: configmaps "kube-controller-manager-pod" not found
2018-11-30T06:26:33Z Failed to create revision 1: configmaps "kube-apiserver-pod" not found
2018-11-30T06:26:33Z Failed to create revision 1: configmaps "kube-apiserver-pod" not found
2018-11-30T06:26:33Z Failed to create revision 1: configmaps "kube-controller-manager-pod" not found
2018-11-30T06:26:35Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:26:35Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:26:35Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:26:36Z Failed to create revision 1: configmaps "kube-controller-manager-pod" not found
2018-11-30T06:26:46Z Failed to create revision 1: secrets "serving-cert" not found
2018-11-30T06:27:01Z Failed to create revision 1: secrets "serving-cert" not found
2018-11-30T06:27:07Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:08Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:08Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:16Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:16Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:16Z MountVolume.SetUp failed for volume "serving-cert" : secrets "serving-cert" not found
2018-11-30T06:27:38Z Readiness probe failed: HTTP probe failed with statuscode: 403
2018-11-30T06:28:00Z Readiness probe failed: Get https://10.129.0.9:8443/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018-11-30T06:28:02Z Readiness probe failed: Get https://10.130.0.15:8443/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018-11-30T06:28:08Z Readiness probe failed: HTTP probe failed with statuscode: 500
2018-11-30T06:28:09Z Readiness probe failed: HTTP probe failed with statuscode: 500
2018-11-30T06:28:11Z Readiness probe failed: HTTP probe failed with statuscode: 500
2018-11-30T06:28:29Z Failed to create installer pod for revision 1 on node "ip-10-0-45-158.ec2.internal": Post https://172.30.0.1:443/api/v1/namespaces/openshift-kube-controller-manager/pods: dial tcp 172.30.0.1:443: connect: connection refused
2018-11-30T06:28:31Z cluster bootstrapping has completed
2018-11-30T06:28:36Z Unable to mount volumes for pod "apiserver-8952q_openshift-apiserver(e19d1cfb-f468-11e8-9d8b-125c0f33f2c8)": timeout expired waiting for volumes to attach or mount for pod "openshift-apiserver"/"apiserver-8952q". list of unmounted volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]. list of unattached volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]
2018-11-30T06:28:36Z Unable to mount volumes for pod "apiserver-d4tst_openshift-apiserver(e19e7a6a-f468-11e8-9d8b-125c0f33f2c8)": timeout expired waiting for volumes to attach or mount for pod "openshift-apiserver"/"apiserver-d4tst". list of unmounted volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]. list of unattached volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]
2018-11-30T06:28:36Z Unable to mount volumes for pod "apiserver-xhfw4_openshift-apiserver(e19e9bd7-f468-11e8-9d8b-125c0f33f2c8)": timeout expired waiting for volumes to attach or mount for pod "openshift-apiserver"/"apiserver-xhfw4". list of unmounted volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]. list of unattached volumes=[config client-ca etcd-serving-ca etcd-client serving-cert openshift-apiserver-sa-token-qzc6w]
2018-11-30T06:28:38Z Readiness probe failed: HTTP probe failed with statuscode: 403
2018-11-30T06:30:46Z network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni config uninitialized]
2018-11-30T06:30:58Z network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni config uninitialized]
2018-11-30T06:30:59Z network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni config uninitialized]
2018-11-30T06:31:41Z MountVolume.SetUp failed for volume "default-certificate" : secrets "router-certs-default" not found
2018-11-30T06:31:41Z MountVolume.SetUp failed for volume "default-certificate" : secrets "router-certs-default" not found
2018-11-30T06:31:41Z MountVolume.SetUp failed for volume "default-certificate" : secrets "router-certs-default" not found
2018-11-30T06:32:30Z Readiness probe failed: Get https://10.129.0.13:8443/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018-11-30T06:32:30Z MountVolume.SetUp failed for volume "secret-grafana-tls" : secrets "grafana-tls" not found
2018-11-30T06:32:35Z Readiness probe failed: HTTP probe failed with statuscode: 403
2018-11-30T06:33:00Z Readiness probe failed: Get https://10.128.0.16:8443/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018-11-30T06:33:09Z Readiness probe failed: HTTP probe failed with statuscode: 500
2018-11-30T06:33:46Z Readiness probe failed: Get https://10.130.0.20:8443/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2018-11-30T06:37:42Z Error deleting EBS volume "vol-0117599289fd757d7" since volume is in "creating" state
2018-11-30T06:37:42Z Error deleting EBS volume "vol-04de6a4ab315162c4" since volume is in "creating" state
2018-11-30T06:37:42Z Error deleting EBS volume "vol-0443c140090950e02" since volume is in "creating" state
2018-11-30T06:37:42Z Error deleting EBS volume "vol-09fe6717b780bc627" since volume is in "creating" state
2018-11-30T06:37:42Z Error deleting EBS volume "vol-08d34a638c77690dc" since volume is in "creating" state
2018-11-30T06:37:42Z Error deleting EBS volume "vol-0eac6185a2ef7b842" since volume is in "creating" state
2018-11-30T06:37:45Z Error deleting EBS volume "vol-0a8974406639d7456" since volume is in "creating" state
2018-11-30T06:37:46Z Error deleting EBS volume "vol-0b762f7d8da6d2741" since volume is in "creating" state
2018-11-30T06:37:46Z Error deleting EBS volume "vol-022f667d8a414e96c" since volume is in "creating" state
2018-11-30T06:37:47Z Error deleting EBS volume "vol-0c399eb6be225efa5" since volume is in "creating" state
2018-11-30T06:37:47Z Error deleting EBS volume "vol-0e4b6600546d26835" since volume is in "creating" state

You can see some canceled requests around the same time as the test-suite "could not find" errors. And not all that long before are the secrets "router-certs-default" not found error (although that may have resolved by the time of the test errors; I'm not sure who populates that secret).

@abhinavdahiya
Copy link
Contributor

/retest

@wking
Copy link
Member Author

wking commented Dec 1, 2018

Release error:

Unable to connect to the server: net/http: TLS handshake timeout
2018/12/01 02:02:04 Container release in pod release-latest failed, exit code 1, reason Error

I'll kick this again after #769 lands, since that one actually helps reduce flakes.

@wking
Copy link
Member Author

wking commented Dec 1, 2018

/retest

@wking
Copy link
Member Author

wking commented Dec 1, 2018

/retest

@openshift-merge-robot openshift-merge-robot merged commit 71956aa into openshift:master Dec 1, 2018
@openshift-ci-robot
Copy link
Contributor

@wking: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-libvirt 449326d link /test e2e-libvirt

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@wking wking deleted the openshift-pull-secret branch December 1, 2018 17:29
wking added a commit to wking/openshift-installer that referenced this pull request Dec 13, 2018
try.openshift.com used to be an HTML redirect to cloud.openshift.com,
but now it's a page in its own right talking about what OpenShift 4
is.  Folks who are trying to find a pull secret for the installer are
already pretty interested, so they shouldn't have to dig too hard to
get the JSON they need.

This will also help avoid confusion like we saw for the CoreOS flow
[1], where the pull secret was not immediately obvious to several
users due to an undocumented "register for a Tectonic plan"
intermediate [2,3].

Currently the JavaScript on /clusters/install is stripping the
'#pull-secret' fragment.  Hopefully that will get sorted out soon and
this link will drop users right into the section with the pull-secret.

[1]: openshift#663 (comment)
[2]: openshift#677
[3]: openshift#691
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants