Skip to content

cmd/openshift-install: Add 'destroy bootstrap' command #493

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 19, 2018

Conversation

wking
Copy link
Member

@wking wking commented Oct 19, 2018

Using Terraform to remove all resources created by the bootstrap modules. For this to work, all platforms must define a bootstrap module (and they all currently do).

I've also created a LoadMetadata helper to share the "retrieve the metadata from the asset directory" logic between the destroy-cluster and destroy-bootstrap logic. The new helper lives in the cluster asset plackage close to the code that determines that file's location.

/assign @crawford

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 19, 2018
@wking wking force-pushed the bootstrap-destroy branch 2 times, most recently from 151beee to 9b8cfda Compare October 19, 2018 17:22
@wking wking changed the title cmd/openshift-install: Add destroy-bootstrap command cmd/openshift-install: Add 'destroy bootstrap' command Oct 19, 2018
@wking
Copy link
Member Author

wking commented Oct 19, 2018

Based on discussion with @abhinavdahiya and @crawford, I've pushed 151beee -> 9b8cfda to make this openshift-install destroy bootstrap. I've also renamed the full-cluster destroy to openshift-install destroy cluster. The old openshift-install destroy-cluster is still there until we migrate CI, but it's deprecated.

@wking wking force-pushed the bootstrap-destroy branch from 9b8cfda to 640547a Compare October 19, 2018 17:28
@crawford
Copy link
Contributor

/approve

@wking wking force-pushed the bootstrap-destroy branch from 640547a to 69078c7 Compare October 19, 2018 21:16
@abhinavdahiya
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 19, 2018
wking added 2 commits October 19, 2018 16:11
So it's easier for consumers like the cluster asset to use the
conventional name.
Using Terraform to remove all resources created by the bootstrap
modules.  For this to work, all platforms must define a bootstrap
module (and they all currently do).

This command moves the previous destroy-cluster into a new 'destroy
cluster' subcommand, because grouping different destroy flavors into
sub-commands makes the base command easier to understand.  We expect
both destroy flavors to be long-running, because it's hard to write
generic logic for "is the cluster sufficiently live for us to remove
the bootstrap".  We don't want to hang forever if the cluster dies
before coming up, but there's no solid rules for how long to wait
before deciding that it's never going to come up.  When we start
destroying the bootstrap resources automatically in the future, will
pick reasonable timeouts, but will want to still provide callers with
the ability to manually remove the bootstrap resources if we happen to
fall out of that timeout on a cluster that does eventually come up.

I've also created a LoadMetadata helper to share the "retrieve the
metadata from the asset directory" logic between the destroy-cluster
and destroy-bootstrap logic.  The new helper lives in the cluster
asset plackage close to the code that determines that file's location.

I've pushed the Terraform module unpacking and 'terraform init' call
down into a helper used by the Apply and Destroy functions to make
life easier on the callers.

I've also fixed a path.Join -> filepath.Join typo in Apply, which
dates back to ff5a57b (pkg/terraform: Modify some helper functions
for the new binary layout, 2018-09-19, openshift#289).  These aren't network
paths ;).
@wking wking force-pushed the bootstrap-destroy branch from 69078c7 to 3f4fe57 Compare October 19, 2018 23:12
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Oct 19, 2018
@wking
Copy link
Member Author

wking commented Oct 19, 2018

Rebased around #497 with 69078c7 -> 3f4fe57.

@abhinavdahiya
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 19, 2018
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, crawford, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [abhinavdahiya,crawford,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit e7d9f49 into openshift:master Oct 19, 2018
@wking
Copy link
Member Author

wking commented Oct 21, 2018

We may want to do something to make this more gentle. Maybe run some soft-shutdown script before firing the event. As it stands, running openshift-install delete bootstrap after the bootstrap-complete event comes through leaves you with the cluster wondering where the bootstrap node went:

$ KUBECONFIG=wking/auth/kubeconfig oc get nodes
NAME              STATUS     ROLES       AGE       VERSION
wking-bootstrap   NotReady   bootstrap   46m       v1.11.0+d4cacc0
wking-master-0    Ready      master      46m       v1.11.0+d4cacc0
$ KUBECONFIG=wking/auth/kubeconfig oc get node -o json wking-bootstrap | jq '.status.conditions[0]'
{
  "lastHeartbeatTime": "2018-10-21T04:37:20Z",
  "lastTransitionTime": "2018-10-21T04:42:24Z",
  "message": "Kubelet stopped posting node status.",
  "reason": "NodeStatusUnknown",
  "status": "Unknown",
  "type": "OutOfDisk"
}

What we want is to remove that node entirely. You can currently do that by hand:

$ KUBECONFIG=wking/auth/kubeconfig oc delete node wking-bootstrap
node "wking-bootstrap" deleted

but I'm not sure what the best way is to wire that into delete bootstrap.

@crawford
Copy link
Contributor

The long-term plan is to never introduce the bootstrap node into K8S. I have a local patch that tries to do this. I'll have to rebase and try again.

@wking wking deleted the bootstrap-destroy branch October 22, 2018 17:42
hardys pushed a commit to hardys/installer that referenced this pull request Oct 26, 2018
This code was recently reworked in:

openshift#493

Adding OpenStack here means the destroy bootstrap option will
work for OpenStack, which is a first step towards full destroy
support for the OpenStack platform.
wking added a commit to wking/openshift-release that referenced this pull request Oct 26, 2018
…uster'

Catching up with openshift/installer@3f4fe574 (cmd/openshift-install:
Add 'destroy bootstrap' command, openshift/installer#493).

Also change 'cluster' -> 'create cluster' to catch up with
openshift/installer@85c76c98 (cmd/openshift-install: Push creation
under 'openshift create', 2018-10-22, openshift/installer#513).
wking added a commit to wking/openshift-installer that referenced this pull request Nov 4, 2018
Since 3f4fe57 (cmd/openshift-install: Add 'destroy bootstrap'
command, 2018-10-18, openshift#493), the bootstrap module is targeted for
deletion by 'destroy bootstrap'.  Moving the S3 bucket into the module
means that the bucket, which we only use for hosting the bootstrap
Ignition configuration, will be removed along with the rest of the
bootstrap resources.
wking added a commit to wking/openshift-installer that referenced this pull request Nov 21, 2018
These have been deprecated since 3f4fe57 (cmd/openshift-install: Add
'destroy bootstrap' command, 2018-10-18, openshift#493) and 85c76c9
(cmd/openshift-install: Push creation under 'openshift create',
2018-10-22, openshift#513).

The destroy commands were deprecated before the 0.3.0 release, so
removing them now should be fine.

The create commands landed after 0.3.0, but it's been a long time
since they were deprecated.  Our released versions currently go stale
pretty fast, so there are unlikely to be any consumers of the old
create commands that this removal would break.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants