-
Notifications
You must be signed in to change notification settings - Fork 1.9k
gather/aws/console: Add a step to gather console logs on AWS #9743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gather/aws/console: Add a step to gather console logs on AWS #9743
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wking: 1 invalid OWNERS file
In response to this:
History of this logic:
- Initially landed for nodes in e102a16 (#6189).
- Grew
--text
in 6ec5bf3 (#6536).- Grew machine handling in a469f53 (#6906).
- The
node-provider-IDs
creation was ported to steps in e2fd5c7 (#6708), but without any consumer for the collected file.The
aws-instance-ids.txt
injection allows install-time steps to register additional instances for later console collection (for a proxy instance, bootstrap instance, etc.).Approvers are:
- Myself and @vrutkovs, who have touched this logic in the past.
- @cgwalters, representing the RHCOS/machine-config space that needs console logs to debug RHCOS issues.
- @enxebre, representing the machine-API space that needs console logs to debug failed-to-boot issues.
Feel free to recommend additional approvers :).
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
dba286c
to
32006d7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wking: 1 invalid OWNERS file
In response to this:
History of this logic:
- Initially landed for nodes in e102a16 (#6189).
- Grew
--text
in 6ec5bf3 (#6536).- Grew machine handling in a469f53 (#6906).
- The
node-provider-IDs
creation was ported to steps in e2fd5c7 (#6708), but without any consumer for the collected file.The
aws-instance-ids.txt
injection allows install-time steps to register additional instances for later console collection (for a proxy instance, bootstrap instance, etc.).Approvers are:
- Myself and @vrutkovs, who have touched this logic in the past.
- @cgwalters, representing the RHCOS/machine-config space that needs console logs to debug RHCOS issues.
- @enxebre, representing the machine-API space that needs console logs to debug failed-to-boot issues.
Feel free to recommend additional approvers :).
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
32006d7
to
2a14db3
Compare
Thanks for doing this, I and many others would have found this useful in the past. But...I think coreos/ignition#585 would cover a lot of what this does in a much better way - most notably we could start to tie it into the machineAPI so we have a state between |
(Also that approach would work universally across other clouds, bare metal etc.) |
Sounds good to me, although there's a window where the machine is not alive enough to call itself in where polling for console logs would still be useful. And that ticket is from 2017? So regardless of whether we feel like my console polling is useful long-term, I think it's definitely useful in the short term. Or do you expect movement on the Ignition front in the next few months? |
Agreed.
Probably not but I'm more trying to coax/trick/convince you into hacking on Ignition 😄 |
2a14db3
to
571dcc0
Compare
Pushed 2a14db3a0b -> 571dcc0340 to address |
History of this logic: * Initially landed for nodes in e102a16 (ci-operator/templates/openshift/installer/cluster-launch-installer-e2e: Gather node console logs on AWS, 2019-12-02, openshift#6189). * Grew --text in 6ec5bf3 (installer artifacts: keep text version of instance output, 2020-01-02, openshift#6536). * Grew machine handling in a469f53 (ci-operator/templates/openshift/installer/cluster-launch-installer-e2e: Gather console logs from Machine providerID too, 2020-01-29, openshift#6906). * The node-provider-IDs creation was ported to steps in e2fd5c7 (step-registry: update deprovision step, 2020-01-30, openshift#6708), but without any consumer for the collected file. The aws-instance-ids.txt injection allows install-time steps to register additional instances for later console collection (for a proxy instance, bootstrap instance, etc.). Approvers are: * Myself and Vadim, who have touched this logic in the past. * Alberto, representing the machine-API space that needs console logs to debug failed-to-boot issues. * Colin, representing the RHCOS/machine-config space that needs console logs to debug RHCOS issues. TMPDIR documentation is based on POSIX [1]. [1]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03
571dcc0
to
390d6be
Compare
I've pushed 571dcc0340 -> 390d6be to fix. |
etcd-operator/e2e-aws looks good now. I think this PR is good to merge. /assign @cgwalters Since you've glanced at it already, can you |
@wking: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
(Does submitting jobs like this really try rehearsing all jobs whch depend on it? That's like ~20 clusters launched by this PR or so?) /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cgwalters, wking The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I think there's a cap. But yeah, it's a high cap. |
@wking: Updated the following 2 configmaps:
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
env: | ||
- name: TMPDIR | ||
default: /tmp | ||
documentation: A pathname of a directory made available for programs that need a place to create temporary files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thought as #9640 (comment)
I think exposing TMPDIR as a user option is mis use of setting variables for the bash script.
Looks like 4.3- upi-installer lack |
Fixup in #9791. |
…urced IDs Fixing a typo we've had since the step was born in 390d6be (gather/aws/console: Add a step to gather console logs on AWS, 2020-06-17, openshift#9743).
Like the similar aws-console step, but for GCP, following [1]. I'm not sure why they require a --zone argument when 'instances list' seems to be able to figure this out, but whatever. Leaving it off doesn't seem to be an option: wking@cloudshell:~ (openshift-gce-devel-ci)$ gcloud --format json compute instances get-serial-port-output "${INSTANCE_ID}" </dev/null >logs.json ERROR: (gcloud.compute.instances.get-serial-port-output) Underspecified resource [ci-ln-2gtd6db-f76d1-fj4cl-master-0]. Specify the [--zone] flag. wking@cloudshell:~ (openshift-gce-devel-ci)$ gcloud --version | head -n1 Google Cloud SDK 315.0.0 OWNERS are like AWS, from 390d6be (gather/aws/console: Add a step to gather console logs on AWS, 2020-06-17, openshift#9743), and also the installer's GCP approvers [2]. [1]: https://cloud.google.com/compute/docs/reference/rest/v1/instances/getSerialPortOutput [2]: https://github.com/openshift/installer/blob/e17697302ff991ae80d6e9c0c10426db510881e2/OWNERS_ALIASES#L60-L62
History of this logic:
--text
in 6ec5bf3 (Set LC_ALL in artifacts so that aws cli would convert unicode correctly #6536).node-provider-IDs
creation was ported to steps in e2fd5c7 (step-registry: add configure and install IPI steps #6708), but without any consumer for the collected file.The
aws-instance-ids.txt
injection allows install-time steps to register additional instances for later console collection (for a proxy instance, bootstrap instance, etc.).Approvers are:
Feel free to recommend additional approvers :).