Skip to content

[RFE] Installer does not use existing SSH agent #1865

Closed
@bison

Description

@bison

Version

$ ./current/openshift-install version
./current/openshift-install unreleased-master-1130-g311a8a1266268aa454e45049e85f2b3c186715c7-dirty
built from commit 311a8a1266268aa454e45049e85f2b3c186715c7
release image registry.svc.ci.openshift.org/ocp/release@sha256:e898c929fea0b02146b074ca848e012de243d41f90cf87fc2f5dca3455563ce2

Platform (aws|libvirt|openstack):

AWS

What happened?

The install failed, and the installer was not able to SSH to the machines to gather logs:

INFO Pulling debug logs from the bootstrap machine
ERROR failed to create SSH client: failed to initialize the SSH agent: [failed to parse SSH private key from "[REDACTED]": ssh: no key found, failed to parse SSH private key from "[REDACTED]": ssh: no key found, failed to parse SSH private key from "[REDACTED]": ssh: cannot decode encrypted private keys, failed to parse SSH private key from "[REDACTED]": ssh: no key found, failed to parse SSH private key from "[REDACTED]": ssh: no key found, failed to parse SSH private key from "[REDACTED]": ssh: no key found, failed to read "[REDACTED]: no such device or address]
FATAL waiting for Kubernetes API: context deadline exceeded

What you expected to happen?

The installer should have been able to gather the logs using my already running SSH agent. My private keys are not stored on disk. If the SSH connection could not be established, the installer should have provided me the commands to try manually.

How to reproduce it (as minimally and precisely as possible)?

Start with a machine with no SSH private keys available in ~/.ssh.

$ openshift-install create cluster

Anything else we need to know?

I think the support for gathering diagnostics via SSH introduced in #1822 could be a bit more robust in some cases. A few things:

  • It looks like defaultPrivateSSHKeys() tries to load every file in ${HOME}/.ssh and parse them as an SSH private key. Maybe I'm reading wrong, but it looks like if any file fails to parse (e.g. a config or known_hosts file) LoadPrivateSSHKeys() returns a non-nil error which causes newAgent() to fail.

  • Some users don't have private keys on disk. It's common for private keys to live on a smart-card and be accessed via an already running SSH agent. It would be nice if the installer detected SSH_AUTH_SOCK and forwarded to the existing agent.

  • If all else fails, I think the installer should print the commands like it used to. That way users can try for themselves with their existing configs.

References

#1822

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions