Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/webapi/ping calls upstream dependencies on every request #44121

Closed
taraspos opened this issue Jul 12, 2024 · 1 comment · Fixed by #53742
Closed

/webapi/ping calls upstream dependencies on every request #44121

taraspos opened this issue Jul 12, 2024 · 1 comment · Fixed by #53742
Assignees
Labels
bug c-cc Internal Customer Reference

Comments

@taraspos
Copy link
Contributor

Expected behavior

Calls to /webapi/ping endpoint do not depend on successful calls to third party dependencies

Current behavior

Successful response from /webapi/ping endpoint depends on successful response from configured SAML Provider.

As result, any issues or slowness on the SAML Provider's side are causing failures or long response time on /webapi/ping as well.

This is happening because ValidateSAMLConnector is called, which uses http.Get(sc.GetEntityDescriptorURL()). Because default default http client is used, it also has no client timeout set on the request, so it will wait for the response indefinitely.

// ValidateSAMLConnector validates the SAMLConnector and sets default values.
// If a remote to fetch roles is specified, roles will be validated to exist.
func ValidateSAMLConnector(sc types.SAMLConnector, rg RoleGetter) error {
if err := CheckAndSetDefaults(sc); err != nil {
return trace.Wrap(err)
}
if sc.GetEntityDescriptorURL() != "" {
resp, err := http.Get(sc.GetEntityDescriptorURL())
if err != nil {
return trace.WrapWithMessage(err, "unable to fetch entity descriptor from %v for SAML connector %v", sc.GetEntityDescriptorURL(), sc.GetName())
}
defer resp.Body.Close()

Bug details

  • Teleport version - 15.4.7

  • Recreation steps

    curl https://<teleport_hostname>:443/webapi/ping

    We noticed this issue, because our external healthcheck (which is configured test against /webapi/ping) was randomly failing with timeout from time to time. Debugging led to the conclusion that requests hang because of slow responses from the SAML provider.

  • Debug logs

    • Every request to /webapi/ping endpoint produces log entry like

      {
        "caller": "services/saml.go:66",
        "component": null,
        "level": "debug",
        "message": "[SAML] Successfully fetched entity descriptor from <redacted> for connector <connector_name>",
        "timestamp": "2024-07-11T11:08:37Z"
      }

      log.Debugf("[SAML] Successfully fetched entity descriptor from %v for connector %v", sc.GetEntityDescriptorURL(), sc.GetName())

@taraspos taraspos added the bug label Jul 12, 2024
@espadolini
Copy link
Contributor

espadolini commented Jul 12, 2024

I reckon that the simplest way to fix this would be to add a parameter to the proxy /webapi/ping endpoint that instructs the handler to only return info that comes from the auth server Ping call rather than the additional auth info - it'll still give us a confirmation that proxy-auth comms are up and working, but with much less variability or dependency on cluster configuration.

@nklaassen nklaassen self-assigned this Apr 5, 2025
nklaassen added a commit that referenced this issue Apr 5, 2025
`/webapi/ping` and `/web/config.js` will currently try to fetch SAML
metadata from an entity descriptor URL if one is configured. This can
break these endpoints if the IdP goes down or has network issues. To
make matters worse, we use the default HTTP client with no timeout on
the request.

This PR updates the ping and config.js to avoid fetching SAML metadata
over the network and adds a timeout to the request when we do fetch the
metadata.

Closes #44121
nklaassen added a commit that referenced this issue Apr 5, 2025
`/webapi/ping` and `/web/config.js` will currently try to fetch SAML
metadata from an entity descriptor URL if one is configured. This can
break these endpoints if the IdP goes down or has network issues. To
make matters worse, we use the default HTTP client with no timeout on
the request.

This PR updates the ping and config.js to avoid fetching SAML metadata
over the network and adds a timeout to the request when we do fetch the
metadata.

Closes #44121
github-merge-queue bot pushed a commit that referenced this issue Apr 9, 2025
`/webapi/ping` and `/web/config.js` will currently try to fetch SAML
metadata from an entity descriptor URL if one is configured. This can
break these endpoints if the IdP goes down or has network issues. To
make matters worse, we use the default HTTP client with no timeout on
the request.

This PR updates the ping and config.js to avoid fetching SAML metadata
over the network and adds a timeout to the request when we do fetch the
metadata.

Closes #44121
@milos-teleport milos-teleport added the c-cc Internal Customer Reference label Apr 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug c-cc Internal Customer Reference
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants