Don't throw away errors from ShouldPropagate #82

adrianludwin · 2021-09-21T02:13:16Z

See comment. We're discarding errors but we shouldn't be; they should be propagated upwards like any other error or at least logged as an error (Update Sep 25 - we should log them, see discussion below).

/cc @joe2far

adrianludwin · 2021-09-21T02:14:26Z

/good-first-issue

k8s-ci-robot · 2021-09-21T02:14:27Z

@adrianludwin:
This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-good-first-issue command.

In response to this:

/good-first-issue

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

yashvardhan-kukreja · 2021-09-21T21:37:36Z

/assign

yashvardhan-kukreja · 2021-09-21T22:01:20Z

Hi @adrianludwin 👋🏼
At the time of error, should I proceed towards just logging the error and letting the loop iterate further, or should I actually, break the loop right away by returning the error hence, making it propagate upwards in the call stack?

I prefer the latter one but as I am new, I might be missing something so I'd love to hear your suggestion :)

adrianludwin · 2021-09-25T23:49:58Z

Ok I had a look and I can see why we're not propagating these up. The scenario where you could get an error is:

Someone puts a bad selector on an object. This should never happen, but let's say there was a bug and one got in.
- This selector is basically now ignored
Someone tries to validate another change that might affect this object (or one of its propagated copies). We call ShouldPropagate to see if there's a conflict, but instead we get the error due to the actions of the earlier editor.

In this case, we should not propagate the error back upwards, as it would effectively block everyone else, without giving them any way to resolve the issue. This seems suboptimal.

Instead, I'd vote for logging the error, e.g. something like this:

log.Error(err, "Invalid selector")

That way, if the admin is monitoring errors, they'll get a flood of them whenever we might be making the wrong choice - which is exactly what we want. The log object isn't currently passed down to the places that call ShouldPropagate but this should be easy to solve.

The one downside here is that it looks like the name of the object that has the bad selector isn't going to be identified. We could add it individually to each of the places where we log the error, but IMO it's better to add this information to the error itself when they're first generated, i.e. in GetSelector, GetTreeSelector and GetNoneSelector. Each of these functions (which are the only ones that can cause an error in ShouldPropagate) know the object that's causing the error, so IMO that's the right place to report it.

To take the simplest example of GetNoneSelector, I'd change

		return false,
			fmt.Errorf("invalid %s value: %w",
				api.AnnotationNoneSelector,
				err,
			)

with something like this:

		return false,fmt.Errorf("invalid %s value on %s/%s (%s): %w",
			api.AnnotationNoneSelector, inst.GetNamespace(), inst.GetName(), inst.GroupVersionKind(), err)

Which will give a string like invalid propagate.hnc.x-k8s.io/none value on foo-ns/bar (Secret): <lower level error>. In GetTreeSelector the error can be returned from many different places so you'll probably want to precreate the string "%s/%s (%s) but that's pretty easy to do.

I wouldn't worry about testing this too much, since we're just talking about improving an error message and logging messages that got away. If you like, you could create a bad selector on an object by disabling all the webhooks, making a bad change, and then re-enabling the webhooks and checking the logs, but it's not strictly necessary.

k8s-triage-robot · 2021-12-25T00:04:13Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2022-01-24T00:04:43Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2022-02-23T06:02:31Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2022-02-23T06:02:43Z

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Sep 21, 2021

k8s-ci-robot assigned yashvardhan-kukreja Sep 21, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 25, 2021

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 24, 2022

k8s-ci-robot closed this as completed Feb 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't throw away errors from ShouldPropagate #82

Don't throw away errors from ShouldPropagate #82

adrianludwin commented Sep 21, 2021 •

edited

Loading

adrianludwin commented Sep 21, 2021

k8s-ci-robot commented Sep 21, 2021

yashvardhan-kukreja commented Sep 21, 2021

yashvardhan-kukreja commented Sep 21, 2021

adrianludwin commented Sep 25, 2021

k8s-triage-robot commented Dec 25, 2021

k8s-triage-robot commented Jan 24, 2022

k8s-triage-robot commented Feb 23, 2022

k8s-ci-robot commented Feb 23, 2022

Don't throw away errors from ShouldPropagate #82

Don't throw away errors from ShouldPropagate #82

Comments

adrianludwin commented Sep 21, 2021 • edited Loading

adrianludwin commented Sep 21, 2021

k8s-ci-robot commented Sep 21, 2021

yashvardhan-kukreja commented Sep 21, 2021

yashvardhan-kukreja commented Sep 21, 2021

adrianludwin commented Sep 25, 2021

k8s-triage-robot commented Dec 25, 2021

k8s-triage-robot commented Jan 24, 2022

k8s-triage-robot commented Feb 23, 2022

k8s-ci-robot commented Feb 23, 2022

adrianludwin commented Sep 21, 2021 •

edited

Loading