-
Notifications
You must be signed in to change notification settings - Fork 112
Extend startup notification timeout as jobs continue to be loaded #220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Do we have any practical way of testing this even manually?
K8s pods may also have a readiness timeout. (CBCI currently uses /login
for this purpose.) I do not think there is any specific way for the container to signal that it is making progress still needs more time. In principle you could handle that logic in the endpoint, by having it return 200 if either Jenkins has started (HudsonIsLoading
gone) or onExtendTimeout
in a custom Lifecycle
has been called, but returning 500 if some timeout has elapsed. Would be awkward since the custom endpoint would then need to know what the configured readiness timeout was (in case it is overridable), which Jenkins would need to read from a file projected via downwardAPI
. Perhaps the effort would be better spent profiling & optimizing Item
loading (deferring work to lazy initializers or background threads).
if (jobEncountered.get() > jobEncounteredAtLastTick.get()) { | ||
Jenkins.get().getLifecycle().onExtendTimeout(TICK_INTERVAL, TimeUnit.MILLISECONDS); | ||
} | ||
jobEncounteredAtLastTick.set(jobEncountered.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is probably some idiomatic way to do the compare and set atomically but I suppose we do not care much here since it is only a heuristic.
(Are these counters actually used from concurrent threads? Predates 869ef0c and I no longer recall!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is probably some idiomatic way to do the compare and set atomically
If there is such an idiomatic way, I cannot think of it.
Are these counters actually used from concurrent threads?
I am not familiar with this code, but reading it raised more questions than answers in my mind (i.e., it smelled wrong to me) so I decided not to think too hard about it.
In a previous life, I operated some Jenkins controllers with thousands of jobs backed by slow NFS storage, and some of them took over half an hour to load jobs, but I no longer have access to that environment.
I did actually spend some time researching how Kubernetes deals with readiness before designing the core API. I came to the same conclusion as you: a dedicated readiness API might be nice to have, but without a backchannel to the service manager (akin to the notification socket provided by |
Is there a specific reason for this to be closed? From what little I recall, this was useful and should still be considered. |
I waited for over a month for this PR to be merged. Neither did I receive negative feedback, nor was the PR merged. I have moved on to newer priorities. I would encourage anyone who is interested in driving this forward to pick up where I left off. |
Then I will take the liberty of reopening. There are plenty of valid PRs which are much older than this, and a shortage of maintainers. Closing PRs means that the information that a particular change was desirable in principle gets lost, perhaps forever. |
I have unsubscribed from notifications to this PR, as I have other priorities which demand my focus. |
a new Job property in https://github.com/jenkinsci/make-jenkins-slow-plugin/tree/master/src/main/java/io/jenkins/plugins/make_jenkins_slow then a few jobs in various folders all with that property I would guess. |
Reflection-free alternative to #207. Untested PR to consume the new API introduced in jenkinsci/jenkins#6237 to keep pushing back the startup timeout as long as Jenkins is continuing to make progress in loading jobs. This should make it far less likely that people will hit the
systemd
service startup timeout when usingsystemd
withType=notify
.