manager: fix task scheduler infinite loop #3200
Open
+318
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
- What I did
- How I did it
If the running tasks for a service are not well balanced across the placement-preference tree, the task scheduler could enter an infinite loop when scaling the service up. The scheduleNTasksOnSubtree loop terminates when either all tasks have been scheduled onto nodes, or the nodes in all subtrees are out of room to accept new tasks. The trouble is that the algorithm only considers a subtree to be out of room if an attempt was made to schedule tasks onto its nodes but not all tasks were scheduled. Subtrees with more tasks already running than the desired number of tasks for a balanced tree are skipped over without attempting to assign any tasks, so do not have a chance to be considered out of room. The scheduler will therefore enter a tight infinite loop when there exists a node of the placement-preferences tree in which at least one subtree has more tasks running than desired, and all other subtrees are out of room for more tasks.
It would be incorrect to consider a subtree as out of room just because there are more tasks running than desired at a particular iteration of the scheduling loop. The desired number of tasks to assign changes as the scheduler iteratively schedules tasks and other subtrees run out of room, so it is possible for a subtree to become eligible in a future iteration.
Add a third condition to the task scheduler loop. Make it so the loop exits if there are no subtrees which are eligible for task scheduling, whether due to being out of room or have more tasks running than desired.
- How to test it
With a new regression test.
TestMultiplePreferencesScaleUp
times out without the scheduler change, but passes with it.- Description for the changelog
Fix an issue where all new tasks in the Swarm could get stuck in the PENDING state forever after scaling up a service with placement preferences.