-
Notifications
You must be signed in to change notification settings - Fork 770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adopting coschduling plugin #1724
Conversation
Pull Request Test Coverage Report for Build 3920653750
💛 - Coveralls |
4ae57d6
to
f9fa8ac
Compare
e5d38bc
to
2ae0b6a
Compare
/hold for kubeflow/common#203 |
This PR is ready for review. PTAL. |
57d1b55
to
f760c7e
Compare
@zw0610 @johnugeorge @terrytangyuan Please take a look. I would like to include PodGroup with the scheduler-plugin feature in this release. |
@tenzen-y What is the best way to add e2e test for this? Also, can you update docs in a separate PR? (In kubeflow/website?) |
Maybe, we could run e2e test by updating integration-tests.yaml like the following: ...
jobs:
integration-test:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
kubernetes-version: ["v1.23.12", "v1.24.6", "v1.25.2"]
+ gang-scheduler-name: ["none", "scheduler-plugins"]
steps:
- name: Checkout
...
- name: Deploy training operator
run: |
./scripts/gha/setup-training-operator.sh
env:
KIND_CLUSTER: training-operator-cluster
TRAINING_CI_IMAGE: kubeflowtraining/training-operator:test
+ GANG_SCHEDULER_NAME: ${{ matrix.gang-scheduler-name }}
...
Yes, sure. |
@tenzen-y You can use this release https://github.com/kubeflow/common/releases/tag/v0.4.6 Note: You need to take changes for HPA fixes from #1732 as the release contains HPA selector field change as well. |
@johnugeorge Thanks for letting me know. |
a4b8b31
to
a2a8dc6
Compare
@johnugeorge Completed. |
Do you plan to add CI tests in this PR or separate one ? |
@johnugeorge I will work on E2E on another PR. |
a2a8dc6
to
8f52a72
Compare
Signed-off-by: Yuki Iwai <[email protected]> Co-authored-by: Wang Zhang <[email protected]> Signed-off-by: Yuki Iwai <[email protected]>
Signed-off-by: Yuki Iwai <[email protected]>
Signed-off-by: Yuki Iwai <[email protected]>
…1beta1 PodGroup Signed-off-by: Yuki Iwai <[email protected]>
8f52a72
to
e379a7f
Compare
Signed-off-by: Yuki Iwai <[email protected]>
e379a7f
to
cc3359e
Compare
@johnugeorge Rebased. |
Great work @tenzen-y Thank you for adding this feature /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johnugeorge, tenzen-y The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@johnugeorge I'll create a PR to add E2E test for scheduler-plugins by the end of Monday. |
/hold cancel |
What this PR does / why we need it:
Adopts updated kubeflow/common with PodGroupControl designed for more generic gang-schedulers.
Follow up on #1526.
Reviewers can check the coscheduling feature with the scheduler-plugins in the following steps:
Also, this PR includes HPA changes and upgrading PriorityClass apiVersion.
Which issue(s) this PR fixes (optional, in
Fixes #<issue number>, #<issue number>, ...
format, will close the issue(s) when PR gets merged):Fixes #1722 #1725
Checklist: