Description
We noticed a particular scenario with the Knative progressive serving rollout, which we need input on.
We have a cluster with 2 GPUs
Initially, we apply isvc with replicas as 1
NAME READY STATUS RESTARTS AGE
predictor-00001-xyz 2/2 Running 0
Later we update the isvc to have replicas as 2
With the progressive rollout, we noticed it got stuck at this juncture with the traffic being moved to the new revision, but with this stalemate
NAME READY STATUS RESTARTS AGE
predictor-00001-xyz 2/2 Running 0
predictor-00002-abc 0/2 Pending 0
predictor-00002-xdqq 2/2 Running 0
The Knative progressive serving rollout design docs mention this -
"To keep the total number of replicas for both the old and new revisions remain the same, we can reduce the number for the old revisions before increasing the number for the new revisions. However, this is against the principle of the Knative service to serve the workload with demand."
Is there any approach where the above scenario can be achieved?