Skip to content
This repository was archived by the owner on Sep 12, 2023. It is now read-only.
This repository was archived by the owner on Sep 12, 2023. It is now read-only.

PR 135 makes replicaType updates hard #154

Open
@Jeffwan

Description

@Jeffwan

In this PR, https://github.com/kubeflow/common/pull/135/files, it changes rtype to ReplicaType.

However, it brings some challenges in operator upgrade.

-		expectationPodsKey := expectation.GenExpectationPodsKey(jobKey, rtype)
+		expectationPodsKey := expectation.GenExpectationPodsKey(jobKey, apiv1.ReplicaType(rtype))

Using this line as an example, controller retrieve rtype from pod label. It's lower case at this moment.

https://github.com/kubeflow/tf-operator/blob/ff5aaf10bd8a665cbce72dd2cd1d2c0cd06f329e/pkg/apis/tensorflow/v1/types.go#L77

https://github.com/kubeflow/tf-operator/blob/ff5aaf10bd8a665cbce72dd2cd1d2c0cd06f329e/pkg/apis/pytorch/v1/pytorchjob_types.go#L62

apiv1.ReplicaType(rtype) is inaccurate and meaningless because we define Upper case role like Master in the past,
and this conversion give us a non-exist ReplicaType because it's lower case master.

we have some references like below.

https://github.com/kubeflow/tf-operator/blob/ff5aaf10bd8a665cbce72dd2cd1d2c0cd06f329e/pkg/controller.v1/pytorch/pytorch.go#L46

I feel like all the role should be lower case, we make sure we don't use ReplicaType for comparison. Instead, still use strings.toLower(string(rtype)) which means PR 135 still not helpful?

/cc @MartinForReal @gaocegege @kubeflow/wg-training-leads

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions