fix: training-integration uat by updating image (#68)

orfeas-k · web-flow · commit ad0922d6911a · 2024-06-17T13:58:31.000+03:00
* Update PyTorchJob image according to upstream E2E tests. * Update registry from which PaddleJob image is pulled to follow upstream E2E tests. Closes canonical/bundle-kubeflow#894, canonical/bundle-kubeflow#910
diff --git a/tests/notebooks/training/training-integration.ipynb b/tests/notebooks/training/training-integration.ipynb
@@ -6,9 +6,15 @@
    "source": [
     "# Test Training Operator Integration\n",
     "\n",
-    "This example notebook is loosely based on [this](https://github.com/kubeflow/training-operator/blob/master/sdk/python/examples/kubeflow-tfjob-sdk.ipynb) upstream example.\n",
+    "This example notebook is loosely based on the following upstream examples:\n",
+    "* [TFJob](https://github.com/kubeflow/training-operator/blob/964a6e836eedff11edfe79cc9f4e5b7c623cbe88/examples/tensorflow/image-classification/create-tfjob.ipynb)\n",
+    "* [PyTorchJob](https://github.com/kubeflow/training-operator/blob/964a6e836eedff11edfe79cc9f4e5b7c623cbe88/examples/pytorch/image-classification/create-pytorchjob.ipynb)\n",
+    "* [PaddleJob](https://github.com/kubeflow/training-operator/blob/964a6e836eedff11edfe79cc9f4e5b7c623cbe88/examples/paddlepaddle/simple-cpu.yaml)\n",
     "\n",
-    "- create training job of type: TFJob, PyTorchJob, and PaddleJob\n",
+    "Note that the above can get out of sync with the actual testing upstream does, so make sure to also check out [upstream E2E tests](https://github.com/kubeflow/training-operator/tree/964a6e836eedff11edfe79cc9f4e5b7c623cbe88/sdk/python/test/e2e) for updating the notebook.\n",
+    "\n",
+    "The workflow for each job (TFJob, PyTorchJob, and PaddleJob) is:\n",
+    "- create training job\n",
     "- monitor its execution\n",
     "- get training logs\n",
     "- delete job"
@@ -142,7 +148,7 @@
    "source": [
     "### Define a TFJob\n",
     "\n",
-    "Define a TFJob object before deploying it. This TFJob is similar to [this](https://github.com/kubeflow/training-operator/blob/master/sdk/python/examples/kubeflow-tfjob-sdk.ipynb) example."
+    "Define a TFJob object before deploying it."
    ]
   },
   {
@@ -411,7 +417,8 @@
    "source": [
     "PYTORCHJOB_NAME = \"pytorch-dist-mnist-gloo\"\n",
     "PYTORCHJOB_CONTAINER = \"pytorch\"\n",
-    "PYTORCHJOB_IMAGE = \"gcr.io/kubeflow-ci/pytorch-dist-mnist-test:v1.0\""
+    "PYTORCHJOB_IMAGE = \"kubeflow/pytorch-dist-mnist:v1-3a360ba\"\n",
+    "# The image above should be updated with each release with the latest available in the registry."
    ]
   },
   {
@@ -633,7 +640,7 @@
    "source": [
     "### Define a PaddleJob\n",
     "\n",
-    "Define a PaddleJob object before deploying it. This PaddleJob is loosely based on [this](https://github.com/kubeflow/training-operator/blob/11b7a115e6538caeab405344af98f0d5b42a4c96/examples/paddlepaddle/simple-cpu.yaml) example."
+    "Define a PaddleJob object before deploying it."
    ]
   },
   {
@@ -644,7 +651,7 @@
    "source": [
     "PADDLEJOB_NAME = \"paddle-simple-cpu\"\n",
     "PADDLEJOB_CONTAINER = \"paddle\"\n",
-    "PADDLEJOB_IMAGE = \"registry.baidubce.com/paddlepaddle/paddle:2.4.0rc0-cpu\""
+    "PADDLEJOB_IMAGE = \"docker.io/paddlepaddle/paddle:2.4.0rc0-cpu\""
    ]
   },
   {