Skip to content

Update Pipeline SDK overview page #2167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Sep 14, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 24 additions & 27 deletions content/en/docs/pipelines/sdk/sdk-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,6 @@ description = "Overview of using the SDK to build components and pipelines"
weight = 10

+++
{{% alert title="Out of date" color="warning" %}}
This guide contains outdated information pertaining to Kubeflow 1.0. This guide
needs to be updated for Kubeflow 1.1.
{{% /alert %}}

{{% beta-status
feedbacklink="https://github.com/kubeflow/pipelines/issues" %}}
Expand All @@ -24,26 +20,15 @@ workflow and how the components interact with each other.
The Kubeflow Pipelines SDK includes the following packages:

* [`kfp.compiler`](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.compiler.html)
includes classes and methods for building Docker container images for your
pipeline components. Methods in this package include, but are not limited
includes classes and methods for compiling pipeline Python DSL into a workflow yaml spec
Methods in this package include, but are not limited
to, the following:

* `kfp.compiler.Compiler.compile` compiles your Python DSL code into a single
static configuration (in YAML format) that the Kubeflow Pipelines service
can process. The Kubeflow Pipelines service converts the static
configuration into a set of Kubernetes resources for execution.

* `kfp.compiler.build_docker_image` builds a container image based on a
Dockerfile and pushes the image to a URI. In the parameters, you provide the
path to a Dockerfile containing the image specification, and the URI for the
target image (for example, a container registry).

* `kfp.compiler.build_python_component` builds a container image for a
pipeline component based on a Python function, and pushes the image to a
URI. In the parameters, you provide the Python function that does the work
of the pipeline component, a Docker image to use as a base image,
and the URI for the target image (for example, a container registry).

* [`kfp.components`](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.components.html)
includes classes and methods for interacting with pipeline components.
Methods in this package include, but are not limited to, the following:
Expand Down Expand Up @@ -75,8 +60,6 @@ The Kubeflow Pipelines SDK includes the following packages:
Methods, classes, and modules in this package include, but are not limited to,
the following:

* `kfp.dsl.ContainerOp` represents a pipeline task (op) implemented by a
container image.
* `kfp.dsl.PipelineParam` represents a pipeline parameter that you can pass
from one pipeline component to another. See the guide to
[pipeline parameters](/docs/pipelines/sdk/parameters/).
Expand Down Expand Up @@ -107,6 +90,18 @@ The Kubeflow Pipelines SDK includes the following packages:
represents a volume used to pass data between pipeline steps. `ContainerOp`s
can mount a `PipelineVolume` either via the constructor's argument
`pvolumes` or `add_pvolumes()` method.
* [`kfp.dsl.ParallelFor`](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html#kfp.dsl.ParallelFor)
represents a parallel for loop over a static or dynamic set of items in a pipeline.
Each iteration of the for loop is executed in parallel.

* [`kfp.dsl.ExitHandler`](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html#kfp.dsl.ExitHandler)
represents an exit handler that is invoked upon exiting a pipeline. A typical
usage of `ExitHandler` is garbage collection.

* [`kfp.dsl.Condition`](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html#kfp.dsl.Condition)
represents a group of ops, that will only be executed when a certain condition is met.
The condition specified need to be determined at runtime, by incorporating at least one task output,
or PipelineParam in the boolean expression.

* [`kfp.Client`](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.client.html)
contains the Python client libraries for the [Kubeflow Pipelines
Expand All @@ -117,23 +112,25 @@ The Kubeflow Pipelines SDK includes the following packages:
[experiment](/docs/pipelines/concepts/experiment/) and returns an
experiment object.
* `kfp.Client.run_pipeline` runs a pipeline and returns a run object.
* `kfp.Client.pipeline_uploads.upload_pipeline_version` uploads a local file to create a pipeline version. [Follow an example to learn more about creating a pipeline version](/docs/pipelines/tutorials/sdk-examples)

* [`kfp.notebook`](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.notebook.html)
* `kfp.Client.create_run_from_pipeline_func` compiles a pipeline function and submits it
for execution on Kubeflow Pipelines.
* `kfp.Client.create_run_from_pipeline_package` runs a local pipeline package on Kubeflow Pipelines.
* `kfp.Client.upload_pipeline` uploads a local file to create a new pipeline in Kubeflow Pipelines.
* `kfp.Client.upload_pipeline_version` uploads a local file to create a pipeline version. [Follow an example to learn more about creating a pipeline version](/docs/pipelines/tutorials/sdk-examples)

* [KFP extension modules](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.extensions.html)
* [Kubeflow Pipelines extension modules](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.extensions.html)
include classes and functions for specific platforms on which you can use
Kubeflow Pipelines. Examples include utility functions for on premises,
Google Cloud Platform (GCP), Amazon Web Services (AWS), and Microsoft Azure.

* [KFP diagnose_me modules](https://github.com/kubeflow/pipelines/tree/master/sdk/python/kfp/cli/diagnose_me)include classes and functions that help with environment diagnostic tasks.
* [Kubeflow Pipelines diagnose_me modules](https://github.com/kubeflow/pipelines/tree/master/sdk/python/kfp/cli/diagnose_me)include classes and functions that help with environment diagnostic tasks.

* `kfp.cli.diagnose_me.dev_env` reports on diagnostic metadata from your development environment, such as your python library version.
* `kfp.cli.diagnose_me.kubernetes_cluster` reports on diagnostic data from your Kubernetes cluster, such as Kubernetes secrets.
* `kfp.cli.diagnose_me.gcp` reports on diagnostic data related to your GCP environment.

## KFP CLI tool
The KFP CLI tool enables you to use a subset of the Kubeflow Pipelines SDK directly from the command line. The KFP CLI tool provides the following commands:
## Kubeflow Pipelines CLI tool
The Kubeflow Pipelines CLI tool enables you to use a subset of the Kubeflow Pipelines SDK directly from the command line. The Kubeflow Pipelines CLI tool provides the following commands:

* `kfp diagnose_me` runs environment diagnostic with specified parameters.
* `--json` - Indicates that this command must return its results as JSON. Otherwise, results are returned in human readable format.
Expand All @@ -150,7 +147,7 @@ The KFP CLI tool enables you to use a subset of the Kubeflow Pipelines SDK direc
* `list` - Lists recent pipeline runs.
* `submit` - Submits a pipeline run.

* `kfp --endpoint <ENDPOINT>` - Specifies the endpoint that the KFP CLI should connect to.
* `kfp --endpoint <ENDPOINT>` - Specifies the endpoint that the Kubeflow Pipelines CLI should connect to.

## Installing the SDK

Expand Down