Description
Describe the bug
The documentation for the NVIDIA GPU Operator on Red Hat OpenShift Container Platform includes a chapter titled "Installing the NVIDIA GPU Operator on OpenShift." Within this chapter, the section "Create the cluster policy using the CLI" contains two notable issues:
1. Version reference issue
The first command in this section references a specific version of the NVIDIA GPU Operator. However, it does not reflect the latest available version. At the time of writing, version 25.3.1 was current, but the documentation still refers to 22.9.0:
$ oc get csv -n nvidia-gpu-operator gpu-operator-certified.v22.9.0 -ojsonpath={.metadata.annotations.alm-examples} | jq .[0] > clusterpolicy.json
Ideally, the documentation should either:
- Be updated to reflect the latest version, or
- Use a variable to dynamically reference the current version.
In addition, when using zsh, .[0]
must be put in quotes.
2. Incorrect ClusterPolicy example
The example ClusterPolicy
returned by the cluster service version contains a structural error. Specifically, the key kernelModuleType
is incorrectly placed under the upgradePolicy
key. According to the ClusterPolicy
custom resource definition, kernelModuleType
should be nested under the driver
key. This discrepancy can cause deployment issues or misinterpretation of the configuration.
To Reproduce
→ see above command
Expected behavior
The example ClusterPolicy
returned by the cluster service version contains no errors.
Environment (please provide the following information):
- GPU Operator Version: 25.3.1