Description
When following this guide, https://www.kubeflow.org/docs/components/tfserving_new/, I am unable to serve a model using ks param set ${MODEL_COMPONENT} numGpus 1
. Doing so results in an error 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.
, which presumably means that the nvidia.com/gpu
plugin has not been deployed. I am at a loss as to exactly how this should be done. Documentation on the Nvidia website is quite scant, and also the link provided in the guide for a GPU example (https://github.com/kubeflow/examples/blob/master/object_detection/tf_serving_gpu.md) offers no explanation whatsoever.
As a side note, if I leave out ks param set ${MODEL_COMPONENT} numGpus 1
(or set numGpus to 0), it also doesn't work, resulting in:
Error: failed to start container "testmodel": Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused \"setenv: invalid argument\"": unknown
EDIT
The solution to this is as follows:
- When creating the cluster, nodeGroups of type p3.2xlarge must be created. This will automatically create instances using the "EKS Optimized with GPU" AMI, as described here: https://docs.aws.amazon.com/eks/latest/userguide/gpu-ami.html
For example:
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: mycluster
region: us-east-1
version: '1.12'
availabilityZones: ["us-east-1a", "us-east-1b"]
nodeGroups:
- name: cpu-nodegroup
instanceType: m5.2xlarge
desiredCapacity: 1
minSize: 0
maxSize: 2
volumeSize: 30
- name: gpu-nodegroup
instanceType: p3.2xlarge
desiredCapacity: 1
minSize: 0
maxSize: 10
volumeSize: 50
availabilityZones: ["us-east-1a"]
iam:
withAddonPolicies:
autoScaler: true
labels:
'k8s.amazonaws.com/accelerator': 'nvidia-tesla-v100'
- Thereafter, the
nvidia/gpu
daemonset must be deployed, as follows:
kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.12/nvidia-device-plugin.yml
I think it is really necessary that the guide describes these requirements