Skip to content

Commit 61de916

Browse files
lluunnk8s-ci-robot
authored andcommitted
Fix tfserving doc (#1025)
* fix tfserving doc * fix * fix
1 parent 0823f78 commit 61de916

File tree

1 file changed

+273
-73
lines changed

1 file changed

+273
-73
lines changed

content/docs/components/serving/tfserving_new.md

+273-73
Original file line numberDiff line numberDiff line change
@@ -6,41 +6,217 @@ weight = 51
66

77
## Serving a model
88

9-
_This section has not yet been converted to kustomize, please refer to [kubeflow/website/issues/958](https://github.com/kubeflow/website/issues/958)._
9+
To deploy a model we create following resources as illustrated below
1010

11-
We treat each deployed model as two [components](https://ksonnet.io/docs/tutorial#2-generate-and-deploy-an-app-component)
12-
in your APP: one tf-serving-deployment, and one tf-serving-service.
13-
We can think of the service as a model, and the deployment as the version of the model.
11+
- A deployment to deploy the model using TFServing
12+
- A K8s service to create an endpoint a service
13+
- An Istio virtual service to route traffic to the model and expose it through the Istio gateway
14+
- An Istio DestinationRule is for doing traffic splitting.
1415

15-
Generate the service(model) component
16-
17-
```
18-
ks generate tf-serving-service mnist-service
19-
ks param set mnist-service modelName mnist // match your deployment mode name
20-
ks param set mnist-service trafficRule v1:100 // optional, it's the default value
21-
ks param set mnist-service serviceType LoadBalancer // optional, change type to LoadBalancer to expose external IP
22-
```
23-
24-
Generate the deployment(version) component
25-
26-
```
27-
MODEL_COMPONENT=mnist-v1
28-
ks generate tf-serving-deployment-gcp ${MODEL_COMPONENT}
29-
ks param set ${MODEL_COMPONENT} modelName mnist
30-
ks param set ${MODEL_COMPONENT} versionName v1 // optional, it's the default value
31-
ks param set ${MODEL_COMPONENT} modelBasePath gs://kubeflow-examples-data/mnist
32-
ks param set ${MODEL_COMPONENT} gcpCredentialSecretName user-gcp-sa
33-
ks param set ${MODEL_COMPONENT} injectIstio true // If you want to use istio
16+
```yaml
17+
apiVersion: v1
18+
kind: Service
19+
metadata:
20+
labels:
21+
app: mnist
22+
name: mnist-service
23+
namespace: kubeflow
24+
spec:
25+
ports:
26+
- name: grpc-tf-serving
27+
port: 9000
28+
targetPort: 9000
29+
- name: http-tf-serving
30+
port: 8500
31+
targetPort: 8500
32+
selector:
33+
app: mnist
34+
type: ClusterIP
35+
---
36+
apiVersion: extensions/v1beta1
37+
kind: Deployment
38+
metadata:
39+
labels:
40+
app: mnist
41+
name: mnist-v1
42+
namespace: kubeflow
43+
spec:
44+
template:
45+
metadata:
46+
annotations:
47+
sidecar.istio.io/inject: "true"
48+
labels:
49+
app: mnist
50+
version: v1
51+
spec:
52+
containers:
53+
- args:
54+
- --port=9000
55+
- --rest_api_port=8500
56+
- --model_name=mnist
57+
- --model_base_path=YOUR_MODEL
58+
command:
59+
- /usr/bin/tensorflow_model_server
60+
image: tensorflow/serving:1.11.1
61+
imagePullPolicy: IfNotPresent
62+
livenessProbe:
63+
initialDelaySeconds: 30
64+
periodSeconds: 30
65+
tcpSocket:
66+
port: 9000
67+
name: mnist
68+
ports:
69+
- containerPort: 9000
70+
- containerPort: 8500
71+
resources:
72+
limits:
73+
cpu: "4"
74+
memory: 4Gi
75+
requests:
76+
cpu: "1"
77+
memory: 1Gi
78+
volumeMounts:
79+
- mountPath: /var/config/
80+
name: config-volume
81+
volumes:
82+
- configMap:
83+
name: mnist-v1-config
84+
name: config-volume
85+
---
86+
apiVersion: networking.istio.io/v1alpha3
87+
kind: DestinationRule
88+
metadata:
89+
labels:
90+
name: mnist-service
91+
namespace: kubeflow
92+
spec:
93+
host: mnist-service
94+
subsets:
95+
- labels:
96+
version: v1
97+
name: v1
98+
---
99+
apiVersion: networking.istio.io/v1alpha3
100+
kind: VirtualService
101+
metadata:
102+
labels:
103+
name: mnist-service
104+
namespace: kubeflow
105+
spec:
106+
gateways:
107+
- kubeflow-gateway
108+
hosts:
109+
- '*'
110+
http:
111+
- match:
112+
- method:
113+
exact: POST
114+
uri:
115+
prefix: /tfserving/models/mnist
116+
rewrite:
117+
uri: /v1/models/mnist:predict
118+
route:
119+
- destination:
120+
host: mnist-service
121+
port:
122+
number: 8500
123+
subset: v1
124+
weight: 100
34125
```
35126
36-
We enable TF Serving's REST API, and it's able to serve HTTP requests. The API is the same as our http proxy before.
127+
Referring to the above example, you can customize your deployment by changing the following configurations in the YAML file:
128+
129+
- In the deployment resource, the `model_base_path` argument points to the model.
130+
Change the value to your own model.
131+
132+
- The example contains three configurations for Google Cloud Storage (GCS) access:
133+
volumes (secret `user-gcp-sa`), volumeMounts, and
134+
env (GOOGLE_APPLICATION_CREDENTIALS).
135+
If your model is not at GCS (e.g. using S3 from AWS), See the section below on
136+
how to setup access.
137+
138+
- GPU. If you want to use GPU, add `nvidia.com/gpu: 1`
139+
in container resources, and use a GPU image, for example:
140+
`tensorflow/serving:1.11.1-gpu`.
141+
```yaml
142+
resources:
143+
limits:
144+
cpu: "4"
145+
memory: 4Gi
146+
nvidia.com/gpu: 1
147+
```
148+
149+
- The resource `VirtualService` and `DestinationRule` are for routing.
150+
With the example above, the model is accessible at `HOSTNAME/tfserving/models/mnist`
151+
(HOSTNAME is your Kubeflow deployment hostname). To change the path, edit the
152+
`http.match.uri` of VirtualService.
37153

38154
### Pointing to the model
39155
Depending where model file is located, set correct parameters
40156

41157
*Google cloud*
42158

43-
Set the param as above section.
159+
Change the deployment spec as follows:
160+
161+
```yaml
162+
spec:
163+
template:
164+
metadata:
165+
annotations:
166+
sidecar.istio.io/inject: "true"
167+
labels:
168+
app: mnist
169+
version: v1
170+
spec:
171+
containers:
172+
- args:
173+
- --port=9000
174+
- --rest_api_port=8500
175+
- --model_name=mnist
176+
- --model_base_path=gs://kubeflow-examples-data/mnist
177+
command:
178+
- /usr/bin/tensorflow_model_server
179+
env:
180+
- name: GOOGLE_APPLICATION_CREDENTIALS
181+
value: /secret/gcp-credentials/user-gcp-sa.json
182+
image: tensorflow/serving:1.11.1-gpu
183+
imagePullPolicy: IfNotPresent
184+
livenessProbe:
185+
initialDelaySeconds: 30
186+
periodSeconds: 30
187+
tcpSocket:
188+
port: 9000
189+
name: mnist
190+
ports:
191+
- containerPort: 9000
192+
- containerPort: 8500
193+
resources:
194+
limits:
195+
cpu: "4"
196+
memory: 4Gi
197+
nvidia.com/gpu: 1
198+
requests:
199+
cpu: "1"
200+
memory: 1Gi
201+
volumeMounts:
202+
- mountPath: /var/config/
203+
name: config-volume
204+
- mountPath: /secret/gcp-credentials
205+
name: gcp-credentials
206+
volumes:
207+
- configMap:
208+
name: mnist-v1-config
209+
name: config-volume
210+
- name: gcp-credentials
211+
secret:
212+
secretName: user-gcp-sa
213+
```
214+
215+
The changes are:
216+
217+
- environment variable `GOOGLE_APPLICATION_CREDENTIALS`
218+
- volume `gcp-credentials`
219+
- volumeMount `gcp-credentials`
44220

45221
We need a service account that can access the model.
46222
If you are using Kubeflow's click-to-deploy app, there should be already a secret, `user-gcp-sa`, in the cluster.
@@ -54,12 +230,7 @@ See [doc](https://cloud.google.com/docs/authentication/) for more detail.
54230

55231
*S3*
56232

57-
To use S3, generate a different prototype
58-
```
59-
ks generate tf-serving-deployment-aws ${MODEL_COMPONENT} --name=${MODEL_NAME}
60-
```
61-
62-
First you need to create secret that will contain access credentials. Use base64 to encode your credentials and check details in the Kubernetes guide to [creating a secret manually](https://kubernetes.io/docs/concepts/configuration/secret/#creating-a-secret-manually)
233+
To use S3, first you need to create secret that will contain access credentials. Use base64 to encode your credentials and check details in the Kubernetes guide to [creating a secret manually](https://kubernetes.io/docs/concepts/configuration/secret/#creating-a-secret-manually)
63234
```
64235
apiVersion: v1
65236
metadata:
@@ -70,52 +241,81 @@ data:
70241
kind: Secret
71242
```
72243
73-
Enable S3, set url and point to correct Secret
74-
75-
```
76-
MODEL_PATH=s3://kubeflow-models/inception
77-
ks param set ${MODEL_COMPONENT} modelBasePath ${MODEL_PATH}
78-
ks param set ${MODEL_COMPONENT} s3Enable true
79-
ks param set ${MODEL_COMPONENT} s3SecretName secretname
80-
```
81-
82-
Optionally you can also override default parameters of S3
83-
84-
```
85-
# S3 region
86-
ks param set ${MODEL_COMPONENT} s3AwsRegion us-west-1
87-
88-
# Whether or not to use https for S3 connections
89-
ks param set ${MODEL_COMPONENT} s3UseHttps true
244+
Then use the following manifest as an example:
90245
91-
# Whether or not to verify https certificates for S3 connections
92-
ks param set ${MODEL_COMPONENT} s3VerifySsl true
246+
```yaml
93247
94-
# URL for your s3-compatible endpoint.
95-
ks param set ${MODEL_COMPONENT} s3Endpoint s3.us-west-1.amazonaws.com
96-
```
97-
98-
### Using GPU
99-
To serve a model with GPU, first make sure your Kubernetes cluster has a GPU node. Then set an additional param:
100-
```
101-
ks param set ${MODEL_COMPONENT} numGpus 1
102-
```
103-
There is an [example](https://github.com/kubeflow/examples/blob/master/object_detection/tf_serving_gpu.md)
104-
for serving an object detection model with GPU.
105-
106-
### Deploying
248+
apiVersion: extensions/v1beta1
249+
kind: Deployment
250+
metadata:
251+
labels:
252+
app: s3
253+
name: s3
254+
namespace: kubeflow
255+
spec:
256+
template:
257+
metadata:
258+
annotations:
259+
sidecar.istio.io/inject: null
260+
labels:
261+
app: s3
262+
version: v1
263+
spec:
264+
containers:
265+
- args:
266+
- --port=9000
267+
- --rest_api_port=8500
268+
- --model_name=s3
269+
- --model_base_path=s3://abc
270+
- --monitoring_config_file=/var/config/monitoring_config.txt
271+
command:
272+
- /usr/bin/tensorflow_model_server
273+
env:
274+
- name: AWS_ACCESS_KEY_ID
275+
valueFrom:
276+
secretKeyRef:
277+
key: AWS_ACCESS_KEY_ID
278+
name: secretname
279+
- name: AWS_SECRET_ACCESS_KEY
280+
valueFrom:
281+
secretKeyRef:
282+
key: AWS_SECRET_ACCESS_KEY
283+
name: secretname
284+
- name: AWS_REGION
285+
value: us-west-1
286+
- name: S3_USE_HTTPS
287+
value: "true"
288+
- name: S3_VERIFY_SSL
289+
value: "true"
290+
- name: S3_ENDPOINT
291+
value: s3.us-west-1.amazonaws.com
292+
image: tensorflow/serving:1.11.1
293+
imagePullPolicy: IfNotPresent
294+
livenessProbe:
295+
initialDelaySeconds: 30
296+
periodSeconds: 30
297+
tcpSocket:
298+
port: 9000
299+
name: s3
300+
ports:
301+
- containerPort: 9000
302+
- containerPort: 8500
303+
resources:
304+
limits:
305+
cpu: "4"
306+
memory: 4Gi
307+
requests:
308+
cpu: "1"
309+
memory: 1Gi
310+
volumeMounts:
311+
- mountPath: /var/config/
312+
name: config-volume
313+
volumes:
314+
- configMap:
315+
name: s3-config
316+
name: config-volume
107317
108318
```
109-
export KF_ENV=default
110-
ks apply ${KF_ENV} -c mnist-service
111-
ks apply ${KF_ENV} -c ${MODEL_COMPONENT}
112-
```
113-
114-
The `KF_ENV` environment variable represents a conceptual deployment environment
115-
such as development, test, staging, or production, as defined by
116-
ksonnet. For this example, we use the `default` environment.
117-
You can read more about Kubeflow's use of ksonnet in the Kubeflow
118-
[ksonnet component guide](/docs/components/ksonnet/).
119319

120320
### Sending prediction request directly
121321
If the service type is LoadBalancer, it will have its own accessible external ip.

0 commit comments

Comments
 (0)