Skip to content

Commit a1c7423

Browse files
sujitukggiovanejr
andauthored
NV NIM Blueprint: Digital human for customer services (#1019)
Digital human NIM blueprint with optional HTTPS LoadBalancer for the NIM endpoints --------- Co-authored-by: Giovane Moura Jr <[email protected]>
1 parent f2d8ab2 commit a1c7423

File tree

5 files changed

+1021
-2
lines changed

5 files changed

+1021
-2
lines changed

tutorials-and-examples/nvidia-nim/blueprints/README.md

+13-2
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,20 @@
22

33
Here you will find the NVIDIA NIM Blueprints that can be provisioned to run on GKE. These are good for proof of concepts.
44

5-
1. [Generative Virtual Screening for Drug Discovery](https://build.nvidia.com/nvidia/generative-virtual-screening-for-drug-discovery) uses 3 NIMs.
5+
1. [Digital Human for Customer Service](https://build.nvidia.com/nvidia/digital-humans-for-customer-service)
6+
- Audio2face-3D
7+
- Audio2face-2D
8+
- FastPitch-hifigan-tts
9+
- Llama3-8b-instruct
10+
- nv-embedqa-e5-v5
11+
- nv-rerankqa-mistral4b-v3
12+
- Parakeet-ctc-1.1b-asr
13+
14+
You can follow the detailed steps [here](./digitalhuman/README.md).
15+
16+
2. [Generative Virtual Screening for Drug Discovery](https://build.nvidia.com/nvidia/generative-virtual-screening-for-drug-discovery) uses 3 NIMs.
617
- AlphaFold2
718
- MolMIM
819
- DiffDock
920

10-
You can follow the detailed steps [here](./drugdiscovery/README.md).
21+
You can follow the detailed steps [here](./drugdiscovery/README.md).
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,356 @@
1+
# Digital Human for Customer Service on GKE
2+
3+
Deploying the digital human blueprint based on few NIMs on GKE.
4+
5+
## Table of Contents
6+
7+
- [Digital Human for Customer Service on GKE](#digital-human-for-customer-service-on-gke)
8+
- [Table of Contents](#table-of-contents)
9+
- [Prerequisites](#prerequisites)
10+
- [Setup](#setup)
11+
- [Test](#test)
12+
- [nv-embedqa-e5-v5](#nv-embedqa-e5-v5)
13+
- [nv-rerankqa-mistral-4b-v3](#nv-rerankqa-mistral-4b-v3)
14+
- [llama3-8b-instruct](#llama3-8b-instruct)
15+
- [parakeet-ctc-1.1b-asr](#parakeet-ctc-11b-asr)
16+
- [fastpitch-hifigan-tts](#fastpitch-hifigan-tts)
17+
- [audio2face-2d](#audio2face-2d)
18+
- [audio2face-3d](#audio2face-3d)
19+
- [Tear down](#tear-down)
20+
21+
## Prerequisites
22+
23+
- **GCloud SDK:** Ensure you have the Google Cloud SDK installed and configured.
24+
- **Project:** A Google Cloud project with billing enabled.
25+
- **NGC API Key:** An API key from NVIDIA NGC. Please read the prerequisites to access this key [here](https://github.com/NVIDIA-AI-Blueprints/digital-human/blob/main/README.md#prerequisites)
26+
- **kubectl:** kubectl command-line tool installed and configured.
27+
- **NVIDIA GPUs:** One of the below GPUs should work
28+
- [NVIDIA L4 GPU (8)](https://cloud.google.com/compute/docs/gpus#l4-gpus)
29+
- [NVIDIA A100 80GB (1) GPU](https://cloud.google.com/compute/docs/gpus#a100-gpus)
30+
- [NVIDIA H100 80GB (1) GPU or higher](https://cloud.google.com/compute/docs/gpus#a3-series)
31+
32+
## Setup
33+
34+
1. **Environment setup**: You'll set up several environment variables to make the following steps easier and more flexible. These variables store important information like cluster names, machine types, and API keys. You need to update the variable values to match your needs and context.
35+
36+
```bash
37+
gcloud config set project "<GCP Project ID>"
38+
39+
export CLUSTER_NAME="gke-nimbp-dighuman"
40+
export NP_NAME="gke-nimbp-dighuman-gpunp"
41+
42+
export ZONE="us-west4-a" # e.g., us-west4-a
43+
export NP_CPU_MACHTYPE="e2-standard-2" # e.g., e2-standard-2
44+
export NP_GPU_MACHTYPE="g2-standard-96" # e.g., a2-ultragpu-1g
45+
46+
export ACCELERATOR_TYPE="nvidia-l4" # e.g., nvidia-a100-80gb
47+
export ACCELERATOR_COUNT="8" # Or higher, as needed
48+
export NODE_POOL_NODES=1 # Or higher, as needed
49+
50+
export NGC_API_KEY="<NGC API Key>"
51+
```
52+
53+
2. **GKE Cluster and Node pool creation**:
54+
55+
```bash
56+
gcloud container clusters create "${CLUSTER_NAME}" \
57+
--num-nodes="1" \
58+
--location="${ZONE}" \
59+
--machine-type="${NP_CPU_MACHTYPE}" \
60+
--addons=GcpFilestoreCsiDriver
61+
62+
gcloud container node-pools create "${NP_NAME}" \
63+
--cluster="${CLUSTER_NAME}" \
64+
--location="${ZONE}" \
65+
--node-locations="${ZONE}" \
66+
--num-nodes="${NODE_POOL_NODES}" \
67+
--machine-type="${NP_GPU_MACHTYPE}" \
68+
--accelerator="type=${ACCELERATOR_TYPE},count=${ACCELERATOR_COUNT},gpu-driver-version=LATEST" \
69+
--placement-type="COMPACT" \
70+
--disk-type="pd-ssd" \
71+
--disk-size="300GB"
72+
```
73+
74+
3. **Get Cluster Credentials:**
75+
76+
```bash
77+
gcloud container clusters get-credentials "${CLUSTER_NAME}" --location="${ZONE}"
78+
```
79+
80+
4. **Set kubectl Alias (Optional):**
81+
82+
```bash
83+
alias k=kubectl
84+
```
85+
86+
5. **Create NGC API Key Secret:** Creates secrets for pulling images from NVIDIA NGC and pods that need the API key at startup.
87+
88+
```bash
89+
k create secret docker-registry secret-nvcr \
90+
--docker-username=\$oauthtoken \
91+
--docker-password="${NGC_API_KEY}" \
92+
--docker-server="nvcr.io"
93+
94+
k create secret generic ngc-api-key \
95+
--from-literal=NGC_API_KEY="${NGC_API_KEY}"
96+
```
97+
98+
6. **Deploy NIMs:**
99+
100+
```bash
101+
k apply -f digital-human-nimbp.yaml
102+
```
103+
104+
The NIM deployment takes upto 15mins for it to be complete. You can check the pods are in `Running` status: `k get pods` should list below pods.
105+
106+
| NAME | READY | STATUS | RESTARTS |
107+
|---|---|---|---|
108+
|`dighum-embedqa-e5v5-aa-aa` | 1/1 | Running | 0 |
109+
|`dighum-rerankqa-mistral4bv3-bb-bb` | 1/1 | Running | 0 |
110+
|`dighum-llama3-8b-cc-cc` | 1/1 | Running | 0 |
111+
|`dighum-audio2face-3d-dd-dd` | 1/1 | Running | 0 |
112+
|`dighum-fastpitch-tts-ee-ee` | 1/1 | Running | 0 |
113+
|`dighum-maxine-audio2face-2d-ff-ff` | 1/1 | Running | 0 |
114+
|`dighum-parakeet-asr-1-1b-gg-gg` | 1/1 | Running | 0 |
115+
116+
4. **Access NIM endpoints**
117+
118+
```bash
119+
SERVICES=$(k get svc | awk '{print $1}' | grep -v NAME | grep '^dighum')
120+
121+
for service in $SERVICES; do
122+
# Get the pod name.
123+
POD=$(k get pods -o go-template --template '{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}' | grep $(echo $service | sed 's/-lb//'))
124+
125+
# Get external IP.
126+
EXTERNAL_IP=$(k get svc $service -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
127+
128+
echo "----------------------------------"
129+
echo "Testing service: $service at ${EXTERNAL_IP}"
130+
curl http://${EXTERNAL_IP}/v1/health/ready
131+
echo " "
132+
echo "----------------------------------"
133+
done
134+
```
135+
136+
[Click here if you need HTTPS endpoints](https.md)
137+
138+
## Test
139+
140+
Below are curl statements to test each of the endpoints
141+
142+
- ### nv-embedqa-e5-v5
143+
144+
Set `EXTERNAL_IP` from above output for `dighum-embedqa-e5v5`
145+
146+
```bash
147+
export EXTERNAL_IP=<IP>
148+
curl -X "POST" \
149+
"http://${EXTERNAL_IP}/v1/embeddings" \
150+
-H 'accept: application/json' \
151+
-H 'Content-Type: application/json' \
152+
-d '{
153+
"input": ["Hello world"],
154+
"model": "nvidia/nv-embedqa-e5-v5",
155+
"input_type": "query"
156+
}'
157+
```
158+
159+
- ### nv-rerankqa-mistral-4b-v3
160+
161+
Set `EXTERNAL_IP` from above output for `dighum-rerankqa-mistral4bv3`
162+
163+
```bash
164+
export EXTERNAL_IP=<IP>
165+
166+
curl -X "POST" \
167+
"http://${EXTERNAL_IP}/v1/ranking" \
168+
-H 'accept: application/json' \
169+
-H 'Content-Type: application/json' \
170+
-d '{
171+
"model": "nvidia/nv-rerankqa-mistral-4b-v3",
172+
"query": {"text": "which way should i go?"},
173+
"passages": [
174+
{"text": "two roads diverged in a yellow wood, and sorry i could not travel both and be one traveler, long i stood and looked down one as far as i could to where it bent in the undergrowth;"}
175+
],
176+
"truncate": "END"
177+
}'
178+
```
179+
180+
- ### llama3-8b-instruct
181+
182+
Set `EXTERNAL_IP` from above output for `dighum-llama3-8b`
183+
184+
```bash
185+
export EXTERNAL_IP=<IP>
186+
187+
curl -X "POST" \
188+
"http://${EXTERNAL_IP}/v1/chat/completions" \
189+
-H 'accept: application/json' \
190+
-H 'Content-Type: application/json' \
191+
-d '{
192+
"model": "meta/llama3-8b-instruct",
193+
"messages": [{"role":"user", "content":"Write a limerick about the wonders of GPU computing."}],
194+
"max_tokens": 64
195+
}'
196+
```
197+
198+
- ### parakeet-ctc-1.1b-asr
199+
200+
- Install the Riva Python client package
201+
202+
```bash
203+
python3 -m venv venv
204+
source venv/bin/activate
205+
pip install nvidia-riva-client
206+
```
207+
208+
- Download Riva sample clients
209+
210+
```bash
211+
212+
git clone https://github.com/nvidia-riva/python-clients.git
213+
214+
```
215+
216+
- Run Speech to Text inference in streaming modes. Riva ASR supports Mono, 16-bit audio in WAV, OPUS and FLAC formats.
217+
218+
```bash
219+
220+
k port-forward $(k get pod --selector="app=dighum-parakeet-asr-1-1b" --output jsonpath='{.items[0].metadata.name}') 50051:50051
221+
222+
python3 python-clients/scripts/asr/transcribe_file.py --server 0.0.0.0:50051 --input-file ./output.wav --language-code en-US
223+
224+
deactivate
225+
226+
```
227+
228+
For more details on getting started with this NIM, visit the [Riva ASR NIM Docs](https://docs.nvidia.com/nim/riva/asr/latest/overview.html)
229+
230+
- ### fastpitch-hifigan-tts
231+
232+
- Install the Riva Python client package
233+
234+
```bash
235+
python3 -m venv venv
236+
source venv/bin/activate
237+
pip install nvidia-riva-client
238+
```
239+
240+
- Download Riva sample clients
241+
242+
```bash
243+
244+
git clone https://github.com/nvidia-riva/python-clients.git
245+
246+
```
247+
248+
- Use `kubectl` to port forward
249+
250+
```bash
251+
252+
k port-forward $(k get pod --selector="app=dighum-parakeet-asr-1-1b" --output jsonpath='{.items[0].metadata.name}') 50051:50051 &
253+
254+
```
255+
256+
- Run Speech to Text inference in streaming modes. Riva ASR supports Mono, 16-bit audio in WAV, OPUS and FLAC formats.
257+
258+
```bash
259+
python3 python-clients/scripts/tts/talk.py --server 0.0.0.0:50051 --text "Hello, this is a speech synthesizer." --language-code en-US --output output.wav
260+
261+
deactivate
262+
```
263+
264+
On running the above command, the synthesized audio file named output.wav will be created.
265+
266+
- ### audio2face-2d
267+
268+
- Setup a virtual env
269+
270+
```bash
271+
python3 -m venv venv
272+
source venv/bin/activate
273+
```
274+
275+
- Download the Audio2Face-2D client code
276+
277+
```bash
278+
git clone https://github.com/NVIDIA-Maxine/nim-clients.git
279+
cd nim-clients/audio2face-2d/
280+
pip install -r python/requirements.txt
281+
```
282+
283+
- Compile the protos
284+
285+
```bash
286+
cd protos/linux/python
287+
chmod +x compile_protos.sh
288+
./compile_protos.sh
289+
```
290+
291+
- Run test inference
292+
293+
```bash
294+
cd python/scripts
295+
296+
python audio2face-2d.py --target <server_ip:port> \
297+
--audio-input <input audio file path> \
298+
--portrait-input <input portrait image file path> \
299+
--output <output file path and the file name> \
300+
--head-rotation-animation-filepath <rotation animation filepath> \
301+
--head-translation-animation-filepath <translation animation filepath> \
302+
--ssl-mode <ssl mode value> \
303+
--ssl-key <ssl key file path> \
304+
--ssl-cert <ssl cert filepath> \
305+
--ssl-root-cert <ssl root cert filepath>
306+
```
307+
308+
Refer the documentation [audio2face-2d](https://docs.nvidia.com/nim/maxine/audio2face-2d/latest/basic-inference.html#running-inference-via-node-js-script) NIM to set the values.
309+
310+
- ### audio2face-3d
311+
312+
- Setup a virtual env
313+
314+
```bash
315+
python3 -m venv venv
316+
source venv/bin/activate
317+
```
318+
319+
- Download the Audio2Face-2D client code
320+
321+
```bash
322+
git clone https://github.com/NVIDIA/Audio2Face-3D-Samples.git
323+
cd Audio2Face-3D-Samples/scripts/audio2face_3d_microservices_interaction_app
324+
325+
pip3 install ../../proto/sample_wheel/nvidia_ace-1.2.0-py3-none-any.whl
326+
327+
pip3 install -r requirements.txt
328+
```
329+
330+
- Perform a health check
331+
332+
```bash
333+
python3 a2f_3d.py health_check --url 0.0.0.0:52000
334+
```
335+
336+
- Run a test inference
337+
338+
```bash
339+
python3 a2f_3d.py run_inference ../../example_audio/Claire_neutral.wav config/config_claire.yml \
340+
-u 0.0.0.0:52000
341+
```
342+
343+
Refer the documentation of [audio2face-3d](https://docs.nvidia.com/ace/audio2face-3d-microservice/latest/text/getting-started/getting-started.html#running-inference) NIM for more information.
344+
345+
## Tear down
346+
347+
**Tear down the environment**
348+
**NOTE:** Please note all the NIMs deployed and cluster will be deleted.
349+
350+
```bash
351+
k delete -f digital-human-nimbp.yaml
352+
k delete secret secret-nvcr
353+
k delete secret ngc-api-key
354+
gcloud container clusters delete "${CLUSTER_NAME}" \
355+
--location="${ZONE}" --quiet
356+
```

0 commit comments

Comments
 (0)