From 67e642915cccdab881e1bb01acfe41c6bd97f71b Mon Sep 17 00:00:00 2001 From: dzier Date: Tue, 8 Aug 2023 12:11:30 -0700 Subject: [PATCH 1/4] Update docs with NVAIE messaging --- README.md | 17 ++++++++++------- docs/index.md | 13 +++++++++++-- docs/user_guide/faq.md | 41 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 62 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index 03bb690384..f8d002af92 100644 --- a/README.md +++ b/README.md @@ -38,13 +38,16 @@ and corresponds to the 23.07 container release on ---- Triton Inference Server is an open source inference serving software that -streamlines AI inferencing. Triton enables teams to deploy any AI model from -multiple deep learning and machine learning frameworks, including TensorRT, -TensorFlow, PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more. Triton -supports inference across cloud, data center,edge and embedded devices on NVIDIA -GPUs, x86 and ARM CPU, or AWS Inferentia. Triton delivers optimized performance -for many query types, including real time, batched, ensembles and audio/video -streaming. +streamlines AI inferencing. Triton enables teams to deploy any AI model from +multiple deep learning and machine learning frameworks, including TensorRT, +TensorFlow, PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more. Triton +Inference Server supports inference across cloud, data center,edge and embedded +devices on NVIDIA GPUs, x86 and ARM CPU, or AWS Inferentia. Triton Inference +Server delivers optimized performance for many query types, including real time, +batched, ensembles and audio/video streaming. Triton inference Server is part of +[NVIDIA AI Enterprise](https://www.nvidia.com/en-us/data-center/products/ai-enterprise/), +an software platform that accelerates the data science pipeline and streamlines +the development and deployment of production AI. Major features include: diff --git a/docs/index.md b/docs/index.md index 7ae2b22173..ac4bab6f0d 100644 --- a/docs/index.md +++ b/docs/index.md @@ -58,9 +58,18 @@ Triton Inference Server is an open source inference serving software that stream -# Triton +# Triton Inference Server -Triton enables teams to deploy any AI model from multiple deep learning and machine learning frameworks, including TensorRT, TensorFlow, PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more. Triton supports inference across cloud, data center,edge and embedded devices on NVIDIA GPUs, x86 and ARM CPU, or AWS Inferentia. Triton delivers optimized performance for many query types, including real time, batched, ensembles and audio/video streaming. +Triton Inference Server enables teams to deploy any AI model from multiple deep +learning and machine learning frameworks, including TensorRT, TensorFlow, +PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more. Triton supports inference +across cloud, data center,edge and embedded devices on NVIDIA GPUs, x86 and ARM +CPU, or AWS Inferentia. Triton Inference Server delivers optimized performance +for many query types, including real time, batched, ensembles and audio/video +streaming. Triton inference Server is part of +[NVIDIA AI Enterprise](https://www.nvidia.com/en-us/data-center/products/ai-enterprise/), +an software platform that accelerates the data science pipeline and streamlines +the development and deployment of production AI. Major features include: diff --git a/docs/user_guide/faq.md b/docs/user_guide/faq.md index 518f2cc161..ece195d7ad 100644 --- a/docs/user_guide/faq.md +++ b/docs/user_guide/faq.md @@ -162,3 +162,44 @@ looking at the gdb trace for the segfault. When opening a GitHub issue for the segfault with Triton, please include the backtrace to better help us resolve the problem. + +## What are the benefits of using [Triton Inference Server](https://developer.nvidia.com/triton-inference-server) as part of the [NVIDIA AI Enterprise Software Suite](https://www.nvidia.com/en-us/data-center/products/ai-enterprise/)? + +NVIDIA AI Enterprise enables enterprises to implement full AI workflows by +delivering an entire end-to-end AI platform. Four key benefits: + +### Enterprise-Grade Support, Security & API Stability: + +Business-critical AI projects stay on track with NVIDIA Enterprise Support, +available globally to assist both IT teams with deploying and managing the +lifecycle of AI applications and the developer teams with building AI +applications. Support includes maintenance updates, dependable SLAs and +response times. Regular security reviews and priority notifications mitigate +potential risk of unmanaged opensource and ensure compliance with corporate +standards. Finally, long term support and regression testing ensures API +stability between releases. + +### Speed time to production with AI Workflows & Pretrained Models: +To reduce the complexity of developing common AI applications, NVIDIA AI +Enterprise includes +[AI workflows](https://www.nvidia.com/en-us/launchpad/ai/workflows/) which are +reference applications for specific business outcomes such as Intelligent +Virtual Assistants and Digital Fingerprinting for real-time cybersecurity threat +detection. AI workflow reference applications may include +[AI frameworks](https://docs.nvidia.com/deeplearning/frameworks/index.html) and +[pretrained models](https://developer.nvidia.com/ai-models), +[Helm Charts](https://catalog.ngc.nvidia.com/helm-charts), +[Jupyter Notebooks](https://developer.nvidia.com/run-jupyter-notebooks) and +[documentation](https://docs.nvidia.com/ai-enterprise/index.html#overview). + +### Performance for Efficiency and Cost Savings: +Using accelerated compute for AI workloads such as data process with +[NVIDIA RAPIDS Accelerator](https://developer.nvidia.com/rapids) for Apache +Spark and inference with Triton Inference Sever delivers better performance +which also improves efficiency and reduces operation and infrastructure costs, +including savings from reduced time and energy consumption. + +### Optimized and Certified to Deploy Everywhere: +Cloud, Data Center, Edge Optimized and certified to ensure reliable performance +whether it’s running your AI in the public cloud, virtualized data centers, or +on DGX systems. \ No newline at end of file From 764f2986f9a56b4a622c11415ff0392aada6f960 Mon Sep 17 00:00:00 2001 From: dzier Date: Tue, 8 Aug 2023 12:55:37 -0700 Subject: [PATCH 2/4] fix newline --- docs/user_guide/faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user_guide/faq.md b/docs/user_guide/faq.md index ece195d7ad..c272fd25a3 100644 --- a/docs/user_guide/faq.md +++ b/docs/user_guide/faq.md @@ -202,4 +202,4 @@ including savings from reduced time and energy consumption. ### Optimized and Certified to Deploy Everywhere: Cloud, Data Center, Edge Optimized and certified to ensure reliable performance whether it’s running your AI in the public cloud, virtualized data centers, or -on DGX systems. \ No newline at end of file +on DGX systems. From 37cad1fa02a0b9e7ebd5faf9ad86fe0ce5c52139 Mon Sep 17 00:00:00 2001 From: dzier Date: Tue, 8 Aug 2023 16:00:51 -0700 Subject: [PATCH 3/4] fixes --- README.md | 2 +- docs/index.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index f8d002af92..b25705df39 100644 --- a/README.md +++ b/README.md @@ -46,7 +46,7 @@ devices on NVIDIA GPUs, x86 and ARM CPU, or AWS Inferentia. Triton Inference Server delivers optimized performance for many query types, including real time, batched, ensembles and audio/video streaming. Triton inference Server is part of [NVIDIA AI Enterprise](https://www.nvidia.com/en-us/data-center/products/ai-enterprise/), -an software platform that accelerates the data science pipeline and streamlines +a software platform that accelerates the data science pipeline and streamlines the development and deployment of production AI. Major features include: diff --git a/docs/index.md b/docs/index.md index ac4bab6f0d..2895b7501e 100644 --- a/docs/index.md +++ b/docs/index.md @@ -68,7 +68,7 @@ CPU, or AWS Inferentia. Triton Inference Server delivers optimized performance for many query types, including real time, batched, ensembles and audio/video streaming. Triton inference Server is part of [NVIDIA AI Enterprise](https://www.nvidia.com/en-us/data-center/products/ai-enterprise/), -an software platform that accelerates the data science pipeline and streamlines +a software platform that accelerates the data science pipeline and streamlines the development and deployment of production AI. Major features include: From f3e3c4511a03ed0a961dade7fd62546a6101f363 Mon Sep 17 00:00:00 2001 From: dzier Date: Wed, 9 Aug 2023 08:40:40 -0700 Subject: [PATCH 4/4] another fix --- README.md | 2 +- docs/index.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index b25705df39..229f4f4103 100644 --- a/README.md +++ b/README.md @@ -41,7 +41,7 @@ Triton Inference Server is an open source inference serving software that streamlines AI inferencing. Triton enables teams to deploy any AI model from multiple deep learning and machine learning frameworks, including TensorRT, TensorFlow, PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more. Triton -Inference Server supports inference across cloud, data center,edge and embedded +Inference Server supports inference across cloud, data center, edge and embedded devices on NVIDIA GPUs, x86 and ARM CPU, or AWS Inferentia. Triton Inference Server delivers optimized performance for many query types, including real time, batched, ensembles and audio/video streaming. Triton inference Server is part of diff --git a/docs/index.md b/docs/index.md index 2895b7501e..62bdb27d43 100644 --- a/docs/index.md +++ b/docs/index.md @@ -63,7 +63,7 @@ Triton Inference Server is an open source inference serving software that stream Triton Inference Server enables teams to deploy any AI model from multiple deep learning and machine learning frameworks, including TensorRT, TensorFlow, PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more. Triton supports inference -across cloud, data center,edge and embedded devices on NVIDIA GPUs, x86 and ARM +across cloud, data center, edge and embedded devices on NVIDIA GPUs, x86 and ARM CPU, or AWS Inferentia. Triton Inference Server delivers optimized performance for many query types, including real time, batched, ensembles and audio/video streaming. Triton inference Server is part of