doc: Add TRT-LLM backend to the doc (#102)

krishung5 · web-flow · commit ac03a5d97403 · 2024-08-29T14:03:41.000-07:00
* Add TRT-LLM backend to the doc

* Add TRT-LLM backend to platform support matrix

* Switch the order of vLLM and TRT-LLM
diff --git a/README.md b/README.md
@@ -115,6 +115,14 @@ random forest models. The
 [fil_backend](https://github.com/triton-inference-server/fil_backend) repo
 contains the documentation and source for the backend.
 
+**TensorRT-LLM**: The TensorRT-LLM backend allows you to serve
+[TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) models with Triton Server.
+Check out the
+[Triton TRT-LLM user guide](https://github.com/triton-inference-server/server/blob/main/docs/getting_started/trtllm_user_guide.md)
+for more information. The
+[tensorrtllm_backend](https://github.com/triton-inference-server/tensorrtllm_backend)
+repo contains the documentation and source for the backend.
+
 **vLLM**: The vLLM backend is designed to run
 [supported models](https://vllm.readthedocs.io/en/latest/models/supported_models.html)
 on a [vLLM engine](https://github.com/vllm-project/vllm/blob/main/vllm/engine/async_llm_engine.py).
diff --git a/docs/backend_platform_support_matrix.md b/docs/backend_platform_support_matrix.md
@@ -1,5 +1,5 @@
 <!--
-# Copyright 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# Copyright 2022-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 #
 # Redistribution and use in source and binary forms, with or without
 # modification, are permitted provided that the following conditions
@@ -53,6 +53,7 @@ each backend on different platforms.
 | Python[^1]   |  :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU  |  :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU  |
 | DALI         |  :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU  | :heavy_check_mark: GPU[^2] <br/> :heavy_check_mark: CPU[^2] |
 | FIL          |  :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU  |  Unsupported  |
+| TensorRT-LLM |  :heavy_check_mark: GPU <br/> :x: CPU | :heavy_check_mark: GPU <br/> :x: CPU       |
 | vLLM         |  :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU  |  Unsupported  |