Skip to content

Commit ac03a5d

Browse files
authored
doc: Add TRT-LLM backend to the doc (#102)
* Add TRT-LLM backend to the doc * Add TRT-LLM backend to platform support matrix * Switch the order of vLLM and TRT-LLM
1 parent 30fa78a commit ac03a5d

File tree

2 files changed

+10
-1
lines changed

2 files changed

+10
-1
lines changed

README.md

+8
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,14 @@ random forest models. The
115115
[fil_backend](https://github.com/triton-inference-server/fil_backend) repo
116116
contains the documentation and source for the backend.
117117

118+
**TensorRT-LLM**: The TensorRT-LLM backend allows you to serve
119+
[TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) models with Triton Server.
120+
Check out the
121+
[Triton TRT-LLM user guide](https://github.com/triton-inference-server/server/blob/main/docs/getting_started/trtllm_user_guide.md)
122+
for more information. The
123+
[tensorrtllm_backend](https://github.com/triton-inference-server/tensorrtllm_backend)
124+
repo contains the documentation and source for the backend.
125+
118126
**vLLM**: The vLLM backend is designed to run
119127
[supported models](https://vllm.readthedocs.io/en/latest/models/supported_models.html)
120128
on a [vLLM engine](https://github.com/vllm-project/vllm/blob/main/vllm/engine/async_llm_engine.py).

docs/backend_platform_support_matrix.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
# Copyright 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# Copyright 2022-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
33
#
44
# Redistribution and use in source and binary forms, with or without
55
# modification, are permitted provided that the following conditions
@@ -53,6 +53,7 @@ each backend on different platforms.
5353
| Python[^1] | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU |
5454
| DALI | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU | :heavy_check_mark: GPU[^2] <br/> :heavy_check_mark: CPU[^2] |
5555
| FIL | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU | Unsupported |
56+
| TensorRT-LLM | :heavy_check_mark: GPU <br/> :x: CPU | :heavy_check_mark: GPU <br/> :x: CPU |
5657
| vLLM | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU | Unsupported |
5758

5859

0 commit comments

Comments
 (0)