File tree 2 files changed +10
-1
lines changed
2 files changed +10
-1
lines changed Original file line number Diff line number Diff line change @@ -115,6 +115,14 @@ random forest models. The
115
115
[ fil_backend] ( https://github.com/triton-inference-server/fil_backend ) repo
116
116
contains the documentation and source for the backend.
117
117
118
+ ** TensorRT-LLM** : The TensorRT-LLM backend allows you to serve
119
+ [ TensorRT-LLM] ( https://github.com/NVIDIA/TensorRT-LLM ) models with Triton Server.
120
+ Check out the
121
+ [ Triton TRT-LLM user guide] ( https://github.com/triton-inference-server/server/blob/main/docs/getting_started/trtllm_user_guide.md )
122
+ for more information. The
123
+ [ tensorrtllm_backend] ( https://github.com/triton-inference-server/tensorrtllm_backend )
124
+ repo contains the documentation and source for the backend.
125
+
118
126
** vLLM** : The vLLM backend is designed to run
119
127
[ supported models] ( https://vllm.readthedocs.io/en/latest/models/supported_models.html )
120
128
on a [ vLLM engine] ( https://github.com/vllm-project/vllm/blob/main/vllm/engine/async_llm_engine.py ) .
Original file line number Diff line number Diff line change 1
1
<!--
2
- # Copyright 2022-2023 , NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2
+ # Copyright 2022-2024 , NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3
3
#
4
4
# Redistribution and use in source and binary forms, with or without
5
5
# modification, are permitted provided that the following conditions
@@ -53,6 +53,7 @@ each backend on different platforms.
53
53
| Python[ ^ 1 ] | :heavy_check_mark : GPU <br /> :heavy_check_mark : CPU | :heavy_check_mark : GPU <br /> :heavy_check_mark : CPU |
54
54
| DALI | :heavy_check_mark : GPU <br /> :heavy_check_mark : CPU | :heavy_check_mark : GPU[ ^ 2 ] <br /> :heavy_check_mark : CPU[ ^ 2 ] |
55
55
| FIL | :heavy_check_mark : GPU <br /> :heavy_check_mark : CPU | Unsupported |
56
+ | TensorRT-LLM | :heavy_check_mark : GPU <br /> :x : CPU | :heavy_check_mark : GPU <br /> :x : CPU |
56
57
| vLLM | :heavy_check_mark : GPU <br /> :heavy_check_mark : CPU | Unsupported |
57
58
58
59
You can’t perform that action at this time.
0 commit comments