Skip to content

Commit af93c4f

Browse files
qinxuyecodingl2k1
andauthored
DOC: add doc about virtual env & update models in README (#3287)
Co-authored-by: codingl2k1 <[email protected]>
1 parent 9f5891e commit af93c4f

21 files changed

+941
-101
lines changed

README.md

+8-8
Original file line numberDiff line numberDiff line change
@@ -47,14 +47,14 @@ potential of cutting-edge AI models.
4747
- Support SGLang backend: [#1161](https://github.com/xorbitsai/inference/pull/1161)
4848
- Support LoRA for LLM and image models: [#1080](https://github.com/xorbitsai/inference/pull/1080)
4949
### New Models
50-
- Built-in support for [Gemma-3-it](https://blog.google/technology/developers/gemma-3/): [#3077](https://github.com/xorbitsai/inference/pull/3077)
51-
- Built-in support for [QwQ-32B](https://qwenlm.github.io/blog/qwq-32b/): [#3005](https://github.com/xorbitsai/inference/pull/3005)
52-
- Built-in support for [DeepSeek V3 and R1](https://github.com/deepseek-ai/DeepSeek-R1): [#2864](https://github.com/xorbitsai/inference/pull/2864)
53-
- Built-in support for [InternVL2.5](https://internvl.github.io/blog/2024-12-05-InternVL-2.5/): [#2776](https://github.com/xorbitsai/inference/pull/2776)
54-
- Built-in support for [DeepSeek-R1-Distill-Llama](https://github.com/deepseek-ai/DeepSeek-R1?tab=readme-ov-file#deepseek-r1-distill-models): [#2811](https://github.com/xorbitsai/inference/pull/2811)
55-
- Built-in support for [DeepSeek-R1-Distill-Qwen](https://github.com/deepseek-ai/DeepSeek-R1?tab=readme-ov-file#deepseek-r1-distill-models): [#2781](https://github.com/xorbitsai/inference/pull/2781)
56-
- Built-in support for [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M): [#2790](https://github.com/xorbitsai/inference/pull/2790)
57-
- Built-in support for [qwen2.5-vl](https://github.com/QwenLM/Qwen2.5-VL): [#2788](https://github.com/xorbitsai/inference/pull/2788)
50+
- Built-in support for [Qwen2.5-Omni](https://github.com/QwenLM/Qwen2.5-Omni): [#3279](https://github.com/xorbitsai/inference/pull/3279)
51+
- Built-in support for [Skywork-OR1](https://github.com/SkyworkAI/Skywork-OR1): [#3274](https://github.com/xorbitsai/inference/pull/3274)
52+
- Built-in support for [GLM-4-0414](https://github.com/THUDM/GLM-4): [#3251](https://github.com/xorbitsai/inference/pull/3251)
53+
- Built-in support for [SeaLLMs-v3](https://github.com/DAMO-NLP-SG/DAMO-SeaLLMs): [#3248](https://github.com/xorbitsai/inference/pull/3248)
54+
- Built-in support for [paraformer-zh](https://huggingface.co/funasr/paraformer-zh): [#3236](https://github.com/xorbitsai/inference/pull/3236)
55+
- Built-in support for [InternVL3](https://internvl.github.io/blog/2025-04-11-InternVL-3.0/): [#3235](https://github.com/xorbitsai/inference/pull/3235)
56+
- Built-in support for [MegaTTS3](https://github.com/bytedance/MegaTTS3): [#3224](https://github.com/xorbitsai/inference/pull/3224)
57+
- Built-in support for [Deepseek-VL2](https://github.com/deepseek-ai/DeepSeek-VL2): [#3179](https://github.com/xorbitsai/inference/pull/3179)
5858
### Integrations
5959
- [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): an LLMOps platform that enables developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable.
6060
- [FastGPT](https://github.com/labring/FastGPT): a knowledge-based platform built on the LLM, offers out-of-the-box data processing and model invocation capabilities, allows for workflow orchestration through Flow visualization.

README_zh_CN.md

+8-8
Original file line numberDiff line numberDiff line change
@@ -43,14 +43,14 @@ Xorbits Inference(Xinference)是一个性能强大且功能全面的分布
4343
- 支持 SGLang 后端: [#1161](https://github.com/xorbitsai/inference/pull/1161)
4444
- 支持LLM和图像模型的LoRA: [#1080](https://github.com/xorbitsai/inference/pull/1080)
4545
### 新模型
46-
- 内置 [Gemma-3-it](https://blog.google/technology/developers/gemma-3/): [#3077](https://github.com/xorbitsai/inference/pull/3077)
47-
- 内置 [QwQ-32B](https://qwenlm.github.io/zh/blog/qwq-32b/): [#3005](https://github.com/xorbitsai/inference/pull/3005)
48-
- 内置 [DeepSeek V3 and R1](https://github.com/deepseek-ai/DeepSeek-R1): [#2864](https://github.com/xorbitsai/inference/pull/2864)
49-
- 内置 [InternVL2.5](https://internvl.github.io/blog/2024-12-05-InternVL-2.5/): [#2776](https://github.com/xorbitsai/inference/pull/2776)
50-
- 内置 [DeepSeek-R1-Distill-Llama](https://github.com/deepseek-ai/DeepSeek-R1?tab=readme-ov-file#deepseek-r1-distill-models): [#2811](https://github.com/xorbitsai/inference/pull/2811)
51-
- 内置 [DeepSeek-R1-Distill-Qwen](https://github.com/deepseek-ai/DeepSeek-R1?tab=readme-ov-file#deepseek-r1-distill-models): [#2781](https://github.com/xorbitsai/inference/pull/2781)
52-
- 内置 [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M): [#2790](https://github.com/xorbitsai/inference/pull/2790)
53-
- 内置 [qwen2.5-vl](https://github.com/QwenLM/Qwen2.5-VL): [#2788](https://github.com/xorbitsai/inference/pull/2788)
46+
- 内置 [Qwen2.5-Omni](https://github.com/QwenLM/Qwen2.5-Omni): [#3279](https://github.com/xorbitsai/inference/pull/3279)
47+
- 内置 [Skywork-OR1](https://github.com/SkyworkAI/Skywork-OR1): [#3274](https://github.com/xorbitsai/inference/pull/3274)
48+
- 内置 [GLM-4-0414](https://github.com/THUDM/GLM-4): [#3251](https://github.com/xorbitsai/inference/pull/3251)
49+
- 内置 [SeaLLMs-v3](https://github.com/DAMO-NLP-SG/DAMO-SeaLLMs): [#3248](https://github.com/xorbitsai/inference/pull/3248)
50+
- 内置 [paraformer-zh](https://huggingface.co/funasr/paraformer-zh): [#3236](https://github.com/xorbitsai/inference/pull/3236)
51+
- 内置 [InternVL3](https://internvl.github.io/blog/2025-04-11-InternVL-3.0/): [#3235](https://github.com/xorbitsai/inference/pull/3235)
52+
- 内置 [MegaTTS3](https://github.com/bytedance/MegaTTS3): [#3224](https://github.com/xorbitsai/inference/pull/3224)
53+
- 内置 [Deepseek-VL2](https://github.com/deepseek-ai/DeepSeek-VL2): [#3179](https://github.com/xorbitsai/inference/pull/3179)
5454
### 集成
5555
- [FastGPT](https://doc.fastai.site/docs/development/custom-models/xinference/):一个基于 LLM 大模型的开源 AI 知识库构建平台。提供了开箱即用的数据处理、模型调用、RAG 检索、可视化 AI 工作流编排等能力,帮助您轻松实现复杂的问答场景。
5656
- [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): 一个涵盖了大型语言模型开发、部署、维护和优化的 LLMOps 平台。

doc/source/getting_started/environments.rst

+2
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ XINFERENCE_MODEL_SRC
1414
Modelhub used for downloading models. Default is "huggingface", or you
1515
can set "modelscope" as downloading source.
1616

17+
.. _environments_xinference_home:
18+
1719
XINFERENCE_HOME
1820
~~~~~~~~~~~~~~~~
1921
By default, Xinference uses ``<HOME>/.xinference`` as home path to store

doc/source/getting_started/installation.rst

+8-9
Original file line numberDiff line numberDiff line change
@@ -89,17 +89,12 @@ and will be the sole backend for llama.cpp in the future.
8989

9090
.. note::
9191

92-
``llama-cpp-python`` is the default option for llama.cpp backend.
93-
To enable xllamacpp, add environment variable ``USE_XLLAMACPP=1``.
94-
95-
e.g. Starting local Xinference via
96-
97-
``USE_XLLAMACPP=1 xinference-local``
92+
``xllamacpp`` is the default option for llama.cpp backend since v1.5.0.
93+
To enable ``llama-cpp-python``, add environment variable ``USE_XLLAMACPP=0``.
9894

9995
.. warning::
10096

101-
For upcoming Xinference v1.5.0,
102-
``xllamacpp`` will become default option for llama.cpp, and ``llama-cpp-python`` will be deprecated.
97+
Since Xinference v1.5.0, ``llama-cpp-python`` will be deprecated.
10398
For Xinference v1.6.0, ``llama-cpp-python`` will be removed.
10499

105100
Initial setup::
@@ -112,10 +107,14 @@ Installation instructions for ``xllamacpp``:
112107

113108
pip install -U xllamacpp
114109

115-
- Cuda::
110+
- CUDA::
116111

117112
pip install xllamacpp --force-reinstall --index-url https://xorbitsai.github.io/xllamacpp/whl/cu124
118113

114+
- HIP::
115+
116+
pip install xllamacpp --force-reinstall --index-url https://xorbitsai.github.io/xllamacpp/whl/rocm-6.0.2
117+
119118
Hardware-Specific installations for ``llama-cpp-python``:
120119

121120
- Apple Silicon::

doc/source/locale/zh_CN/LC_MESSAGES/getting_started/installation.po

+55-35
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ msgid ""
77
msgstr ""
88
"Project-Id-Version: Xinference \n"
99
"Report-Msgid-Bugs-To: \n"
10-
"POT-Creation-Date: 2025-03-19 12:51+0800\n"
10+
"POT-Creation-Date: 2025-04-19 00:40+0200\n"
1111
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
1212
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
1313
"Language: zh_CN\n"
@@ -16,7 +16,7 @@ msgstr ""
1616
"MIME-Version: 1.0\n"
1717
"Content-Type: text/plain; charset=utf-8\n"
1818
"Content-Transfer-Encoding: 8bit\n"
19-
"Generated-By: Babel 2.14.0\n"
19+
"Generated-By: Babel 2.16.0\n"
2020

2121
#: ../../source/getting_started/installation.rst:5
2222
msgid "Installation"
@@ -160,7 +160,7 @@ msgstr ""
160160
#: ../../source/getting_started/installation.rst:50
161161
msgid ""
162162
"``qwen2.5``, ``qwen2.5-coder``, ``qwen2.5-instruct``, ``qwen2.5-coder-"
163-
"instruct``"
163+
"instruct``, ``qwen2.5-instruct-1m``"
164164
msgstr ""
165165

166166
#: ../../source/getting_started/installation.rst:51
@@ -212,86 +212,89 @@ msgid "``marco-o1``"
212212
msgstr ""
213213

214214
#: ../../source/getting_started/installation.rst:63
215-
msgid "``gemma-it``, ``gemma-2-it``"
215+
msgid "``fin-r1``"
216216
msgstr ""
217217

218218
#: ../../source/getting_started/installation.rst:64
219-
msgid "``orion-chat``, ``orion-chat-rag``"
219+
msgid "``gemma-it``, ``gemma-2-it``, ``gemma-3-1b-it``"
220220
msgstr ""
221221

222222
#: ../../source/getting_started/installation.rst:65
223-
msgid "``c4ai-command-r-v01``"
223+
msgid "``orion-chat``, ``orion-chat-rag``"
224224
msgstr ""
225225

226226
#: ../../source/getting_started/installation.rst:66
227-
msgid "``minicpm3-4b``"
227+
msgid "``c4ai-command-r-v01``"
228228
msgstr ""
229229

230230
#: ../../source/getting_started/installation.rst:67
231-
msgid "``internlm3-instruct``"
231+
msgid "``minicpm3-4b``"
232232
msgstr ""
233233

234234
#: ../../source/getting_started/installation.rst:68
235+
msgid "``internlm3-instruct``"
236+
msgstr ""
237+
238+
#: ../../source/getting_started/installation.rst:69
235239
msgid "``moonlight-16b-a3b-instruct``"
236240
msgstr ""
237241

238-
#: ../../source/getting_started/installation.rst:71
242+
#: ../../source/getting_started/installation.rst:72
239243
msgid "To install Xinference and vLLM::"
240244
msgstr "安装 xinference 和 vLLM:"
241245

242-
#: ../../source/getting_started/installation.rst:84
246+
#: ../../source/getting_started/installation.rst:85
243247
msgid "Llama.cpp Backend"
244248
msgstr "Llama.cpp 引擎"
245249

246-
#: ../../source/getting_started/installation.rst:85
250+
#: ../../source/getting_started/installation.rst:86
247251
msgid ""
248252
"Xinference supports models in ``gguf`` format via ``xllamacpp`` or "
249253
"``llama-cpp-python``. `xllamacpp "
250254
"<https://github.com/xorbitsai/xllamacpp>`_ is developed by Xinference "
251255
"team, and will be the sole backend for llama.cpp in the future."
252-
msgstr "Xinference 通过 xllamacpp 或 llama-cpp-python 支持 gguf 格式的模型。"
253-
"`xllamacpp <https://github.com/xorbitsai/xllamacpp>`_ 由 Xinference 团队开发,"
254-
"并将在未来成为 llama.cpp 的唯一后端。"
256+
msgstr ""
257+
"Xinference 通过 xllamacpp 或 llama-cpp-python 支持 gguf 格式的模型。`"
258+
"xllamacpp <https://github.com/xorbitsai/xllamacpp>`_ 由 Xinference 团队"
259+
"开发,并将在未来成为 llama.cpp 的唯一后端。"
255260

256-
#: ../../source/getting_started/installation.rst:91
261+
#: ../../source/getting_started/installation.rst:92
257262
msgid ""
258-
"``llama-cpp-python`` is the default option for llama.cpp backend. To "
259-
"enable xllamacpp, add environment variable ``USE_XLLAMACPP=1``."
260-
msgstr "``llama-cpp-python`` 是 llama.cpp 后端的默认选项。"
261-
"要启用 xllamacpp,请添加环境变量 USE_XLLAMACPP=1。"
262-
263-
#: ../../source/getting_started/installation.rst:94
264-
msgid "e.g. Starting local Xinference via"
265-
msgstr "例如,通过以下方式启动本地 Xinference"
266-
267-
#: ../../source/getting_started/installation.rst:96
268-
msgid "``USE_XLLAMACPP=1 xinference-local``"
263+
"``xllamacpp`` is the default option for llama.cpp backend since v1.5.0. "
264+
"To enable ``llama-cpp-python``, add environment variable "
265+
"``USE_XLLAMACPP=0``."
269266
msgstr ""
267+
"自 v1.5.0 起,``xllamacpp`` 成为 llama.cpp 后端的默认选项。如需启用 ``"
268+
"llama-cpp-python``,请设置环境变量 ``USE_XLLAMACPP=0``。"
270269

271-
#: ../../source/getting_started/installation.rst:100
270+
#: ../../source/getting_started/installation.rst:97
272271
msgid ""
273-
"For upcoming Xinference v1.5.0, ``xllamacpp`` will become default option "
274-
"for llama.cpp, and ``llama-cpp-python`` will be deprecated. For "
272+
"Since Xinference v1.5.0, ``llama-cpp-python`` will be deprecated. For "
275273
"Xinference v1.6.0, ``llama-cpp-python`` will be removed."
276-
msgstr "在即将发布的 Xinference v1.5.0 中,``xllamacpp`` 将成为 llama.cpp 的默认选项,"
277-
"而 ``llama-cpp-python`` 将被弃用。在 Xinference v1.6.0 中,``llama-cpp-python`` 将被移除。"
274+
msgstr ""
275+
"自 Xinference v1.5.0 起,``llama-cpp-python`` 将被弃用;在 Xinference "
276+
"v1.6.0 中,该后端将被移除。"
278277

279-
#: ../../source/getting_started/installation.rst:104
278+
#: ../../source/getting_started/installation.rst:100
280279
#: ../../source/getting_started/installation.rst:137
281280
#: ../../source/getting_started/installation.rst:150
282281
msgid "Initial setup::"
283282
msgstr "初始步骤:"
284283

285-
#: ../../source/getting_started/installation.rst:108
284+
#: ../../source/getting_started/installation.rst:104
286285
msgid "Installation instructions for ``xllamacpp``:"
287286
msgstr "``xllamacpp`` 的安装说明:"
288287

289-
#: ../../source/getting_started/installation.rst:110
288+
#: ../../source/getting_started/installation.rst:106
290289
msgid "CPU or Mac Metal::"
291290
msgstr "CPU 或 Mac Metal:"
292291

292+
#: ../../source/getting_started/installation.rst:110
293+
msgid "CUDA::"
294+
msgstr ""
295+
293296
#: ../../source/getting_started/installation.rst:114
294-
msgid "Cuda::"
297+
msgid "HIP::"
295298
msgstr ""
296299

297300
#: ../../source/getting_started/installation.rst:118
@@ -364,3 +367,20 @@ msgstr ""
364367
#~ "建议根据当前使用的硬件手动安装依赖,从而"
365368
#~ "获得最佳的加速效果。"
366369

370+
#~ msgid ""
371+
#~ "``qwen2.5``, ``qwen2.5-coder``, ``qwen2.5-instruct``, "
372+
#~ "``qwen2.5-coder-instruct``"
373+
#~ msgstr ""
374+
375+
#~ msgid "``gemma-it``, ``gemma-2-it``"
376+
#~ msgstr ""
377+
378+
#~ msgid "e.g. Starting local Xinference via"
379+
#~ msgstr "例如,通过以下方式启动本地 Xinference"
380+
381+
#~ msgid "``USE_XLLAMACPP=1 xinference-local``"
382+
#~ msgstr ""
383+
384+
#~ msgid "Cuda::"
385+
#~ msgstr ""
386+

doc/source/locale/zh_CN/LC_MESSAGES/models/index.po

+25-17
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ msgid ""
88
msgstr ""
99
"Project-Id-Version: Xinference \n"
1010
"Report-Msgid-Bugs-To: \n"
11-
"POT-Creation-Date: 2025-01-26 11:51+0800\n"
11+
"POT-Creation-Date: 2025-04-19 00:37+0800\n"
1212
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
1313
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
1414
"Language: zh_CN\n"
@@ -140,70 +140,78 @@ msgid ""
140140
msgstr "当你不再需要当前正在运行的模型时,以下列方式释放其占用的资源:"
141141

142142
#: ../../source/models/index.rst:161
143+
msgid ""
144+
"For models that are no longer maintained and depend on outdated libraries"
145+
" (such as ``transformers``), we recommend enabling the :ref:`Model "
146+
"Virtual Environment <model_virtual_env>` feature to ensure they can run "
147+
"properly in a compatible environment."
148+
msgstr "对于不再维护且依赖旧版库(如 ``transformers`` )的模型,建议启用 :ref:`模型虚拟空间 <model_virtual_env>` 功能,以确保它们能在兼容的环境中正常运行。"
149+
150+
#: ../../source/models/index.rst:167
143151
msgid "Model Usage"
144152
msgstr "模型使用"
145153

146-
#: ../../source/models/index.rst:166
154+
#: ../../source/models/index.rst:172
147155
msgid "Chat & Generate"
148156
msgstr "聊天 & 生成"
149157

150-
#: ../../source/models/index.rst:170
158+
#: ../../source/models/index.rst:176
151159
msgid "Learn how to chat with LLMs in Xinference."
152160
msgstr "学习如何在 Xinference 中与 LLM聊天。"
153161

154-
#: ../../source/models/index.rst:172
162+
#: ../../source/models/index.rst:178
155163
msgid "Tools"
156164
msgstr "工具"
157165

158-
#: ../../source/models/index.rst:176
166+
#: ../../source/models/index.rst:182
159167
msgid "Learn how to connect LLM with external tools."
160168
msgstr "学习如何将 LLM 与外部工具连接起来。"
161169

162-
#: ../../source/models/index.rst:181
170+
#: ../../source/models/index.rst:187
163171
msgid "Embeddings"
164172
msgstr "嵌入"
165173

166-
#: ../../source/models/index.rst:185
174+
#: ../../source/models/index.rst:191
167175
msgid "Learn how to create text embeddings in Xinference."
168176
msgstr "学习如何在 Xinference 中创建文本嵌入。"
169177

170-
#: ../../source/models/index.rst:187
178+
#: ../../source/models/index.rst:193
171179
msgid "Rerank"
172180
msgstr "重排序"
173181

174-
#: ../../source/models/index.rst:191
182+
#: ../../source/models/index.rst:197
175183
msgid "Learn how to use rerank models in Xinference."
176184
msgstr "学习如何在 Xinference 中使用重排序模型。"
177185

178-
#: ../../source/models/index.rst:196
186+
#: ../../source/models/index.rst:202
179187
msgid "Images"
180188
msgstr "图像"
181189

182-
#: ../../source/models/index.rst:200
190+
#: ../../source/models/index.rst:206
183191
msgid "Learn how to generate images with Xinference."
184192
msgstr "学习如何使用Xinference生成图像。"
185193

186-
#: ../../source/models/index.rst:202
194+
#: ../../source/models/index.rst:208
187195
msgid "Multimodal"
188196
msgstr "多模态"
189197

190-
#: ../../source/models/index.rst:206
198+
#: ../../source/models/index.rst:212
191199
msgid "Learn how to process images and audio with LLMs."
192200
msgstr "学习如何使用 LLM 处理图像和音频。"
193201

194-
#: ../../source/models/index.rst:211
202+
#: ../../source/models/index.rst:217
195203
msgid "Audio"
196204
msgstr "音频"
197205

198-
#: ../../source/models/index.rst:215
206+
#: ../../source/models/index.rst:221
199207
msgid "Learn how to turn audio into text or text into audio with Xinference."
200208
msgstr "学习如何使用 Xinference 将音频转换为文本或将文本转换为音频。"
201209

202-
#: ../../source/models/index.rst:217
210+
#: ../../source/models/index.rst:223
203211
msgid "Video"
204212
msgstr "视频"
205213

206-
#: ../../source/models/index.rst:221
214+
#: ../../source/models/index.rst:227
207215
msgid "Learn how to generate video with Xinference."
208216
msgstr "学习如何使用Xinference生成视频。"
209217

0 commit comments

Comments
 (0)