xorbitsai · qinxuye · May 29, 2025 · May 29, 2025 · May 29, 2025 · May 29, 2025
diff --git a/README.md b/README.md
@@ -47,14 +47,14 @@ potential of cutting-edge AI models.
 - Support SGLang backend: [#1161](https://github.com/xorbitsai/inference/pull/1161)
 - Support LoRA for LLM and image models: [#1080](https://github.com/xorbitsai/inference/pull/1080)
 ### New Models
+- Built-in support for [Deepseek-R1-0528](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528): [#3539](https://github.com/xorbitsai/inference/pull/3539)
 - Built-in support for [Qwen3](https://qwenlm.github.io/blog/qwen3/): [#3347](https://github.com/xorbitsai/inference/pull/3347)
 - Built-in support for [Qwen2.5-Omni](https://github.com/QwenLM/Qwen2.5-Omni): [#3279](https://github.com/xorbitsai/inference/pull/3279)
 - Built-in support for [Skywork-OR1](https://github.com/SkyworkAI/Skywork-OR1): [#3274](https://github.com/xorbitsai/inference/pull/3274)
 - Built-in support for [GLM-4-0414](https://github.com/THUDM/GLM-4): [#3251](https://github.com/xorbitsai/inference/pull/3251)
 - Built-in support for [SeaLLMs-v3](https://github.com/DAMO-NLP-SG/DAMO-SeaLLMs): [#3248](https://github.com/xorbitsai/inference/pull/3248)
 - Built-in support for [paraformer-zh](https://huggingface.co/funasr/paraformer-zh): [#3236](https://github.com/xorbitsai/inference/pull/3236)
 - Built-in support for [InternVL3](https://internvl.github.io/blog/2025-04-11-InternVL-3.0/): [#3235](https://github.com/xorbitsai/inference/pull/3235)
-- Built-in support for [MegaTTS3](https://github.com/bytedance/MegaTTS3): [#3224](https://github.com/xorbitsai/inference/pull/3224)
 ### Integrations
 - [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): an LLMOps platform that enables developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable.
 - [FastGPT](https://github.com/labring/FastGPT): a knowledge-based platform built on the LLM, offers out-of-the-box data processing and model invocation capabilities, allows for workflow orchestration through Flow visualization.

diff --git a/README_zh_CN.md b/README_zh_CN.md
@@ -43,14 +43,14 @@ Xorbits Inference（Xinference）是一个性能强大且功能全面的分布
 - 支持 SGLang 后端: [#1161](https://github.com/xorbitsai/inference/pull/1161)
 - 支持LLM和图像模型的LoRA: [#1080](https://github.com/xorbitsai/inference/pull/1080)
 ### 新模型
+- 内置 [Deepseek-R1-0528](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528): [#3539](https://github.com/xorbitsai/inference/pull/3539)
 - 内置 [Qwen3](https://qwenlm.github.io/blog/qwen3/): [#3347](https://github.com/xorbitsai/inference/pull/3347)
 - 内置 [Qwen2.5-Omni](https://github.com/QwenLM/Qwen2.5-Omni): [#3279](https://github.com/xorbitsai/inference/pull/3279)
 - 内置 [Skywork-OR1](https://github.com/SkyworkAI/Skywork-OR1): [#3274](https://github.com/xorbitsai/inference/pull/3274)
 - 内置 [GLM-4-0414](https://github.com/THUDM/GLM-4): [#3251](https://github.com/xorbitsai/inference/pull/3251)
 - 内置 [SeaLLMs-v3](https://github.com/DAMO-NLP-SG/DAMO-SeaLLMs): [#3248](https://github.com/xorbitsai/inference/pull/3248)
 - 内置 [paraformer-zh](https://huggingface.co/funasr/paraformer-zh): [#3236](https://github.com/xorbitsai/inference/pull/3236)
 - 内置 [InternVL3](https://internvl.github.io/blog/2025-04-11-InternVL-3.0/): [#3235](https://github.com/xorbitsai/inference/pull/3235)
-- 内置 [MegaTTS3](https://github.com/bytedance/MegaTTS3): [#3224](https://github.com/xorbitsai/inference/pull/3224)
 ### 集成
 - [FastGPT](https://doc.fastai.site/docs/development/custom-models/xinference/)：一个基于 LLM 大模型的开源 AI 知识库构建平台。提供了开箱即用的数据处理、模型调用、RAG 检索、可视化 AI 工作流编排等能力，帮助您轻松实现复杂的问答场景。
 - [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): 一个涵盖了大型语言模型开发、部署、维护和优化的 LLMOps 平台。

diff --git a/doc/source/getting_started/installation.rst b/doc/source/getting_started/installation.rst
@@ -60,7 +60,7 @@ Currently, supported models include:
 - ``codestral-v0.1``
 - ``Yi``, ``Yi-1.5``, ``Yi-chat``, ``Yi-1.5-chat``, ``Yi-1.5-chat-16k``
 - ``code-llama``, ``code-llama-python``, ``code-llama-instruct``
-- ``deepseek``, ``deepseek-coder``, ``deepseek-chat``, ``deepseek-coder-instruct``, ``deepseek-r1-distill-qwen``, ``deepseek-v2-chat``, ``deepseek-v2-chat-0628``, ``deepseek-v2.5``, ``deepseek-v3``, ``deepseek-r1``, ``deepseek-r1-distill-llama``
+- ``deepseek``, ``deepseek-coder``, ``deepseek-chat``, ``deepseek-coder-instruct``, ``deepseek-r1-distill-qwen``, ``deepseek-v2-chat``, ``deepseek-v2-chat-0628``, ``deepseek-v2.5``, ``deepseek-v3``, ``deepseek-v3-0324``, ``deepseek-r1``, ``deepseek-r1-0528``, ``deepseek-prover-v2``, ``deepseek-r1-distill-llama``
 - ``yi-coder``, ``yi-coder-chat``
 - ``codeqwen1.5``, ``codeqwen1.5-chat``
 - ``qwen2.5``, ``qwen2.5-coder``, ``qwen2.5-instruct``, ``qwen2.5-coder-instruct``, ``qwen2.5-instruct-1m``
@@ -74,11 +74,14 @@ Currently, supported models include:
 - ``codegeex4``
 - ``qwen1.5-chat``, ``qwen1.5-moe-chat``
 - ``qwen2-instruct``, ``qwen2-moe-instruct``
+- ``XiYanSQL-QwenCoder-2504``
 - ``QwQ-32B-Preview``, ``QwQ-32B``
 - ``marco-o1``
 - ``fin-r1``
 - ``seallms-v3``
-- ``skywork-or1-preview``
+- ``skywork-or1-preview``, ``skywork-or1``
+- ``HuatuoGPT-o1-Qwen2.5``, ``HuatuoGPT-o1-LLaMA-3.1``
+- ``DianJin-R1``
 - ``gemma-it``, ``gemma-2-it``, ``gemma-3-1b-it``
 - ``orion-chat``, ``orion-chat-rag``
 - ``c4ai-command-r-v01``

diff --git a/doc/source/locale/zh_CN/LC_MESSAGES/models/model_abilities/audio.po b/doc/source/locale/zh_CN/LC_MESSAGES/models/model_abilities/audio.po
@@ -21,7 +21,7 @@ msgstr ""
 
 #: ../../source/models/model_abilities/audio.rst:5
 msgid "Audio"
-msgstr ""
+msgstr "音频"
 
 #: ../../source/models/model_abilities/audio.rst:7
 msgid "Learn how to turn audio into text or text into audio with Xinference."
@@ -358,7 +358,7 @@ msgstr "基本使用，加载模型 ``CosyVoice-300M-SFT``。"
 msgid ""
 "Please note that the latest CosyVoice 2.0 requires `use_flow_cache=True` "
 "for stream generation."
-msgstr ""
+msgstr "请注意，最新版本的 CosyVoice 2.0 在进行流式生成时需要设置 `use_flow_cache=True`。"
 
 #: ../../source/models/model_abilities/audio.rst:422
 msgid ""

diff --git a/doc/source/models/builtin/audio/index.rst b/doc/source/models/builtin/audio/index.rst
@@ -55,6 +55,12 @@ The following is a list of built-in audio models in Xinference:
 
    paraformer-zh
 
+   paraformer-zh-hotword
+
+   paraformer-zh-long
+
+   paraformer-zh-spk
+
    sensevoicesmall
 
    whisper-base

diff --git a/doc/source/models/builtin/audio/paraformer-zh-hotword.rst b/doc/source/models/builtin/audio/paraformer-zh-hotword.rst
@@ -0,0 +1,19 @@
+.. _models_builtin_paraformer-zh-hotword:
+
+=====================
+paraformer-zh-hotword
+=====================
+
+- **Model Name:** paraformer-zh-hotword
+- **Model Family:** funasr
+- **Abilities:** ['audio2text']
+- **Multilingual:** False
+
+Specifications
+^^^^^^^^^^^^^^
+
+- **Model ID:** JunHowie/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404
+
+Execute the following command to launch the model::
+
+   xinference launch --model-name paraformer-zh-hotword --model-type audio
diff --git a/doc/source/models/builtin/audio/paraformer-zh-long.rst b/doc/source/models/builtin/audio/paraformer-zh-long.rst
@@ -0,0 +1,19 @@
+.. _models_builtin_paraformer-zh-long:
+
+==================
+paraformer-zh-long
+==================
+
+- **Model Name:** paraformer-zh-long
+- **Model Family:** funasr
+- **Abilities:** ['audio2text']
+- **Multilingual:** False
+
+Specifications
+^^^^^^^^^^^^^^
+
+- **Model ID:** JunHowie/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch
+
+Execute the following command to launch the model::
+
+   xinference launch --model-name paraformer-zh-long --model-type audio
diff --git a/doc/source/models/builtin/audio/paraformer-zh-spk.rst b/doc/source/models/builtin/audio/paraformer-zh-spk.rst
@@ -0,0 +1,19 @@
+.. _models_builtin_paraformer-zh-spk:
+
+=================
+paraformer-zh-spk
+=================
+
+- **Model Name:** paraformer-zh-spk
+- **Model Family:** funasr
+- **Abilities:** ['audio2text']
+- **Multilingual:** False
+
+Specifications
+^^^^^^^^^^^^^^
+
+- **Model ID:** JunHowie/speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn
+
+Execute the following command to launch the model::
+
+   xinference launch --model-name paraformer-zh-spk --model-type audio
diff --git a/doc/source/models/builtin/llm/cogvlm2-video-llama3-chat.rst b/doc/source/models/builtin/llm/cogvlm2-video-llama3-chat.rst
diff --git a/doc/source/models/builtin/llm/cogvlm2.rst b/doc/source/models/builtin/llm/cogvlm2.rst
diff --git a/doc/source/models/builtin/llm/deepseek-prover-v2.rst b/doc/source/models/builtin/llm/deepseek-prover-v2.rst
@@ -0,0 +1,63 @@
+.. _models_llm_deepseek-prover-v2:
+
+========================================
+deepseek-prover-v2
+========================================
+
+- **Context Length:** 163840
+- **Model Name:** deepseek-prover-v2
+- **Languages:** en, zh
+- **Abilities:** chat, reasoning
+- **Description:** We introduce DeepSeek-Prover-V2, an open-source large language model designed for formal theorem proving in Lean 4, with initialization data collected through a recursive theorem proving pipeline powered by DeepSeek-V3. The cold-start training procedure begins by prompting DeepSeek-V3 to decompose complex problems into a series of subgoals. The proofs of resolved subgoals are synthesized into a chain-of-thought process, combined with DeepSeek-V3's step-by-step reasoning, to create an initial cold start for reinforcement learning. This process enables us to integrate both informal and formal mathematical reasoning into a unified model
+
+Specifications
+^^^^^^^^^^^^^^
+
+
+Model Spec 1 (pytorch, 671 Billion)
+++++++++++++++++++++++++++++++++++++++++
+
+- **Model Format:** pytorch
+- **Model Size (in billions):** 671
+- **Quantizations:** none
+- **Engines**: vLLM, Transformers
+- **Model ID:** deepseek-ai/DeepSeek-Prover-V2-671B
+- **Model Hubs**:  `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-Prover-V2-671B>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-Prover-V2-671B>`__
+
+Execute the following command to launch the model, remember to replace ``${quantization}`` with your
+chosen quantization method from the options listed above::
+
+   xinference launch --model-engine ${engine} --model-name deepseek-prover-v2 --size-in-billions 671 --model-format pytorch --quantization ${quantization}
+
+
+Model Spec 2 (pytorch, 7 Billion)
+++++++++++++++++++++++++++++++++++++++++
+
+- **Model Format:** pytorch
+- **Model Size (in billions):** 7
+- **Quantizations:** none
+- **Engines**: vLLM, Transformers
+- **Model ID:** deepseek-ai/DeepSeek-Prover-V2-7B
+- **Model Hubs**:  `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-Prover-V2-7B>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-Prover-V2-7B>`__
+
+Execute the following command to launch the model, remember to replace ``${quantization}`` with your
+chosen quantization method from the options listed above::
+
+   xinference launch --model-engine ${engine} --model-name deepseek-prover-v2 --size-in-billions 7 --model-format pytorch --quantization ${quantization}
+
+
+Model Spec 3 (mlx, 7 Billion)
+++++++++++++++++++++++++++++++++++++++++
+
+- **Model Format:** mlx
+- **Model Size (in billions):** 7
+- **Quantizations:** 4bit
+- **Engines**: 
+- **Model ID:** mlx-community/DeepSeek-Prover-V2-7B-4bit
+- **Model Hubs**:  `Hugging Face <https://huggingface.co/mlx-community/DeepSeek-Prover-V2-7B-4bit>`__, `ModelScope <https://modelscope.cn/models/mlx-community/DeepSeek-Prover-V2-7B-4bit>`__
+
+Execute the following command to launch the model, remember to replace ``${quantization}`` with your
+chosen quantization method from the options listed above::
+
+   xinference launch --model-engine ${engine} --model-name deepseek-prover-v2 --size-in-billions 7 --model-format mlx --quantization ${quantization}
+
diff --git a/doc/source/models/builtin/llm/deepseek-r1-0528.rst b/doc/source/models/builtin/llm/deepseek-r1-0528.rst
@@ -0,0 +1,31 @@
+.. _models_llm_deepseek-r1-0528:
+
+========================================
+deepseek-r1-0528
+========================================
+
+- **Context Length:** 163840
+- **Model Name:** deepseek-r1-0528
+- **Languages:** en, zh
+- **Abilities:** chat, reasoning
+- **Description:** DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.
+
+Specifications
+^^^^^^^^^^^^^^
+
+
+Model Spec 1 (pytorch, 671 Billion)
+++++++++++++++++++++++++++++++++++++++++
+
+- **Model Format:** pytorch
+- **Model Size (in billions):** 671
+- **Quantizations:** none
+- **Engines**: vLLM, Transformers
+- **Model ID:** deepseek-ai/DeepSeek-R1-0528
+- **Model Hubs**:  `Hugging Face <https://huggingface.co/deepseek-ai/DeepSeek-R1-0528>`__, `ModelScope <https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-0528>`__
+
+Execute the following command to launch the model, remember to replace ``${quantization}`` with your
+chosen quantization method from the options listed above::
+
+   xinference launch --model-engine ${engine} --model-name deepseek-r1-0528 --size-in-billions 671 --model-format pytorch --quantization ${quantization}
+
diff --git a/doc/source/models/builtin/llm/deepseek-v2.rst b/doc/source/models/builtin/llm/deepseek-v2.rst