Skip to content

Commit 78b60d9

Browse files
cfahlgren1lewtun
andauthored
add qwen 3 chat template deep dive (#2834)
* add qwen 3 chat template post * improve on section title * nit * add quotes around think pair * add code snippet, update thumbnail, add chat template image * Update qwen-3-chat-template-deep-dive.md Co-authored-by: lewtun <[email protected]> * rework last paragraph a bit * Update qwen-3-chat-template-deep-dive.md * Update qwen-3-chat-template-deep-dive.md * Update qwen-3-chat-template-deep-dive.md * Update qwen-3-chat-template-deep-dive.md --------- Co-authored-by: lewtun <[email protected]>
1 parent b7e426b commit 78b60d9

File tree

3 files changed

+161
-0
lines changed

3 files changed

+161
-0
lines changed

_blog.yml

+10
Original file line numberDiff line numberDiff line change
@@ -5921,6 +5921,16 @@
59215921
- hub
59225922
- inference
59235923

5924+
- local: qwen-3-chat-template-deep-dive
5925+
title: "The 4 Things Qwen-3's Chat Template Teaches Us"
5926+
author: cfahlgren1
5927+
thumbnail: /blog/assets/qwen-3-chat-template-deep-dive/thumbnail.png
5928+
date: April 30, 2025
5929+
tags:
5930+
- qwen
5931+
- chat-template
5932+
- llms
5933+
59245934
- local: llama-guard-4
59255935
title: "Welcoming Llama Guard 4 on Hugging Face Hub"
59265936
author: merve
Loading

qwen-3-chat-template-deep-dive.md

+151
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
---
2+
title: "The 4 Things Qwen-3’s Chat Template Teaches Us"
3+
thumbnail: /blog/assets/qwen-3-chat-template-deep-dive/thumbnail.png
4+
authors:
5+
- user: cfahlgren1
6+
---
7+
8+
# The 4 Things Qwen-3’s Chat Template Teaches Us
9+
10+
_**What a boring Jinja snippet tells us about the new Qwen-3 model.**_
11+
12+
The new Qwen-3 model by [Qwen](https://huggingface.co/qwen) ships with a much more sophisticated chat template than it's predecessors Qwen-2.5 and QwQ. By taking a look at the differences in the Jinja template, we can find interesting insights into the new model.
13+
14+
<h2 style="text-align: center; margin-bottom: 0.5rem; font-style: italic;">Chat Templates</h2>
15+
<ul style="display: flex; justify-content: center; list-style: none; padding: 0; margin: 0;">
16+
<li style="margin-right: 1rem;"><a href="https://huggingface.co/Qwen/Qwen3-235B-A22B?chat_template=default">Qwen-3 Chat Template</a></li>
17+
<li style="margin-right: 1rem;"><a href="https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct?chat_template=default">Qwen-2.5 Chat Template</a></li>
18+
<li><a href="https://huggingface.co/Qwen/QwQ-32B?chat_template=default">Qwen-QwQ Chat Template</a></li>
19+
</ul>
20+
21+
22+
## What is a Chat Template?
23+
24+
A [chat template](https://huggingface.co/docs/transformers/main/en/chat_templating) defines how conversations between users and models are structured and formatted. The template acts as a translator, converting a human-readable conversation:
25+
26+
```js
27+
[
28+
{ role: "user", content: "Hi there!" },
29+
{ role: "assistant", content: "Hi there, how can I help you today?" },
30+
{ role: "user", content: "I'm looking for a new pair of shoes." },
31+
]
32+
```
33+
34+
into a model friendly format:
35+
36+
```xml
37+
<|im_start|>user
38+
Hi there!<|im_end|>
39+
<|im_start|>assistant
40+
Hi there, how can I help you today?<|im_end|>
41+
<|im_start|>user
42+
I'm looking for a new pair of shoes.<|im_end|>
43+
<|im_start|>assistant
44+
<think>
45+
46+
</think>
47+
```
48+
49+
You can easily view the chat template for a given model on the Hugging Face model page.
50+
51+
![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/qwen-3-chat-template/qwen-3-chat-template.png)
52+
<p style="text-align:center; font-style:italic; font-size:medium;">
53+
Chat Template for <a href="https://huggingface.co/Qwen/Qwen3-235B-A22B?chat_template=default" target="_blank"> Qwen/Qwen3-235B-A22B </a>
54+
</p>
55+
56+
Let's dive into the Qwen-3 chat template and see what we can learn!
57+
## 1. Reasoning doesn't have to be forced
58+
59+
_**and you can make it optional via a simple prefill...**_
60+
61+
Qwen-3 is unique in it's ability to toggle reasoning via the `enable_thinking` flag. When set to false, the template inserts an empty `<think></think>` pair, telling the model to skip step‑by‑step thoughts. Earlier models baked the `<think>` tag into every generation, forcing chain‑of‑thought whether you wanted it or not.
62+
63+
```jinja
64+
{# Qwen-3 #}
65+
{%- if enable_thinking is defined and enable_thinking is false %}
66+
{{- '<think>\n\n</think>\n\n' }}
67+
{%- endif %}
68+
```
69+
70+
QwQ for example, forces reasoning in every conversation.
71+
72+
```jinja
73+
{# QwQ #}
74+
{%- if add_generation_prompt %}
75+
{{- '<|im_start|>assistant\n<think>\n' }}
76+
{%- endif %}
77+
```
78+
79+
If the `enable_thinking` is true, the model is able to decide whether to think or not.
80+
81+
You can test test out the template with the following code:
82+
83+
```js
84+
import { Template } from "@huggingface/jinja";
85+
import { downloadFile } from "@huggingface/hub";
86+
87+
const HF_TOKEN = process.env.HF_TOKEN;
88+
89+
const file = await downloadFile({
90+
repo: "Qwen/Qwen3-235B-A22B",
91+
path: "tokenizer_config.json",
92+
accessToken: HF_TOKEN,
93+
});
94+
const config = await file!.json();
95+
96+
const template = new Template(config.chat_template);
97+
const result = template.render({
98+
messages,
99+
add_generation_prompt: true,
100+
enable_thinking: false,
101+
bos_token: config.bos_token,
102+
eos_token: config.eos_token,
103+
});
104+
```
105+
106+
## 2. Context Management Should be Dynamic
107+
108+
_Qwen-3 utilizes a rolling checkpoint system, intelligently preserving or pruning reasoning blocks to maintain relevant context. Older models discarded reasoning prematurely to save tokens._
109+
110+
Qwen-3 introduces a "**_rolling checkpoint_**" by traversing the message list in reverse to find the latest user turn that wasn’t a tool call. For any assistant replies after that index it keeps the full `<think>` blocks; everything earlier is stripped out.
111+
112+
**Why this matters**:
113+
- Keeps the active plan visible during a multi‑step tool call.
114+
- Supports nested tool workflows without losing context.
115+
- Saves tokens by pruning thoughts the model no longer needs.
116+
- Prevents "stale" reasoning from bleeding into new tasks.
117+
118+
### Example
119+
120+
Here's an example of chain-of-thought preservation through tool calls with Qwen-3 and QwQ.
121+
![image/png](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/qwen-3-chat-template/qwen-chat-output.png)
122+
<p style="text-align:center; font-style:italic; font-size:medium;">
123+
Check out <a href="https://www.npmjs.com/package/@huggingface/jinja">@huggingface/jinja</a> for testing out the chat templates
124+
</p>
125+
126+
## 3. Tool Arguments Need Better Serialization
127+
128+
Before, every `tool_call.arguments` field was piped through ` | tojson`, even if it was already a JSON‑encoded string—risking double‑escaping. Qwen‑3 checks the type first and only serializes when necessary.
129+
130+
```jinja
131+
{# Qwen3 #}
132+
{%- if tool_call.arguments is string %}
133+
{{- tool_call.arguments }}
134+
{%- else %}
135+
{{- tool_call.arguments | tojson }}
136+
{%- endif %}
137+
```
138+
139+
## 4. There's No Need for a Default System Prompt
140+
141+
Like many models, the Qwen‑2.5 series has a default system prompt.
142+
143+
> You are Qwen, created by Alibaba Cloud. You are a helpful assistant.
144+
145+
This is pretty common as it helps models respond to user questions like "Who are you?".
146+
147+
Qwen-3 and QwQ ship without this default system prompt. Despite this, the model can still accurately identify its creator if you ask it.
148+
149+
## Conclusion
150+
151+
Qwen-3 shows us that through the `chat_template` we can provide better flexibility, smarter context handling, and improved tool interaction. These improvements not only improve capabilities, but also make agentic workflows more reliable and efficent.

0 commit comments

Comments
 (0)