Skip to content

Commit 0ad9cc0

Browse files
committed
add qwen 3 chat template post
1 parent 5ec837f commit 0ad9cc0

File tree

3 files changed

+98
-1
lines changed

3 files changed

+98
-1
lines changed

_blog.yml

+9-1
Original file line numberDiff line numberDiff line change
@@ -5921,4 +5921,12 @@
59215921
- hub
59225922
- inference
59235923

5924-
5924+
- local: qwen-3-chat-template-deep-dive
5925+
title: "The 4 Things Qwen-3's Chat Template Teaches Us"
5926+
author: cfahlgren1
5927+
thumbnail: /blog/assets/qwen-3-chat-template-deep-dive/thumbnail.png
5928+
date: April 29, 2025
5929+
tags:
5930+
- qwen
5931+
- chat-template
5932+
- llms
Loading

qwen-3-chat-template-deep-dive.md

+89
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
---
2+
title: "The 4 Things Qwen-3’s Chat Template Teaches Us"
3+
thumbnail: /blog/assets/qwen-3-chat-template-deep-dive/thumbnail.jpg
4+
authors:
5+
- user: cfahlgren1
6+
---
7+
8+
# The 4 Things Qwen-3’s Chat Template Teaches Us
9+
10+
_**What a boring Jinja snippet tells us about the new Qwen-3 model.**_
11+
12+
The new Qwen-3 model by [Qwen](https://huggingface.co/qwen) ships with a much more sophisticated chat template than it's predecessors Qwen-2.5 and QwQ. By taking a look at the differences in the Jinja template, we can find interesting insights into the new model.
13+
14+
<h2 style="text-align: center; margin-bottom: 0.5rem; font-style: italic;">Chat Templates</h2>
15+
<ul style="display: flex; justify-content: center; list-style: none; padding: 0; margin: 0;">
16+
<li style="margin-right: 1rem;"><a href="https://huggingface.co/Qwen/Qwen3-235B-A22B?chat_template=default">Qwen-3 Chat Template</a></li>
17+
<li style="margin-right: 1rem;"><a href="https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct?chat_template=default">Qwen-2.5 Chat Template</a></li>
18+
<li><a href="https://huggingface.co/Qwen/QwQ-32B?chat_template=default">Qwen-QwQ Chat Template</a></li>
19+
</ul>
20+
21+
## 1. Reasoning doesn't have to be forced
22+
23+
_**and you can do it via a simple prefill...**_
24+
25+
Qwen-3 is unique in it's ability to toggle reasoning via the `enable_thinking` flag. When set to false, the template inserts an empty <think></think> pair, telling the model to skip step‑by‑step thoughts. Earlier models baked the <think> tag into every generation, forcing chain‑of‑thought whether you wanted it or not.
26+
27+
```jinja
28+
{# Qwen-3 #}
29+
{%- if enable_thinking is defined and enable_thinking is false %}
30+
{{- '<think>\n\n</think>\n\n' }}
31+
{%- endif %}
32+
```
33+
34+
QwQ for example, forces reasoning in every conversation.
35+
36+
```jinja
37+
{# QwQ #}
38+
{%- if add_generation_prompt %}
39+
{{- '<|im_start|>assistant\n<think>\n' }}
40+
{%- endif %}
41+
```
42+
43+
If the `enable_thinking` is true, the model is able to decide whether to think or not.
44+
_More often than not it will think._
45+
46+
## 2. Context Management Should be Dynamic
47+
48+
_Qwen-3 utilizes a rolling checkpoint system, intelligently preserving or pruning reasoning blocks to maintain relevant context. Older models discarded reasoning prematurely to save tokens._
49+
50+
Qwen-3 introduces a "**_rolling checkpoint_**" by traversing the message list in reverse to find the latest user turn that wasn’t a tool echo. For any assistant replies after that index it keeps the full `<think>` blocks; everything earlier is stripped out.
51+
52+
**Why this matters**:
53+
- Keeps the active plan visible during a multi‑step tool call.
54+
- Supports nested tool workflows without losing context.
55+
- Saves tokens by pruning thoughts the model no longer needs.
56+
- Prevents "stale" reasoning from bleeding into new tasks.
57+
58+
### Example
59+
60+
Here's an example of chain-of-thought preservation through tool calls with Qwen-3 and QwQ.
61+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/648a374f00f7a3374ee64b99/7OWKkRuO9Qc2L48LYjxVf.png)
62+
<p style="text-align:center; font-style:italic; font-size:medium;">
63+
Check out <a href="https://www.npmjs.com/package/@huggingface/jinja">@huggingface/jinja</a> for testing out the chat templates
64+
</p>
65+
66+
## 3. Tool Arguments Need Better Serialization
67+
68+
Before, every `tool_call.arguments` field was piped through ` | tojson`, even if it was already a JSON‑encoded string—risking double‑escaping. Qwen‑3 checks the type first and only serializes when necessary.
69+
70+
```jinja
71+
{# Qwen3 #}
72+
{%- if tool_call.arguments is string %}
73+
{{- tool_call.arguments }}
74+
{%- else %}
75+
{{- tool_call.arguments | tojson }}
76+
{%- endif %}
77+
```
78+
79+
## 4. Default Prompts Should be Optional
80+
81+
Qwen‑2.5 automatically inserted a default Alibaba system prompt:
82+
83+
> You are Qwen, created by Alibaba Cloud. You are a helpful assistant.
84+
85+
Unlike Qwen-2.5, which automatically inserted a default Alibaba system prompt, Qwen-3 (alongside QwQ) omits any default prompts, allowing developers full control over the model's persona.
86+
87+
## Conclusion
88+
89+
Qwen-3 shows us that through the `chat_template` we can provide better flexibility, smarter context handling, and improved tool interaction. These improvements not only improve capabilities, but also make agentic workflows more reliable and efficent.

0 commit comments

Comments
 (0)