Description
Describe the bug
When using the streaming API with stream: true
and stream_options.include_usage: false
, the streamed responses still include the "usage"
field in each data chunk.
This is inconsistent with the documented or expected behavior — setting include_usage: false
should suppress the "usage"
field in the streamed response chunks.
To Reproduce
Use the /v1/completions
endpoint with the following payload (or equivalent in go-openai
):
{
"model": "gemma-2",
"stream": true,
"stream_options": {
"include_usage": false
},
"prompt": "Write something interesting."
}
Observe that each data:
chunk still includes:
"usage": {
"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0,
"prompt_tokens_details": null,
"completion_tokens_details": null
}
Also tested with stream_options.include_usage: true
, and both cases show "usage"
in stream chunks.
Expected behavior
If stream_options.include_usage
is set to false
, the streamed data chunks should omit the "usage"
field entirely or return it as null
. Including usage despite explicitly disabling it violates the expected contract of the API.
Screenshots/Logs
Example log:
data: {"id":"cmpl-a16ec15821d444e8a22372900e99cc7e","object":"text_completion","created":1750151236,"model":"gemma-2","choices":[{"text":"writing","index":0,"finish_reason":""}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
Environment (please complete the following information):
go-openai
version:v1.40.1
- Go version:
1.21
- OpenAI API version:
v1
- OS: macOS
Additional context
The bug is confusing for consumers trying to reduce response payloads or suppress token accounting per stream chunk. This appears to be a server-side issue or a client-side mishandling of response parsing.