Skip to content

Commit f50e962

Browse files
feat(api): add token logprobs to chat completions (#980)
1 parent 215476a commit f50e962

14 files changed

+255
-61
lines changed

api.md

+1
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ from openai.types.chat import (
3838
ChatCompletionNamedToolChoice,
3939
ChatCompletionRole,
4040
ChatCompletionSystemMessageParam,
41+
ChatCompletionTokenLogprob,
4142
ChatCompletionTool,
4243
ChatCompletionToolChoiceOption,
4344
ChatCompletionToolMessageParam,

src/openai/resources/chat/completions.py

+104-18
Large diffs are not rendered by default.

src/openai/resources/completions.py

+36-30
Original file line numberDiff line numberDiff line change
@@ -119,14 +119,15 @@ def create(
119119
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
120120
from being generated.
121121
122-
logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
123-
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
124-
the 5 most likely tokens. The API will always return the `logprob` of the
125-
sampled token, so there may be up to `logprobs+1` elements in the response.
122+
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
123+
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
124+
list of the 5 most likely tokens. The API will always return the `logprob` of
125+
the sampled token, so there may be up to `logprobs+1` elements in the response.
126126
127127
The maximum value for `logprobs` is 5.
128128
129-
max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
129+
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
130+
completion.
130131
131132
The token count of your prompt plus `max_tokens` cannot exceed the model's
132133
context length.
@@ -288,14 +289,15 @@ def create(
288289
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
289290
from being generated.
290291
291-
logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
292-
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
293-
the 5 most likely tokens. The API will always return the `logprob` of the
294-
sampled token, so there may be up to `logprobs+1` elements in the response.
292+
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
293+
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
294+
list of the 5 most likely tokens. The API will always return the `logprob` of
295+
the sampled token, so there may be up to `logprobs+1` elements in the response.
295296
296297
The maximum value for `logprobs` is 5.
297298
298-
max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
299+
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
300+
completion.
299301
300302
The token count of your prompt plus `max_tokens` cannot exceed the model's
301303
context length.
@@ -450,14 +452,15 @@ def create(
450452
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
451453
from being generated.
452454
453-
logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
454-
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
455-
the 5 most likely tokens. The API will always return the `logprob` of the
456-
sampled token, so there may be up to `logprobs+1` elements in the response.
455+
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
456+
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
457+
list of the 5 most likely tokens. The API will always return the `logprob` of
458+
the sampled token, so there may be up to `logprobs+1` elements in the response.
457459
458460
The maximum value for `logprobs` is 5.
459461
460-
max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
462+
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
463+
completion.
461464
462465
The token count of your prompt plus `max_tokens` cannot exceed the model's
463466
context length.
@@ -687,14 +690,15 @@ async def create(
687690
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
688691
from being generated.
689692
690-
logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
691-
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
692-
the 5 most likely tokens. The API will always return the `logprob` of the
693-
sampled token, so there may be up to `logprobs+1` elements in the response.
693+
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
694+
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
695+
list of the 5 most likely tokens. The API will always return the `logprob` of
696+
the sampled token, so there may be up to `logprobs+1` elements in the response.
694697
695698
The maximum value for `logprobs` is 5.
696699
697-
max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
700+
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
701+
completion.
698702
699703
The token count of your prompt plus `max_tokens` cannot exceed the model's
700704
context length.
@@ -856,14 +860,15 @@ async def create(
856860
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
857861
from being generated.
858862
859-
logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
860-
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
861-
the 5 most likely tokens. The API will always return the `logprob` of the
862-
sampled token, so there may be up to `logprobs+1` elements in the response.
863+
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
864+
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
865+
list of the 5 most likely tokens. The API will always return the `logprob` of
866+
the sampled token, so there may be up to `logprobs+1` elements in the response.
863867
864868
The maximum value for `logprobs` is 5.
865869
866-
max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
870+
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
871+
completion.
867872
868873
The token count of your prompt plus `max_tokens` cannot exceed the model's
869874
context length.
@@ -1018,14 +1023,15 @@ async def create(
10181023
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
10191024
from being generated.
10201025
1021-
logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
1022-
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
1023-
the 5 most likely tokens. The API will always return the `logprob` of the
1024-
sampled token, so there may be up to `logprobs+1` elements in the response.
1026+
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
1027+
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
1028+
list of the 5 most likely tokens. The API will always return the `logprob` of
1029+
the sampled token, so there may be up to `logprobs+1` elements in the response.
10251030
10261031
The maximum value for `logprobs` is 5.
10271032
1028-
max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
1033+
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
1034+
completion.
10291035
10301036
The token count of your prompt plus `max_tokens` cannot exceed the model's
10311037
context length.

src/openai/resources/files.py

+4-2
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,8 @@ def create(
5151
The size of all the
5252
files uploaded by one organization can be up to 100 GB.
5353
54-
The size of individual files can be a maximum of 512 MB. See the
54+
The size of individual files can be a maximum of 512 MB or 2 million tokens for
55+
Assistants. See the
5556
[Assistants Tools guide](https://platform.openai.com/docs/assistants/tools) to
5657
learn more about the types of files supported. The Fine-tuning API only supports
5758
`.jsonl` files.
@@ -314,7 +315,8 @@ async def create(
314315
The size of all the
315316
files uploaded by one organization can be up to 100 GB.
316317
317-
The size of individual files can be a maximum of 512 MB. See the
318+
The size of individual files can be a maximum of 512 MB or 2 million tokens for
319+
Assistants. See the
318320
[Assistants Tools guide](https://platform.openai.com/docs/assistants/tools) to
319321
learn more about the types of files supported. The Fine-tuning API only supports
320322
`.jsonl` files.

src/openai/types/beta/threads/runs/message_creation_step_details.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,4 @@ class MessageCreationStepDetails(BaseModel):
1616
message_creation: MessageCreation
1717

1818
type: Literal["message_creation"]
19-
"""Always `message_creation``."""
19+
"""Always `message_creation`."""

src/openai/types/beta/threads/runs/run_step.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ class RunStep(BaseModel):
6666
"""
6767

6868
object: Literal["thread.run.step"]
69-
"""The object type, which is always `thread.run.step``."""
69+
"""The object type, which is always `thread.run.step`."""
7070

7171
run_id: str
7272
"""

src/openai/types/chat/__init__.py

+3
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,9 @@
1313
from .chat_completion_message_param import (
1414
ChatCompletionMessageParam as ChatCompletionMessageParam,
1515
)
16+
from .chat_completion_token_logprob import (
17+
ChatCompletionTokenLogprob as ChatCompletionTokenLogprob,
18+
)
1619
from .chat_completion_message_tool_call import (
1720
ChatCompletionMessageToolCall as ChatCompletionMessageToolCall,
1821
)

src/openai/types/chat/chat_completion.py

+10-1
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,14 @@
66
from ..._models import BaseModel
77
from ..completion_usage import CompletionUsage
88
from .chat_completion_message import ChatCompletionMessage
9+
from .chat_completion_token_logprob import ChatCompletionTokenLogprob
910

10-
__all__ = ["ChatCompletion", "Choice"]
11+
__all__ = ["ChatCompletion", "Choice", "ChoiceLogprobs"]
12+
13+
14+
class ChoiceLogprobs(BaseModel):
15+
content: Optional[List[ChatCompletionTokenLogprob]]
16+
"""A list of message content tokens with log probability information."""
1117

1218

1319
class Choice(BaseModel):
@@ -24,6 +30,9 @@ class Choice(BaseModel):
2430
index: int
2531
"""The index of the choice in the list of choices."""
2632

33+
logprobs: Optional[ChoiceLogprobs]
34+
"""Log probability information for the choice."""
35+
2736
message: ChatCompletionMessage
2837
"""A chat completion message generated by the model."""
2938

src/openai/types/chat/chat_completion_chunk.py

+10
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
from typing_extensions import Literal
55

66
from ..._models import BaseModel
7+
from .chat_completion_token_logprob import ChatCompletionTokenLogprob
78

89
__all__ = [
910
"ChatCompletionChunk",
@@ -12,6 +13,7 @@
1213
"ChoiceDeltaFunctionCall",
1314
"ChoiceDeltaToolCall",
1415
"ChoiceDeltaToolCallFunction",
16+
"ChoiceLogprobs",
1517
]
1618

1719

@@ -70,6 +72,11 @@ class ChoiceDelta(BaseModel):
7072
tool_calls: Optional[List[ChoiceDeltaToolCall]] = None
7173

7274

75+
class ChoiceLogprobs(BaseModel):
76+
content: Optional[List[ChatCompletionTokenLogprob]]
77+
"""A list of message content tokens with log probability information."""
78+
79+
7380
class Choice(BaseModel):
7481
delta: ChoiceDelta
7582
"""A chat completion delta generated by streamed model responses."""
@@ -87,6 +94,9 @@ class Choice(BaseModel):
8794
index: int
8895
"""The index of the choice in the list of choices."""
8996

97+
logprobs: Optional[ChoiceLogprobs] = None
98+
"""Log probability information for the choice."""
99+
90100

91101
class ChatCompletionChunk(BaseModel):
92102
id: str

src/openai/types/chat/chat_completion_function_message_param.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,14 @@
22

33
from __future__ import annotations
44

5+
from typing import Optional
56
from typing_extensions import Literal, Required, TypedDict
67

78
__all__ = ["ChatCompletionFunctionMessageParam"]
89

910

1011
class ChatCompletionFunctionMessageParam(TypedDict, total=False):
11-
content: Required[str]
12+
content: Required[Optional[str]]
1213
"""The contents of the function message."""
1314

1415
name: Required[str]
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# File generated from our OpenAPI spec by Stainless.
2+
3+
from typing import List, Optional
4+
5+
from ..._models import BaseModel
6+
7+
__all__ = ["ChatCompletionTokenLogprob", "TopLogprob"]
8+
9+
10+
class TopLogprob(BaseModel):
11+
token: str
12+
"""The token."""
13+
14+
bytes: Optional[List[int]]
15+
"""A list of integers representing the UTF-8 bytes representation of the token.
16+
17+
Useful in instances where characters are represented by multiple tokens and
18+
their byte representations must be combined to generate the correct text
19+
representation. Can be `null` if there is no bytes representation for the token.
20+
"""
21+
22+
logprob: float
23+
"""The log probability of this token."""
24+
25+
26+
class ChatCompletionTokenLogprob(BaseModel):
27+
token: str
28+
"""The token."""
29+
30+
bytes: Optional[List[int]]
31+
"""A list of integers representing the UTF-8 bytes representation of the token.
32+
33+
Useful in instances where characters are represented by multiple tokens and
34+
their byte representations must be combined to generate the correct text
35+
representation. Can be `null` if there is no bytes representation for the token.
36+
"""
37+
38+
logprob: float
39+
"""The log probability of this token."""
40+
41+
top_logprobs: List[TopLogprob]
42+
"""List of the most likely tokens and their log probability, at this token
43+
position.
44+
45+
In rare cases, there may be fewer than the number of requested `top_logprobs`
46+
returned.
47+
"""

src/openai/types/chat/completion_create_params.py

+21-2
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ class CompletionCreateParamsBase(TypedDict, total=False):
7878
particular function via `{"name": "my_function"}` forces the model to call that
7979
function.
8080
81-
`none` is the default when no functions are present. `auto`` is the default if
81+
`none` is the default when no functions are present. `auto` is the default if
8282
functions are present.
8383
"""
8484

@@ -99,8 +99,18 @@ class CompletionCreateParamsBase(TypedDict, total=False):
9999
or exclusive selection of the relevant token.
100100
"""
101101

102+
logprobs: Optional[bool]
103+
"""Whether to return log probabilities of the output tokens or not.
104+
105+
If true, returns the log probabilities of each output token returned in the
106+
`content` of `message`. This option is currently not available on the
107+
`gpt-4-vision-preview` model.
108+
"""
109+
102110
max_tokens: Optional[int]
103-
"""The maximum number of [tokens](/tokenizer) to generate in the chat completion.
111+
"""
112+
The maximum number of [tokens](/tokenizer) that can be generated in the chat
113+
completion.
104114
105115
The total length of input tokens and generated tokens is limited by the model's
106116
context length.
@@ -127,6 +137,8 @@ class CompletionCreateParamsBase(TypedDict, total=False):
127137
response_format: ResponseFormat
128138
"""An object specifying the format that the model must output.
129139
140+
Compatible with `gpt-4-1106-preview` and `gpt-3.5-turbo-1106`.
141+
130142
Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the
131143
message the model generates is valid JSON.
132144
@@ -180,6 +192,13 @@ class CompletionCreateParamsBase(TypedDict, total=False):
180192
functions the model may generate JSON inputs for.
181193
"""
182194

195+
top_logprobs: Optional[int]
196+
"""
197+
An integer between 0 and 5 specifying the number of most likely tokens to return
198+
at each token position, each with an associated log probability. `logprobs` must
199+
be set to `true` if this parameter is used.
200+
"""
201+
183202
top_p: Optional[float]
184203
"""
185204
An alternative to sampling with temperature, called nucleus sampling, where the

src/openai/types/completion_create_params.py

+7-5
Original file line numberDiff line numberDiff line change
@@ -88,16 +88,18 @@ class CompletionCreateParamsBase(TypedDict, total=False):
8888

8989
logprobs: Optional[int]
9090
"""
91-
Include the log probabilities on the `logprobs` most likely tokens, as well the
92-
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
93-
the 5 most likely tokens. The API will always return the `logprob` of the
94-
sampled token, so there may be up to `logprobs+1` elements in the response.
91+
Include the log probabilities on the `logprobs` most likely output tokens, as
92+
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
93+
list of the 5 most likely tokens. The API will always return the `logprob` of
94+
the sampled token, so there may be up to `logprobs+1` elements in the response.
9595
9696
The maximum value for `logprobs` is 5.
9797
"""
9898

9999
max_tokens: Optional[int]
100-
"""The maximum number of [tokens](/tokenizer) to generate in the completion.
100+
"""
101+
The maximum number of [tokens](/tokenizer) that can be generated in the
102+
completion.
101103
102104
The token count of your prompt plus `max_tokens` cannot exceed the model's
103105
context length.

0 commit comments

Comments
 (0)