Skip to content

feat: Add responses API #373

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 2, 2025
Merged

feat: Add responses API #373

merged 2 commits into from
Jun 2, 2025

Conversation

samvrlewis
Copy link
Contributor

@samvrlewis samvrlewis commented May 21, 2025

Adds support for the OpenAI responses API.

Doesn't have support for streaming yet.

Due to types being reexported from the root of the crate, there's a few types with existing names (but different shapes) that I've exported with Responses prefixes to avoid making ambiguous types. For example: ResponsesFilePath, ResponsesRole). Happy for feedback if there's a different way that this would be preferred. (edit: realised later it would probably be cleaner just to not export all the types at the root, as there's too much duplication).

@twitchax
Copy link

Cool.

I tried this against an existing project, and I think something might be wrong with the text() option. I get.

Error while handling: invalid_request_error: Missing required parameter: 'text.format.name'. (param: text.format.name) (code: missing_required_parameter)

I am guessing that in chat.rs,

pub enum ResponseFormat {
    /// The type of response format being defined: `text`
    Text,
    /// The type of response format being defined: `json_object`
    JsonObject,
    /// The type of response format being defined: `json_schema`
    JsonSchema {
        json_schema: ResponseFormatJsonSchema,
    },
}

might need to be...

pub enum ResponseFormat {
    /// The type of response format being defined: `text`
    Text,
    /// The type of response format being defined: `json_object`
    JsonObject,
    /// The type of response format being defined: `json_schema`
    JsonSchema(ResponseFormatJsonSchema),
}

So that name, etc. is at the "top-level" of text.format.

@samvrlewis
Copy link
Contributor Author

Thanks for testing it @twitchax and good catch, I think you're right. I updated it to use a ResponseFormat just for this API rather than reusing the chat one. The example uses a jsonschema response format now and works for me.

fwiw, I mostly hand generated most of this as I couldn't find any generators that produced nice output. So with how complex the API is, I do worry there might be other subtle cases like this. I suppose if there are more issues like this they can be fixed as encountered though.

@twitchax
Copy link

@samvrlewis, yeah, I agree. Not sure how to fully exercise it, honestly, lol.

@twitchax
Copy link

@samvrlewis, your changes fixed this issue with parsing the RepsonseFormat, and it works for me. Not that my review matters, but LGTM. 👍

@twitchax
Copy link

twitchax commented May 23, 2025

Actually, getting another issue, which just seems like a serialization problem. When I switch to o3-mini, I get...

ERROR Error while handling: failed to deserialize api response: missing field `status` at line 18 column 4

Looks like there is an extra output type called reasoning that does not have a status field.

2025-05-23T08:25:55.025 app[6830d40a650698] sea [info] {
2025-05-23T08:25:55.025 app[6830d40a650698] sea [info] "id": "rs_683031120ddc8191b416bd3ad56cb6fd052c3d28f22abf09",
2025-05-23T08:25:55.025 app[6830d40a650698] sea [info] "type": "reasoning",
2025-05-23T08:25:55.025 app[6830d40a650698] sea [info] "summary": []
2025-05-23T08:25:55.025 app[6830d40a650698] sea [info] },

@samvrlewis
Copy link
Contributor Author

Huh, yeah, there is no status.

  "output": [
    {
      "id": "rs_68305c0068848191ab9fa768452d6b090e0209c29f190939",
      "type": "reasoning",
      "summary": []
    },
    {
      "id": "msg_68305c00cac481918b57e1b255671ace0e0209c29f190939",
      "type": "message",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "annotations": [],
          "text": "{\"title\": \"Historic Climate Accord Reached: Nations Unite to Combat Global Warming\", \"website\": \"https://www.globalnews.com\"}"
        }
      ],
      "role": "assistant"
    }
  ],

Unless I'm misunderstanding them, the docs seem to say that it should have one though..

image

Maybe "populated when items are returned via API" means "not in response to a request"? 🤷‍♂️

In any case, I pushed an update to make the field optional so it doesn't make deser. Thanks for the ongoing testing @twitchax!

@twitchax
Copy link

Yeah, I am guessing it is only present when summary has items?

@twitchax
Copy link

It looks like ToolDefinition may also need an mcp option, or, at least, that's what it looks like in the MCP docs.

https://platform.openai.com/docs/guides/tools-remote-mcp

@samvrlewis
Copy link
Contributor Author

It looks like ToolDefinition may also need an mcp option, or, at least, that's what it looks like in the MCP docs.

https://platform.openai.com/docs/guides/tools-remote-mcp

Ah yeah looks like OpenAI added a bunch new tools last week: https://platform.openai.com/docs/changelog !

Have updated the PR to include them, and updated the example to use the same MCP example from the above link.

Adds support for the OpenAI responses API
@samvrlewis
Copy link
Contributor Author

Thought about it some more and I think for the responses types it probably makes less sense to export them all at the root of the types crate, as there's so many duplicated (but subtly different) types and naming becomes confusing. Have instead made everything available through types::responses which I think makes sense, as it seems like OpenAI intends for this API to be a superset of a lot of the previously available APIs.

@twitchax
Copy link

LGTM after that change. Had to bump a few types, but all my tests still pass, so... 👍?

Lol.

Haven't tried the MCP stuff, but I bet it works. May try it in a few weeks. If you're interested, I'm currently using this for https://github.com/twitchax/triage-bot, which is maybe 70% there? Mostly works, but needs some tweaking.

@samvrlewis
Copy link
Contributor Author

LGTM after that change. Had to bump a few types, but all my tests still pass, so... 👍?

woohoo, thanks for trying! 🎉

Haven't tried the MCP stuff, but I bet it works.

It seems to work for at least the simple example I have in the code now that uses https://mcp.deepwiki.com/mcp. Hopefully for more complicated cases it works too.

If you're interested, I'm currently using this for https://github.com/twitchax/triage-bot, which is maybe 70% there? Mostly works, but needs some tweaking.

Looks cool! How well does SurrealDB work for retrieving context on demand in that? At work we have a somewhat similar service that tries to associate incidents with recent pull requests but it doesn't work very well, as it's really hard to give it enough context to let it figure things out. Your approach of doing the initial triaging seems a lot more promising. How is it working for you?

@twitchax
Copy link

Nice.

I haven't used it in a real context yet, but I like SurrealDB. I'd argue that any DB with a full-text search would work fairly well?

I think there is more opportunity to "agentize" some of these things, or get more context into the hands of the LLM. My approach right now uses a two-phase system. First phase (using 4.1) throws a directive, some small context, and the user's query at two separate agents. One of them does a web search, and the second one just determines "search terms" for messages (which are then FTSed on the past messages). Second phase takes the system directive, custom channel directive, any stored channel context, the web search results, the message search results, and the user's query to the assistant agent (using o3). That seems to work fairly well; however, I could see a scenario in which a "loopback" remote MCP might be better. Essentially, allow the LLM to call back to a server housing all of the context / past messages, and give it access to read queries over that data so it can "traverse" / search it however it would like.

Even if my bot-server has access to the same data, I like the idea of just having a separate process run an MCP so I don't have to deal with all of the back-and-forth function calls in the bot code. Looks like Rust is getting some love, so an MCP server may be pretty painless to just drop in.

Definitely some promising results, but I'm trying to push the envelope on what is possible.

In your case, I think that's exactly what remote MSP is for. Instead of gathering a bunch of context that will likely eat up tokens, give o3 the initial context about the incident, and then the remote MCP endpoint for GitHub. As far as I understand, o3 will then decide what sort of calls to make, and it will sort of "throw away" what it doesn't need if it "thinks" it went down the wrong direction.

@twitchax
Copy link

@samvrlewis, do you know the best way to respond with type function_call_output.

https://platform.openai.com/docs/guides/function-calling?api-mode=responses

Maybe a new enum option needs to be added to Input? It also looks like you're supposed to append the API's tool call message itself, and I'm not sure that's possible either?

Maybe, due to the explosion of options, Input just needs a Custom state that just lets you shove a raw serde_json::Value in there? Just thinking out loud.

There's a lot of possible input items in the responses APIs. Ideally
it'd be nice to have strict types, but for now we can use a custom user
defined json value.
@samvrlewis
Copy link
Contributor Author

Maybe, due to the explosion of options, Input just needs a Custom state that just lets you shove a raw serde_json::Value in there? Just thinking out loud.

Ooh, yeah, there are a lot of input items that aren't there right now. 😬 Would be nice to have these all strictly typed but for now I've done as you suggested and added a Custom variant.

Added another example that uses the get_weather example, seems to work!

Thanks for the prompts on MCP, btw! Definitely something to explore a bit further when I have some time. I do worry about how well the model would work with a codebase of significant complexity though, if it's needing to navigate around file by file. I haven't had much success using coding tools like cursor against big repos, I usually need to give them a tighter context to get any good output. Though to be fair I'm usually focused on generating code, maybe just reading/understanding code would be easier?

@twitchax
Copy link

+1, mileage varies a ton for the "agent mode" stuff like Cursor and Copilot Agent. I have the same experience with them as you. Limiting to tests or small refactors appears to work well. Everything else tends to fold, and produce bad architecture.

For my purposes, MCP is going to be a big deal, so I've been poking at it a little more than others might.

Thanks for entertaining my random observations while I poke! Happy to help out at some point, but not certain how you and other maintainers feel about stop-gaps like Custom enum placeholders.

Copy link
Owner

@64bit 64bit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much @samvrlewis and @twitchax for your contributions!

Thank you for doing all the heavy lifting by hand typing types and testing. Appreciate all the hard work!

Your design choice to nest types inside types::responses is a good one!

@64bit 64bit merged commit c2f3a6c into 64bit:main Jun 2, 2025
ifsheldon pushed a commit to ifsheldon/async-openai-wasm that referenced this pull request Jun 2, 2025
* feat: Add responses API

Adds support for the OpenAI responses API

* feat: Add custom input item

There's a lot of possible input items in the responses APIs. Ideally
it'd be nice to have strict types, but for now we can use a custom user
defined json value.

(cherry picked from commit c2f3a6c)
gilljon added a commit to gilljon/async-openai that referenced this pull request Aug 8, 2025
* fix: readme example link (64bit#347)

Co-authored-by: hzlinyiyu <[email protected]>

* feat: Gemini openai compatibility (64bit#353)

* fix: change id and created fields to Option types in response structs (makes loose deserialization which give advantage to gemini openai compatibility)

* fix: change created field to Option type in ImagesResponse struct for better deserialization

* feat: add example for Gemini OpenAI compatibility with async_openai integration

* fix: rollbacked type changes in async-openai, added more examples using byot features

* Backoff when OpenAI returns 5xx (64bit#354)

* chore: Release

* Implement vector store search, retrieve file content operations (64bit#360)

* Implement vector search api

* Make ids in ListVectorStoreFilesResponse optional, as they can come back null when there are no files

* Implement vector file content api

* Add Default derive to RankingOptions, make CompountFilter.type non-optional

* Made comparison type non-optional

* Make compound filter a Vec of VectorStoreSearchFilter

* Implement from conversions for filters

* Add vector store retrieval example

* Update example readme

* Add attributes to create vector store

* Update examples/vector-store-retrieval/src/main.rs

* Update examples/vector-store-retrieval/src/main.rs

---------

Co-authored-by: Himanshu Neema <[email protected]>

* [Completions API] Add web search options (64bit#370)

* [Completons API] Add web search options

* Update async-openai/src/types/chat.rs

* Update async-openai/src/types/chat.rs

* Update async-openai/src/types/chat.rs

* Update async-openai/src/types/chat.rs

* Update async-openai/src/types/chat.rs

* Update async-openai/src/types/chat.rs

* Update async-openai/src/types/chat.rs

* Update async-openai/src/types/chat.rs

* Update async-openai/src/types/chat.rs

* Update examples/completions-web-search/src/main.rs

* Update examples/completions-web-search/src/main.rs

---------

Co-authored-by: Himanshu Neema <[email protected]>

* Add instructions option to speech request (64bit#374)

* Add instructions field to speech request

* Update async-openai/src/types/audio.rs

* Update openapi.yaml

---------

Co-authored-by: Himanshu Neema <[email protected]>

* feat: Add responses API (64bit#373)

* feat: Add responses API

Adds support for the OpenAI responses API

* feat: Add custom input item

There's a lot of possible input items in the responses APIs. Ideally
it'd be nice to have strict types, but for now we can use a custom user
defined json value.

* chore: update readme; format code (64bit#377)

* add Resposnes in feature list

* cargo fmt

* chore: Release

* fix web search options; skip serializing if none (64bit#379)

* added copyright material links, Resolves 64bit#346 (64bit#380)

* add completed state (64bit#384)

* feat: adds Default to CompletionUsage (64bit#387)

* add flex service tier to chat completions (64bit#385)

* chore: Release

* Enable dyn dispatch by dyn Config objects (64bit#383)

* enable dynamic dispatch

* update README with dyn dispatch example

* add doc for dyn dispatch

* Update test

Co-authored-by: Himanshu Neema <[email protected]>

* Update Config bound

Co-authored-by: Himanshu Neema <[email protected]>

* remove Rc impl

Co-authored-by: Himanshu Neema <[email protected]>

* Fix typo

Co-authored-by: Himanshu Neema <[email protected]>

* Fix typo

Co-authored-by: Himanshu Neema <[email protected]>

* Update doc

Co-authored-by: Himanshu Neema <[email protected]>

* Update README

Co-authored-by: Himanshu Neema <[email protected]>

---------

Co-authored-by: Himanshu Neema <[email protected]>

* Add missing voice Ballad to enum (64bit#388)

* Add missing voice Ballad to enum

* Update openapi.yaml

* Update openapi.yaml

---------

Co-authored-by: Himanshu Neema <[email protected]>

* feat: enhance realtime response types and audio transcription options (64bit#391)

* feat: enhance realtime response types and audio transcription options

- Added `Cancelled` variant to `ResponseStatusDetail` enum for better handling of cancelled responses.
- Introduced `LogProb` struct to capture log probability information for transcribed tokens.
- Updated `ConversationItemInputAudioTranscriptionCompletedEvent` and `ConversationItemInputAudioTranscriptionDeltaEvent` to include optional `logprobs` for per-token log probability data.
- Enhanced `AudioTranscription` struct with optional fields for `language`, `model`, and `prompt` to improve transcription accuracy and customization.
- Added new `SemanticVAD` option in the `TurnDetection` enum to control model response eagerness.
- Expanded `RealtimeVoice` enum with additional voice options for more variety in audio responses.

* feat: update audio format enum values for consistency

- Changed enum variants for `AudioFormat` to use underscores instead of hyphens in their serialized names.
- Updated `G711ULAW` from `g711-ulaw` to `g711_law` and `G711ALAW` from `g711-alaw` to `g711_alaw` for improved clarity and adherence to naming conventions.

* feat: add auto-response options to VAD configurations

---------

Co-authored-by: Chris Raethke <[email protected]>

* feat: change Prompt integer variants from u16 to u32 for future compatibility (64bit#392)

* task: Add serialize impl for ApiError (64bit#393)

* task: Add serialize impl for ApiError

- Adds the `serde::Serialize` derive macro to the `ApiError` type so
  that this error can be passed along the wire to clients for proxies

* Update async-openai/Cargo.toml

* Update async-openai/Cargo.toml

---------

Co-authored-by: Himanshu Neema <[email protected]>

* refactor: adding missing fields from Responses API (64bit#394)

* remove .mime_str(application/octet-stream) (64bit#395)

* chore: Release

---------

Co-authored-by: Yiyu Lin <[email protected]>
Co-authored-by: hzlinyiyu <[email protected]>
Co-authored-by: DarshanVanol <[email protected]>
Co-authored-by: Tinco Andringa <[email protected]>
Co-authored-by: Himanshu Neema <[email protected]>
Co-authored-by: Christopher Fraser <[email protected]>
Co-authored-by: Adam Benali <[email protected]>
Co-authored-by: Eric Christiansen <[email protected]>
Co-authored-by: Sam Lewis <[email protected]>
Co-authored-by: Spencer Bartholomew <[email protected]>
Co-authored-by: Jens Walter <[email protected]>
Co-authored-by: Paul Hendricks <[email protected]>
Co-authored-by: ifsheldon <[email protected]>
Co-authored-by: Jeff Registre <[email protected]>
Co-authored-by: Chris Raethke <[email protected]>
Co-authored-by: Chris Raethke <[email protected]>
Co-authored-by: Paul Hendricks <[email protected]>
Co-authored-by: Thomas Harmon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants