Skip to content

Add support for SageMaker Inference Components in sagemaker chat #10603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 31 commits into
base: main
Choose a base branch
from

Conversation

bobbywlindsey
Copy link
Contributor

Title

Add support for SageMaker Inference Components in sagemaker chat

Relevant issues

Fixes #9909

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🆕 New Feature
🐛 Bug Fix

Changes

If model_id is present as a parameter for completion(model="sagemaker_chat/*"...) calls:

  • Include an additional header in the request that enables Inference Components for SageMaker endpoints using Messages API and uses model_id as the Inference Component
  • Remove model key and value from body of request

Copy link

vercel bot commented May 6, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jun 11, 2025 8:42pm

@CLAassistant
Copy link

CLAassistant commented May 6, 2025

CLA assistant check
All committers have signed the CLA.

@krrishdholakia
Copy link
Contributor

@bobbywlindsey sagemaker_chat no longer uses the openai_like/ flow. It's been migrated to our common base_llm_http_handler.py

Can you please update your PR to reflect the change?

You should just need to make any mods here -

Called if Sagemaker endpoint supports HF Messages API.

@bobbywlindsey
Copy link
Contributor Author

@krrishdholakia Nice refactor! I put my changes where you suggested and seems to work nicely 👍🏻 Could you review again? Thanks!

@krrishdholakia
Copy link
Contributor

Hey @bobbywlindsey could you please add some unit testing inside tests/litellm?

@bobbywlindsey
Copy link
Contributor Author

Hey @ishaan-jaff, any chance you could take a look at this PR? Thanks!

@dgallitelli
Copy link

@krrishdholakia I see you're the reviewer for this PR - all the checks have passed and there are no conflict with base. Can we merge? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: sagemaker_chat provider does not correctly pass the model_id for the SageMaker Inference Component
4 participants