-
Notifications
You must be signed in to change notification settings - Fork 238
Allow extra request inputs #552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall! Thanks for working on this. I just left a few questions and suggestions.
Co-authored-by: Hyunjae Woo <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLM input changes look good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
This will allow users to pass in additional inputs beyond those in the model. This is useful for adding support to different backends/endpoints that have different arguments they support without hardcoding each one into GenAi-Perf.
This has been tested for Triton+TRT-LLM, Triton+vLLM, vLLM+OpenAI (both endpoints). If we wanted to drop support for
--streaming
, we could so since that would now be supported via--extra-inputs
(e.g.--extra-inputs stream:True
).When a non-existent input name is given, no error is raised and GenAi-Perf works as if it was not provided. I created a follow-up ticket (TMA-1799) to investigate if PA is not raising an error when input data for non-existent inputs is provided in an input data JSON. That ticket aims to make sure an error is raised in PA and GenAi-Perf for those cases.
Unit testing passes:
