-
Notifications
You must be signed in to change notification settings - Fork 318
async generate_content
is very slow
#557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Compare the runtimes before and after the call to |
Note that |
Hi @yinghsienwu, thanks for the response. Are there no plans to add support for gRPC AsyncIO like the old SDK? This seems like a pretty big regression? |
Same problem here. It's an extremely slow interface. We're getting 100% cpu usage with only a handful of concurrent requests, which is practically unusable for our purposes. |
I compared Vertex SDK (google-cloud-aiplatform) (#1 default grpc_asyncio, and #2 rest_asyncio transport) with #3 google-genai SDK v1.9 (rest, httpx) and #4 aiohttp prototype for async generateContent requests (100, 500, 1000 async requests).
We'll try to put it into our roadmap. |
Thanks for confirming @yinghsienwu! Do you think this situation might be improved by the time the old sdk reaches end of life on Aug 31st? |
I think likely to be available in Q2 2025. I'll attach a PR here. |
The async performance of the new SDK still seems to be much worse than the old SDK (with
transport='grpc_asyncio'
).We can do 1000 text classifications in ~5s with the old client, but this consistently takes over 30s with the new client.
Is this a know issue? Or are there certain settings that must be configured with the new client? (e.g. we found the
transport
option was very important with the previous client).Environment details
Steps to reproduce
google.genai.client.aio.models.generate_content
google.generativeai
generate_content_async
The text was updated successfully, but these errors were encountered: