async `generate_content` is very slow #557

ascillitoe · 2025-03-22T14:18:58Z

The async performance of the new SDK still seems to be much worse than the old SDK (with transport='grpc_asyncio').

We can do 1000 text classifications in ~5s with the old client, but this consistently takes over 30s with the new client.

Is this a know issue? Or are there certain settings that must be configured with the new client? (e.g. we found the transport option was very important with the previous client).

Environment details

Programming language: Python
OS: Ubuntu 22.04.5 LTS
Language runtime version: 3.10.12
Package version: 1.7.0

Steps to reproduce

Run N basic text completions async with the new google.genai.client.aio.models.generate_content
Compare with the old google.generativeai generate_content_async

The text was updated successfully, but these errors were encountered:

andrew-stelmach · 2025-03-24T19:14:11Z

Compare the runtimes before and after the call to generate_content. LLM response speeds are stochastic as every provider changes how many gpus are being used, etc

yinghsienwu · 2025-03-26T17:07:15Z

Note that google.genai uses REST transport. In general, gRPC is faster than REST. We'll calibrate the runtime performance and see how to improve. Thanks for raising this.

ascillitoe · 2025-03-27T23:15:44Z

Hi @yinghsienwu, thanks for the response. Are there no plans to add support for gRPC AsyncIO like the old SDK? This seems like a pretty big regression?

hugbubby · 2025-04-14T22:21:01Z

Same problem here. It's an extremely slow interface. We're getting 100% cpu usage with only a handful of concurrent requests, which is practically unusable for our purposes.

yinghsienwu · 2025-04-15T17:32:01Z

I compared Vertex SDK (google-cloud-aiplatform) (#1 default grpc_asyncio, and #2 rest_asyncio transport) with #3 google-genai SDK v1.9 (rest, httpx) and #4 aiohttp prototype for async generateContent requests (100, 500, 1000 async requests).

Vertex SDK, rest_asyncio and grpc_asyncio transport perform similarly (within the std dev). gRPC should not be the key to better runtime performance.
Currently google-genai SDK (httpx) runtime is ~6X of grpc_asyncio runtime when sending 1000 async requests. (same as the observation above (async generate_content is very slow #557 (comment)).
If we want to improve runtime performance, using aiohttp in google-genai SDK AsyncClient implementation may achieve similar performance as Vertex SDK's rest_asyncio.

We'll try to put it into our roadmap.

ascillitoe · 2025-05-02T15:47:40Z

Thanks for confirming @yinghsienwu! Do you think this situation might be improved by the time the old sdk reaches end of life on Aug 31st?

yinghsienwu · 2025-05-02T20:59:42Z

I think likely to be available in Q2 2025. I'll attach a PR here.

ascillitoe added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Mar 22, 2025

yyyu-google assigned yinghsienwu Mar 24, 2025

yinghsienwu added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Apr 1, 2025

Saran33 mentioned this issue Apr 11, 2025

Migrate to new higher-level google-genai SDK? langchain-ai/langchain-google#853

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

async `generate_content` is very slow #557

async `generate_content` is very slow #557

ascillitoe commented Mar 22, 2025

andrew-stelmach commented Mar 24, 2025

yinghsienwu commented Mar 26, 2025

ascillitoe commented Mar 27, 2025

hugbubby commented Apr 14, 2025

yinghsienwu commented Apr 15, 2025 •

edited

Loading

ascillitoe commented May 2, 2025

yinghsienwu commented May 2, 2025

async generate_content is very slow #557

async generate_content is very slow #557

Comments

ascillitoe commented Mar 22, 2025

Environment details

Steps to reproduce

andrew-stelmach commented Mar 24, 2025

yinghsienwu commented Mar 26, 2025

ascillitoe commented Mar 27, 2025

hugbubby commented Apr 14, 2025

yinghsienwu commented Apr 15, 2025 • edited Loading

ascillitoe commented May 2, 2025

yinghsienwu commented May 2, 2025

async `generate_content` is very slow #557

async `generate_content` is very slow #557

yinghsienwu commented Apr 15, 2025 •

edited

Loading