You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
We recently had a case of huge responses from one particular subgraph causing the Router to OOM.
To be exact the subgraph responded with megabytes of "errors": [].
While there are lots of settings to apply traffic shaping or limit requests (from clients),
I found no way to configure any limits on subgraph response sizes that serve as a circuit-breaker for such cases:
In essence this feature request is just another aspect (like max_depth and max_height) by which the resources of individual requests can be limited.
Describe the solution you'd like
I'd like to be able to set a limit on the size of an individual subgraph request that the router will parse and then compile into the response in order to limit the maximum required memory per individual original request.
This might not necessarily have to be a limit per individual subgraph request, but some configurable maximum per request the router processes in order to now allow for a few requests to fill all of the memory.
Certainly there has to be a log message indicating that requests where dropped / rejected due to their higher than allowed memory, like for all other request limits.
While this might be beneficial in any case as lots of small inflight requests might also cause a router to go OOM, this tackles a different problem as we had with a single malfunctioning subgraph being the troublemaker not the amount of concurrent requests per se.
Being unable to limit the memory footprint of handling a single (of potentially many concurrent) request, makes it hard to determine the required memory for "full throttle" (all connections / threads / workers / ...) being busy the router requires.
Additional context
There are some related issues and feature requests I found
Is your feature request related to a problem? Please describe.
We recently had a case of huge responses from one particular subgraph causing the Router to OOM.
To be exact the subgraph responded with megabytes of
"errors": []
.While there are lots of settings to apply traffic shaping or limit requests (from clients),
I found no way to configure any limits on subgraph response sizes that serve as a circuit-breaker for such cases:
In essence this feature request is just another aspect (like
max_depth
andmax_height
) by which the resources of individual requests can be limited.Describe the solution you'd like
I'd like to be able to set a limit on the size of an individual subgraph request that the router will parse and then compile into the response in order to limit the maximum required memory per individual original request.
This might not necessarily have to be a limit per individual subgraph request, but some configurable maximum per request the router processes in order to now allow for a few requests to fill all of the memory.
Certainly there has to be a log message indicating that requests where dropped / rejected due to their higher than allowed memory, like for all other request limits.
It might also make sense to indicate to the client that their response is larger than allowed by the router, maybe using https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/413 ?
Describe alternatives you've considered
An alternative would be some sort of overload protection in case the router memory approaches certain threshold. Something similar to the overload manager built into Envoy: https://www.envoyproxy.io/docs/envoy/latest/configuration/operations/overload_manager/overload_manager
While this might be beneficial in any case as lots of small inflight requests might also cause a router to go OOM, this tackles a different problem as we had with a single malfunctioning subgraph being the troublemaker not the amount of concurrent requests per se.
Being unable to limit the memory footprint of handling a single (of potentially many concurrent) request, makes it hard to determine the required memory for "full throttle" (all connections / threads / workers / ...) being busy the router requires.
Additional context
There are some related issues and feature requests I found
The text was updated successfully, but these errors were encountered: