-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CGROUP aware resource monitor on memory #38718
Comments
See also https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/bootstrap/v3/bootstrap.proto#config-bootstrap-v3-memoryallocatormanager as a way to configure giving back some memory to the OS. A cgroup aware resource monitor sounds like a great enhancement! |
I did happen to test that as well, it works nicely. After an idle period the memory gets released.
though what I don't understand is how to make a judgement call on the value to set for
|
Opened PR to add cgroup memory resource monitor. Appreciate it if you could take a look and provide feedback. Thanks. CC @wujiaqi @KBaichoo @botengyao |
Title: Add a GROUP aware resource monitor for memory
Description:
I'm opening this issue to have a preliminary discussion on how to implement this. Someone on my team can do the implementation once we get agreement.
We have an Istio Ingress Gateway today and have overload manager configured to load shed on memory utilization thresholds. This is to prevent OOMKills of our pods especially during high load events. However the fixed_heap resource monitor that exists today only reports the memory that tcmalloc believes is allocated. OOMKills are based on what the OS sees and not what tcmalloc thinks so it is important to have a monitor that sees this accordingly. It is often the case that fixed_heap is substantially lower than what is reported in CGROUPS.
Below is an experiment I conducted to demonstrate the discrepancy
During Load
Docker stats
Envoy metric
After Load
Docker stats
Envoy metric
As you can see, heap pressure is much lower than the OS reported memory consumption.
I am proposing to add a new resource monitor for memory based on CGROUPS rather than tcmalloc stats. As there is a transition at the moment where some systems are CGROUPS v1 and others are CGROUPS v2, and some could be in hybrid mode, it would be worth abstracting this detail away in the configuration to just "cgroups enabled". During object construction we can detect in the system if it is CGROUPS v1 or v2. For example it can check the filesystem for presence of the hierarchies
if the following files are present then system is on cgroups v2
/sys/fs/cgroup/memory.max
/sys/fs/cgroup/memory.current
else if the following directory exists then system is on cgroups v1
/sys/fs/cgroup/memory
We will pick the highest available cgroups implementation on the system during construction.
Appreciate the feedback, thanks.
[optional Relevant Links:]
cc @ramaraochavali
The text was updated successfully, but these errors were encountered: