-
Notifications
You must be signed in to change notification settings - Fork 83
25.2.0 memory leak #1073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@jonanon20 do you use the |
Yes that right, we use the /reload API |
@jonanon20 Can you provide some more details for us to debug, such as:
Thanks/ Uzi |
Hello, we deploy to kubernetes using the docker image We are running the pod on an m5.2xlarge AWS instance. 8vCPU and 32GiB.
There is a pod anti affinity so multiple web3signers cannot be loaded onto the same node. There is no memory pressure on the node itself, only the pod after a couple of days. We still get jvm oom errors even after trippling our requests/limits, it just takes a few more days to fail. We launch the container with the arg
|
@jonanon20 If possible, can you DM me on Discord ( |
Hi @usmansaleem , I have sent a couple of messages to you over discord, I wonder if you have had the chance to see them? We're still getting oom issues with this. |
We have noticed since the upgrade from 24.12.0 to 25.2.0 a significant increase in memory usage that has caused several jvm OOM errors. Our web3signers are responsible for 10s of thousands of keys and this issue appeared after onboarding a new series of ~200keys leading to the OOM crash and missed attestations.
You can see from the graph that with version 24.12.0 we didn’t have this memory issue, as soon as we deployed 25.2.0 on 19/02 then memory usage spiked considerably. As a temporary measure we have set our
-XX:MaxRAMPercentage=50
and given our nodes more memory, we generally see memory usage plateau but the plateau will only last a day or 2 before it climbs again. This gradual climb continues until we are alerted and have to restart. Originally we were at-XX:MaxRAMPercentage=25
without any issues or memory spikes. Memory usage for the web3signer up until 25.2.0 has always been consistently low.When we onboarded 200 new keys and we had missed attestations we noticed a considerable jvm garbage collection time as shown on the graph below across 2 of our 3 signers (jvm_gc_collection_seconds).

despite the long gc times we didnt see any reduction in memory for 2 of our signers during this period but this could be the result of memory pressure at the time.
We have noticed a worrying increase in memory and a gradual climb over several days. Where before 25.2.0 we were seeing a usage of ~1GB with no issues we now gradually climb beyond 6GB after 2 or 3 days.
The text was updated successfully, but these errors were encountered: