Description
Description of the problem / feature request:
We switched bazel from 0.9.0 to 0.10.0 and our workers started to fail with out-of-memory errors.
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Unfortunately, I do not have a simple reproduction receipt. We just need to build our project (bazel test //...
), which is about 10K targets, with many of them being template-heavy C++ files.
Build machines: 32 cores, 64 GB of ram, no swap, default memory-related settings
We use the following configuration: --ram_utilization_factor 50
, no -j
option
What operating system are you running Bazel on?
Ubuntu 16.04.3 LTS
Linux *** 4.4.0-1041-aws #50-Ubuntu SMP Wed Nov 15 22:18:17 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
What's the output of bazel info release
?
Build label: 0.10.0- (@Non-Git)
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Sat Aug 11 16:17:21 +50074 (1518034925841)
Build timestamp: 1518034925841
Build timestamp as int: 1518034925841
If bazel info release
returns "development version" or "(@Non-Git)", tell us how you built Bazel.
Downloaded source from 0.10.0 relaase page
What's the output of git remote get-url origin ; git rev-parse master ; git rev-parse HEAD
?
This refers to private git repo that I unfortunately cannot share
Have you found anything relevant by searching the web?
There was an earlier discussion at https://groups.google.com/forum/#!searchin/bazel-discuss/josh$20pieper%7Csort:date/bazel-discuss/ujUkOus9g68/anihpWogDQAJ
Searching bazel-discuss and bug tracker.
#3886 seems related, but there are no cgroups involved.
#3645 / #2946 describe the similar situation, but we do not care about bazel hangs -- OOM killer usually kills some other important process first.
Any other information, logs, or outputs that you want to share?
Happy to run any possible diagnostics.