-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Description
Please provide the following information. The more we know about your system and use case, the more easily and likely we can help.
Description of the problem / feature request / question:
Bazel, by default, looks at available RAM on the system to set local_resources
defaults, so as to best-use the resources of the machine.
Unfortunately, inside a Docker container or other cgroup environment, the system-wide memory statisics (/proc/meminfo
, the output of free
, etc) reflect the memory usage of the host, not the container.
bazel should make a best-effort attempt to find the effective cgroup memory controller limits, and use those.
If possible, provide a minimal example to reproduce the problem:
I ran into this in a CircleCI build; All builds would fail with
Server terminated abruptly (error code: 14, error message: '', log file: '/root/.cache/bazel/_bazel_root/f85b6fb5740e6e8c7efea142eec4b6e8/server/jvm.out')
until I added build --local_resources=4096,4,1.0
to my .bazelrc
.
Circle's build containers report 60G of RAM, but are cgroup-limited to 4G, so building any large application on Circle ought reproduce the issue.
Environment info
-
Operating System:
Linux; Tested on Ubuntu 16.04 -
Bazel version (output of
bazel info release
):
release 0.5.4
-
If
bazel info release
returns "development version" or "(@Non-Git)", please tell us what source tree you compiled Bazel from; git commit hash is appreciated (git rev-parse HEAD
):
Have you found anything relevant by searching the web?
(e.g. StackOverflow answers,
GitHub issues,
email threads on the bazel-discuss
Google group)
There are a number of reports online of people puzzling with bazel OOM-ing. It's hard to know how many of them root-cause to this issue, but almost certainly some of them do, since container environments are increasingly popular these days.
- Bazel fails during analysis phase - "Server terminated abruptly" #3020
- Compiling Error with empty error message tensorflow/tensorflow#9940
- Remote Worker failures on bazel build #3251
Anything else, information or logs or outputs that would be helpful?
(If they are large, please upload as attachment or provide link).
https://fabiokung.com/2014/03/13/memory-inside-linux-containers/ has some notes on how to detect memory availability inside containers.