Skip to content

"terminate called after throwing an instance of std::bad_alloc" error when filling a bunch of large arrays #27715

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
josephrocca opened this issue May 15, 2019 · 10 comments
Labels
memory Issues and PRs related to the memory management or memory footprint.

Comments

@josephrocca
Copy link

josephrocca commented May 15, 2019

Minimal example code:

// index.js
// create an array of arrays and fill the inner arrays with lots of random numbers
let metaArr = new Array(150).fill(0).map(n => []);  
while(1) {
  for(let i = 0; i < metaArr.length; i++) {
    arr = metaArr[i];
    arr.push(Math.random());
    if(arr.length > 10e6) arr.shift();
  }
}

Started with this command:

node --max-old-space-size=999999 --max-semi-space-size=999999 index.js

Result: After about a minute or two, on a machine with 64GB of memory, the memory usage climbs to a bit over 30GB (according to htop), and then it crashes with this message:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

Notes:

  1. When I run this same code in Chrome 75.0.3770.27 (with --js-flags="--max-old-space-size=999999 --max-semi-space-size=999999") it crashes in less than a minute when its memory usage reaches ~18GB (quite consistently).
  2. If I lower the length of metaArray to 50, memory usage stays at around 17GB and it runs indefinitely. This is the case for both node.js and Chrome.
  3. I was also getting FATAL ERROR: invalid table size Allocation failed - JavaScript heap out of memory in node for other configurations of array sizes and array items (e.g. filling with objects rather than numbers), but that's a separate issue that I still need to look into.
  4. This issue may or may not be related.

I have cross-posted this to crbug V8, so please just close this issue if it's more relevant there (I figured it might be since it's happening in Chrome too).

Edit: In case anyone wants to see the full error for point number 3, above, here it is:


<--- Last few GCs --->

[10933:0x39bd5e0]    78676 ms: Scavenge 10073.1 (12128.5) -> 9608.0 (17199.0) MB, 4258.9 / 0.0 ms  (average mu = 0.954, current mu = 0.954) allocation failure 


<--- JS stacktrace --->

==== JS stack trace =========================================

    0: ExitFrame [pc: 0x1a86142]
Security context: 0x0e3d5e99a299 <JSObject>
    1: /* anonymous */ [0x2bf5a55803a1] [/root/markov/node-memory.js:~1] [pc=0x16f61940398b](this=0x2bf5a55804d1 <Object map = 0x132888400459>,0x2bf5a55804d1 <Object map = 0x132888400459>,0x2bf5a5580491 <JSFunction require (sfi = 0x28e5123d7e29)>,0x2bf5a5580401 <Module map = 0x13288843c3b9>,0x28e5123d0f89 <String[#27]: /root/markov/node-memory.js>,0x2bf5a55803e1 ...

FATAL ERROR: invalid table size Allocation failed - JavaScript heap out of memory
 1: 0x98c680 node::Abort() [node]
 2: 0x98d5e6 node::OnFatalError(char const*, char const*) [node]
 3: 0xb077ce v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
 4: 0xb07b49 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
 5: 0xf12c05  [node]
 6: 0x1045d2a v8::internal::HashTable<v8::internal::StringTable, v8::internal::StringTableShape>::New(v8::internal::Isolate*, int, v8::internal::PretenureFlag, v8::internal::MinimumCapacity) [node]
 7: 0x104625d v8::internal::HashTable<v8::internal::StringTable, v8::internal::StringTableShape>::EnsureCapacity(v8::internal::Isolate*, v8::internal::Handle<v8::internal::StringTable>, int, v8::internal::PretenureFlag) [node]
 8: 0x104647b v8::internal::StringTable::LookupKey(v8::internal::Isolate*, v8::internal::StringTableKey*) [node]
 9: 0x1046575 v8::internal::StringTable::LookupString(v8::internal::Isolate*, v8::internal::Handle<v8::internal::String>) [node]
10: 0xb06d7f v8::internal::LookupIterator::PropertyOrElement(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::JSReceiver>, v8::internal::LookupIterator::Configuration) [node]
11: 0x11da4b8 v8::internal::Runtime_DefineDataPropertyInLiteral(int, unsigned long*, v8::internal::Isolate*) [node]
12: 0x1a86142  [node]
Aborted (core dumped)

That error can be produced in node.js and Chrome with the following code:

let metaArr = new Array(50).fill(0).map(n => []);  
while(1) {
  for(let i = 0; i < metaArr.length; i++) {
    arr = metaArr[i];
    arr.push({[Math.random()]:Math.random()});
    if(arr.length > 10e6) arr.shift();
  }
}

Note that metaArray is only of length 50 here. In this case, memory tops out at somewhere around 18GB in both node.js and Chrome before crashing.

Also, in case it's relevant: The memory usage that htop reports matches that of Chrome's Task Manager.

@bnoordhuis
Copy link
Member

bnoordhuis commented May 16, 2019

I'd say neither error is unexpected. Your test exhausts available memory. At some point the OS refuses to grant node additional memory.

Sometimes that point is reached in C++ land when calling malloc() or new, that's when you get that std::bad_alloc exception.

Sometimes it happens in JS land and then you get that FATAL ERROR.

edit: or is your bug report more about that the size of the heap is out of proportion to the amount of data your script generates?

150 * 10e6 * 8 is about 11 GB but using Array#shift() on really big arrays like that probably creates some pathological behavior.

@bnoordhuis bnoordhuis added performance Issues and PRs related to the performance of Node.js. memory Issues and PRs related to the memory management or memory footprint. and removed performance Issues and PRs related to the performance of Node.js. labels May 16, 2019
@josephrocca
Copy link
Author

josephrocca commented May 16, 2019

C++ land ==> std::bad_alloc
JS land ==> FATAL ERROR

Good to know! Cheers.

Your test exhausts available memory. At some point the OS refuses to grant node additional memory.

But why would that happen if my machine has 64GB, and node is only using 30GB or 40GB?

Here's what v8.getHeapStatistics shows (logging once every million push/shift operations):

{"total_heap_size":5009408,"total_heap_size_executable":1048576,"total_physical_size":3505080,"total_available_size":4347106487888,"used_heap_size":2465712,"heap_size_limit":4347109834752,"malloced_memory":8192,"peak_malloced_memory":410288,"does_zap_garbage":0}
{"total_heap_size":7390973952,"total_heap_size_executable":1097728,"total_physical_size":4101643936,"total_available_size":4343573261576,"used_heap_size":3405742904,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1089056,"does_zap_garbage":0}
{"total_heap_size":14683615232,"total_heap_size_executable":1097728,"total_physical_size":8075706960,"total_available_size":4335554129936,"used_heap_size":7759856664,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}
{"total_heap_size":19371397120,"total_heap_size_executable":1097728,"total_physical_size":15006340424,"total_available_size":4335639993632,"used_heap_size":10934534264,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}
{"total_heap_size":19371397120,"total_heap_size_executable":1097728,"total_physical_size":15006340424,"total_available_size":4333239954600,"used_heap_size":13334573296,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}
{"total_heap_size":27889881088,"total_heap_size_executable":1097728,"total_physical_size":14940001616,"total_available_size":4330940438088,"used_heap_size":12142676552,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}
{"total_heap_size":27889881088,"total_heap_size_executable":1097728,"total_physical_size":14940001616,"total_available_size":4328542072368,"used_heap_size":14542716616,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}
{"total_heap_size":39774834688,"total_heap_size_executable":1097728,"total_physical_size":28838777496,"total_available_size":4316185166792,"used_heap_size":23545365808,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}
{"total_heap_size":39774834688,"total_heap_size_executable":1097728,"total_physical_size":31238779816,"total_available_size":4313785127824,"used_heap_size":25945404776,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}
{"total_heap_size":39774834688,"total_heap_size_executable":1097728,"total_physical_size":33638782288,"total_available_size":4311385088720,"used_heap_size":28345443880,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}

Here's what free shows before the crash in sync with the above logs:

total        used        free      shared  buff/cache   available
65965528     2552700    39350276       30456    24062552    62659872
65965528     3654188    38248756       30456    24062584    61558408
65965528     6812716    35090228       30456    24062584    58399880
65965528    11099264    30803672       30456    24062592    54113340
65965528    17527580    24375356       30456    24062592    47685024
65965528    17527704    24375212       30456    24062612    47684884
65965528    17770500    24132416       30456    24062612    47442088
65965528    18076664    23826236       30456    24062628    47135908
65965528    31618912    10283988       30456    24062628    33593660
65965528    33976268     7926608       30456    24062652    31236336

So the machine apparently doesn't want to give up buff/cache to the node process once the free memory runs out?

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 257579
max locked memory       (kbytes, -l) 16384
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 257579
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

I also tried the same code on a 192GB machine and got basically identical results:

root@more-ram:~# node --max-old-space-size=999999 --max-semi-space-size=999999 node-memory.js
{"total_heap_size":5009408,"total_heap_size_executable":1048576,"total_physical_size":3504128,"total_available_size":4347106488816,"used_heap_size":2464712,"heap_size_limit":4347109834752,"malloced_memory":8192,"peak_malloced_memory":410288,"does_zap_garbage":0}
{"total_heap_size":7395614720,"total_heap_size_executable":1097728,"total_physical_size":4106214304,"total_available_size":4343558186944,"used_heap_size":3410379928,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1089056,"does_zap_garbage":0}
{"total_heap_size":14693527552,"total_heap_size_executable":1097728,"total_physical_size":8086072832,"total_available_size":4335528567872,"used_heap_size":7770290344,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}
{"total_heap_size":19389657088,"total_heap_size_executable":1097728,"total_physical_size":15024529952,"total_available_size":4335586600048,"used_heap_size":10952793440,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}
{"total_heap_size":19389657088,"total_heap_size_executable":1097728,"total_physical_size":15024529952,"total_available_size":4333186561016,"used_heap_size":13352832472,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}
{"total_heap_size":27925098496,"total_heap_size_executable":1097728,"total_physical_size":14975148584,"total_available_size":4330905218896,"used_heap_size":12177890712,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}
{"total_heap_size":27925098496,"total_heap_size_executable":1097728,"total_physical_size":14975148584,"total_available_size":4328506855024,"used_heap_size":14577930776,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}
{"total_heap_size":39810052096,"total_heap_size_executable":1097728,"total_physical_size":28873924464,"total_available_size":4316149949448,"used_heap_size":23580579968,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}
{"total_heap_size":39810052096,"total_heap_size_executable":1097728,"total_physical_size":31273926784,"total_available_size":4313749910480,"used_heap_size":25980618936,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}
{"total_heap_size":39810052096,"total_heap_size_executable":1097728,"total_physical_size":33673929256,"total_available_size":4311349871376,"used_heap_size":28380658040,"heap_size_limit":4347109834752,"malloced_memory":90312,"peak_malloced_memory":1847216,"does_zap_garbage":0}
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

And here's the free output, synchronised with the above logs:

total        used        free      shared  buff/cache   available
198086048     1517216   193713908         732     2854924   194987384
198086048     4592604   190638520         732     2854924   191911996
198086048     8908432   186322584         732     2855032   187596164
198086048    15406212   179824804         732     2855032   181098384
198086048    15678752   179552224         732     2855072   180825828
198086048    15866020   179364948         732     2855080   180638552
198086048    29468992   165761976         732     2855080   167035580
198086048    31803036   163427932         732     2855080   164701544
198086048    34167212   161063756         732     2855080   162337368

In this case the free column doesn't run anywhere close to zero, so that seems to contradict the idea that the OS is for some reason reluctant to give up cache/buff space.

Interestingly, in one test on the 64GB machine it got to 30GB and then just stalled - no crash, no error messages, and no further log messages indicating that it was still pushing/shifting. Could there be some sort of memory fragmentation stuff causing all this? That's wild speculation because I am a mere web developer with very little low-level/hardware experience.

@josephrocca
Copy link
Author

josephrocca commented May 17, 2019

Update: This code runs fine (indefinitely) on Firefox and never consumes more than about 15GB of my 64GB machine's memory.

Edit: The arr.push(Math.random()) version (that causes the bad_alloc in V8) works fine in Firefox, but interestingly, the arr.push({[Math.random()]:Math.random()}); version (which causes the above-mentioned JavaScript-land "FATALERROR" in V8) also causes a memory-related error in Firefox. It gets to about 10GB (no matter the machine's actual memory capacity) and throws Error: out of memory.

@devsnek
Copy link
Member

devsnek commented May 17, 2019

also had this issue on a recent test run: https://ci.nodejs.org/job/node-test-commit-smartos/nodes=smartos18-64/25899/console

09:51:01 not ok 2020 parallel/test-tls-securepair-leak
09:51:01   ---
09:51:01   duration_ms: 1.723
09:51:01   severity: crashed
09:51:01   exitcode: -6
09:51:01   stack: |-
09:51:01     terminate called after throwing an instance of 'std::bad_alloc'
09:51:01     terminate called recursively
09:51:01   ...

@josephrocca
Copy link
Author

josephrocca commented May 17, 2019

Here's a very simple script which causes both Chrome and standalone V8 to crash (logs here) at 18GB, no matter how much more memory is actually available:

const arrs = [];
let n = 0;
while(1) {
  arrs.push(new Array(1e3).fill(0));
  if(++n % 1e5 === 0) console.log(n);
}

Firefox/SpiderMonkey fills up the all 64GB of my machine's memory with no problems.

@bnoordhuis
Copy link
Member

That can probably be explained by the fact that arrays have a hard size limit (512 MB on 32 bits platforms, 1 GB on 64 bits.)

Size == length * some constant factor, where the factor depends on the underlying representation of the array.

It maxes out around 110 million elements in practice, assuming you don't run out of heap space before that.

@josephrocca
Copy link
Author

josephrocca commented May 21, 2019

Thank you, @bnoordhuis! The above code runs fine if I change new Array(1e3) to new Array(1e6), as would be expected if your explanation were true - V8 happily fills 100GB+ of memory.

For others hitting this thread via search engine in the future:

Hope V8 can catch up to SpiderMonkey in this regard at some point. It looks like we should have support for huge TypedArrays in V8 soon: https://bugs.chromium.org/p/v8/issues/detail?id=4153 But the Array limits look like they might be with us for some time yet.

@bnoordhuis
Copy link
Member

I think this issue has been answered/resolved as much as is possible so I'll take the liberty of closing it out. Let me know if I should reopen.

@josephrocca
Copy link
Author

This issue doesn't need to be re-opened, since the bug is upstream at V8, not with nodejs specifically, but in case others come across it in the future: The memory problems are back. I don't know what happened, but the following code snippet crashes V8 at only 4GB of RAM usage:

const arrs = [];
let n = 0;
while(1) {
  arrs.push(new Array(1e6).fill(0));
  if(++n % 10 === 0) console.log(arrs.length);
}

I said in my last comment (above) that V8 happily fills 100GB of RAM with this snippet, but in trying to resolve a different error (not the bad_alloc one) I recently tested the snippet again and found that it crashes with CALL_AND_RETRY_LAST after consuming only 4GB.

I have no idea what's going on! Hopefully that crbug report leads somehere. If anyone knows what's going on here, I'd love to hear from you. I'll report back here if I learn anything.

@jbaylina
Copy link

Just added this answer to stack overflow that might be relevant to this thread.

The problem might be that you reach the max number of mmaps per file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
memory Issues and PRs related to the memory management or memory footprint.
Projects
None yet
Development

No branches or pull requests

4 participants