Description
Please describe the bug
We have a CI job that runs a number of rspec jobs. Since upgrading to Nokogiri v1.14.0, we noticed an uptick of seg faults. In this example, it seems that the seg fault happened at the end of the test run (https://gitlab.com/gitlab-org/gitlab/-/jobs/3697213765):
<snip>
Finished in 32 minutes 18 seconds (files took 1 minute 22.6 seconds to load)
330 examples, 0 failures, 3 pending
Randomized with seed 40485
[TEST PROF INFO] Time spent in factories: 03:25.902 (10.43% of total time)
Failed to write to log, write log/workhorse-test.log: file already closed
[BUG] Segmentation fault at 0x0000000000000440
ruby 3.0.5p211 (2022-11-24 revision ba5cf0f7c5) [x86_64-linux]
-- Machine register context ------------------------------------------------
RIP: 0x00007f5f4f202b03 RBP: 0x00007f5ec252b410 RSP: 0x00007f5ec252b3f0
RAX: 0x0000000000000000 RBX: 0x00007f5ed6053358 RCX: 0x0000000000000001
RDX: 0x00007f5f45cb19e0 RDI: 0x00007f5f1593a6a0 RSI: 0x0000000000000000
R8: 0x00007f5ec252bca8 R9: 0x00007f5eb126f000 R10: 0x00007f5ea93fdca8
R11: 0x00007f5ec1229ca8 R12: 0x0000000000000000 R13: 0x00007f5f1593a6a0
R14: 0x0000000000000004 R15: 0x00007f5ec252cc18 EFL: 0x0000000000010206
-- C level backtrace information -------------------------------------------
/usr/local/lib/libruby.so.3.0(rb_print_backtrace+0x11) [0x7f5f4f3df59e] vm_dump.c:758
/usr/local/lib/libruby.so.3.0(rb_vm_bugreport) vm_dump.c:998
/usr/local/lib/libruby.so.3.0(rb_bug_for_fatal_signal+0xf8) [0x7f5f4f1dfb38] error.c:787
/usr/local/lib/libruby.so.3.0(sigsegv+0x55) [0x7f5f4f332645] signal.c:963
/lib/x86_64-linux-gnu/libpthread.so.0(__restore_rt+0x0) [0x7f5f4f0db140]
/usr/local/lib/libruby.so.3.0(ruby_sized_xfree+0xa) [0x7f5f4f202b03] gc.c:10929
/usr/local/lib/libruby.so.3.0(ruby_sized_xfree) gc.c:10926
/builds/gitlab-org/gitlab/vendor/ruby/3.0.0/gems/nokogiri-1.14.1-x86_64-linux/lib/nokogiri/3.0/nokogiri.so(xmlResetError+0x26) [0x7f5f45c48ad6]
/builds/gitlab-org/gitlab/vendor/ruby/3.0.0/gems/nokogiri-1.14.1-x86_64-linux/lib/nokogiri/3.0/nokogiri.so(0x7f5f45cb19f0) [0x7f5f45cb19f0]
/lib/x86_64-linux-gnu/libpthread.so.0(0x7cd1) [0x7f5f4f0cfcd1]
/lib/x86_64-linux-gnu/libpthread.so.0(0x7eba) [0x7f5f4f0cfeba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f5f4eb68a2f]
<snip>
The backtrace suggests nokogiri
or libxml2
is calling xmlResetError()
: https://github.com/GNOME/libxml2/blob/f507d167f1755b7eaea09fb1a44d29aab828b6d1/error.c#L873-L891.
The Ruby interpreter (v3.0.5) is patched, but gc.c:10929
line corresponds to the objspace_xfree
call in https://github.com/ruby/ruby/blob/ba5cf0f7c52d4d35cc6a173c89eda98ceffa2dcf/gc.c#L10909:
void
ruby_sized_xfree(void *x, size_t size)
{
if (x) {
objspace_xfree(&rb_objspace, x, size);
}
}
This might relate to the changes in #2480. I have to wonder if an error is being allocated with malloc
instead of ruby_xmalloc
.
Help us reproduce what you're seeing
We're not yet sure how to reproduce the seg fault. We're discussing the issue in https://gitlab.com/gitlab-org/gitlab/-/issues/390313.
Expected behavior
No seg faults.
Environment
- Ruby 3.0.5
- Nokogiri v1.14.1 (x86)
# nokogiri -v
# Nokogiri (1.14.1)
---
warnings: []
nokogiri:
version: 1.14.1
cppflags:
- "-I/usr/local/lib/ruby/gems/3.0.0/gems/nokogiri-1.14.1-x86_64-linux/ext/nokogiri"
- "-I/usr/local/lib/ruby/gems/3.0.0/gems/nokogiri-1.14.1-x86_64-linux/ext/nokogiri/include"
- "-I/usr/local/lib/ruby/gems/3.0.0/gems/nokogiri-1.14.1-x86_64-linux/ext/nokogiri/include/libxml2"
ldflags: []
ruby:
version: 3.0.5
platform: x86_64-linux
gem_platform: x86_64-linux
description: ruby 3.0.5p211 (2022-11-24 revision ba5cf0f7c5) [x86_64-linux]
engine: ruby
libxml:
source: packaged
precompiled: true
patches:
- 0001-Remove-script-macro-support.patch
- 0002-Update-entities-to-remove-handling-of-ssi.patch
- 0003-libxml2.la-is-in-top_builddir.patch
- '0009-allow-wildcard-namespaces.patch'
libxml2_path: "/usr/local/lib/ruby/gems/3.0.0/gems/nokogiri-1.14.1-x86_64-linux/ext/nokogiri"
memory_management: ruby
iconv_enabled: true
compiled: 2.10.3
loaded: 2.10.3
libxslt:
source: packaged
precompiled: true
patches:
- 0001-update-automake-files-for-arm64.patch
datetime_enabled: true
compiled: 1.1.37
loaded: 1.1.37
other_libraries:
zlib: 1.2.13
libgumbo: 1.0.0-nokogiri