Bug 11102 - jit segmentation fault (Mono 3.0.7 and before)
Summary: jit segmentation fault (Mono 3.0.7 and before)
Status: RESOLVED FIXED
Alias: None
Product: Runtime
Classification: Mono
Component: JIT ()
Version: unspecified
Hardware: PC Linux
: --- normal
Target Milestone: ---
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2013-03-13 05:38 UTC by Victor
Modified: 2013-04-10 05:06 UTC (History)
5 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
mono-sgen binary (3.04 MB, application/octet-stream)
2013-03-13 12:08 UTC, Victor
Details
result of thread apply all backtrace (19.10 KB, text/plain)
2013-03-14 12:42 UTC, Victor
Details
coredump (1.58 MB, application/x-gzip)
2013-04-03 08:22 UTC, Victor
Details
gdb : thread apply all backtrace (with debug) (7.89 KB, text/plain)
2013-04-03 08:29 UTC, Victor
Details


Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Victor 2013-03-13 05:38:37 UTC
Mono 2.10.8
     3.0.5
     3.0.6

process usr/bin/mono was killed by signal 11 (SIGSEGV)

Red Hat Enterprise Linux Server release 6.3 (Santiago) Linux 2.6.32-279.el6.x86_64

attachment : cores files (26MB)

dgb stacktrace sample :

Program terminated with signal 11, Segmentation fault.
#0  jit_info_table_index (table=0x0, addr=0x50abf4 "D\213O\bE\205\311~GD\211\311\061\300\353\n\017\037@") at domain.c:332
332             int left = 0, right = table->num_chunks;
Mono support loaded.
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6.x86_64 libgcc-4.4.6-4.el6.x86_64
(gdb) bt
#0  jit_info_table_index (table=0x0, addr=0x50abf4 "D\213O\bE\205\311~GD\211\311\061\300\353\n\017\037@") at domain.c:332
#1  0x000000000050e30c in mono_jit_info_table_find (domain=0x7f88f0004b30, addr=
    0x50abf4 "D\213O\bE\205\311~GD\211\311\061\300\353\n\017\037@") at domain.c:393
#2  0x000000000048c2ea in mini_jit_info_table_find (domain=0x7f89088dbcc0, addr=
    0x50abf4 "D\213O\bE\205\311~GD\211\311\061\300\353\n\017\037@", out_domain=0x0) at mini-exceptions.c:1050
#3  0x00000000004e2f00 in mono_arch_handle_altstack_exception (sigctx=<value optimized out>,
    fault_addr=<value optimized out>, stack_ovf=0) at exceptions-amd64.c:946
#4  0x0000000000415029 in mono_sigsegv_signal_handler (_dummy=11, info=0x7f8903615f70, context=0x7f8903615e40)
    at mini.c:5909
#5  <signal handler called>
#6  jit_info_table_index (table=0x0, addr=
    0x3447a77a48 "H\211X\020H\211\350H\211Z\030H\203\310\001H\211,+H\211C\bH\201\375\377\377") at domain.c:332
#7  0x000000000050e30c in mono_jit_info_table_find (domain=0x7f88f0004b30, addr=
    0x3447a77a48 "H\211X\020H\211\350H\211Z\030H\203\310\001H\211,+H\211C\bH\201\375\377\377") at domain.c:393
#8  0x000000000048c2ea in mini_jit_info_table_find (domain=0x7f89088dbcc0, addr=
    0x3447a77a48 "H\211X\020H\211\350H\211Z\030H\203\310\001H\211,+H\211C\bH\201\375\377\377", out_domain=0x7f8902f79418)
    at mini-exceptions.c:1050
#9  0x000000000048c9f7 in mono_find_jit_info_ext (domain=0x7f89088dbcc0, jit_tls=<value optimized out>,
    prev_ji=<value optimized out>, ctx=0x7f8902f79510, new_ctx=0x7f8902f794a0, trace=0x0, lmf=0x7f8902f79498,
    save_locations=0x0, frame=0x7f8902f79580) at mini-exceptions.c:351
#10 0x000000000048dbce in mono_walk_stack (func=0x48b2c0 <find_last_handler_block>, domain=0x7f89088dbcc0,
    start_ctx=<value optimized out>, unwind_options=<value optimized out>, thread=<value optimized out>, lmf=
    0x7f88f0000c20, user_data=0x7f8902f79630) at mini-exceptions.c:721
#11 0x000000000048dd3c in mono_install_handler_block_guard (thread=<value optimized out>, ctx=0x7f8902f796e0)
    at mini-exceptions.c:2429
#12 0x00000000004e89c8 in sigusr1_signal_handler (_dummy=<value optimized out>, info=<value optimized out>, context=
    0x7f8902f79780) at mini-posix.c:249
#13 <signal handler called>
#14 0x0000003447a77a48 in _int_free () from /lib64/libc.so.6
#15 0x0000000000511eff in ref_stack_destroy (thread=0x7f8908028d68) at threads.c:3336
#16 thread_cleanup (thread=0x7f8908028d68) at threads.c:607
#17 0x0000000000514173 in start_wrapper_internal (data=0x13857c0) at threads.c:800
#18 start_wrapper (data=0x13857c0) at threads.c:832
#19 0x00000000005ae524 in thread_start_routine (args=0x120e358) at wthreads.c:287
#20 0x00000000005e24de in GC_start_routine (arg=<value optimized out>) at pthread_support.c:1468
#21 0x0000003448207851 in start_thread () from /lib64/libpthread.so.0
#22 0x0000003447ae767d in clone () from /lib64/libc.so.6
Comment 2 Mark Probst 2013-03-13 11:33:03 UTC
That filetea link doesn't do anything for me.  Could you please attach a test case as a file here?
Comment 3 Victor 2013-03-13 11:59:28 UTC
A new link to download the cores files :
http://www.datafilehost.com/download-2e51cf98.html

I can not provide case test because the bug reported happens randomly and is due to a bad memory management in domain.c
Comment 4 Mark Probst 2013-03-13 12:01:30 UTC
How often does this happen?  Even if you have a test case that only triggers once in a hundred times, that might be useful.
Comment 5 Mark Probst 2013-03-13 12:03:47 UTC
If you can only provide us with core dumps, please also attach the mono/mono-sgen executable that matches the core dumps.
Comment 6 Victor 2013-03-13 12:07:27 UTC
It happens randomly 2 or 3 times per day and the program is called 10.000 times per day.

I am going to attach the mono-sgen binary
Comment 7 Victor 2013-03-13 12:08:58 UTC
Created attachment 3599 [details]
mono-sgen binary

mono-sgen binary

[adm_v.agostino@LXLYOPFD11 ~]$ mono-sgen -V
Mono JIT compiler version 2.10.8 (tarball Fri Jan 25 16:51:32 CET 2013)
Copyright (C) 2002-2011 Novell, Inc, Xamarin, Inc and Contributors. www.mono-project.com
        TLS:           __thread
        SIGSEGV:       altstack
        Notifications: epoll
        Architecture:  amd64
        Disabled:      none
        Misc:          softdebug
        LLVM:          supported, not enabled.
        GC:            sgen
Comment 8 Mark Probst 2013-03-13 13:32:56 UTC
Victor,

Unfortunately, the executable is not very useful - it doesn't contain debug information, and I'm not even sure it matches the core files.

Would you mind compiling a recent mono from git, with "CFLAGS=-O0 -g" and using that?  That would be very helpful.

Sorry for the inconvenience.

Mark
Comment 9 Mark Probst 2013-03-13 19:54:22 UTC
Does this bug happen when your program is shutting down?
Comment 10 Victor 2013-03-14 04:11:48 UTC
It happens when the program is starting because I don't see the logs (there is a log message at the begining of the Main() method).

It do match the core files.

If you watch the stack trace you can always see : 
seg fault in domain.c line 332
due to mono_jit_info_table_find() à line 393

In the 2.10.8 the seg fault was different which means someone (schani a month ago) has tried to solve this problem but an issue still exists.

I can't use right now the latest mono version in production but I will try to have at less one core dump with the master from git compiled with CFLAGS=-O0 -g. Thanks for your reply
Comment 11 Mark Probst 2013-03-14 12:11:12 UTC
Thanks, a dump with a binary with debug info would be very much appreciated.

Should you catch it in gdb again, it would also help if you could do a

  thread apply all backtrace

Thanks!
Comment 12 Mark Probst 2013-03-14 12:37:19 UTC
Oh, and if you could walk to the frame of mono_jit_info_table_find() and do a

  p *domain

that would also help.
Comment 13 Victor 2013-03-14 12:39:53 UTC
(gdb) p *domain
Cannot access memory at address 0x0

I attach a text file with the result of thread apply all backtrace
Comment 14 Mark Probst 2013-03-14 12:41:33 UTC
In your original backtrace there was a non-null domain:

#2  0x000000000048c2ea in mini_jit_info_table_find (domain=0x7f89088dbcc0,
Comment 15 Victor 2013-03-14 12:42:06 UTC
Created attachment 3620 [details]
result of thread apply all backtrace

result of thread apply all backtrace
Comment 16 Victor 2013-03-14 12:46:19 UTC
I've collect several backtrace, all the exact same segmentation fault.
Comment 17 Mark Probst 2013-03-14 12:53:14 UTC
A few remarks:

Those backtraces are from a Boehm mono, not an SGen one.  (The binary "mono" is Boehm, "mono-sgen" is SGen).  So the core did not in fact match the binary you sent.  But that's not the problem, of course :-)

There's also something missing in the backtrace text file, or there's duplication.  There's two backtraces for thread 1, and the first one goes

#8  0x000000000048c2ea in mini_jit_info_table_find (domain=0x7f89088dbcc0, addr=
    0x3447a77a48 "H\211X\020H\211\350H\211Z\030H\203\310\001H\211,+H\211C\bH\201\375\377\377", out_domain=0x7f8902f79418)

#4  0x0000003448207851 in start_thread () from /lib64/libpthread.so.0
#5  0x0000003447ae767d in clone () from /lib64/libc.so.6

Also, it seems that your program is, in fact, exiting.  See thread 19:

#4  0x000000000052cc67 in ves_icall_System_Environment_Exit (result=0) at icall.c:6557

My best guess is that this is a shutdown bug.

Patiently waiting for a -O0 -g build :-)
Comment 18 Victor 2013-04-02 10:18:22 UTC
I will publish results this week, my tests will be based on Mono 3.0.7
Comment 19 Victor 2013-04-03 08:22:53 UTC
Created attachment 3731 [details]
coredump

Mono JIT compiler version 3.0.7 ((no/514fcd7 Wed Apr  3 13:46:44 CEST 2013)
Copyright (C) 2002-2012 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
        TLS:           __thread
        SIGSEGV:       altstack
        Notifications: epoll
        Architecture:  amd64
        Disabled:      none
        Misc:          softdebug
        LLVM:          supported, not enabled.
        GC:            Included Boehm (with typed GC and Parallel Mark)
Comment 20 Victor 2013-04-03 08:25:22 UTC
I've attached the coredump.

Mono 3.0.7 has been compiled with the -g option.

Here is the extract of the mono-core file used to create the rpm files :

[...]
export PATH=/opt/novell/llvm-mono/bin:$PATH
%endif
%configure --target= --build= \
  --with-ikvm=no \
  --with-sgen=%{sgen} \
  --with-moonlight=no
%if %llvm == yes
  --enable-loadedllvm \
  --disable-system-aot \
%endif
%ifnarch %ix86 x86_64
  --disable-system-aot \
%endif


make -j1
[...]

Same errors as explained in my original post : segmentation fault.
Comment 21 Victor 2013-04-03 08:29:49 UTC
Created attachment 3732 [details]
gdb : thread apply all backtrace (with debug)
Comment 22 Zoltan Varga 2013-04-03 08:40:20 UTC
It seems like a thread receives an abort signal while it is cleaning up, leading to the crash.
Comment 23 Victor 2013-04-03 09:34:18 UTC
I agree

SIGUSR1
then doing stuff
then SIGSEGV for a reason i don't know.

My exit routine looks like that :

public static void ExitSelonMode() {

 if (Config.Mode) {
   MainClass.log.Error("xxxxx");                
   System.Environment.Exit(75);
 } 
 else {
   MainClass.log.Error("yyyyy");
   System.Environment.Exit(0);
 }

}

Config.Mode is a boolean
log is a log4net object which send log messages to the rsyslog deamon.
Comment 24 Zoltan Varga 2013-04-09 11:35:10 UTC
Should be fixed by:
https://github.com/mono/mono/commit/5bae54197af25962f17993fe20a4a423c8d1d8e7

Could you try that patch out ?
Comment 25 Victor 2013-04-09 11:39:31 UTC
Hi,

I will try this patch asap and let you know

Regards,

Victor
Comment 26 Victor 2013-04-10 05:06:27 UTC
I am glad to announce you that the bug is fixed by the 3.0.9 release !

Thanks again for your work. 

Regards,
Victor d'Agostino