Bug 34323 - Segfault during mono_jit_init / GC stack push system. GC_approx_sp return value is bad. This may be a compilation problem
Summary: Segfault during mono_jit_init / GC stack push system. GC_approx_sp return va...
Status: VERIFIED FIXED
Alias: None
Product: Runtime
Classification: Mono
Component: GC ()
Version: unspecified
Hardware: PC Linux
: Normal normal
Target Milestone: 4.6.0 (C8)
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2015-09-27 01:46 UTC by Thomas
Modified: 2016-08-30 10:10 UTC (History)
7 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
VERIFIED FIXED

Description Thomas 2015-09-27 01:46:35 UTC
Say I have a the shortest mono-embedding program ever:
------------------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
 
#include <glib.h>
#include <mono/jit/jit.h>
#include <mono/metadata/assembly.h>
 
int main(int argc, char *argv[])
{      
    MonoDomain *domain = NULL ;
    domain = mono_jit_init( "date" );
    return EXIT_SUCCESS;
}
----------------------------------------------------------
compiled
$ gcc main.c $(pkg-config --cflags --libs mono-2 glib-2.0) -Wall
( I put glib because it is present on the example on the website, I dont think it is needed?)
 
when I run it, I got a segfault during mono_jit_init:
 
 
Native stacktrace:
 
        /usr/lib/libmonoboehm-2.0.so.1(+0xd33ca) [0x7fbe77a053ca]
        /usr/lib/libmonoboehm-2.0.so.1(+0x488e0) [0x7fbe7797a8e0]
        /usr/lib/libpthread.so.0(+0x10d60) [0x7fbe7701bd60]
        /usr/lib/libmonoboehm-2.0.so.1(+0x243a40) [0x7fbe77b75a40]
        /usr/lib/libmonoboehm-2.0.so.1(+0x24bef8) [0x7fbe77b7def8]
        /usr/lib/libmonoboehm-2.0.so.1(+0x244f3f) [0x7fbe77b76f3f]
        /usr/lib/libmonoboehm-2.0.so.1(+0x244308) [0x7fbe77b76308]
        /usr/lib/libmonoboehm-2.0.so.1(+0x23af68) [0x7fbe77b6cf68]
        /usr/lib/libmonoboehm-2.0.so.1(+0x23b851) [0x7fbe77b6d851]
        /usr/lib/libmonoboehm-2.0.so.1(+0x245e6a) [0x7fbe77b77e6a]
        /usr/lib/libmonoboehm-2.0.so.1(+0x245f8e) [0x7fbe77b77f8e]
        /usr/lib/libmonoboehm-2.0.so.1(+0x1fc0a1) [0x7fbe77b2e0a1]
        /usr/lib/libmonoboehm-2.0.so.1(+0x1d079c) [0x7fbe77b0279c]
        /usr/lib/libmonoboehm-2.0.so.1(+0x49be6) [0x7fbe7797bbe6]
        ./a.out() [0x400747]
        /usr/lib/libc.so.6(__libc_start_main+0xf0) [0x7fbe76979610]
        ./a.out() [0x400659]
 
Debug info from gdb:
 
ptrace: Opération non permise.
No threads.
 
=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=================================================================
 
Abandon (core dumped)
 
 
 
 
 
Here is what I got when I link it with a fresh built libmonoboehm-2.0.so.1
 
 
./a.out: ./libmonoboehm-2.0.so.1: no version information available (required by ./a.out)
 
Native stacktrace:
 
        ./libmonoboehm-2.0.so.1(+0xd8c20) [0x7f55cc902c20]
        ./libmonoboehm-2.0.so.1(+0x52eac) [0x7f55cc87ceac]
        /usr/lib/libpthread.so.0(+0x10d60) [0x7f55cbf13d60]
        ./libmonoboehm-2.0.so.1(GC_push_all_eager+0x5e) [0x7f55cca72534]
        ./libmonoboehm-2.0.so.1(GC_push_current_stack+0x2a) [0x7f55cca73a41]
        ./libmonoboehm-2.0.so.1(GC_with_callee_saves_pushed+0x2b) [0x7f55cca7c4af]
        ./libmonoboehm-2.0.so.1(GC_generic_push_regs+0x27) [0x7f55cca7c4f2]
        ./libmonoboehm-2.0.so.1(GC_push_roots+0x10e) [0x7f55cca73b8d]
        ./libmonoboehm-2.0.so.1(GC_mark_some+0x240) [0x7f55cca702c5]
        ./libmonoboehm-2.0.so.1(GC_stopped_mark+0x1e0) [0x7f55cca67882]
        ./libmonoboehm-2.0.so.1(GC_try_to_collect_inner+0x1b4) [0x7f55cca673fc]
        ./libmonoboehm-2.0.so.1(GC_init_inner+0x3fd) [0x7f55cca74627]
        ./libmonoboehm-2.0.so.1(GC_init+0x21) [0x7f55cca7418e]
        ./libmonoboehm-2.0.so.1(+0x1face5) [0x7f55cca24ce5]
        ./libmonoboehm-2.0.so.1(+0x1cde3c) [0x7f55cc9f7e3c]
        ./libmonoboehm-2.0.so.1(+0x5421e) [0x7f55cc87e21e]
        ./a.out() [0x400747]
        /usr/lib/libc.so.6(__libc_start_main+0xf0) [0x7f55cb871610]
        ./a.out() [0x400659]
 
Debug info from gdb:
 
[New LWP 8488]
[New LWP 8487]
[New LWP 8486]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
0x00007f55cbf139bb in waitpid () from /usr/lib/libpthread.so.0
  Id   Target Id         Frame
  4    Thread 0x7f55cb5e0700 (LWP 8486) "a.out" 0x00007f55cbf1007f in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0
  3    Thread 0x7f55caddf700 (LWP 8487) "a.out" 0x00007f55cbf1007f in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0
  2    Thread 0x7f55ca5de700 (LWP 8488) "a.out" 0x00007f55cbf1007f in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0
* 1    Thread 0x7f55ccfa4740 (LWP 8485) "a.out" 0x00007f55cbf139bb in waitpid () from /usr/lib/libpthread.so.0
 
Thread 4 (Thread 0x7f55cb5e0700 (LWP 8486)):
#0  0x00007f55cbf1007f in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0
#1  0x00007f55cca7ba5a in GC_wait_marker () at pthread_support.c:1897
#2  0x00007f55cca71a2a in GC_help_marker (my_mark_no=0) at mark.c:1116
#3  0x00007f55cca7a529 in GC_mark_thread (id=0x0) at pthread_support.c:555
#4  0x00007f55cbf0a4a4 in start_thread () from /usr/lib/libpthread.so.0
#5  0x00007f55cb93a13d in clone () from /usr/lib/libc.so.6
 
Thread 3 (Thread 0x7f55caddf700 (LWP 8487)):
#0  0x00007f55cbf1007f in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0
#1  0x00007f55cca7ba5a in GC_wait_marker () at pthread_support.c:1897
#2  0x00007f55cca71a2a in GC_help_marker (my_mark_no=0) at mark.c:1116
#3  0x00007f55cca7a529 in GC_mark_thread (id=0x1) at pthread_support.c:555
#4  0x00007f55cbf0a4a4 in start_thread () from /usr/lib/libpthread.so.0
#5  0x00007f55cb93a13d in clone () from /usr/lib/libc.so.6
 
Thread 2 (Thread 0x7f55ca5de700 (LWP 8488)):
#0  0x00007f55cbf1007f in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0
#1  0x00007f55cca7ba5a in GC_wait_marker () at pthread_support.c:1897
#2  0x00007f55cca71a2a in GC_help_marker (my_mark_no=0) at mark.c:1116
#3  0x00007f55cca7a529 in GC_mark_thread (id=0x2) at pthread_support.c:555
#4  0x00007f55cbf0a4a4 in start_thread () from /usr/lib/libpthread.so.0
#5  0x00007f55cb93a13d in clone () from /usr/lib/libc.so.6
 
Thread 1 (Thread 0x7f55ccfa4740 (LWP 8485)):
#0  0x00007f55cbf139bb in waitpid () from /usr/lib/libpthread.so.0
#1  0x00007f55cc902cec in mono_handle_native_sigsegv (signal=signal@entry=11, ctx=ctx@entry=0x7fff5b260ec0, info=info@entry=0x7fff5b260ff0) at mini-exceptions.c:2234
#2  0x00007f55cc87ceac in mono_sigsegv_signal_handler (_dummy=11, _info=0x7fff5b260ff0, context=0x7fff5b260ec0) at mini-runtime.c:2419
#3  <signal handler called>
#4  GC_push_all_eager (bottom=0x0, top=0x7fff5b2615bc "") at mark.c:1468
#5  0x00007f55cca73a41 in GC_push_current_stack (cold_gc_frame=0x7fff5b2615bc "") at mark_rts.c:498
#6  0x00007f55cca7c4af in GC_with_callee_saves_pushed (fn=0x7f55cca73a17 <GC_push_current_stack>, arg=0x7fff5b2615bc "") at mach_dep.c:476
#7  0x00007f55cca7c4f2 in GC_generic_push_regs (cold_gc_frame=0x7fff5b2615bc "") at mach_dep.c:487
#8  0x00007f55cca73b8d in GC_push_roots (all=1, cold_gc_frame=0x7fff5b2615bc "") at mark_rts.c:638
#9  0x00007f55cca702c5 in GC_mark_some (cold_gc_frame=0x7fff5b2615bc "") at mark.c:326
#10 0x00007f55cca67882 in GC_stopped_mark (stop_func=0x7f55cca66d04 <GC_never_stop_func>) at alloc.c:543
#11 0x00007f55cca673fc in GC_try_to_collect_inner (stop_func=0x7f55cca66d04 <GC_never_stop_func>) at alloc.c:382
#12 0x00007f55cca74627 in GC_init_inner () at misc.c:818
#13 0x00007f55cca7418e in GC_init () at misc.c:528
#14 0x00007f55cca24ce5 in mono_gc_base_init () at boehm-gc.c:191
#15 0x00007f55cc9f7e3c in mono_init_internal (filename=filename@entry=0x4007e4 "date", exe_filename=exe_filename@entry=0x4007e4 "date", runtime_version=runtime_version@entry=0x0) at domain.c:521
#16 0x00007f55cc9f9217 in mono_init_from_assembly (domain_name=domain_name@entry=0x4007e4 "date", filename=filename@entry=0x4007e4 "date") at domain.c:906
#17 0x00007f55cc87e21e in mini_init (filename=0x4007e4 "date", runtime_version=0x0) at mini-runtime.c:3068
#18 0x0000000000400747 in main (argc=1, argv=0x7fff5b261948) at main.c:11
 
=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=================================================================
 
Abandon (core dumped)
 
________________________________________________________________________________________________
WITH GDB:
 
(gdb) r
Starting program: /home/tomtix/a.out
/home/tomtix/a.out: ./libmonoboehm-2.0.so.1: no version information available (required by /home/tomtix/a.out)
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[New Thread 0x7ffff65fd700 (LWP 8508)]
[New Thread 0x7ffff5dfc700 (LWP 8509)]
[New Thread 0x7ffff55fb700 (LWP 8510)]
 
Program received signal SIGSEGV, Segmentation fault.
GC_push_all_eager (bottom=0x0, top=0x7fffffffe6fc "") at mark.c:1468
1468            q = *p;
(gdb) bt
#0  GC_push_all_eager (bottom=0x0, top=0x7fffffffe6fc "") at mark.c:1468
#1  0x00007ffff7a90a41 in GC_push_current_stack (cold_gc_frame=0x7fffffffe6fc "") at mark_rts.c:498
#2  0x00007ffff7a994af in GC_with_callee_saves_pushed (fn=0x7ffff7a90a17 <GC_push_current_stack>,
    arg=0x7fffffffe6fc "") at mach_dep.c:476
#3  0x00007ffff7a994f2 in GC_generic_push_regs (cold_gc_frame=0x7fffffffe6fc "") at mach_dep.c:487
#4  0x00007ffff7a90b8d in GC_push_roots (all=1, cold_gc_frame=0x7fffffffe6fc "") at mark_rts.c:638
#5  0x00007ffff7a8d2c5 in GC_mark_some (cold_gc_frame=0x7fffffffe6fc "") at mark.c:326
#6  0x00007ffff7a84882 in GC_stopped_mark (stop_func=0x7ffff7a83d04 <GC_never_stop_func>) at alloc.c:543
#7  0x00007ffff7a843fc in GC_try_to_collect_inner (stop_func=0x7ffff7a83d04 <GC_never_stop_func>)
    at alloc.c:382
#8  0x00007ffff7a91627 in GC_init_inner () at misc.c:818
#9  0x00007ffff7a9118e in GC_init () at misc.c:528
#10 0x00007ffff7a41ce5 in mono_gc_base_init () at boehm-gc.c:191
#11 0x00007ffff7a14e3c in mono_init_internal (filename=filename@entry=0x4007e4 "date",
    exe_filename=exe_filename@entry=0x4007e4 "date", runtime_version=runtime_version@entry=0x0)
    at domain.c:521
#12 0x00007ffff7a16217 in mono_init_from_assembly (domain_name=domain_name@entry=0x4007e4 "date",
    filename=filename@entry=0x4007e4 "date") at domain.c:906
#13 0x00007ffff789b21e in mini_init (filename=0x4007e4 "date", runtime_version=0x0) at mini-runtime.c:3068
#14 0x0000000000400747 in main (argc=1, argv=0x7fffffffea88) at main.c:11
(gdb) print p
$1 = (word *) 0x0

I made some more investigating:

In GC_push_current_stack (libgc/mark_rts.c), the argument that is passed as first argument of GC_push_all_eager 
(libgc/mark.c) is the return value of GC_approx_sp (libgc/mark_rts.c).  This is supposed to be (as the name says)
 an approximation of the stack pointer, but it always seems to be zero.


I looked at the function, GC_approx_sp. It returns the adress of a dummy automatic variable which seems 
legitimate for what it's supposed to do. There is a lot of pragma / warning things.
I'm not sure about what this is doing, but my rude guess was that it disable gcc warning "return adress of variable local to function".



Eventually when I dissassembled the function here is what I got:

00000000002496fa <GC_approx_sp>:
  2496fa:	55                   	push   %rbp
  2496fb:	48 89 e5             	mov    %rsp,%rbp
  2496fe:	48 c7 45 f8 2a 00 00 	movq   $0x2a,-0x8(%rbp)
  249705:	00 
  249706:	b8 00 00 00 00       	mov    $0x0,%eax
  24970b:	5d                   	pop    %rbp
  24970c:	c3                   	retq  


This litteraly ALWAYS RETURNS ZERO.
(unless there is some relocation mechanism I'm not aware of, but in pratice it returns zero)
and basically this does not do what is written in the c program (this is why I think this may be a compilation problem)

The C program (without all the pragmas:)
ptr_t GC_approx_sp()
{
    VOLATILE word dummy;
    dummy = 42;
    return((ptr_t)(&dummy));
}


With this hack (which obviously is not a portable solution), I got rid of the segfault and I succeed to embed mono:

ptr_t GC_approx_sp()
{
    VOLATILE word dummy;
    dummy = 42;
    __asm__ volatile (
    	"mov %rsp, %rax\n"
    	"pop %rbp\n"
    	"ret\n"
    );
   //return((ptr_t)(&dummy));
}
Comment 1 Zoltan Varga 2015-10-06 11:12:22 UTC
You can try compiling with

gcc main.c $(pkg-config --cflags --libs monosgen-2) -Wall

This will use the newer sgen garbage collector which doesn't suffer from this problem.
Comment 2 James Laird-Wah 2016-02-08 20:28:43 UTC
Are you using GCC 5.x? I had the same problem figured out it was due to new compiler behaviours, with a fix: https://github.com/mono/mono/pull/2574
Thanks for your work in tracking it down.

I note that the sgen approach didn't work - it dies with an assertion failure somewhere else. May be the same general issue but I can't see where it's coming from.
Comment 3 Mirco Bauer 2016-06-11 15:34:57 UTC
I ran into the same problem with mkbundle, see #41731
Comment 4 Alex Rønne Petersen 2016-08-29 08:53:28 UTC
This appears to be fixed by the PR linked earlier.
Comment 5 Shruti 2016-08-30 10:10:21 UTC
Reproduce Status:
---------------------------------
I tried to reproduce this issue with 

C6( MonoFramework-MDK-4.2.0.144.macos10.xamarin.x86_c109f9ca03d38e608cbd85cb2fdf8cfaf55bb97c) builds but not able to reproduce it.I also looked into the wrench for the exact builds given in bug description but haven't found it.

Environment Info: https://gist.github.com/sachins360/1b9de71404d211c7ad994a57a1d1f664
Terminal Output: https://gist.github.com/sachins360/8a5f43156de063546ef824f455db4c06


Verifying Status:
---------------------------------

I have checked this issue with latest cycle8 builds following by the steps given in comment0 and observed that the issue is fixed.

Screencast: http://www.screencast.com/t/tAUVbS81Xf
Environment Info: https://gist.github.com/sachins360/a21feb42493710a61b035a29095f579d


Thanks!