Bug 40239 - SIGSEGV 27408 repro
Summary: SIGSEGV 27408 repro
Status: VERIFIED FIXED
Alias: None
Product: Android
Classification: Xamarin
Component: Mono runtime / AOT Compiler ()
Version: unspecified
Hardware: PC Mac OS
: --- normal
Target Milestone: 6.1 (C7)
Assignee: Rodrigo Kumpera
URL:
Depends on:
Blocks:
 
Reported: 2016-04-08 15:19 UTC by Jonathan Pryor
Modified: 2016-04-24 18:17 UTC (History)
3 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on Developer Community or GitHub with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
VERIFIED FIXED

Description Jonathan Pryor 2016-04-08 15:19:08 UTC
Xamarin.Android 6.1.0 (Cycle 7) uses Mono 4.4.0, and the repro from Bug #24708 is now causing a SIGSEGV.

Repro:

> curl -o Test.zip 'https://bugzilla.xamarin.com/attachment.cgi?id=10026'
> unzip Test.zip
> cd Test
> xbuild /t:Install
> xbuild /t:_Run
# -or-
> xbuild /t:_Gdb
...

Let the app run for ~10 minutes, and the app will crash. I have three separate gdb stack traces:

> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 21736]
> 0xb37232f8 in ?? () from /Volumes/Seagate4TB/work/bxc-27408/Test/gdb-symbols/libmonosgen-32bit-2.0.so
> (gdb) bt
> #0  0xb37232f8 in ?? () from /Volumes/Seagate4TB/work/bxc-27408/Test/gdb-symbols/libmonosgen-32bit-2.0.so
> #1  <signal handler called>
> #2  0xb6c866c0 in __memcpy_base () from /Volumes/Seagate4TB/work/bxc-27408/Test/gdb-symbols/libc.so
> #3  0xb6cbb884 in je_arena_ralloc () from /Volumes/Seagate4TB/work/bxc-27408/Test/gdb-symbols/libc.so
> #4  0xb6cc6a10 in je_realloc () from /Volumes/Seagate4TB/work/bxc-27408/Test/gdb-symbols/libc.so
> #5  0xb38cbaa0 in ?? () from /Volumes/Seagate4TB/work/bxc-27408/Test/gdb-symbols/libmonosgen-32bit-2.0.so
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)

> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 23475]
> 0xb37232f8 in mono_create_jump_trampoline (Cannot access memory at address 0x0
> domain=0xb6cedeac <usual>, method=0x1a, add_sync_wrapper=-1667502368)
>     at /Users/builder/data/lanes/1196/65564e92/source/mono/mono/mini/mini-trampolines.c:1436
> 1436	/Users/builder/data/lanes/1196/65564e92/source/mono/mono/mini/mini-trampolines.c: No such file or directory.
> (gdb) bt
> #0  0xb37232f8 in mono_create_jump_trampoline (domain=0xb6cedeac <usual>, method=0x1a, add_sync_wrapper=-1667502368)
>     at /Users/builder/data/lanes/1196/65564e92/source/mono/mono/mini/mini-trampolines.c:1436
> #1  0xa97b1300 in ?? ()
> Cannot access memory at address 0x0

> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 25276]
> 0xb37232f8 in mono_create_jump_trampoline (domain=0x9a4fefdc, domain@entry=<error reading variable: Cannot access memory at address 0xfffffff2>, method=0x12, 
>     method@entry=<error reading variable: Cannot access memory at address 0xfffffff2>, add_sync_wrapper=-1706039560, 
>     add_sync_wrapper@entry=<error reading variable: Cannot access memory at address 0xfffffff2>)
>     at /Users/builder/data/lanes/1196/65564e92/source/mono/mono/mini/mini-trampolines.c:1436
> 1436	/Users/builder/data/lanes/1196/65564e92/source/mono/mono/mini/mini-trampolines.c: No such file or directory.
> (gdb) bt
> #0  0xb37232f8 in mono_create_jump_trampoline (domain=0x9a4fefdc, domain@entry=<error reading variable: Cannot access memory at address 0xfffffff2>, method=0x12, 
>     method@entry=<error reading variable: Cannot access memory at address 0xfffffff2>, add_sync_wrapper=-1706039560, 
>     add_sync_wrapper@entry=<error reading variable: Cannot access memory at address 0xfffffff2>)
>     at /Users/builder/data/lanes/1196/65564e92/source/mono/mono/mini/mini-trampolines.c:1436
> Cannot access memory at address 0xfffffff2
Comment 1 Jonathan Pryor 2016-04-08 15:24:50 UTC
Ideally, once the SIGSEGV crash is fixed we can use the Bug #24708 repro to help track down Bug #40136...

We can be that lucky, right?
Comment 2 Jonathan Pryor 2016-04-08 19:24:07 UTC
Update: PeterC isn't able to repro this on Cycle7, but he *is* able to repro this with monodroid/master 392c71cdc, which uses Mono 4.4, on a Nexus 5.
Comment 3 Zoltan Varga 2016-04-12 00:41:10 UTC
I can reproduce a crash with this build monodroid/master build:
https://wrench.internalx.com/Wrench/ViewLane.aspx?lane_id=1196&host_id=163&revision_id=746758

It takes a lot of time to happen.
Comment 4 Zoltan Varga 2016-04-12 03:09:11 UTC
Some findings:
- The crash seems to happen on Thread 4, which is one of the threads started by the app.
- The stacktrace is the following:
Thread 4 (Thread 30401):
Cannot access memory at address 0x1
#0  mono_create_jump_trampoline (domain=0x1, method=0x9b4ff154, add_sync_wrapper=-1689260648)
    at /Users/builder/data/lanes/1196/6550f72a/source/mono/mono/mini/mini-trampolines.c:1436
#1  0x9c820ac0 in ?? ()
Cannot access memory at address 0x1
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

The crash happens because some code seems to jump into the middle of this function:
=> 0xa95172f0 <mono_create_jump_trampoline+308>:	ldr	r0, [sp, #12]
   0xa95172f4 <mono_create_jump_trampoline+312>:	ldr	r1, [sp, #8]
   0xa95172f8 <mono_create_jump_trampoline+316>:	str	r1, [r0, #16]
   0xa95172fc <mono_create_jump_trampoline+320>:	ldr	r0, [sp, #12]

which means the arguments etc. are bogus, so the crash happens at +316. lr is also bogus, so its hard to determine what code jumps here.

There always seems to be a thread doing a gc:

Thread 5 (Thread 30402):
#0  0xb6cd45e4 in syscall () from /Users/vargaz/Projects/40239/Test/gdb-symbols/libc.so
#1  0xb6cd956c in sem_wait () from /Users/vargaz/Projects/40239/Test/gdb-symbols/libc.so
#2  0xa963d740 in mono_os_sem_wait (flags=MONO_SEM_FLAGS_NONE, sem=<optimized out>) at /Users/builder/data/lanes/1196/6550f72a/source/mono/mono/utils/mono-os-semaphore.h:163
#3  sgen_wait_for_suspend_ack (count=2) at /Users/builder/data/lanes/1196/6550f72a/source/mono/mono/metadata/sgen-os-posix.c:188
#4  0xa963d8dc in sgen_thread_handshake (suspend=<optimized out>) at /Users/builder/data/lanes/1196/6550f72a/source/mono/mono/metadata/sgen-os-posix.c:223
---Type <return> to continue, or q <return> to quit---
#5  0xa9649ec8 in sgen_client_stop_world (generation=0) at /Users/builder/data/lanes/1196/6550f72a/source/mono/mono/metadata/sgen-stw.c:233
#6  0xa965c3d0 in sgen_stop_world (generation=0) at /Users/builder/data/lanes/1196/6550f72a/source/mono/mono/sgen/sgen-gc.c:3198
#7  0xa965bc74 in sgen_perform_collection (requested_size=4096, generation_to_collect=0, reason=0xa96f0a03 "Nursery full", wait_to_finish=0)
    at /Users/builder/data/lanes/1196/6550f72a/source/mono/mono/sgen/sgen-gc.c:2218
#8  0xa9650258 in sgen_alloc_obj_nolock (vtable=0xb36b7070, size=32) at /Users/builder/data/lanes/1196/6550f72a/source/mono/mono/sgen/sgen-alloc.c:291
Comment 5 Zoltan Varga 2016-04-12 03:14:46 UTC
mono e3b4f547ec75ceb4113f07ef15428737c304deea could be related, its not in mono 4.4.
Comment 6 Jonathan Pryor 2016-04-12 18:53:09 UTC
> its not in mono 4.4.

That part confuses me, as monodroid/master is using mono-4.4.0-branch/1025cb85, not some mono/master commit.

Whatever is causing the problem is *something* in mono-4.4.0-branch.
Comment 7 Zoltan Varga 2016-04-12 21:25:01 UTC
Its in 4.4, it was a mistake.

We tracked it down, its a problem with that patch.
Comment 9 Zoltan Varga 2016-04-13 20:30:48 UTC
Fixed in mono-extensions 1f8fb48dac9c349ba150e8c55f52e3150f29ca08.
Comment 10 Peter Collins 2016-04-24 18:17:27 UTC
I was no longer able to reproduce this after letting the same test case from before run for a little over 30 minutes using monodroid/master/3e9342611b8d5ba19b256f4fe543abbed29ef79c