Bug 16432 - Mono GC deadlock 2
Summary: Mono GC deadlock 2
Status: RESOLVED FIXED
Alias: None
Product: Runtime
Classification: Mono
Component: GC ()
Version: 3.2.x
Hardware: PC Linux
: --- normal
Target Milestone: ---
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2013-11-24 23:41 UTC by Sergey Zhukov
Modified: 2013-11-26 01:36 UTC (History)
3 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Sergey Zhukov 2013-11-24 23:41:12 UTC
Another mono GC deadlocking:

Use case: 
#1 thread tries to allocate unmanaged memory using Marshal.AllocHGlobal. Implementation of Marshal.AllocHGlobal uses malloc(). Implementation of malloc() locks ar_ptr before allocating has been done.
#2 thread tries to allocate managed memory for Char[] array. It locks GC than signals SIGPWR and starts garbage collecting. Thread #1 catches the signal just in the malloc() lock. Than thread #2 calls qsort(). qsort() implementaion uses malloc() and malloc() deadlocks on mutex of thread #1.   

I think AllockHGlobal should also locks GC before allocating unmanaged memory or somehow does not suspend current thread on SIGPWR before returning from malloc().

OS: Ubuntu linux 12.04 32 bit
Mono Runtime Engine version 3.2.7 (master/2c7864a Sun Nov 24 10:05:40 NOVT 2013)

Backtrace.

Thread 1
#0  0xb772b424 in __kernel_vsyscall ()
#1  0xb753a622 in do_sigsuspend (set=0x83b63a0)
    at ../sysdeps/unix/sysv/linux/sigsuspend.c:63
#2  __GI___sigsuspend (set=0x83b63a0)
    at ../sysdeps/unix/sysv/linux/sigsuspend.c:78
#3  0x08217886 in suspend_thread (context=0xb3a8110c, info=0xb2e52e18)
    at sgen-os-posix.c:113
#4  suspend_handler (sig=30, siginfo=0xb3a8108c, context=0xb3a8110c)
    at sgen-os-posix.c:131
#5  <signal handler called>
#6  0xb772b424 in __kernel_vsyscall ()
#7  0xb7609821 in __lll_lock_wait_private ()
    at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:95
#8  0xb7587b0e in _L_lock_10452 () at malloc.c:5242
#9  0xb7585de3 in __GI___libc_malloc (bytes=131072) at malloc.c:2921
#10 0x08282361 in monoeg_try_malloc (x=131072) at gmem.c:93
#11 0x08199431 in ves_icall_System_Runtime_InteropServices_Marshal_AllocHGlobal
    (size=<optimized out>) at marshal.c:12080
#12 0xb287bfd0 in ?? ()
#13 0xb287bf8c in ?? ()
#14 0xb23de308 in ?? ()
#15 0xb23de1b0 in ?? ()
#16 0xb23ddb74 in ?? ()
#17 0xb23dd74c in ?? ()
#18 0xb2819d0c in ?? ()

 
Thread 2
#0  0xb772b424 in __kernel_vsyscall ()
#1  0xb7609821 in __lll_lock_wait_private ()
    at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:95
#2  0xb7587b0e in _L_lock_10452 () at malloc.c:5242
#3  0xb7585de3 in __GI___libc_malloc (bytes=5020) at malloc.c:2921
#4  0xb753e2e2 in __GI_qsort_r (b=0xb1c1d560, n=1255, s=4, cmp=
    0x8223660 <compare_pointers>, arg=0x0) at msort.c:223
#5  0xb753e58f in __GI_qsort (b=0xb1c1d560, n=1255, s=4, cmp=
    0x8223660 <compare_pointers>) at msort.c:308
#6  0x08223d8c in major_have_computer_minor_collection_allowance ()
    at sgen-marksweep.c:1917
#7  0x08243f7d in sgen_memgov_try_calculate_minor_collection_allowance (
    overwrite=<optimized out>) at sgen-memory-governor.c:150
#8  sgen_memgov_try_calculate_minor_collection_allowance (
    overwrite=<optimized out>) at sgen-memory-governor.c:86
#9  0x0824418d in sgen_memgov_major_collection_end ()
    at sgen-memory-governor.c:196
#10 0x0821e392 in major_finish_collection (reason=0x834b7f5 "Minor allowance", 
    old_next_pin_slot=2305, scan_mod_union=<optimized out>) at sgen-gc.c:3258
#11 0x0821e773 in major_do_collection (reason=0x834b7f5 "Minor allowance")
    at sgen-gc.c:3297
#12 major_do_collection (reason=0x834b7f5 "Minor allowance") at sgen-gc.c:3279
#13 0x082225b6 in sgen_perform_collection (requested_size=4096, 
    generation_to_collect=1, reason=0x834b7f5 "Minor allowance", 
    wait_to_finish=0) at sgen-gc.c:3480
#14 0x08222ac4 in sgen_ensure_free_space (size=4096) at sgen-gc.c:3414
#15 0x082381c6 in mono_gc_alloc_obj_nolock (vtable=vtable("System.Char[]"), 
    size=<optimized out>) at sgen-alloc.c:288
#16 0x082384f4 in mono_gc_alloc_vector (vtable=vtable("System.Char[]"), size=
    2064, max_length=1024) at sgen-alloc.c:491
#17 0xb725bddb in ?? ()
#18 0xb701beec in ?? ()
#19 0xb2f3461c in ?? ()
#20 0xb2f344e8 in ?? ()
#21 0xb2f34454 in ?? ()
#22 0xb2f3471c in ?? ()
Comment 1 Sergey Zhukov 2013-11-25 01:19:56 UTC
Some additional info about callers of Marshal:AllocHGlobal

(gdb) p mini_jit_info_table_find($domain,0xb287bf8c,$td)->d.method
$9 = "System.Runtime.InteropServices.Marshal:AllocHGlobal ()"
(gdb) p mini_jit_info_table_find($domain,0xb23de308,$td)->d.method
$10 = "BlockManager:EnsureCapacity ()"
(gdb) p mini_jit_info_table_find($domain,0xb23de1b0,$td)->d.method
$11 = "BlockManager:Write ()"
(gdb) p mini_jit_info_table_find($domain,0xb23ddb74,$td)->d.method
$12 = "ByteBucket:Write ()"
(gdb) p mini_jit_info_table_find($domain,0xb23dd74c,$td)->d.method
$13 = "System.Web.HttpResponseStream:AppendBuffer ()"
(gdb) p mini_jit_info_table_find($domain,0xb2819d0c,$td)->d.method
$14 = "System.Web.HttpResponseStream:Write ()"
(gdb) p mini_jit_info_table_find($domain,0xb2f1e13d,$td)->d.method
$15 = "System.IO.StreamWriter:FlushBytes ()"
(gdb) p mini_jit_info_table_find($domain,0xb2f1e0c8,$td)->d.method
$16 = "System.IO.StreamWriter:FlushCore ()"
(gdb) p mini_jit_info_table_find($domain,0xb2f1e094,$td)->d.method
$17 = "System.IO.StreamWriter:Flush ()"
(gdb) p mini_jit_info_table_find($domain,0xb2818cfc,$td)->d.method
$18 = "ServiceStack.Text.JsonSerializer:SerializeToStream ()"
Comment 2 Mark Probst 2013-11-25 08:50:12 UTC
I'm on it.
Comment 3 Sergey Zhukov 2013-11-25 09:23:24 UTC
Also, please note, that this is not just AllocHGlobal issue, this deadlock might occur in every call of malloc() or g_free(). 
For example the most part of functions from marshal.c use g_free(),  mono_string_to_utf8() from object.c calls g_utf16_to_utf8() from giconv.c, which uses malloc() and so on
Comment 4 Mark Probst 2013-11-25 09:46:58 UTC
Fixed in 8a4e1ca74e654095b342619d06dcc9b73e402946.
Comment 5 Sergey Zhukov 2013-11-26 01:36:12 UTC
Works for me