Bug 17983 - Mono GC deadlock 3
Summary: Mono GC deadlock 3
Status: RESOLVED NOT_REPRODUCIBLE
Alias: None
Product: Runtime
Classification: Mono
Component: GC ()
Version: 3.2.x
Hardware: PC Linux
: --- normal
Target Milestone: ---
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2014-02-24 23:52 UTC by Sergey Zhukov
Modified: 2017-07-08 03:27 UTC (History)
5 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
Stacktraces (210.45 KB, text/plain)
2014-03-10 05:15 UTC, Sergey Zhukov
Details


Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED NOT_REPRODUCIBLE

Description Sergey Zhukov 2014-02-24 23:52:35 UTC
There is another deadlock in mono GC. When I run latest monodevelop I've got deadlock. There are two threads, which lock each other. Thread 7 made GC_LOCK when try to allocate new object. Then sends signals to other threads to interrupt and wait the ack. All threads are stopped but thread 13 is not stopped, because it made GC_LOCK in mono_gc_skip_thread. The same issue in tp_epoll_wait() function 

Thread 7 (Thread 0x4f000b40 (LWP 12660)):
#0  0x40022424 in __kernel_vsyscall ()
#1  0x4008b1f6 in nanosleep () at ../sysdeps/unix/syscall-template.S:82
#2  0x0828b3ae in monoeg_g_usleep (microseconds=21420) at gdate-unix.c:53
#3  0x082464d9 in restart_threads_until_none_in_managed_allocator ()
    at sgen-stw.c:153
#4  sgen_stop_world (generation=0) at sgen-stw.c:216
#5  0x0822333a in sgen_perform_collection (requested_size=4096, 
    generation_to_collect=0, reason=0x834cb8f "Nursery full", wait_to_finish=0)
    at sgen-gc.c:3465
#6  0x08223904 in sgen_ensure_free_space (size=4096) at sgen-gc.c:3435
#7  0x08239c56 in mono_gc_alloc_obj_nolock (vtable=
    vtable("IKVM.Reflection.Reader.ParameterInfoImpl"), size=<optimized out>)
    at sgen-alloc.c:288
#8  0x08239d3a in mono_gc_alloc_obj (vtable=
    vtable("IKVM.Reflection.Reader.ParameterInfoImpl"), size=24)
    at sgen-alloc.c:469
#9  0x4049610d in ?? ()
#10 0x404918ec in ?? ()
#11 0x44b8f4a8 in ?? ()
#12 0x44b93674 in ?? ()
#13 0x44b9364f in ?? ()
#14 0x44b92fb0 in ?? ()
#15 0x44b92cec in ?? ()
---Type <return> to continue, or q <return> to quit---
#16 0x44b8dfb0 in ?? ()
#17 0x44b8ba28 in ?? ()
#18 0x44b82c14 in ?? ()
#19 0x44b714c0 in ?? ()
#20 0x44b70992 in ?? ()
#21 0x44b6f24c in ?? ()
#22 0x44b32efc in ?? ()
#23 0x416f1523 in System.Threading.Tasks.TaskActionInvoker/ActionInvoke:Invoke
    (this=..., owner=<optimized out>, state=<optimized out>, 
    context=<optimized out>)
    at /home/sergey/Projects/mono/mono/mcs/class/corlib/System.Threading.Tasks/TaskActionInvoker.cs:141
#24 0x416ed79e in System.Threading.Tasks.Task:InnerInvoke (this=...)
    at /home/sergey/Projects/mono/mono/mcs/class/corlib/System.Threading.Tasks/Task.cs:1084
#25 0x416ecf55 in System.Threading.Tasks.Task:ThreadStart (this=...)
    at /home/sergey/Projects/mono/mono/mcs/class/corlib/System.Threading.Tasks/Task.cs:836
#26 0x416ed4de in System.Threading.Tasks.Task:Execute (this=...)
    at /home/sergey/Projects/mono/mono/mcs/class/corlib/System.Threading.Tasks/Task.cs:1013

[Switching to thread 13 (Thread 0x4bd64b40 (LWP 12629))]
#0  0x40022424 in __kernel_vsyscall ()
#1  0x4008a5a2 in __lll_lock_wait ()
    at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
#2  0x40085ead in _L_lock_686 () from /lib/i386-linux-gnu/libpthread.so.0
#3  0x40085cf3 in __pthread_mutex_lock (mutex=0x83c1244)
    at pthread_mutex_lock.c:61
#4  0x082244c2 in mono_gc_set_skip_thread (skip=0) at sgen-gc.c:5618
#5  0x081d633f in check_for_interruption_critical () at threadpool.c:1425
#6  0x081d8e55 in async_invoke_thread (data=0x0) at threadpool.c:1570
#7  0x081d39e9 in start_wrapper_internal (data=0x9f8c2c8) at threads.c:643
#8  start_wrapper (data=0x9f8c2c8) at threads.c:688
#9  0x0827e135 in inner_start_thread (arg=0x446ffd0c)
    at mono-threads-posix.c:94
#10 0x40083d4c in start_thread (arg=0x4bd64b40) at pthread_create.c:308
#11 0x40187bae in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130


Mono JIT compiler version 3.4.0 (master/ba5aecb Tue Feb 25 09:55:52 NOVT 2014)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
	TLS:           __thread
	SIGSEGV:       altstack
	Notifications: epoll
	Architecture:  x86
	Disabled:      none
	Misc:          softdebug 
	LLVM:          supported, not enabled.
	GC:            sgen
Comment 1 Mark Probst 2014-02-26 17:08:36 UTC
I cannot reproduce this bug.  Can you provide a test case?
Comment 2 Sergey Zhukov 2014-02-26 20:04:38 UTC
When I tried to found the cause of the issue monodevelop overwrote ~/.recently-used file and after that I can't reproduce the bug also.

What can I find, the issue somehow related to File.Open/File.Read functions. I have the project with large amount of opened .cs files in IDE editor (more than 100). When I open the solution monodevelop tries to read .recently-used file with FileShare.None option for every .cs file being parsed in parallel. But in some moment File.Open throws an exception (sharing violation) and monodevelop hangs with mono in GC. The exception was raised only with previous .recently-used file which unfortunately I did not save before experiments. Maybe the file was big or something else I don't know exactly, but on every hang I saw that one of thread was in the Read method.

This is the method in monodevelop which throws an exception and produces hang. If I reproduce the bug again I will add more info. 

RecentFileStorage.cs
------------------------------
		//FIXME: should we P/Invoke lockf on POSIX or is Mono's FileShare.None sufficient?
		static FileStream AcquireFileExclusive (string filePath)
		{
			const int MAX_WAIT_TIME = 1000;
			const int RETRY_WAIT = 50;
			
			int remainingTries = MAX_WAIT_TIME / RETRY_WAIT;
			while (true) {
				try {
					Directory.CreateDirectory (Path.GetDirectoryName (filePath));
					return File.Open (filePath, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None);
				} catch (Exception ex) {
					//FIXME: will it work on Mono if we check that it's an access conflict, i.e. HResult is 0x80070020?
					if (ex is IOException && remainingTries > 0) {
						Thread.Sleep (RETRY_WAIT);
						remainingTries--;
						continue;
					}
					throw;
				}
			}
		} 
------------------------------
Comment 3 Sergey Zhukov 2014-03-10 05:15:09 UTC
Created attachment 6270 [details]
Stacktraces

That's another stacktraces for the same issue. Now I've got "double free or corruption (fasttop)" error.

Mono JIT compiler version 3.4.0 (master/243899a Wed Mar  5 13:31:30 NOVT 2014)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
	TLS:           __thread
	SIGSEGV:       altstack
	Notifications: epoll
	Architecture:  x86
	Disabled:      none
	Misc:          softdebug 
	LLVM:          supported, not enabled.
	GC:            sgen
Comment 4 Sergey Zhukov 2014-05-26 14:04:15 UTC
Some addtitional info about how to reproduce this bug.

Install monodevelop and start it, then open csharp project. Then start monodevelop using "make run" command from MD build directory and open project with alot of opened *.cs files. In this case monodevelop has a good chance to get the deadlock in mono (I get it 1 to 5-10 times of run). 

Before the deadlock monodevelop 5.1 writes to console exceptions about exclusive locking the .recently-used file.

I tried to isolate the issue in a separate testcase, but for now can only get deadlock only with monodevelop.
Comment 5 SN 2015-06-01 10:41:51 UTC
I am getting a similar deadlock in mono 3.2.8 (linux) but in a very different project - nothing to do with monodevelop. Unfortunately I don't have a good test case yet to attach. Without a short snippet of code that reproduces the issue it is likely to stay open for a long time and eventually be closed.
Comment 6 SN 2015-06-01 18:11:39 UTC
Furthermore, on thread 1 in my deadlocked process the stacktrace matches the trace above fairly well. However, in 1 of the threads that owns a lock that 2 other threads are waiting on, the stack trace matches bug 15759.
Here is a partial stacktrace.

#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f6d52f51657 in _L_lock_909 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007f6d52f51480 in __GI___pthread_mutex_lock (mutex=0x995540 <gc_mutex>)
    at ../nptl/pthread_mutex_lock.c:79
#3  0x00000000005f4b0b in mono_gc_alloc_pinned_obj (vtable=0x2085da8, size=32)
    at sgen-alloc.c:585
#4  0x00000000005b5501 in mono_object_new_pinned (domain=domain@entry=0x1fd0730, 
    klass=<optimized out>) at object.c:4397
#5  0x00000000005c1a2f in mono_type_get_object (domain=domain@entry=0x1fd0730, 
    type=0x24d8368) at reflection.c:6527
#6  0x00000000005c514d in mono_param_get_objects_internal (domain=0x1fd0730, 
    method=0x7f6d3c05b938, refclass=0x2456b68) at reflection.c:6821
#7  0x00000000406d8593 in ?? ()
#8  0x00007f6d2c0c3850 in ?? ()
.......


I wonder if bug 15759 is really fixed in 3.2.8?
Comment 7 Sergey Zhukov 2015-06-01 23:00:40 UTC
#17983 is not fixed even in mono 3.12.1, monodevelop hangs on opening project when tries to rewrite ~/.recently-used and a lot of files were opened for editing inside monodevelop. I tried to investigate this, and found that monodevelop starts a task per each opened file and each task tries to rewrite this file. They throws "sharing violation" exception and sometimes this produces deadlock. Unfortunately I could not create small reproducible test case, which can fix the error.
Comment 8 Ludovic Henry 2017-07-08 03:27:47 UTC
Can you still reproduce with latest mono? If that's the case, please reopen and provide a repro case. Thank you.