Bug 27348 - Deadlock in WebConnectionGroup.Close() / WebConnection.Close()
Summary: Deadlock in WebConnectionGroup.Close() / WebConnection.Close()
Status: RESOLVED FIXED
Alias: None
Product: Class Libraries
Classification: Mono
Component: System ()
Version: unspecified
Hardware: PC Linux
: --- normal
Target Milestone: Untriaged
Assignee: Martin Baulig
URL:
Depends on:
Blocks:
 
Reported: 2015-02-24 08:13 UTC by HellBrick
Modified: 2015-06-08 04:36 UTC (History)
2 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description HellBrick 2015-02-24 08:13:59 UTC
I've encountered what seems to be a deadlock when downloading a few things from one host in parallel. Here are the relevant parts of the stack traces:

(1)

"Threadpool worker" tid=0x0x7f279ddc7700 this=0x0x7f272a34e3d0 thread handle 0x574 state : interrupted state owns ()
  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) System.Threading.Monitor.try_enter_with_atomic_var (object,int,bool&) <IL 0x0000f, 0xffffffff>
  at System.Threading.Monitor.TryEnter (object,int,bool&) <IL 0x00047, 0x00057>
  at System.Threading.Monitor.Enter (object,bool&) <IL 0x00003, 0x00023>
  at (wrapper unknown) System.Threading.Monitor.FastMonitorEnterV4 (object,bool&) <IL 0x00059, 0xffffffff>
  at System.Net.WebConnection.Close (bool) <IL 0x00007, 0x00053>
  at System.Net.WebConnectionGroup.Close () <IL 0x0004e, 0x000ab>
  at System.Net.ServicePoint.CloseConnectionGroup (string) <IL 0x0001b, 0x00063>
  at System.Net.Http.HttpClientHandler.Dispose (bool) <IL 0x00027, 0x00063>

(2)

"Threadpool worker" tid=0x0x7f279fd9f700 this=0x0x7f2723ff5350 thread handle 0x536 state : interrupted state owns ()
  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) System.Threading.Monitor.try_enter_with_atomic_var (object,int,bool&) <IL 0x0000f, 0xffffffff>
  at System.Threading.Monitor.TryEnter (object,int,bool&) <IL 0x00047, 0x00057>
  at System.Threading.Monitor.Enter (object,bool&) <IL 0x00003, 0x00023>
  at (wrapper unknown) System.Threading.Monitor.FastMonitorEnterV4 (object,bool&) <IL 0x00059, 0xffffffff>
  at System.Net.WebConnectionGroup/ConnectionState.SetIdle () <IL 0x0000c, 0x0003b>
  at System.Net.WebConnection.Close (bool) <IL 0x000f0, 0x00276>
  at System.Net.WebConnectionStream.Close () <IL 0x000c7, 0x001e3>
  at System.IO.Stream.Dispose () <IL 0x00001, 0x00013>

My guess is:

- two threads use one WebConnection concurrently;
- thread (1) calls WebConnectionGroup.Close(), which takes a lock on a ServicePoint instance (https://github.com/mono/mono/blob/master/mcs/class/System/System.Net/WebConnectionGroup.cs#L71), then calls WebConnection.Close() (https://github.com/mono/mono/blob/master/mcs/class/System/System.Net/WebConnectionGroup.cs#L80), which tries to take a lock on itself (https://github.com/mono/mono/blob/master/mcs/class/System/System.Net/WebConnection.cs#L1176);
- meanwhile, thread (2) calls WebConnection.Close(), which takes a lock on itself (https://github.com/mono/mono/blob/master/mcs/class/System/System.Net/WebConnection.cs#L1176), then calls state.SetIdle() (https://github.com/mono/mono/blob/master/mcs/class/System/System.Net/WebConnection.cs#L1203), which in turn tries to lock on a ServicePoint instance (https://github.com/mono/mono/blob/master/mcs/class/System/System.Net/WebConnectionGroup.cs#L264);
- two threads take two locks in different order, leading to a classic deadlock.

I encountered a bug when running 3.10 on Linux, but judging by the code in the master branch, it's still open to the possibility of this bug.
Comment 1 Martin Baulig 2015-06-08 04:36:51 UTC
This should have been already fixed in
https://github.com/mono/mono/commit/893bacaf9a6f6544cb9a3e65799d7b5bcf7c163a

If there's still a race between the WebConnection and ServicePoint, then we might need to change this to use a common lock.