Bug 2619 - ThreadPool keeps objects alive while waiting for more work
Summary: ThreadPool keeps objects alive while waiting for more work
Status: RESOLVED FIXED
Alias: None
Product: Runtime
Classification: Mono
Component: General ()
Version: unspecified
Hardware: PC Mac OS
: --- normal
Target Milestone: ---
Assignee: Bugzilla
URL:
Depends on:
Blocks: 386
  Show dependency tree
 
Reported: 2011-12-22 18:13 UTC by Rolf Bjarne Kvinge [MSFT]
Modified: 2012-01-17 11:43 UTC (History)
9 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
client.zip (1.44 KB, application/zip)
2011-12-22 18:13 UTC, Rolf Bjarne Kvinge [MSFT]
Details
Patch that helps (8.71 KB, patch)
2011-12-22 18:16 UTC, Rolf Bjarne Kvinge [MSFT]
Details
HeapShot screenshot (84.93 KB, image/png)
2011-12-22 18:27 UTC, Rolf Bjarne Kvinge [MSFT]
Details
epoll changes to 'Patch that helps' (13.50 KB, patch)
2012-01-03 16:57 UTC, Gonzalo Paniagua Javier
Details
Patch that helps, v2 (13.71 KB, application/octet-stream)
2012-01-04 16:47 UTC, Rolf Bjarne Kvinge [MSFT]
Details
Disable the GC for threadpool threads while waiting for more work (7.58 KB, patch)
2012-01-13 22:06 UTC, Rolf Bjarne Kvinge [MSFT]
Details


Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Rolf Bjarne Kvinge [MSFT] 2011-12-22 18:13:14 UTC
Created attachment 1084 [details]
client.zip

Each threadpool thread can keep objects alive until more work is done by that thread. This is because the GC finds references to the objects on the stack (even if the code nulls out the reference there may be other temporaries on the stack created by gcc).

To reproduce: Get http://bugzilla.xamarin.com/attachment.cgi?id=181, and run the server project (TestTouchService) on a windows machine. Then get the client from here, and run it. Write the number of MB of data to get from the windows machine (25 seems to cause a decent spike in mem usage). When done, type GC a couple of times to run the GC. Note how the memory usage is still enormous compared to before downloading anything.
Comment 1 Rolf Bjarne Kvinge [MSFT] 2011-12-22 18:16:28 UTC
Created attachment 1085 [details]
Patch that helps

One solution (illustrated by the attached patch), is to refactor async_invoke_thread so that the objects are used in a separate method, so that when work is done and the thread is waiting for more work, the stack doesn't contain references to the objects anymore.

This patch improves memory usage (but doesn't quite fix it, that is likely another bug somewhere else though).
Comment 2 Rolf Bjarne Kvinge [MSFT] 2011-12-22 18:27:52 UTC
Created attachment 1086 [details]
HeapShot screenshot

Memory Usage:
- Right after startup: 8 MB
- Right after getting 25MB of data: 279 MB
- After running the GC a couple of times: 164 MB.

See attached screenshot from HeapShot, it shows inverse references for a StringBuilder that is kept alive by two AsyncResult and SocketAsyncResult instances (the <Misc Root> nodes are the stack referencing those objects).
Comment 3 Rolf Bjarne Kvinge [MSFT] 2011-12-22 18:42:06 UTC
With the patch from comment #1, the memory usage goes down to 71 MB after
running the GC.

Unfortunately the patch has other consequences (I can only run 1 request,
subsequent requests just sits there, doing nothing). This might be the symptom
of a more severe bug (depending on the threadpool threads to keep objects alive
on the stack... which will fail miserably with certain threadpool loads).
Comment 4 Dynami Le Savard 2011-12-23 13:22:26 UTC
Lately we had weird memory problems when loading stuff asynchronously. 

A series of operation that was taking a long time was moved from synchronous loading to asynchronous using BackgroundWorkers/NSThreads. When we moved to asynchronous we started receiving memory warnings and eventually crashes (when deploying on the device), whereas we don't have such problems when the operations are done synchronously on the main thread.

Could our issues be caused by the problem described in this case ?
Comment 5 Gonzalo Paniagua Javier 2012-01-03 12:59:40 UTC
Dynami, I do not know the problems you experienced are related to this bug, but I doubt it. Looks more like you did not register the threads (NSAutoReleasePool...).

Rolf, I guess you already know that running this server on Mono easily crashes (out of memory) due to XmlWriter leaking StringBuilders like a sieve...
Comment 6 Rolf Bjarne Kvinge [MSFT] 2012-01-03 13:08:10 UTC
Gonzalo, I actually never ran the server on Mono, I ran it on Windows :) In any case that sounds like another bug that should be fixed...
Comment 7 Gonzalo Paniagua Javier 2012-01-03 13:48:50 UTC
Rolf, I doubt the threadpool has anything to do with this problem. Your patch might fail because FILE is not thread-safe. We have never been good dealing with
large arrays and sys.xml has always been allocation-happy (huge byte array, then huge string...). Using large non-xml payloads  with xsp works just fine and does NOT leak even after thousands of request have been made.
Comment 8 Rolf Bjarne Kvinge [MSFT] 2012-01-03 14:13:59 UTC
Gonzalo, I don't think you quite understood the problem. The issue is not leaking a bit in thousands of requests, the issue is leaking the last request (per threadpool thread). Doing another request will free the previous one (if it happens on the same threadpool thread).

The fact that non-xml payloads doesn't show the problem doesn't mean it isn't there - you can add counters to (Socket)AsyncResult's ctor and dtor and see that not all are freed after a request has finished. No need for any payload at all.

The FILE related parts of the patch are just debug-code, and they're obviously not meant to be part of the solution.
Comment 9 Gonzalo Paniagua Javier 2012-01-03 16:57:08 UTC
Created attachment 1120 [details]
epoll changes to 'Patch that helps'

Revert back the threadpool_init() line to its original version if needed.
Comment 10 Rolf Bjarne Kvinge [MSFT] 2012-01-04 16:47:24 UTC
Created attachment 1126 [details]
Patch that helps, v2

Second version of the threadpool patch, now it makes a difference on linux x86 too.

Unfortunately the regression still exists, but it's less reproducible.
Comment 11 Tim Dawson 2012-01-07 15:44:33 UTC
Our iOS app is gobbling memory slowly but surely until eventually it's killed by the OS. This is basically the same code that already runs fine on other .net platforms and at this stage we suspect the Mono/MonoTouch platform, and this bug sounds suspicious as we do a lot of work on background threads.

Does this affect both Boehm and Sgen?
Comment 12 Rolf Bjarne Kvinge [MSFT] 2012-01-09 07:22:05 UTC
Tim: this particular bug wouldn't cause memory usage to grow slowly and steadily, it will keep exactly 1 object per thread-pool thread alive (and all the objects that one object references) - and only while that thread is waiting for more work. In other words it will leak if you do something on the threadpool once, but then stop using the threadpool, if you continuously keep using the threadpool you shouldn't notice anything.

If you think it is a bug in Mono/MonoTouch, don't hesitate to open another bug report or contact support where we can track the issue.
Comment 13 Tim Dawson 2012-01-09 07:59:18 UTC
We don't use ThreadPool directly (though we do use threads extensively) however I understand that WebClient uses it, and we use WebClient a lot. Once data has been downloaded asynchronously with WebClient it stays in memory forever, which is a serious bug as it renders our software susceptible to be killed by the OS after initial setup which involves much downloading of data.

I have, however, created a separate bug for the WebClient issue. Someone commented on it that it might be this issue though.
Comment 14 Rolf Bjarne Kvinge [MSFT] 2012-01-13 22:06:30 UTC
Created attachment 1191 [details]
Disable the GC for threadpool threads while waiting for more work

Attached is another approach that works better: it disables the GC for threadpool threads while those threads are waiting for work.

With this patch no large objects survive for long in the test case provided previously.
Comment 15 Rolf Bjarne Kvinge [MSFT] 2012-01-13 22:09:00 UTC
Paolo, does the patch in the previous comment look ok?
Comment 16 Paolo Molaro 2012-01-16 09:11:07 UTC
(In reply to comment #15)
> Paolo, does the patch in the previous comment look ok?

Yes, please commit, thanks!
Comment 17 Rolf Bjarne Kvinge [MSFT] 2012-01-17 11:43:42 UTC
Thanks, committed to master (2d739f3, 8928c1b), 2.10 (b371946, 5092bb10) and mobile-master (851bbc06, eeb38948).