Bug 21193 - large object heap issue?
Summary: large object heap issue?
Status: RESOLVED FIXED
Alias: None
Product: Runtime
Classification: Mono
Component: GC ()
Version: unspecified
Hardware: PC Mac OS
: --- normal
Target Milestone: ---
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2014-07-09 06:55 UTC by Iain
Modified: 2014-08-12 10:58 UTC (History)
6 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
repro solution (13.60 KB, application/zip)
2014-07-09 06:59 UTC, Iain
Details


Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Iain 2014-07-09 06:55:30 UTC
Hi folks

First a bit of background before the code sample...

I encountered issues a while back when developing a cross platform app using Mono, initially during the iOs development phase.  From what I recall of the issue (it was a while back), it manifested itself as a continually growing application heap resulting eventually in the iOs runtime terminating the application.  Examination of the heap using the mono profiler didn't show any leaks. I eventually tracked the issue down to a TCP server we were creating and destroying whenever the app was suspended/resumed.  More specifically I tracked it down to a large byte array that we were using as a buffer (we were passing offsets within this array to individual TCP sessions to use for reading/writing).  Repeately re-constructing that byte array every time the app was resumed led to a growing heap size.  Almost as though it was not getting freed properly by the garbage collector?

My fix at the time was to construct the byte array up front and preserve this between suspend/resume calls.  This worked for iOs and seemed to be the end of the problem up until the point of my attempt to get the app running on Android, whereupon I again encountered weird runtime issues.  The app initially worked until a colleague happened to move the construction of the TCP server into a Task as opposed to up front construction on the main thread during initialisation of the application object.  Symptoms were that the app then appeared to freeze randomly soon after startup - threads just seemed to die, the debugger wouldn't connect and the application basically became unresponsive and un-debuggable.  Breaking the large byte array into multiple small arrays, one per TCP session fixed the issue and the application worked fine.

So I thought I would attempt to distill the issue into a reproducible example for you.  Not certain if this is the exact same runtime issue as the code is hugely different, but my idea was that the problem was related to moving or constructing large byte arrays across threads.   The code sample below certainly displays bad behaviour, whether it's the same issue as my main application or not.  

The task threads appear to stop running shortly after the application starts, but the click handler on the button is still working ok.  Removing the call to bytes[rand.Next(kNumArrays)] = new byte[kArraySize]; in the task threads or reducing the size of the arrays appears to allow the app to run indefinitely which is why I wondered if the problem was related to the large object heap.  Forgive my limited knowledge of the mono runtime - I may be way off the mark with my speculation here :)


Anyway, here's the code sample - a modified out of the box Android activity created in the latest alpha of xamarin studio/mono for android.  Full solution attached as a zip file.  Try it with and without line 57 and play around with kArraySize to see the effect on the task threads - which appear to die in their tracks when the arrays are large...

Regards
Iain McLeod


[Activity(Label = "StressTestLargeObjectHeap", MainLauncher = true, Icon = "@drawable/icon")]
    public class MainActivity : Activity
    {
        int count = 1;

        protected override void OnCreate(Bundle bundle)
        {
            base.OnCreate(bundle);

            // Set our view from the "main" layout resource
            SetContentView(Resource.Layout.Main);

            // Get our button from the layout resource,
            // and attach an event to it
            Button button = FindViewById<Button>(Resource.Id.myButton);
            
            button.Click += delegate
            {
                button.Text = string.Format("{0} clicks!", count++);
            };

            // modified below

            int kArraySize = 100 * 256 * 1024 * 2;
            int kNumArrays = 10;           
            List<byte[]> bytes = new List<byte[]>();
            Random rand = new Random();

            for (int i = 0; i < kNumArrays; i++)
            {
                bytes.Add(new byte[kArraySize]);
            }

            for (int i = 0; i < kNumArrays; i++)
            {
                Task.Factory.StartNew(() =>
                {
                    while (true)
                    {
                        lock(bytes)
                        {
                            Console.WriteLine("here");
                            bytes[rand.Next(kNumArrays)] = new byte[kArraySize];
                        }
                        Thread.Sleep(100);
                    }
                });
            }
        }
    }
Comment 1 Iain 2014-07-09 06:59:35 UTC
Created attachment 7300 [details]
repro solution
Comment 2 Iain 2014-07-09 10:54:18 UTC
oh, I should have mentioned.  I'm running this on a nexus 7 (1st gen) running KitKat 4.4.4.

Cheers
Iain
Comment 3 Iain 2014-07-09 11:21:53 UTC
Further to this, a bit of trial and error on my part with the number of allowed TCP connections in our app found the cutoff between stable app and 100% reproducible crash on startup was 30.  

Our buffer in the TCP server was defined as num_connections * 256 * 1024 + num_connections * 16 * 1024.

Plugging in the numbers here, 29 connections gives 7888K for the buffer whereas 30 connections gives 8160K

this post:
http://mono.1490590.n4.nabble.com/Large-object-heap-size-threshold-td4661717.html
suggests SGEN_MAX_SMALL_OBJ_SIZE is 8000, which would fit with it being a large object space issue.

Cheers
Iain
Comment 4 Paolo Molaro 2014-08-12 10:58:24 UTC
Thanks for the report.
The MAX_SMALL_OBJ_SIZE is 8000 _bytes_, while your arrays are about 8 megabytes.
The issue here is most likely because the arrays happened to be allocated in an area where a false pointer was pinning them in (mono still needs to conservatively scan some memory areas, like the stack, while the heap is checked for live pointers in a fully-precise way) and when you have a few of those, especially on 32 bit systems, the probability of it happening increases a lot.
The solution you adopted is the correct one, small arrays have less chance to be pinned or be a problem if they are.
The difference you noted between size slightly smaller and bigger than 8 MB may happen because the kernel could place them differently in memory and hit the issue.
Running a test with current mono on a 64 bit system for a while doesn't exhibit the problem: for 32 bit systems with relatively small amounts of memory like the current mobile devices, the best option is the one you already adopted.
Feel free to reopen if you hit a related issue with smaller arrays.