Notice (2018-05-24): bugzilla.xamarin.com is now in
Please join us on
Visual Studio Developer Community and in the
Mono organizations on
GitHub to continue tracking issues. Bugzilla will remain
available for reference in read-only mode. We will continue to work
on open Bugzilla bugs, copy them to the new locations
as needed for follow-up, and add the new items under Related
Our sincere thanks to everyone who has contributed on this bug
tracker over the years. Thanks also for your understanding as we
make these adjustments and improvements for the future.
Please create a new report on
GitHub or Developer Community with
your current version information, steps to reproduce, and relevant error
messages or log files if you are hitting an issue that looks similar to
this resolved bug and you do not yet see a matching new report.
See the discussion here:
I have a full sample from the customer, who would like to keep it under wraps. Poke me for access.
Possibly related / same as https://bugzilla.xamarin.com/show_bug.cgi?id=19015
I looked at the user test case, and multiple things seemed pretty obvious to me:
- first there would seem to be a bug in the Parallel.For imported from microsoft referencesource: without a ParallelOption with MaxDegreeOfParallelism passed, it is going to use a ludicrous amount of threads.
- second, the user code contains a bunch of threading misusage which is going to lead to scalability issue after just a few threads
I am going to work on the Parallel.For issue, but unfortunately I cannot work on the user code error.
Also, I reassigned the bug to me, as I think Rodrigo is already getting enough notifications as is :)
According to the documentation (https://msdn.microsoft.com/en-us/library/system.threading.tasks.paralleloptions.maxdegreeofparallelism(v=vs.110).aspx), the number of parallel tasks is not bounded in the default case, so the fact that it is going to launch an unlimited/huge number of tasks (and thus of threads) in parallel, is the expected behavior.
To control the number of parallel tasks executing in parallel, simply pass a ParallelOptions with MaxDegreeOfParallelism set. The user already have it in the code (https://github.com/philstopford/pointless/blob/master/MainWindowController.cs#L269), he just doesn't pass it to Parallel.For (https://github.com/philstopford/pointless/blob/master/MainWindowController.cs#L292).
So in the end, "this is not a bug, this is a feature" ...
I'm curious. This is my first foray into MonoMac code, but the original code ran fine and without any apparent ill effects on the Windows side. Even with the ParallelOptions based to Parallel.For, I see no change in behavior on Windows or the issues on the Mac side.
What threading misusage is apparent in the code (just vague hints would be sufficient)?
*passed to Parallel.For
It's also unclear to me why 3.12.1 and 4.x behave so differently.
The main change between mono 3.12 and 4.x is that we started using Microsoft ReferenceSource starting at 4.0. And the Mono version (before 4.0) of Parallel.For was a bit more respectful of the user machine by limiting by default the maximum number of parallel tasks to the number of CPU on the user machine. The implementation of Microsoft does not really care about that, and simply launch as many tasks in parallel in possible, letting the OS scheduler and the ThreadPool take care of the rest.
The fact that it works better on Windows is more or less an issue with the OS scheduler, and a hint of luck with the ThreadPool. And if you are comparing to .NET on windows, then .NET uses the OS ThreadPool, while we use the runtime one on Mono, and the OS one has way more insights into the resource usage of the process.
My main advice would be to pass a ParallelOption with MaxDegreeOfParallelism set to N * the number of core, with N a acceptable number for your workload (2 for CPU heavy workload, to 20-50 for IO heavy workload)
Well, as noted, I added the po 'ParallelOptions' to the Parallel.For and it seemed to make no difference in the overall performance on either Windows or OS X, hence my confusion (for the large project supplied and also my basic testbed to try and understand). Were you able to get full CPU usage with 'po' in the small test case?
e.g. Parallel.For(0, numberOfCases, po, (i, loopState) =>
I only ever see ~140% CPU usage and the timer for updating fires very infrequently in both cases. This has me rather puzzled, and is also seen with the terminal (headless) behavior, where GUI aspects are quite simplified
I tried various locking strategies as well, without seeing any full usage. That's why I'm curious about the apparent threading misusage :)
The not full CPU usage comes from you algorithm which is not parallel-computation friendly: most of the time is spent in waiting for the lock at https://github.com/philstopford/pointless/blob/master/MainWindowController.cs#L299
And increasing the number of threads will only lead to more contention on this lock. By only seeing the small test case in this repository, you would certainly get better performance by having a single threaded version, and no lock.
Your lock contention comes from the fact that you have this shared `previewShape` field that is accessed by all the threads.
Another concurrency pattern that you should have a look into is the System.Threading.Barrier .
Well, this small case was intended to explore the issues I was seeing in the full example (that Chris has and can be poked for), where the computation time is significantly longer. This report was really in the context of that larger example, which runs pretty nicely (headless) under 3.12.1 and poorly under 4.x.
The same basic code is used in the MonoMac version (in that full example) and there the GUI performance during a run seems to be horrible.
More work on my side has yielded these numbers for a headless version :
OSX_Headless46 (Mono 4.3.0)
1 m, 03 s : 65000
2 m, 31 s : 130000
OSX_Headless46 (Mono 4.2.1)
1 m, 03 s : 65000
2 m, 31 s : 130000
OSX_Headless46 (Mono 3.12.1)
1 m, 31 s : 65000
3 m, 21 s : 130000
However, compared to Windows, something is still seriously amiss - the Mono best-case is still 2x slower than the Windows version.
0 m, 30 s : 65000