Notice (2018-05-24): bugzilla.xamarin.com is now in
Please join us on
Visual Studio Developer Community and in the
Mono organizations on
GitHub to continue tracking issues. Bugzilla will remain
available for reference in read-only mode. We will continue to work
on open Bugzilla bugs, copy them to the new locations
as needed for follow-up, and add the new items under Related
Our sincere thanks to everyone who has contributed on this bug
tracker over the years. Thanks also for your understanding as we
make these adjustments and improvements for the future.
Please create a new report on
Developer Community or GitHub with
your current version information, steps to reproduce, and relevant error
messages or log files if you are hitting an issue that looks similar to
this resolved bug and you do not yet see a matching new report.
Created attachment 6534 [details]
A synthetic benchmark
The cold-start of an application takes a significant amount of time on lower-end devices, like the iPhone 4 and iPad 2.
The profiling sessions show that the time is spent in generic_trampoline_delegate, a method that will build the trampolines for generic method calls.
I’ve attached a synthetic benchmark that reproduces the issue, which contains thousands of generic methods, in a similar way as our application does work with generic methods.
On an iPad 2, the first call of the chain takes about 1150ms, whereas on the second call, it takes about 4ms (four). The caching mechanism of the mini runtime is working properly, as we can see that the time drops significantly.
However, during the first calls, the time taken to resolve the methods is significant, and seems to linearly increase, in relation to the number of generic types present in the application domain.
As a tentative performance improvement, parallelizing does not seem to have any impact, as when calling the same code on two threads, the first call takes 2330ms, where with the second call, both take 4ms (four).
Note that the ratio between cold and warm time is *very* different with latest Apple devices, like the 5S, where the cold duration drops by a factor of 4.
This has a great impact on the perceived performance of the app for the consumer, even though when the app is warmed up, the performance is great.
Thanks for the testcase.
Checked in a fix to mono master 078dc0321d53f9e161957656550fd10cc41db618/mono-3.4.0 0081c27e0d6473a83cc856abf67c4a42dc21b53d.
It improves the first run of the benchmark from 1.1s to 0.4s for me.
Thank Zoltan, that's quite an improvement :)
Would you know if that also improves the performance in multi-thread scenarios ?
It probably does.
I'm asking because of this:
Where there is contention when resolving the generic methods. The work being done inside the lock is pretty significant...
This patch introduced a regression, Mono no longer bootstraps, see:
The changes were reverted from master/3.4.0 for now.
@Jerome: Will look at reducing the work done inside the lock.
Committed a fixed fix to mono master ea490c5486af6e1ce6ce8b1a117f1d99cf988df0. It will be in a future mt version after some testing.
The corresponding change on the 3.4.0 branch is 28145e01f42317e685ad1020a47ba746f164c28b.
Using the same PoC, the run time is down from 1150ms to 268ms, same hardware.
Great improvement Zoltan, thanks !
Note that the behavior for multi-thread is vastly better, bit still slower than the single-cpu test. (2330ms down to 380ms)
This fix is part of the 7.2.6 release (in th alpha channel right now).
As per comment 9, this issue is working fine now i.e. run time is down from 1150ms to 268ms on same hardware.
Hence closing this issue.