Notice (2018-05-24): bugzilla.xamarin.com is now in
Please join us on
Visual Studio Developer Community and in the
Mono organizations on
GitHub to continue tracking issues. Bugzilla will remain
available for reference in read-only mode. We will continue to work
on open Bugzilla bugs, copy them to the new locations
as needed for follow-up, and add the new items under Related
Our sincere thanks to everyone who has contributed on this bug
tracker over the years. Thanks also for your understanding as we
make these adjustments and improvements for the future.
Please create a new report on
GitHub or Developer Community with
your current version information, steps to reproduce, and relevant error
messages or log files if you are hitting an issue that looks similar to
this resolved bug and you do not yet see a matching new report.
In commit https://github.com/mono/mono/commit/209c4246adbba994dfcecbc1e93481016b8d9f44 seq points were enabled by default, which resulted in a regression for my specific setup.
The bug is rather weird, as it looks like Mono runtime can stop executing the program at random point by running into 100% CPU usage (infinite loop?) which causes the runtime to be stuck, without any crash or segfault, even after several hours.
The bug is not reproducable with MONO_DEBUG=no-compact-seq-points option set, which disables seq points once again (reverting to old behaviour).
I reproduced the issue not only in latest master, but also in older Mono versions (4.2 tested) with MONO_DEBUG=gen-compact-seq-points enabled, so this bug wasn't introduced in above commit, but much much earlier, perhaps it was even incompatible from the start but I just ran into it with my specific setup.
It's also not that easy to reproduce, as it looks like smaller projects are working 100% correctly, while the bug is hitting only bigger, more complex projects with more classes - I reproduced it by running ILRepack as well as running my own self-compiled xbuild'ed project, but I couldn't reproduce it with my own self-compiled two smaller projects, which seem to be working perfectly fine even in faulty setup.
I suspect this is the result of gcc-specific flag such as stripping debug symbols from output binary, that I'm currently trying to find out. I'll update this issue once I gather more relevant information which could help locate the culprit.
Created attachment 16692 [details]
Reproducable test case
I'm sorry for the format but as noted in the bug, this issue is specific and doesn't happen with every project, so it's easier for me to include something bigger right away in order to have reliable test case.
I was close - it wasn't gcc optimization flag, but in fact Mono one.
I attached reproducable test case that should be working on both new (master) and old (4.2/4.4) Mono versions, just remove unnecessary export if you're running master, as this is default now.
The culprit is combination of abcrem and tailc optimizations - using just one doesn't trigger it, but using both does. Likewise, -O=all also triggers the issue, but default run (without -O) seems to work fine, probably that's why it wasn't caught earlier.
Once again I'm sorry for "format" of the test case, but that was the best I could come up with. If it isn't satisfying, you can try with any bigger C# project, but the bug doesn't trigger if binary doesn't meet some "requirements" - it looks like size issue to me, but maybe it's something more specific, I don't know.
Hope it helps, I did everything I could to pinpoint the issue, rest is upon you. Good luck and thank you in advance!
I can reproduce this on osx with mono master.
Can you take a look at this one.
Hi Lukasz, thanks for the detailed bug report.
Compact sequence points are somewhat problematic because we emit a bunch of temporary sequence points opcodes that break optimizations that are not expecting them on the middle of others opcodes.
Our test suite verifies that the code generated with and without the sequence point opcode has the same size for each method. Running that test suite with -O=all showed no errors. Which means one of two things, either it does not cover the failing optimization, or the partially optimized code compiled with sequence point opcodes has the same size has the correct one.
I will run a script to check for diferences on the JIT compiled assembly between running your application with and without sequence points opcodes, hopefully some methods will diverge in size and we will be able to track the failing optimizations from there.
Hi, and thank you for your response Marcos.
Yes, as I said, this bug is quite hard to reproduce, and I'm not entirely sure why, that's why I attached binary and reproduce script instead of usual testcase in source form. Feel free to modify included reproduce.sh script in order to find differences in the JIT assembly that could help you spot the real culprit, because as I noted, this issue happens only in this specific case of turning on those two Mono optimizations, and not in any other case. I'd be happy to further help, but unfortunately I don't have enough knowledge to even try to guess what might be wrong, as I'm not that experienced in this field - I'd be happy to fix the issue and send PR otherwise.
Regardless, I'm happy that you were able to reproduce the bug, as that's most important. It's not ILRepack-specific bug, as I was able to reproduce it also in other projects, I just considered to attach this one as it was easiest to use for this purpose. You can remove -O parameter to check that Mono is running fine without those two optimizations.
Thank you in advance for taking look into this issue.
Did you manage to reproduce the issue?
The issue can be reproduced as Łukasz suggested.
Alexander Köplinger also discovered a good way of reproducing it, which requires only mcs. By doing:
MONO_PATH=mcs/class/lib/net_4_x/ mono/mini/mono -O=all mcs/class/lib/build/mcs.exe Program.cs
public static void Main(string args)
It only hangs when we use the param string in the Main method.
I changed our test suite to compare all the generated native opcodes, but we also have noise of a few differences caused by non determinism in the JIT compiler.
The pull request is here: https://github.com/mono/mono/pull/3425
I check manually with lldb and monobt the mcs test case and Mono.CSharp.VarianceDecl:CheckTypeVariance is calling Mono.CSharp.TypeSpec:get_TypeArguments in a infinite loop.
A diff on the jitted code with and without OP_IL_SEQ_POINT revealed differences:
The issue seems to be in:
000000000000042f movq %rax, %r13 | 000000000000042f xorl %eax, %eax
@Rodrigo It is quite hard to me to find the JIT compiler logic responsible by that difference. Can someone from the runtime team give me some insight on this?
More detailed gist:
I meant the change before the jmp:
0000000000000426 movq (%rbx,%rax), %r13 | 0000000000000426 movq (%rbx,%rax), %rax
Great investigative job Marcos! That looks like an incompatible change.
in progress: https://github.com/mono/mono/pull/3492
*** Bug 43004 has been marked as a duplicate of this bug. ***
Yes, I can confirm that this issue has been fixed in https://github.com/mono/mono/pull/3492. Big thanks to everybody looking into the problem!