Notice (2018-05-24): bugzilla.xamarin.com is now in
Please join us on
Visual Studio Developer Community and in the
Mono organizations on
GitHub to continue tracking issues. Bugzilla will remain
available for reference in read-only mode. We will continue to work
on open Bugzilla bugs, copy them to the new locations
as needed for follow-up, and add the new items under Related
Our sincere thanks to everyone who has contributed on this bug
tracker over the years. Thanks also for your understanding as we
make these adjustments and improvements for the future.
Please create a new report on
GitHub or Developer Community with
your current version information, steps to reproduce, and relevant error
messages or log files if you are hitting an issue that looks similar to
this resolved bug and you do not yet see a matching new report.
Created attachment 6578 [details]
thread apply all backtrace
After running my program many times, eventually after ~24 hours it hangs. The CPU usage when it hangs is at ~700% meaning 7 cores are being used on a machine with 32 cores and 126.0 GB of RAM. The mono process is using less than 0.1% of total memory available. GDB backtrace (attached) indicates sgen is performing a collection. This is using 64bit fedora with mono 3.2.8.
Created attachment 6579 [details]
another gdb backtrace after the hung program is allowed to run for a while longer
Here is another backtrace from all threads after I allowed the program to continue running while it was in this hang state. Note that the line numbers in sgen-nursery-allocator.c have changed so presumably this is where the cpus are spinning.
So far testing using mono 2.10.9 has produced no hangs after 120 runs. Mono 3.2.8 was hanging after ~80 runs. I will continue testing with mono 2.10.9 to be sure, but so far it seems like this bug does not exist in that version.
I have confirmed that indeed the bug does not exist in mono 2.10.9. My most recent test using mono 3.2.8 resulted in a hang/deadlock after only 9 runs of my program with each run taking about 10 min. This time the CPU usage is hovering at 500% rather than 600%.
Could you provide a test case?
Created attachment 6627 [details]
After I created this test case and compiled with mcs I didn't have the hang issue, but instead it caused a SIGSEGV (after about 25 rounds). The stacktrace indicated sgen was performing a nursery collection. On my machine each round took about 4 minutes so you might have to wait a while to see the SIGSEGV.
Created attachment 6628 [details]
SIGSEGV stack trace from test case
To run the test case download the file SgenHang.zip from the link above and extract.
Compile the program:
mcs -r:MathNet.Numerics.IO.dll -r:MathNet.Numerics.dll Program.cs
Run the program:
mono Program.exe 0.cleaned
Wait for the SIGSEGV. The program will cycle for up to 120 rounds, but the SIGSEGV should happen before that.
This bug is confirmed in mono 3.4.0 also. It happened after 72 cycles.
Any update on this bug? It has been confirmed in mono 3.6. Seems like it needs a high core machine (>8) to reproduce and might be actually two different issues (one for the hang and one for the SIGSEGV) both related to garbage collection.
We haven't been able to reproduce which limits our ability to fix it.
Until we do, it will remain as is.
We have commited a fix for the hang in mono/d2f66f2d9b4de1d2f79f029b7bec10581084601b
It's not part of 3.6.0, it will catch the next train.
Thanks Rodrigo. To troubleshoot the other issue (SIGSEGV during sgen collection) I can provide access to a VM with 32 cores. Contact me if you are interested in that route.
Eric, as far as I can tell the bug goes away if you use the environment variable
Please use that as a workaround for now. We're looking into steps to fix the bug.
Eric, can you confirm that this issue is fixed on mono master?
Yes, this looks to be fixed in mono master. No crashes/hangs after 120 runs of the test case (took about 6 hours on a 40 core machine). Thanks Mark!
No crashes/hangs after 120 runs of
the test case