Notice (2018-05-24): bugzilla.xamarin.com is now in
Please join us on
Visual Studio Developer Community and in the
Mono organizations on
GitHub to continue tracking issues. Bugzilla will remain
available for reference in read-only mode. We will continue to work
on open Bugzilla bugs, copy them to the new locations
as needed for follow-up, and add the new items under Related
Our sincere thanks to everyone who has contributed on this bug
tracker over the years. Thanks also for your understanding as we
make these adjustments and improvements for the future.
Please create a new report on
GitHub or Developer Community with
your current version information, steps to reproduce, and relevant error
messages or log files if you are hitting an issue that looks similar to
this resolved bug and you do not yet see a matching new report.
The test "bug-18026.exe" sometimes fails with an assertion in hazard-pointer.c.
This might be a known limitation of the current implementation of the runtime:
* If this assert fails we don't have enough overflow slots.
* We should contemplate adding them dynamically. If we can
* make mono_thread_small_id_alloc() lock-free we can just
* allocate them on-demand.
g_assert (small_id < HAZARD_TABLE_OVERFLOW);
I've been unable to trigger this assertion locally, but it happens quite frequently (more than 50% of the times) when running the tests on TravisCI.
See https://s3.amazonaws.com/archive.travis-ci.org/jobs/23540160/log.txt and https://s3.amazonaws.com/archive.travis-ci.org/jobs/23540161/log.txt for traces of the failure.
Please provide a full backtrace of the crash.
You can probably get that by installing gdb in your test machine.
As I already said, I cannot reproduce the issue locally, hence it is not very easy to get a full backtrace.
Nonetheless, a backtrace of the thread that triggered the assertion is available here: https://s3.amazonaws.com/archive.travis-ci.org/jobs/23540160/log.txt
(see the last few lines).
Such partial backtrace is, unfortunately, of no use.
Is this any better?
Uhm... No, I guess it stopped in the wrong place (and I should ignore SIGPWR, it seems to be used for coordinating the GC).
Yes, mono uses signals for a lot of things.
Quick Q. What's the environment that you got this backtrace?
I could not get on a 64bits build on a 4 core VM.
The TravisCI environment seems to be a 64-bits VM on a 32 core system (I don't know how many cores actually exist in the physical machine).
I'm running another test with gdb set to nostop on SIGPWR and SIGXCPU.
It looks like attaching the debugger changes the timings enough to prevent the assertion and hide again the bug: https://s3.amazonaws.com/archive.travis-ci.org/jobs/23618000/log.txt
Don't run the tests under gdb.
Just make sure it's installed an on the PATH. When a crash happen, mono will ask it to attach and dump a backtrace.
That worked! Thanks.
The trace is available at the end of this log https://s3.amazonaws.com/archive.travis-ci.org/jobs/23656453/log.txt
The bug here is that something else is not cleanup up its hazard pointers well enough. We need to investigate it.
I cannot reproduce it anymore on current master.
I believe it was fixed by 2285b4c10f3ed1bfabff391b6c5a7324067e51e7 or a9b07ba04ba870e2a681f7ec8ea253eabd1a15b5.
I left the test running overnight and it did not crash.
It was crashing in less than 5 minutes before.
I'm actually seeing this when trying to run an xUnit android runner application in Jenkins. Same as above, when a debugger is attached the issue is non-existent
Can you provide a test case that shows the issue?