Bug 13604 - Mono 3.2.0 Process Crashes on sgen-os-posix info handshake not met
Summary: Mono 3.2.0 Process Crashes on sgen-os-posix info handshake not met
Status: RESOLVED FIXED
Alias: None
Product: Runtime
Classification: Mono
Component: General ()
Version: unspecified
Hardware: All Linux
: --- normal
Target Milestone: ---
Assignee: Mark Probst
URL:
Depends on:
Blocks:
 
Reported: 2013-07-30 09:16 UTC by Thomas Holloway
Modified: 2013-08-12 12:51 UTC (History)
4 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Thomas Holloway 2013-07-30 09:16:49 UTC
I was previously running on Mono 2.10.9 on CentOS and I have a number of worker processes that run long-term running background work all the time. They typically will create a Task to report the current worker status to our database so we can keep track of processing speeds..etc. It's hard to tell exactly how/when the process crashes, but I know that at the very least I am using log4net to automatically append errors to the output. Anytime I see the following:

* Assertion at sgen-os-posix.c:60, condition `info->doing_handshake' not met

I can see that the process crashed. This didn't happen in 2.10.9, it only started happening when I upgraded to Mono 3.2. Can anyone point me in the right direction, I'm not sure where to start on this one :/

I removed the assertion manually from the compiled code to see if I could get around this one but that didn't seem to work, the process didn't crash but it looks like my thread is either not working, gone or is in a continuously suspended state. It's a really tough problem for me because I don't have any kind of stack trace to work off of :|
Comment 1 Thomas Holloway 2013-07-30 09:19:36 UTC
Just as a point of reference I'm using ServiceStack.Redis's blocking pop mechanism in one of my threads. At some point it just stops processing items off of that blocking pop and appears to not work anymore. Again, no stack trace just the above error message before the process just dies.
Comment 2 Rodrigo Kumpera 2013-08-08 17:01:31 UTC
Could you post a backtrace of the crasher and a test case for it?
Comment 3 Thomas Holloway 2013-08-08 17:11:49 UTC
I will try to re-run this scenario again with a coredump if possible. The stacktrace doesn't usually show but I'm not quite sure what I need to run / configure mono with to ensure that it displays the information you need. It doesn't happen often either so.. there's that.
Comment 4 Rodrigo Kumpera 2013-08-08 18:00:14 UTC
Make sure gdb is installed.

As an alternative if nothing shows up, run with MONO_DEBUG=suspend-on-segv
This will cause mono to hang instead of crash, then attach gdb and get a full thread dump with "t a a bt".
Comment 5 Rodrigo Kumpera 2013-08-08 18:00:22 UTC
Make sure gdb is installed.

As an alternative if nothing shows up, run with MONO_DEBUG=suspend-on-segv
This will cause mono to hang instead of crash, then attach gdb and get a full thread dump with "t a a bt".
Comment 6 Rodrigo Kumpera 2013-08-09 11:30:43 UTC
Mark,

Can you debug this crasher? This can be repro'd on amd64/linux with the sgen test suite second an email on monodev.
Comment 7 Mark Probst 2013-08-09 16:25:34 UTC
This should be fixed on master.  Please check.
Comment 8 Thomas Holloway 2013-08-11 17:11:59 UTC
I no longer see the issue anymore. This patch appears to have worked for me. Thanks!
Comment 9 Charles Randall 2013-08-12 12:51:06 UTC
Working with Kumpera on the monodev IRC, this problem was repeatable with the
test bug-10127.cs from the mono/tests in the distribution.

There was a concern that the test was broken and this was the suggested fix,

http://sprunge.us/ISQb

With this new test and this patch to fix this defect,

https://github.com/mono/mono/commit/2c45af25e2a027d749feef771a83a3c9c731f4aa

I am able to run this test hundreds of times without running into this particular failure.