Notice (2018-05-24): bugzilla.xamarin.com is now in
Please join us on
Visual Studio Developer Community and in the
Mono organizations on
GitHub to continue tracking issues. Bugzilla will remain
available for reference in read-only mode. We will continue to work
on open Bugzilla bugs, copy them to the new locations
as needed for follow-up, and add the new items under Related
Our sincere thanks to everyone who has contributed on this bug
tracker over the years. Thanks also for your understanding as we
make these adjustments and improvements for the future.
Please create a new report on
GitHub or Developer Community with
your current version information, steps to reproduce, and relevant error
messages or log files if you are hitting an issue that looks similar to
this resolved bug and you do not yet see a matching new report.
Created attachment 5186 [details]
actor.c and actor.cs
I can reproduce this on 2.10 and master.
I'm delaying adding a complete test case until I know it will actually be useful.
The test environment is jruby running Akka. C function calls are via the jruby ffi.
I am calling into mono from various threads in java. The application uses Akka and so it's java threads using the fork/join framework where this originally became an issue. I tried various thread dispatchers in akka, including pinned dispatchers, but that didn't make a different. I then tried running the code under java thread pools and executors I created myself, and it still failed. Interestingly enough, the only version that works is when I create a group of threads manually, with one mono object being accessed from each thread. Actually the other version that works is when there is just one thread/mono object.
Lowering the concurrency makes it take longer to trigger. It usually takes at least a few thousand calls before I get a segfault.
I've used several techniques for doing this while trying to narrow down the issue. I started with calling into C from java and getting a gchandle, then calling methods on the object the gchandle references. Then I tried not using gchandles, but sticking the mono object into an array in C# so it wouldn't get collected. Same errors. Then I moved to just calling mono via static methods and letting mono handle the object instantiation. All of these variations produce the same errors.
So on to the code. Forgive how messy it is, I've gone through so many iterations trying to track this down that it's pretty bad.
The ReceiveMessage static method in actor.cs is the latest test code I'm using. The first argument to ReceiveMessage is a string containing the java thread id concatenated with the actor name. This is how I am pinning java threads to mono object instances. The calling code is making a call to attach the mono thread on every call. I've tried being strict about just calling attach once, and being more liberal, doesn't seem to make a difference.
The C method that calls this is on_receive2 in actor.c.
Most often I get a segfault when the Memorystream is reading bytes. I see the following in the stack trace:
Thread 5 (Thread 0x7fdd3adfc700 (LWP 32405)):
#0 0x00007fddc48cb4b7 in __libc_waitpid (pid=pid@entry=32409, stat_loc=stat_loc@entry=0x7fdd3adf9cec, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:40
#1 0x00007fddabb1667c in mono_handle_native_sigsegv (signal=signal@entry=11, ctx=ctx@entry=0x7fdd3adfa780) at mini-exceptions.c:2377
#2 0x00007fddaba80e87 in mono_sigsegv_signal_handler (_dummy=11, info=0x7fdd3adfa8b0, context=0x7fdd3adfa780) at mini.c:6640
#3 0x00007fddc3c0b167 in call_chained_handler (context=0x7fdd3adfa780, siginfo=0x7fdd3adfa8b0, sig=11, actp=<optimized out>) at /build/buildd/openjdk-7-7u25-2.3.12/build/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:3791
#4 os::Linux::chained_handler (sig=sig@entry=11, siginfo=siginfo@entry=0x7fdd3adfa8b0, context=context@entry=0x7fdd3adfa780) at /build/buildd/openjdk-7-7u25-2.3.12/build/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:3809
#5 0x00007fddc3c0e807 in JVM_handle_linux_signal (sig=11, info=0x7fdd3adfa8b0, ucVoid=0x7fdd3adfa780, abort_if_unrecognized=<optimized out>) at /build/buildd/openjdk-7-7u25-2.3.12/build/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:508
#6 <signal handler called>
#7 mono_array_get_byte_length (array=0x7fddb001e550) at icall.c:6115
#8 ves_icall_System_Buffer_BlockCopyInternal (src=0x7fddb001e550, src_offset=<optimized out>, dest=<optimized out>, dest_offset=<optimized out>, count=<optimized out>) at icall.c:6186
However, It think this is just because it's the hot path. The underlying issue looks like memory getting stomped on somehow. In some tests I caught this call Buffer.ByteLength (bytes) succeeding, but then a few lines later it segfaults with the above stack trace while calling the same method via protocol buffer decoding, which seems to imply the memory changed during that time? Other times it fails on Buffer.ByteLength(bytes) at the top of ReceiveMessage, with the same stack trace.
This is the most common top part of the stacktrace, it's what I see 90% of the time:
at <unknown> <0xffffffff>
at (wrapper managed-to-native) System.Buffer.BlockCopyInternal (System.Array,int,System.Array,int,int) <0xffffffff>
at System.Buffer.BlockCopy (System.Array,int,System.Array,int,int) <0x0006b>
at ProtoBuf.Helpers.BlockCopy (byte,int,byte,int,int) <0x00023>
at ProtoBuf.BufferPool.ResizeAndFlushLeft (byte&,int,int,int) <0x0006b>
at ProtoBuf.ProtoReader.Ensure (int,bool) <0x0007b>
at ProtoBuf.ProtoReader.TryReadUInt32VariantWithoutMoving (bool,uint&) <0x00043>
at ProtoBuf.ProtoReader.ReadUInt32Variant (bool) <0x0002b>
at ProtoBuf.ProtoReader.ReadString () <0x00037>
at (wrapper dynamic-method) com.game_machine.entity_system.generated.Entity.proto_2 (object,ProtoBuf.ProtoReader) <0x00696>
at ProtoBuf.Serializers.CompiledSerializer.ProtoBuf.Serializers.IProtoSerializer.Read (object,ProtoBuf.ProtoReader) <0x0003f>
at ProtoBuf.Meta.RuntimeTypeModel.Deserialize (int,object,ProtoBuf.ProtoReader) <0x00150>
at ProtoBuf.Meta.TypeModel.DeserializeCore (ProtoBuf.ProtoReader,System.Type,object,bool) <0x00064>
at ProtoBuf.Meta.TypeModel.Deserialize (System.IO.Stream,object,System.Type,ProtoBuf.SerializationContext) <0x0009b>
at ProtoBuf.Meta.TypeModel.Deserialize (System.IO.Stream,object,System.Type) <0x0001f>
at ProtoBuf.Serializer.Deserialize<T> (System.IO.Stream) <0x00043>
at GameMachine.Actor.ByteArrayToEntity (byte) <0x00047>
at GameMachine.TestActor.OnReceive (object) <0x0006b>
at GameMachine.Actor.ReceiveMessage (string,string,string,byte) <0x000a5>
at (wrapper runtime-invoke) <Module>.runtime_invoke_void_object_object_object_object (object,intptr,intptr,intptr) <0xffffffff>
Right now I'm looking for what's most useful to submit to track this down. It's going to take some time to extract everything out into a clean test case, so I want to make sure if I do that, it's going to be useful information.
Hmm I think this has something to do with how I am creating the mono byte array in C.
If at the top of ReceiveMessage I have this it fails, just takes a while.
If I remove the byte argument to ReceiveMessage and don't pass the byte array, and have the following at the start of the method, it works.
byte b1 = System.Text.Encoding.UTF8.GetBytes ("TEST^&%$#");
No it's not how I'm setting the array values. Get the same errors with just a newly created MonoArray with no values set.
Keep failing here at line 6115 in icall.c:
klass = array->obj.vtable->klass;
Can you provide a test case that doesn't require a JVM?
Can you still reproduce with latest mono? If that's the case, feel free to reopen, and please provide a repro case. Thank you.