Bug 10127 - WeakReference.Target can be garbage with SGEN?
Summary: WeakReference.Target can be garbage with SGEN?
Status: RESOLVED FIXED
Alias: None
Product: Runtime
Classification: Mono
Component: GC ()
Version: unspecified
Hardware: PC Mac OS
: --- normal
Target Milestone: ---
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2013-02-07 11:59 UTC by Roope Kangas
Modified: 2013-02-14 17:47 UTC (History)
3 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
Example program to reproduce the problem (2.80 KB, application/octet-stream)
2013-02-13 07:34 UTC, Roope Kangas
Details
Crash on OSX (31.90 KB, text/plain)
2013-02-13 07:34 UTC, Roope Kangas
Details


Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Roope Kangas 2013-02-07 11:59:27 UTC
Hi!

We had a piece of code that cached references to arrays in to a large table (65k) of WeakReferences.
When load testing our application (a game server) we noticed that it quite consistently crashed on something like this:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffb4a7b700 (LWP 24879)]
mono_class_is_assignable_from (klass="FList`1", oklass=0x7fff3c0858) at class.c:7301
7301		if (!oklass->inited)
(gdb) backtrace
#0  mono_class_is_assignable_from (klass="FList`1", oklass=0x7fff3c0858) at class.c:7301
#1  0x0000000000565fad in mono_object_isinst (obj=0x7ffff0c1d850, klass="FList`1") at object.c:5133
#2  0x0000000040021a4e in (wrapper managed-to-native) object:__icall_wrapper_mono_object_isinst (param0=<type 'exceptions.RuntimeError'>
Cannot access memory at address 0x7fff3c08a0
<type 'exceptions.RuntimeError'>
Cannot access memory at address 0x7fff3c08a0
140737232623696, param1=25658304) at xdb.il:836
#3  0x00000000401f1979 in Interner:Probe<T> (hc=0) at /var/lib/jenkins/jobs/Load test turf-server/workspace/MobileProto/Assets/Scripts/Shared/Utils/Cache.cs:229
#4  0x0000000040341813 in FList`1<IntVec3>:From (ie=...) at /var/lib/jenkins/jobs/Load test turf-server/workspace/MobileProto/Assets/Scripts/Shared/Utils/Arrays.cs:219
#5  0x00000000406c59b0 in FDict`2<IntVec3, FChunkData>:DiffKeys (this=..., another=...) at /var/lib/jenkins/jobs/Load test turf-server/workspace/MobileProto/Assets/Scripts/Shared/Utils/Collections.cs:90
#6  0x000000004053ebc1 in TurfEngine:RunBlockPhys (this=...) at /var/lib/jenkins/jobs/Load test turf-server/workspace/MobileProto/Assets/Scripts/Shared/TurfEngine.cs:2079
#7  0x000000004053e792 in ServerTurfEngine:BlockPhysUpdate (this=...) at /var/lib/jenkins/jobs/Load test turf-server/workspace/turf-server/source/ServerTurfEngine.cs:458
#8  0x00000000403940a2 in ServerTurfEngine:GameTick (this=...) at /var/lib/jenkins/jobs/Load test turf-server/workspace/turf-server/source/ServerTurfEngine.cs:395
#9  0x0000000040392f59 in ServerTurfEngine:<Start>m__C7 (this=...) at /var/lib/jenkins/jobs/Load test turf-server/workspace/turf-server/source/ServerTurfEngine.cs:244
#10 0x0000000040391d4c in GameThreadWorker:Tick (this=...) at /var/lib/jenkins/jobs/Load test turf-server/workspace/turf-server/source/GameThreadPool.cs:252
#11 0x00000000402c6368 in GameThreadWorker:Work (this=...) at /var/lib/jenkins/jobs/Load test turf-server/workspace/turf-server/source/GameThreadPool.cs:216
#12 0x00000000402c5fe4 in GameThreadWorker:<Start>m__92 (this=..., _=<value optimized out>)
#13 0x0000000040106bea in System.Threading.Thread:StartInternal (this=...) at /tmp/mono-2.10.9/mcs/class/corlib/System.Threading/Thread.cs:705
#14 0x00000000400247cb in (wrapper runtime-invoke) object:runtime_invoke_void__this__ (param0=System.Threading.ThreadStart = {...}, param1=<value optimized out>, param2=0, param3=1074813055) at xdb.il:893
#15 0x000000000041fb78 in mono_jit_runtime_invoke (method="System.Threading.ThreadStart:Invoke ()", obj=0x7ffff0ffc738, params=0x7fffb4a7ad90, exc=0x0) at mini.c:5791
#16 0x00000000005613eb in mono_runtime_invoke (method="System.Threading.ThreadStart:Invoke ()", obj=0x7ffff0ffc738, params=0x7fffb4a7ad90, exc=0x0) at object.c:2755
#17 0x00000000005b3619 in start_wrapper_internal (data=0x7fff9c0032e0) at threads.c:790
#18 start_wrapper (data=0x7fff9c0032e0) at threads.c:832
#19 0x00000000005dd962 in thread_start_routine (args=0xa025e0) at wthreads.c:287
#20 0x000000000058a52d in gc_start_thread (arg=0x7fff9c029bd0) at sgen-gc.c:6154
#21 0x00007ffff7537851 in start_thread (arg=0x7fffb4a7b700) at pthread_create.c:301
#22 0x00007ffff728511d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

The code in question is just trying to cast the target of WeakReference with "as" operator. After looking at the related mono sources, it feels like the .Target may be returning pointers to GC:d memory.

Removing this WeakReference based caching made our app a lot more stable.
Comment 1 Mark Probst 2013-02-12 14:52:26 UTC
Can you provide a test case?
Comment 2 Roope Kangas 2013-02-13 03:46:49 UTC
At the moment that happened in our game server which is not open source. I will try to create a test case that simulates that behavior.
Comment 3 Roope Kangas 2013-02-13 07:34:33 UTC
Created attachment 3362 [details]
Example program to reproduce the problem

This is quite contrived but...

It uses a WeakReference based cache on multiple threads. Access to the caches internal table is given to one thread at once to avoid threading issues, sacrificing performance in this example.

To avoid having to run the tester for a _long_ period of time each thread as a 10% change of calling GC.Collect(). And thus making GC trigger more often than normal.

Run the program for a while and you will have SIGSEV at __icall_wrapper_mono_object_isinst.
Comment 4 Roope Kangas 2013-02-13 07:34:59 UTC
Created attachment 3363 [details]
Crash on OSX
Comment 5 Roope Kangas 2013-02-13 07:35:46 UTC
Note: I wrote and tested  the example program on OSX agains 2.10.9 mono.

Mono JIT compiler version 2.10.9 (tarball Mon May  7 20:25:51 EDT 2012)
Copyright (C) 2002-2011 Novell, Inc, Xamarin, Inc and Contributors. www.mono-project.com
	TLS:           normal
	SIGSEGV:       normal
	Notification:  kqueue
	Architecture:  x86
	Disabled:      none
	Misc:          debugger softdebug 
	LLVM:          yes(2.9svn-mono)
	GC:            Included Boehm (with typed GC)
Comment 6 Roope Kangas 2013-02-13 07:37:02 UTC
The program seems to crash only with SGEN.
Comment 7 Mark Probst 2013-02-13 17:09:45 UTC
I can reproduce the crash.  Interestingly, I can only reproduce it when the locking for table is in there.  I removed all the counting stuff, so locking isn't necessary anymore for correctness, but without the locking, it doesn't crash.
Comment 8 Mark Probst 2013-02-14 17:47:36 UTC
Fixed in mono-2-10 and master.

Thank you!