Bug 2158 - [GTK] crash in find_window_for_ns_event
Summary: [GTK] crash in find_window_for_ns_event
Status: RESOLVED FIXED
Alias: None
Product: Xamarin Studio
Classification: Desktop
Component: General ()
Version: Trunk
Hardware: PC Mac OS
: Highest critical
Target Milestone: ---
Assignee: Bugzilla
URL:
: 4199 ()
Depends on:
Blocks:
 
Reported: 2011-11-22 13:37 UTC by Mikayla Hutchinson [MSFT]
Modified: 2013-05-29 15:11 UTC (History)
6 users (show)

Tags: gtk
Is this bug a regression?: ---
Last known good build:


Attachments
Patch for debugging this issue (1.28 KB, patch)
2012-02-17 15:16 UTC, Kristian Rietveld (inactive)
Details
Patch for debugging this issue (has a non-useful g_warning removed) (1.23 KB, patch)
2012-02-17 15:17 UTC, Kristian Rietveld (inactive)
Details
Almost the same patch, but don't return uninitialized memory (1.30 KB, patch)
2012-04-10 11:41 UTC, Michael Natterer
Details
testtext torture patch (2.12 KB, patch)
2012-07-09 19:54 UTC, Kristian Rietveld (inactive)
Details
Patch that guards for return a combination of a window and coordinates out of window bounds (3.50 KB, patch)
2012-07-09 20:00 UTC, Kristian Rietveld (inactive)
Details
Updated patch USE WITH CAUTION (4.83 KB, patch)
2012-10-08 07:26 UTC, Kristian Rietveld (inactive)
Details
Patch that doesn't break press/release pairing (4.77 KB, patch)
2012-10-09 10:50 UTC, Michael Natterer
Details


Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on Developer Community or GitHub with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Mikayla Hutchinson [MSFT] 2011-11-22 13:37:13 UTC
When triggering the GTK context menu in a GtkEntry near the left edge of the screen, I saw the following crash. This was with the tip of the GTK+ 2.24 branch.

Native stacktrace:

	0   mono                                0x000d4e22 mono_handle_native_sigsegv + 376
	1   mono                                0x0000f0e4 mono_sigsegv_signal_handler + 322
	2   libsystem_c.dylib                   0x93ccc59b _sigtramp + 43
	3   ???                                 0xffffffff 0x0 + 4294967295
	4   libgdk-quartz-2.0.0.dylib           0x04b7941d find_window_for_ns_event + 503
	5   libgdk-quartz-2.0.0.dylib           0x04b7a261 gdk_event_translate + 543
	6   libgdk-quartz-2.0.0.dylib           0x04b7a9a2 _gdk_events_queue + 117
	7   libgdk-quartz-2.0.0.dylib           0x04b7b963 gdk_event_dispatch + 57
	8   libglib-2.0.0.dylib                 0x043b9962 g_main_dispatch + 489
	9   libglib-2.0.0.dylib                 0x043baf30 g_main_context_dispatch + 138
	10  libglib-2.0.0.dylib                 0x043bb444 g_main_context_iterate + 1193
	11  libglib-2.0.0.dylib                 0x043bbc3e g_main_loop_run + 956
	12  libgtk-quartz-2.0.0.dylib           0x045c6f13 gtk_dialog_run + 542
	13  ???                                 0x16143464 0x0 + 370422884
	14  ???                                 0x16143420 0x0 + 370422816
	15  ???                                 0x16143294 0x0 + 370422420
	16  ???                                 0x16142a78 0x0 + 370420344
	17  ???                                 0x11ce3d38 0x0 + 298728760
	18  ???                                 0x11ce3ca0 0x0 + 298728608
	19  ???                                 0x11ce3c74 0x0 + 298728564
	20  ???                                 0x11ce3c4c 0x0 + 298728524
	21  ???                                 0x11ce3c2f 0x0 + 298728495
	22  ???                                 0x11ce3bcc 0x0 + 298728396
	23  ???                                 0x0a8d449c 0x0 + 177030300
	24  ???                                 0x0a8f26ec 0x0 + 177153772
	25  ???                                 0x11ce3ad8 0x0 + 298728152
	26  ???                                 0x0dab23c6 0x0 + 229319622
	27  ???                                 0x0bdd0d89 0x0 + 199036297
	28  ???                                 0x0bda163c 0x0 + 198841916
	29  libglib-2.0.0.dylib                 0x043bd069 g_timeout_dispatch + 77
	30  libglib-2.0.0.dylib                 0x043b9962 g_main_dispatch + 489
	31  libglib-2.0.0.dylib                 0x043baf30 g_main_context_dispatch + 138
	32  libglib-2.0.0.dylib                 0x043bb444 g_main_context_iterate + 1193
	33  libglib-2.0.0.dylib                 0x043bbc3e g_main_loop_run + 956
	34  libgtk-quartz-2.0.0.dylib           0x046661b7 gtk_main + 239
	35  ???                                 0x0e33dfec 0x0 + 238280684
	36  ???                                 0x0e33dfb4 0x0 + 238280628
	37  ???                                 0x0e33df94 0x0 + 238280596
	38  ???                                 0x035d1730 0x0 + 56432432
	39  ???                                 0x004eff98 0x0 + 5177240
	40  ???                                 0x004efd9c 0x0 + 5176732
	41  ???                                 0x004efe56 0x0 + 5176918
	42  mono                                0x0000ee9f mono_jit_runtime_invoke + 1332
	43  mono                                0x001ee3a6 mono_runtime_invoke + 137
	44  mono                                0x001f0c60 mono_runtime_exec_main + 585
	45  mono                                0x001effee mono_runtime_run_main + 843
	46  mono                                0x000a5a26 mono_jit_exec + 200
	47  mono                                0x000a5c5d main_thread_handler + 555
	48  mono                                0x000a7d70 mono_main + 6993
	49  mono                                0x00001b2e mono_main_with_options + 536
	50  mono                                0x00001b8d main + 49
	51  mono                                0x000018d9 start + 53
Comment 1 Mikayla Hutchinson [MSFT] 2011-11-22 13:44:24 UTC
This might be related to bug 2157.
Comment 2 Jeffrey Stedfast 2012-01-09 15:21:20 UTC
Marking as Lowest w/o a reproducible test case
Comment 3 Miguel de Icaza [MSFT] 2012-01-09 15:39:26 UTC
This is still a crasher, upgrading the priority.

This is a Gtk+ issue, so we should discuss with the fine folks at Lanedo for some help.
Comment 4 Mikayla Hutchinson [MSFT] 2012-02-09 18:41:54 UTC
I saw a similar crash using the Mono beta 2.10.9 with the new GTK. It happened while using the text editor, no obvious trigger.

Process:         mono [1023]
Path:            /Users/USER/*/MonoDevelop.app/Contents/MacOS/bin/monodevelop
Identifier:      com.ximian.monodevelop
Version:         2.8.6 (2.8.6)
Code Type:       X86 (Native)
Parent Process:  launchd [266]

Date/Time:       2012-02-09 18:38:00.149 -0500
OS Version:      Mac OS X 10.7.3 (11D50)
Report Version:  9

Interval Since Last Report:          97421 sec
Crashes Since Last Report:           2
Per-App Interval Since Last Report:  153814 sec
Per-App Crashes Since Last Report:   1
Anonymous UUID:                      8DA76007-2B36-4F78-AD50-63448AF4B9F2

Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Exception Type:  EXC_BAD_ACCESS (SIGABRT)
Exception Codes: KERN_INVALID_ADDRESS at 0x000000008fe00000

VM Regions Near 0x8fe00000:
    CoreServices           000000001f39b000-000000001f735000 [ 3688K] rw-/rwx SM=COW  
--> 
    __TEXT                 000000008fe4d000-000000008fe80000 [  204K] r-x/rwx SM=COW  /usr/lib/dyld

Application Specific Information:
objc[1023]: garbage collection is OFF
abort() called

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib        	0x9bfde9c6 __pthread_kill + 10
1   libsystem_c.dylib             	0x958eef78 pthread_kill + 106
2   libsystem_c.dylib             	0x958dfbdd abort + 167
3   monodevelop                   	0x00093b60 mono_handle_native_sigsegv + 624
4   monodevelop                   	0x000055b8 mono_sigsegv_signal_handler + 248
5   libsystem_c.dylib             	0x9594459b _sigtramp + 43
6   ???                           	0xffffffff 0 + 4294967295
7   monodevelop                   	0x000054c0 mono_sigill_signal_handler + 64
8   libgdk-quartz-2.0.0.dylib     	0x047489a2 find_window_for_ns_event + 610
9   libgdk-quartz-2.0.0.dylib     	0x04749be4 gdk_event_translate + 724
10  libgdk-quartz-2.0.0.dylib     	0x0474a5c0 _gdk_events_queue + 128
11  libgdk-quartz-2.0.0.dylib     	0x0474b852 gdk_event_dispatch + 82
12  libglib-2.0.0.dylib           	0x0405226e g_main_dispatch + 510
13  libglib-2.0.0.dylib           	0x04053afb g_main_context_dispatch + 155
14  libglib-2.0.0.dylib           	0x040540a3 g_main_context_iterate + 1331
15  libglib-2.0.0.dylib           	0x040549ea g_main_loop_run + 1050
16  libgtk-quartz-2.0.0.dylib     	0x042fd200 gtk_main + 240
17  ???                           	0x0c9f2dfc 0 + 211758588
18  ???                           	0x0c9f2dc4 0 + 211758532
19  ???                           	0x0c9f2da4 0 + 211758500
20  ???                           	0x0400409c 0 + 67125404
21  ???                           	0x003eef90 0 + 4124560
22  ???                           	0x003eed9c 0 + 4124060
23  ???                           	0x003eee56 0 + 4124246
24  monodevelop                   	0x0000d612 mono_jit_runtime_invoke + 722
25  monodevelop                   	0x0016d28e mono_runtime_invoke + 126
26  monodevelop                   	0x00171354 mono_runtime_exec_main + 420
27  monodevelop                   	0x00176725 mono_runtime_run_main + 725
28  monodevelop                   	0x00069c15 mono_jit_exec + 149
29  monodevelop                   	0x0006c00d mono_main + 9197
30  monodevelop                   	0x00002869 main + 441
31  monodevelop                   	0x00002676 start + 54

Thread 1:
0   libsystem_kernel.dylib        	0x9bfdcc22 mach_msg_trap + 10
1   libsystem_kernel.dylib        	0x9bfdc1f6 mach_msg + 70
2   monodevelop                   	0x000d7c2a mach_exception_thread + 90
3   monodevelop                   	0x0020441d GC_start_routine + 93
4   libsystem_c.dylib             	0x958eced9 _pthread_start + 335
5   libsystem_c.dylib             	0x958f06de thread_start + 34

Thread 2:
0   libsystem_kernel.dylib        	0x9bfdcc5e semaphore_wait_trap + 10
1   monodevelop                   	0x001e4f6a mono_sem_wait + 26
2   monodevelop                   	0x0010c72a finalizer_thread + 74
3   monodevelop                   	0x001a9b06 start_wrapper + 422
4   monodevelop                   	0x001dcbea thread_start_routine + 154
5   monodevelop                   	0x0020441d GC_start_routine + 93
6   libsystem_c.dylib             	0x958eced9 _pthread_start + 335
7   libsystem_c.dylib             	0x958f06de thread_start + 34

Thread 3:: Dispatch queue: com.apple.libdispatch-manager
0   libsystem_kernel.dylib        	0x9bfdf90a kevent + 10
1   libdispatch.dylib             	0x93fbcc58 _dispatch_mgr_invoke + 969
2   libdispatch.dylib             	0x93fbb6a7 _dispatch_mgr_thread + 53

Thread 4:
0   libsystem_kernel.dylib        	0x9bfde83e __psynch_cvwait + 10
1   libsystem_c.dylib             	0x958f0e21 _pthread_cond_wait + 827
2   libsystem_c.dylib             	0x958a142c pthread_cond_wait$UNIX2003 + 71
3   monodevelop                   	0x001c7092 _wapi_handle_timedwait_signal_handle + 482
4   monodevelop                   	0x001c70d8 _wapi_handle_wait_signal_handle + 40
5   monodevelop                   	0x001d9d48 WaitForSingleObjectEx + 664
6   monodevelop                   	0x001a848d ves_icall_System_Threading_WaitHandle_WaitOne_internal + 77
7   ???                           	0x0400aa30 0 + 67152432
8   ???                           	0x0400a450 0 + 67150928
9   ???                           	0x0400a12a 0 + 67150122
10  ???                           	0x04009fb2 0 + 67149746
11  ???                           	0x030b0941 0 + 51054913
12  monodevelop                   	0x0000d612 mono_jit_runtime_invoke + 722
13  monodevelop                   	0x0016d28e mono_runtime_invoke + 126
14  monodevelop                   	0x0016d3cc mono_runtime_delegate_invoke + 92
15  monodevelop                   	0x001a9b32 start_wrapper + 466
16  monodevelop                   	0x001dcbea thread_start_routine + 154
17  monodevelop                   	0x0020441d GC_start_routine + 93
18  libsystem_c.dylib             	0x958eced9 _pthread_start + 335
19  libsystem_c.dylib             	0x958f06de thread_start + 34

Thread 5:
0   libsystem_kernel.dylib        	0x9bfdebb2 __semwait_signal + 10
1   libsystem_c.dylib             	0x958a17b9 nanosleep$UNIX2003 + 187
2   monodevelop                   	0x001dbdc7 SleepEx + 295
3   monodevelop                   	0x001a0fc7 monitor_thread + 119
4   monodevelop                   	0x001a9b06 start_wrapper + 422
5   monodevelop                   	0x001dcbea thread_start_routine + 154
6   monodevelop                   	0x0020441d GC_start_routine + 93
7   libsystem_c.dylib             	0x958eced9 _pthread_start + 335
8   libsystem_c.dylib             	0x958f06de thread_start + 34

Thread 6:
0   libsystem_kernel.dylib        	0x9bfdcc76 semaphore_timedwait_trap + 10
1   monodevelop                   	0x001e4ed5 mono_sem_timedwait + 309
2   monodevelop                   	0x001a40a1 async_invoke_thread + 2817
3   monodevelop                   	0x001a9b06 start_wrapper + 422
4   monodevelop                   	0x001dcbea thread_start_routine + 154
5   monodevelop                   	0x0020441d GC_start_routine + 93
6   libsystem_c.dylib             	0x958eced9 _pthread_start + 335
7   libsystem_c.dylib             	0x958f06de thread_start + 34

Thread 7:
0   libsystem_kernel.dylib        	0x9bfde83e __psynch_cvwait + 10
1   libsystem_c.dylib             	0x958f0e21 _pthread_cond_wait + 827
2   libsystem_c.dylib             	0x958a13e0 pthread_cond_timedwait$UNIX2003 + 70
3   monodevelop                   	0x001c702a _wapi_handle_timedwait_signal_handle + 378
4   monodevelop                   	0x001d9d6b WaitForSingleObjectEx + 699
5   monodevelop                   	0x001a848d ves_icall_System_Threading_WaitHandle_WaitOne_internal + 77
6   ???                           	0x0400aa30 0 + 67152432
7   ???                           	0x0c969ad0 0 + 211196624
8   ???                           	0x0c969a44 0 + 211196484
9   ???                           	0x0c969930 0 + 211196208
10  ???                           	0x04009fb2 0 + 67149746
11  ???                           	0x030b0941 0 + 51054913
12  monodevelop                   	0x0000d612 mono_jit_runtime_invoke + 722
13  monodevelop                   	0x0016d28e mono_runtime_invoke + 126
14  monodevelop                   	0x0016d3cc mono_runtime_delegate_invoke + 92
15  monodevelop                   	0x001a9b32 start_wrapper + 466
16  monodevelop                   	0x001dcbea thread_start_routine + 154
17  monodevelop                   	0x0020441d GC_start_routine + 93
18  libsystem_c.dylib             	0x958eced9 _pthread_start + 335
19  libsystem_c.dylib             	0x958f06de thread_start + 34

Thread 8:
0   libsystem_kernel.dylib        	0x9bfdebb2 __semwait_signal + 10
1   libsystem_c.dylib             	0x958a17b9 nanosleep$UNIX2003 + 187
2   monodevelop                   	0x001dbdc7 SleepEx + 295
3   monodevelop                   	0x001ac2a8 ves_icall_System_Threading_Thread_Sleep_internal + 88
4   ???                           	0x0c98c9ac 0 + 211339692
5   ???                           	0x0c98c928 0 + 211339560
6   ???                           	0x0c98c7dc 0 + 211339228
7   ???                           	0x04009fb2 0 + 67149746
8   ???                           	0x030b0941 0 + 51054913
9   monodevelop                   	0x0000d612 mono_jit_runtime_invoke + 722
10  monodevelop                   	0x0016d28e mono_runtime_invoke + 126
11  monodevelop                   	0x0016d3cc mono_runtime_delegate_invoke + 92
12  monodevelop                   	0x001a9b32 start_wrapper + 466
13  monodevelop                   	0x001dcbea thread_start_routine + 154
14  monodevelop                   	0x0020441d GC_start_routine + 93
15  libsystem_c.dylib             	0x958eced9 _pthread_start + 335
16  libsystem_c.dylib             	0x958f06de thread_start + 34

Thread 9:
0   libsystem_kernel.dylib        	0x9bfdf90a kevent + 10
1   monodevelop                   	0x001a1f36 tp_kqueue_wait + 1142
2   monodevelop                   	0x001a9b06 start_wrapper + 422
3   monodevelop                   	0x001dcbea thread_start_routine + 154
4   monodevelop                   	0x0020441d GC_start_routine + 93
5   libsystem_c.dylib             	0x958eced9 _pthread_start + 335
6   libsystem_c.dylib             	0x958f06de thread_start + 34

Thread 10:
0   libsystem_kernel.dylib        	0x9bfdcc76 semaphore_timedwait_trap + 10
1   monodevelop                   	0x001e4ed5 mono_sem_timedwait + 309
2   monodevelop                   	0x001a40a1 async_invoke_thread + 2817
3   monodevelop                   	0x001a9b06 start_wrapper + 422
4   monodevelop                   	0x001dcbea thread_start_routine + 154
5   monodevelop                   	0x0020441d GC_start_routine + 93
6   libsystem_c.dylib             	0x958eced9 _pthread_start + 335
7   libsystem_c.dylib             	0x958f06de thread_start + 34

Thread 11:
0   libsystem_kernel.dylib        	0x9bfdeb42 __select + 10
1   libglib-2.0.0.dylib           	0x04067c2f g_poll + 1295
2   libgdk-quartz-2.0.0.dylib     	0x0474acec select_thread_func + 332
3   libsystem_c.dylib             	0x958eced9 _pthread_start + 335
4   libsystem_c.dylib             	0x958f06de thread_start + 34

Thread 12:
0   libsystem_kernel.dylib        	0x9bfdcc76 semaphore_timedwait_trap + 10
1   monodevelop                   	0x001e4ed5 mono_sem_timedwait + 309
2   monodevelop                   	0x001a40a1 async_invoke_thread + 2817
3   monodevelop                   	0x001a9b06 start_wrapper + 422
4   monodevelop                   	0x001dcbea thread_start_routine + 154
5   monodevelop                   	0x0020441d GC_start_routine + 93
6   libsystem_c.dylib             	0x958eced9 _pthread_start + 335
7   libsystem_c.dylib             	0x958f06de thread_start + 34

Thread 13:
0   libsystem_kernel.dylib        	0x9bfdcc76 semaphore_timedwait_trap + 10
1   monodevelop                   	0x001e4ed5 mono_sem_timedwait + 309
2   monodevelop                   	0x001a40a1 async_invoke_thread + 2817
3   monodevelop                   	0x001a9b06 start_wrapper + 422
4   monodevelop                   	0x001dcbea thread_start_routine + 154
5   monodevelop                   	0x0020441d GC_start_routine + 93
6   libsystem_c.dylib             	0x958eced9 _pthread_start + 335
7   libsystem_c.dylib             	0x958f06de thread_start + 34

Thread 14:
0   libsystem_kernel.dylib        	0x9bfde83e __psynch_cvwait + 10
1   libsystem_c.dylib             	0x958f0e21 _pthread_cond_wait + 827
2   libsystem_c.dylib             	0x958a142c pthread_cond_wait$UNIX2003 + 71
3   monodevelop                   	0x001c7092 _wapi_handle_timedwait_signal_handle + 482
4   monodevelop                   	0x001c70d8 _wapi_handle_wait_signal_handle + 40
5   monodevelop                   	0x001d9d48 WaitForSingleObjectEx + 664
6   monodevelop                   	0x001a848d ves_icall_System_Threading_WaitHandle_WaitOne_internal + 77
7   ???                           	0x0400aa30 0 + 67152432
8   ???                           	0x0400a450 0 + 67150928
9   ???                           	0x0f1c291f 0 + 253503775
10  ???                           	0x04009fb2 0 + 67149746
11  ???                           	0x030b0941 0 + 51054913
12  monodevelop                   	0x0000d612 mono_jit_runtime_invoke + 722
13  monodevelop                   	0x0016d28e mono_runtime_invoke + 126
14  monodevelop                   	0x0016d3cc mono_runtime_delegate_invoke + 92
15  monodevelop                   	0x001a9b32 start_wrapper + 466
16  monodevelop                   	0x001dcbea thread_start_routine + 154
17  monodevelop                   	0x0020441d GC_start_routine + 93
18  libsystem_c.dylib             	0x958eced9 _pthread_start + 335
19  libsystem_c.dylib             	0x958f06de thread_start + 34

Thread 15:
0   libsystem_kernel.dylib        	0x9bfde83e __psynch_cvwait + 10
1   libsystem_c.dylib             	0x958f0e21 _pthread_cond_wait + 827
2   libsystem_c.dylib             	0x958a142c pthread_cond_wait$UNIX2003 + 71
3   monodevelop                   	0x001c7092 _wapi_handle_timedwait_signal_handle + 482
4   monodevelop                   	0x001c70d8 _wapi_handle_wait_signal_handle + 40
5   monodevelop                   	0x001d9d48 WaitForSingleObjectEx + 664
6   monodevelop                   	0x001a848d ves_icall_System_Threading_WaitHandle_WaitOne_internal + 77
7   ???                           	0x0400aa30 0 + 67152432
8   ???                           	0x0400a450 0 + 67150928
9   ???                           	0x0f36ae17 0 + 255241751
10  ???                           	0x04009fb2 0 + 67149746
11  ???                           	0x030b0941 0 + 51054913
12  monodevelop                   	0x0000d612 mono_jit_runtime_invoke + 722
13  monodevelop                   	0x0016d28e mono_runtime_invoke + 126
14  monodevelop                   	0x0016d3cc mono_runtime_delegate_invoke + 92
15  monodevelop                   	0x001a9b32 start_wrapper + 466
16  monodevelop                   	0x001dcbea thread_start_routine + 154
17  monodevelop                   	0x0020441d GC_start_routine + 93
18  libsystem_c.dylib             	0x958eced9 _pthread_start + 335
19  libsystem_c.dylib             	0x958f06de thread_start + 34

Thread 16:: com.apple.CFSocket.private
0   libsystem_kernel.dylib        	0x9bfdeb42 __select + 10
1   com.apple.CoreFoundation      	0x94f13ee5 __CFSocketManager + 1557
2   libsystem_c.dylib             	0x958eced9 _pthread_start + 335
3   libsystem_c.dylib             	0x958f06de thread_start + 34

Thread 17:
0   libsystem_kernel.dylib        	0x9bfdf02e __workq_kernreturn + 10
1   libsystem_c.dylib             	0x958eeccf _pthread_wqthread + 773
2   libsystem_c.dylib             	0x958f06fe start_wqthread + 30

Thread 0 crashed with X86 Thread State (32-bit):
  eax: 0x00000000  ebx: 0x13360674  ecx: 0xbfffe45c  edx: 0x9bfde9c6
  edi: 0xac6052c0  esi: 0x00000006  ebp: 0xbfffe478  esp: 0xbfffe45c
   ss: 0x00000023  efl: 0x00000246  eip: 0x9bfde9c6   cs: 0x0000000b
   ds: 0x00000023   es: 0x00000023   fs: 0x00000000   gs: 0x0000000f
  cr2: 0xac5fd738
Logical CPU: 0
Comment 5 Mikayla Hutchinson [MSFT] 2012-02-09 19:45:45 UTC
Happened again, again when clicking on the text editor.
Comment 6 Mikayla Hutchinson [MSFT] 2012-02-09 22:24:36 UTC
The most recent crash may have been caused by bug 3340.
Comment 7 Kristian Rietveld (inactive) 2012-02-10 13:55:07 UTC
The important bit of the trace is:

7   monodevelop                       0x000054c0 mono_sigill_signal_handler +
64
8   libgdk-quartz-2.0.0.dylib         0x047489a2 find_window_for_ns_event + 610
9   libgdk-quartz-2.0.0.dylib         0x04749be4 gdk_event_translate + 724
10  libgdk-quartz-2.0.0.dylib         0x0474a5c0 _gdk_events_queue + 128
11  libgdk-quartz-2.0.0.dylib         0x0474b852 gdk_event_dispatch + 82


It is likely find_window_for_ns_event() does not find the correct window (or returns NULL).

If you have an idea about how this can be produced, that would be very welcome.
Comment 8 Mikayla Hutchinson [MSFT] 2012-02-10 14:04:20 UTC
I saw this again so it doesn't appear to be related to the pango issue - but the cause was not obvious. I'll try running MD inside gdb to get more info next time.
Comment 9 Mikayla Hutchinson [MSFT] 2012-02-14 20:00:27 UTC
I'm seeing this crasher at least couple of times a day, so I'm say it's our most important GTK+ bug at this point. However, I still haven't identified a trigger.
Comment 10 Kristian Rietveld (inactive) 2012-02-15 16:02:20 UTC
> I'm seeing this crasher at least couple of times a day, so I'm say it's our
> most important GTK+ bug at this point. However, I still haven't identified a
> trigger.

Michael: can you confirm whether the stack trace is always the same?  If not, could you perhaps get us another trace?

When I was just looking at the stack trace again, it felt a bit suspicious:

7   monodevelop                       0x000054c0 mono_sigill_signal_handler + 64
8   libgdk-quartz-2.0.0.dylib         0x047489a2 find_window_for_ns_event + 610

I assumed that a SIGSEGV was caught when I posted comment 7. However, it is a SIGILL. SIGILL is usually an illegal instruction and I cannot really see how find_window_for_ns_event() could trigger that.  Unless mono triggers SIGILL for special occasions, is that the case?

If I disassemble find_window_for_ns_event, + 610 is the first instruction after the call to get_window_point_from_screen_point(). In find_window_for_ns_event() no pointer is dereferenced after the call to get_window_point_from_screen_point(), so I am wondering if a SIGSEGV due to GTK+ is likely.

In case the program counter information is unreliable it could also be a crash in get_window_point_from_screen_point(). Though the screen_point, x and y arguments are for this code path always set (x and y also by the caller of find_window_for_ns_event()). This leaves the possibility of window being NULL, which could happen if grab->window == NULL in find_window_for_ns_event().
Comment 11 Mikayla Hutchinson [MSFT] 2012-02-15 16:50:40 UTC
They certainly always contain find_window_for_ns_event and are similar enough that I believed them to be the same, though as you mention one trace does have the mono_sigill_signal_handler frame.
Comment 12 Kristian Rietveld (inactive) 2012-02-17 15:14:16 UTC
The stack trace in the opening comment does not contain SIGILL and goes right in to a SIGSEGV from find_window_for_ns_event().  Due to the age of the trace, I cannot use the offsets into the assembler code anymore to get clues.

I revisited the stack trace from comment 4. Earlier I said the +610 for find_window_for_ns_event() is the instruction after the call to get_window_point_from_screen_point(). However, the +724 for gdk_event_translate() is also the instruction after the call to find_window_for_ns_event(), so (as usual actually) these offsets should be taken with a grain of salt.

That said, I think we should investigate the code that I suspect might be the problem around +610 some more.  I will attach a patch which adds two g_warning()s to this code which will be performed when one of the values is NULL.  Due to the added if guards, the crash, if caused by this code fragment, should disappear as well.

It would be great if somebody could run MonoDevelop with this for a few days to see if these warnings are hit.
Comment 13 Kristian Rietveld (inactive) 2012-02-17 15:16:04 UTC
Created attachment 1373 [details]
Patch for debugging this issue

Patch to investigate if the suspected code fragment is indeed hit when one of the pointers is NULL.
Comment 14 Kristian Rietveld (inactive) 2012-02-17 15:17:38 UTC
Created attachment 1374 [details]
Patch for debugging this issue (has a non-useful g_warning removed)
Comment 15 Mikayla Hutchinson [MSFT] 2012-02-17 16:23:12 UTC
Thanks, I've added the patch in bockbuild. Jeff and I will run with it locally.
Comment 16 Mikayla Hutchinson [MSFT] 2012-04-02 17:23:55 UTC
I'm still seeing this on the latest Mono 2.10.9. Duncan, did we get this patch into those builds?
Comment 17 Mikayla Hutchinson [MSFT] 2012-04-02 18:51:58 UTC
Specifically, 210090008.
Comment 18 Duncan Mak 2012-04-02 19:50:23 UTC
I believe we do have the patch included. I checked the tarball that was used and I see the patch in there.
Comment 19 Mikayla Hutchinson [MSFT] 2012-04-03 00:27:13 UTC
I tried getting the line number with atos, and it just returned the same instruction offset as the crash report. And dwarfdump wasn't any better - it seems to have trouble reading anything from the DWARF resource in the dSYM. Maybe the dSYM shipped by Mono for gdk-quartz is broken.

The offsets with Mono 210090008 are:
8   libgdk-quartz-2.0.0.dylib           0x0466fa20 find_window_for_ns_event + 586
9   libgdk-quartz-2.0.0.dylib           0x046708de gdk_event_translate + 585
10  libgdk-quartz-2.0.0.dylib           0x04671022 _gdk_events_queue + 117

AFAICT from disassembling the file with otool, it's still the instruction after the call to
get_window_point_from_screen_point. There are a couple of g_log calls shortly before that point, so it looks like the patch is included.
Comment 20 Mikayla Hutchinson [MSFT] 2012-04-04 16:25:43 UTC
*** Bug 4199 has been marked as a duplicate of this bug. ***
Comment 21 Michael Natterer 2012-04-10 10:58:33 UTC
Hmm, the instruction after get_window_point_from_screen_point() is a simply
return. Let me cook up a patch that adds even more printf and makes sure
it returns NULL rather than uninitialized stack memory.
Comment 22 Michael Natterer 2012-04-10 11:41:44 UTC
Created attachment 1644 [details]
Almost the same patch, but don't return uninitialized memory

I ended up making only the change mentioned above.

While reading the code, I noticed that the different "toplevel" functions/users
in this file have different ideas about what s "toplevel" is, some additionally
check for WINDOW_IS_TOPLEVEL(). Especially evil seems find_toplevel_under_window()
which always returns display->pointer_info.toplevel_under_pointer but
only returns translated coordinates if WINDOW_IS_TOPLEVEL() is also TRUE.

This looks fishy. Kris, what do you think?
Comment 23 Kristian Rietveld (inactive) 2012-04-10 16:12:02 UTC
> Hmm, the instruction after get_window_point_from_screen_point() is a simply
> return. Let me cook up a patch that adds even more printf and makes sure
> it returns NULL rather than uninitialized stack memory.


IIRC it already properly returns NULL, since gdk_window_get_effective_toplevel() will always return a proper value as far as I can see.


> which always returns display->pointer_info.toplevel_under_pointer but
> only returns translated coordinates if WINDOW_IS_TOPLEVEL() is also TRUE.
>
> This looks fishy. Kris, what do you think?

The several WINDOW_IS_TOPLEVEL() checks originate from a patch which made offscreen windows work with the backend without constantly crashing. The real issue could be that get_window_point_from_screen_point() is not offscreen-window-aware.
Comment 24 Kristian Rietveld (inactive) 2012-04-21 05:04:56 UTC
Michael, if you are seeing this crash regularly in the latest Mono 2.10.9, could you get us the messages printed to the console by the g_log()s?   This likely contains valuable debugging information for us.
Comment 25 Mikayla Hutchinson [MSFT] 2012-04-27 15:56:48 UTC
Sorry, I haven't always been running MD in the bockbuild environment, so it's taken a while to get the traces. Here it is, with the latest patchset:
Gdk-Critical: gdk_window_get_events: assertion `GDK_IS_WINDOW (window)' failed
GLib-GObject-Critical: g_object_ref: assertion `G_IS_OBJECT (object)' failed
Gdk-Warning: grab->window == NULL
Gdk-Critical: generate_grab_broken_event: assertion `window != NULL' failed
GLib-GObject-Critical: g_object_unref: assertion `G_IS_OBJECT (object)' failed

This was just when editing in the text editor, though the code completion or tooltip may have been involved. I think that bug 4497 *might* be a duplicate of this, since the symptoms are the same, and AFAIK we included an earlier version of the patch into recent Mono builds.
Comment 26 Kristian Rietveld (inactive) 2012-04-29 16:58:06 UTC
Thanks!  This is very valuable information.  Based on these log messages I think the code path that goes wrong is:

 # In proxy_button_event(), pointer_window is NULL in the if type == GDK_BUTTON_PRESS && .. clause.
 # This causes the warning from gdk_window_get_events() 
 # The call to _gdk_display_add_pointer_grab() with pointer_window == NULL could cause the g_object_ref warning.
 # The fact that a grab with a NULL window is registered causes the two further warnings.

Either _gdk_window_find_descendant_at() returns NULL, or the "Find the event window that gets the grab" code ends up with NULL.

The former is possible if:
 # the point is not in the given top-level window at all;

The latter is also a possibility, especially given the while condition explicitly guards for this.
Comment 27 Kristian Rietveld (inactive) 2012-04-29 18:16:21 UTC
I have to add that the "Find the event window that gets the grab" code stops iterating as soon as the next parent is NULL. w is not set to NULL in such cases. I get the impression that it is more likely that the former code gives the latter code a NULL pointer_window to start with.

This would imply that we have a possible case in which an event is generated with coordinates out of the bounds of the toplevel. (Or, the toplevel is resized in between the event being generated and handled?).  I need to think about whether we can come up with an easy way to stress test this somehow ...
Comment 28 Mikayla Hutchinson [MSFT] 2012-04-29 20:08:49 UTC
I think it has something to do with the code completion window, which gets hidden (but not destroyed) in response to keystrokes, and bug 4497 indicates something strange is going on with its grabs.
Comment 29 Mikayla Hutchinson [MSFT] 2012-05-07 17:18:05 UTC
I got a fresh trace, with symbols, on a fully updated and patched GTK+:

Thread 1 (process 13903):
#0  0x9221bfd5 in __wait4 ()
#1  0x95f974ec in waitpid$UNIX2003 ()
#2  0x000e174b in mono_handle_native_sigsegv (signal=11, ctx=0xbfffe918) at mini-exceptions.c:2192
#3  0x000115df in mono_sigsegv_signal_handler (_dummy=11, info=0xbfffe8d8, context=0xbfffe918) at mini.c:5917
#4  <signal handler called>
#5  0x939e7d47 in objc_msgSend ()
#6  0xbfffeb58 in ?? ()
#7  0x043a3589 in find_window_for_ns_event (nsevent=0x19ee7040, x=0xbfffeb58, y=0xbfffeb54, x_root=0xbfffeb50, y_root=0xbfffeb4c) at gdkevents-quartz.c:716
#8  0x043a4a18 in gdk_event_translate (event=0xc9b808, nsevent=0x19ee7040) at gdkevents-quartz.c:1325
#9  0x043a53f0 in _gdk_events_queue (display=0x10be800) at gdkevents-quartz.c:1528
#10 0x043a6732 in gdk_event_dispatch (source=0x65c860, callback=0, user_data=0x0) at gdkeventloop-quartz.c:670
#11 0x03b20d21 in g_main_dispatch (context=0x65c8c0) at gmain.c:2441
#12 0x03b2261b in g_main_context_dispatch (context=0x65c8c0) at gmain.c:3011
#13 0x03b22c4a in g_main_context_iterate (context=0x65c8c0, block=1, dispatch=1, self=0x1b6bff0) at gmain.c:3089
#14 0x03b2358d in g_main_loop_run (loop=0x1d840a0) at gmain.c:3297
#15 0x03e30a80 in gtk_main () at gtkmain.c:1256
#16 0x0de64bc4 in ?? ()
#17 0x0de64b8c in ?? ()
#18 0x0de64b6c in ?? ()
#19 0x01feed00 in ?? ()
#20 0x00741f90 in ?? ()
#21 0x00741d9c in ?? ()
#22 0x00741e56 in ?? ()
#23 0x0001132f in mono_jit_runtime_invoke (method=0x286681c, obj=0x0, params=0xbffff2d8, exc=0x0) at mini.c:5791
#24 0x00221aaa in mono_runtime_invoke (method=0x286681c, obj=0x0, params=0xbffff2d8, exc=0x0) at object.c:2755
#25 0x0022478c in mono_runtime_exec_main (method=0x286681c, args=0x4f6e00, exc=0x0) at object.c:3930
#26 0x002239f1 in mono_runtime_run_main (method=0x286681c, argc=0, argv=0xbffff5e0, exc=0x0) at object.c:3560
#27 0x000acecf in mono_jit_exec (domain=0x4efe00, assembly=0x1d3fd50, argc=1, argv=0xbffff5dc) at driver.c:944
#28 0x000ad120 in main_thread_handler (user_data=0xbffff508) at driver.c:1003
#29 0x000af578 in mono_main (argc=2, argv=0xbffff5d8) at driver.c:1855
#30 0x00002874 in mono_main_with_options (argc=2, argv=0xbffff5d8) at main.c:66
#31 0x00002908 in main (argc=2, argv=0xbffff5d8) at main.c:97

Here is the file with the current patchset: https://gist.github.com/d6c02072f333710dbbba#L716
Comment 30 Mikayla Hutchinson [MSFT] 2012-05-07 17:19:41 UTC
FYI, that was from Mono's gdb backtracing of itself, Apple Crash Reporter trace for the same crash was:
0   libsystem_kernel.dylib        	0x9221b9c6 __pthread_kill + 10
1   libsystem_c.dylib             	0x95fe4f78 pthread_kill + 106
2   libsystem_c.dylib             	0x95fd5bdd abort + 167
3   mono                          	0x000e17dc mono_handle_native_sigsegv + 908
4   mono                          	0x000115df mono_sigsegv_signal_handler + 351
5   libsystem_c.dylib             	0x9603a59b _sigtramp + 43
6   ???                           	0xffffffff 0 + 4294967295
7   mono                          	0x00011480 mono_sigill_signal_handler + 96
8   libgdk-quartz-2.0.0.dylib     	0x043a3589 find_window_for_ns_event + 713
9   libgdk-quartz-2.0.0.dylib     	0x043a4a18 gdk_event_translate + 776
10  libgdk-quartz-2.0.0.dylib     	0x043a53f0 _gdk_events_queue + 128
11  libgdk-quartz-2.0.0.dylib     	0x043a6732 gdk_event_dispatch + 82
12  libglib-2.0.0.dylib           	0x03b20d21 g_main_dispatch + 513 (gmain.c:2441)
13  libglib-2.0.0.dylib           	0x03b2261b g_main_context_dispatch + 155 (gmain.c:3014)
14  libglib-2.0.0.dylib           	0x03b22c4a g_main_context_iterate + 1466 (gmain.c:3092)
15  libglib-2.0.0.dylib           	0x03b2358d g_main_loop_run + 1037 (gmain.c:3296)
16  libgtk-quartz-2.0.0.dylib     	0x03e30a80 gtk_main + 240 (gtkmain.c:1257)
Comment 31 Kristian Rietveld (inactive) 2012-05-13 09:46:33 UTC
Michael, is the resolver in this case still patched to destroy the window instead of hiding it?


The crash appears to occur in get_window_point_from_screen_point, presumably at the point where a message is sent to the nswindow. This could mean the private->impl->toplevel pointer that we get from the grab->window is busted (or private->impl or private is busted if the crash happens earlier, but that is hard to see here).
Comment 32 Kristian Rietveld (inactive) 2012-05-13 12:28:43 UTC
To possibly get more information out of the stack traces, a full backtrace, generated with "bt full" would be nice.

So far, we still don't seem to be able to trigger this.
Comment 33 Mikayla Hutchinson [MSFT] 2012-05-14 12:10:56 UTC
Yes, that was with the patch to destroy the window.
Comment 34 Mikayla Hutchinson [MSFT] 2012-06-07 16:04:47 UTC
I've also seen this crash after the text editor pops up the error bubble animation at a compile error.
Comment 35 Kristian Rietveld (inactive) 2012-07-09 11:38:30 UTC
I *finally* managed to trigger this crash.  I managed to trigger it in a standalone GTK+ testcase that I wrote a couple of weeks ago, which pops up/pops down a popup window on every key press in a text view. Also the window changes size every time it is shown.  So it's good that we also managed to reproduce this outside of MonoDevelop.

Other prerequisites to reproduce this are: running this test case under valgrind on a Snow Leopard machine and while typing make drag movements from the pop window to outside the popup window.

The trace produced by valgrind is as follows:


==96219== Invalid read of size 8
==96219==    at 0x1000A64E6: get_window_point_from_screen_point (gdkevents-quartz.c:395)
==96219==    by 0x1000A7AA9: find_window_for_ns_event (gdkevents-quartz.c:688)
==96219==    by 0x1000A6CD5: gdk_event_translate (gdkevents-quartz.c:1269)
==96219==    by 0x1000A696D: _gdk_events_queue (gdkevents-quartz.c:1441)
==96219==    by 0x1000A5978: gdk_event_dispatch (gdkeventloop-quartz.c:670)
==96219==    by 0x10180B16D: g_main_dispatch (gmain.c:2387)
==96219==    by 0x10180BDA4: g_main_context_dispatch (gmain.c:2924)
==96219==    by 0x10180BF53: g_main_context_iterate (gmain.c:2995)
==96219==    by 0x10180C360: g_main_loop_run (gmain.c:3189)
==96219==    by 0x1002A452A: gtk_main (gtkmain.c:1257)
==96219==    by 0x1000115CE: main (in /source/gnome/gtk+-2-24/tests/.libs/testtext)

Which is pretty much what is seen above.  What happens is a read from a NULL pointer (a NULL GdkWindow) in get_window_point_from_screen_point.  What is VERY strange is that exactly this condition should have been excluded in the trace given by Michael in comment 29, because the file that is claimed to have produced this crash includes NULL checks for this.  Though it could be the case that valgrind free() is setting everything to NULL such that we are dealing with a NULL pointer crash here and a random pointer crash in the normal case.

What appears to be causing this problem is that we are pushing in events that have coordinates outside of the window that is given in the event.  It looks like we cannot rely on pointer_info->toplevel_under_pointer in gdkevents-quartz.c while the window is constantly resizing, so essentially this is a race condition. Apart from that, there seems to be another code path that is resulting in wrong coordinate, window combos.

I am still investigating this so I can come up with a fix. After that, I will make an analysis whether the related bugs are really related. To be continued.
Comment 36 Kristian Rietveld (inactive) 2012-07-09 19:54:59 UTC
Created attachment 2181 [details]
testtext torture patch

For reference: the completely evil patch to testtext I used to trigger the issue.
Comment 37 Kristian Rietveld (inactive) 2012-07-09 20:00:01 UTC
Created attachment 2182 [details]
Patch that guards for return a combination of a window and coordinates out of window bounds

With this patch I seem no longer able to trigger this crash with the test case.  The patch still feels like a hack to me.  However, NSEvents do appear to come in with a window set, but with coordinates out of the bounds of this window. For button press/release events this will cause nothing but trouble, so we either have to find the right toplevel window or we simply ignore the event.

Finding the right toplevel could bring other problems, also, this scenario will very rarely occur.  So for now I have the impression simply ignoring these events is a good pick.


Testing would of course be good, and I would love to learn about any potential regressions (which should in theory be rare, because the scenario rarely occurs).
Comment 38 Kristian Rietveld (inactive) 2012-07-09 20:12:08 UTC
It is hard to say whether fixing this bug also fixes bugs 4651 and 4497.

What did happen with this bug is that faulty grabs were created, grabs with a NULL (or perhaps in the case without valgrind, a busted) window pointer. This *could* make a case that bug 4497 is fixed, because in 4497 stuck grabs are experienced.  But I can absolutely not say for sure.

In bug 4651 we are dealing with a different stack trace. This trace cannot be created if event_window is NULL, because there's an if statement guarding for a NULL window pointer right at the start of _gdk_windowing_got_event().  However, recall that I mentioned in comment 35 that the trace in comment 29 could also not happen with a NULL window pointer, but, perhaps, when valgrind is not used we are dealing with a busted pointer instead of a NULL pointer. If so, this busted pointer could also come into play in this case, causing bug 4651.


So, we cannot really say whether 4497 and 4651 have been fixed as well now -- the above comments are purely speculations.
Comment 39 Mikayla Hutchinson [MSFT] 2012-07-10 19:43:13 UTC
Thanks, I landed the patch in bockbuild. It'll be a week or so before I can be reasonably sure it fixes the issue - sometimes the crash can happen several times in a few hours, sometimes it doesn't happen for several days...
Comment 40 Mikayla Hutchinson [MSFT] 2012-08-09 00:48:44 UTC
Unfortunately it looks like this doesn't fix it, I just saw:
Thread 1 (process 69871):
#0  0x95d8c095 in __wait4 ()
#1  0x9872c9ae in waitpid$UNIX2003 ()
#2  0x000e15cb in mono_handle_native_sigsegv (signal=11, ctx=0xbfffe8a8) at mini-exceptions.c:2192
#3  0x0001150f in mono_sigsegv_signal_handler (_dummy=11, info=0xbfffe868, context=0xbfffe8a8) at mini.c:5917
#4  <signal handler called>
#5  0x932d2c07 in objc_msgSend ()
#6  0xbfffeae8 in ?? ()
#7  0x05fb2b6d in find_window_for_ns_event (nsevent=0x49e3aa00, x=0xbfffeae8, y=0xbfffeae4, x_root=0xbfffeae0, y_root=0xbfffeadc) at gdkevents-quartz.c:775
#8  0x05fb45b8 in gdk_event_translate (event=0x13fb660, nsevent=0x49e3aa00) at gdkevents-quartz.c:1522
#9  0x05fb4f90 in _gdk_events_queue (display=0x3869800) at gdkevents-quartz.c:1732
#10 0x05fb62d2 in gdk_event_dispatch (source=0x358f2d0, callback=0, user_data=0x0) at gdkeventloop-quartz.c:670
#11 0x05616d21 in g_main_dispatch (context=0x358f330) at gmain.c:2441
#12 0x0561861b in g_main_context_dispatch (context=0x358f330) at gmain.c:3011
#13 0x05618c4a in g_main_context_iterate (context=0x358f330, block=1, dispatch=1, self=0x27b1050) at gmain.c:3089
#14 0x0561958d in g_main_loop_run (loop=0xccee940) at gmain.c:3297
#15 0x05a361e0 in gtk_main () at gtkmain.c:1257
#16 0x0f88a73c in ?? ()
#17 0x0f88a704 in ?? ()
#18 0x0f88a6e4 in ?? ()
#19 0x041d06c0 in ?? ()
#20 0x004eefe8 in ?? ()
#21 0x004eede4 in ?? ()
#22 0x004eeeaa in ?? ()
#23 0x0001125f in mono_jit_runtime_invoke (method=0x82b81c, obj=0x0, params=0xbffff2b8, exc=0x0) at mini.c:5791
#24 0x00224fda in mono_runtime_invoke (method=0x82b81c, obj=0x0, params=0xbffff2b8, exc=0x0) at object.c:2755
#25 0x00227c5c in mono_runtime_exec_main (method=0x82b81c, args=0x18007c0, exc=0x0) at object.c:3930
#26 0x00226ec1 in mono_runtime_run_main (method=0x82b81c, argc=1, argv=0xbffff5b4, exc=0x0) at object.c:3560
#27 0x000ad86f in mono_jit_exec (domain=0x65d2a0, assembly=0x275d330, argc=2, argv=0xbffff5b0) at driver.c:944
#28 0x000adac0 in main_thread_handler (user_data=0xbffff4d8) at driver.c:1003
#29 0x000aff02 in mono_main (argc=4, argv=0xbffff5a8) at driver.c:1855
#30 0x000027a4 in mono_main_with_options (argc=4, argv=0xbffff5a8) at main.c:66
#31 0x00002838 in main (argc=4, argv=0xbffff5a8) at main.c:97
Comment 41 Mikayla Hutchinson [MSFT] 2012-08-13 15:01:59 UTC
Something I've been noticing for some time is that this often happens when I click on the editor immediately after a build. 

After a build, MD uses a popup window to show an animation that "calls out" the first compiler error bubble. I think the crash is related to clicking on the editor while this window is appearing/disappearing.
Comment 42 Kristian Rietveld (inactive) 2012-10-08 07:26:34 UTC
Created attachment 2705 [details]
Updated patch USE WITH CAUTION

Recently, a related crash came to our attention in GIMP, see GNOME BUG 684419. It turns out this crash was being triggered by clicks on the title bar. It is likely this occurs in MonoDevelop too, but I cannot comment whether this fixes the crash in Comment 40. The traces simply lack details ...

This is an updated version of the patch in comment 37 (July 9), that also fixes the crash of 684419. It changes a few things, by now checking whether the event occurs outside of the view frame, which should be more reliable and also automatically includes checking for the titlebar location, since the title bar is out of the bounds of the view frame.

Please use this patch WITH CAUTION, I have not tested it against the test case in comment 36 and hope to do that somewhere this week. Once I have tested I will make this known here.
Comment 43 Mikayla Hutchinson [MSFT] 2012-10-08 12:33:06 UTC
Thanks, I've applied it to my local GTK+ build and will keep an eye out for the crash.
Comment 44 Michael Natterer 2012-10-09 10:31:25 UTC
I completely agree for button press events, but won't that filter out
and break things when a press-move-release sequence ends outside
the window's view frame?
Comment 45 Michael Natterer 2012-10-09 10:41:26 UTC
Yes indeed, the patch has exactly the effect I suspected, it breaks
the expected invariant press/repease pairing.
Comment 46 Michael Natterer 2012-10-09 10:50:06 UTC
Created attachment 2717 [details]
Patch that doesn't break press/release pairing

New patch that does the filtering only on press events.
Seems to work fine for the use case in gnome bug 684419
and doesn'T break press/release event pairing.
Comment 47 Mikayla Hutchinson [MSFT] 2012-10-10 12:38:37 UTC
Thanks, updated patch in bockbuild.
Comment 48 Mikayla Hutchinson [MSFT] 2012-10-12 19:18:35 UTC
I still get the crash with the patch. It happened when I clicked on the editor immediately after a build. I suspect it has something to do with the animated "bubble" popout window.

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib        	0x9183fa6a __pthread_kill + 10
1   libsystem_c.dylib             	0x99b4bacf pthread_kill + 101
2   libsystem_c.dylib             	0x99b824f8 abort + 168
3   mono-sgen                     	0x00094c30 mono_handle_native_sigsegv + 624
4   mono-sgen                     	0x00005628 mono_sigsegv_signal_handler + 248
5   libsystem_c.dylib             	0x99b3686b _sigtramp + 43
6   ???                           	0xffffffff 0 + 4294967295
7   mono-sgen                     	0x00005530 mono_sigill_signal_handler + 64
8   libgdk-quartz-2.0.0.dylib     	0x052b74ad find_window_for_ns_event + 701 (gdkevents-quartz.c:870)
9   libgdk-quartz-2.0.0.dylib     	0x052b8fbb gdk_event_translate + 779 (gdkevents-quartz.c:1636)
10  libgdk-quartz-2.0.0.dylib     	0x052b9a30 _gdk_events_queue + 128 (gdkevents-quartz.c:1851)
11  libgdk-quartz-2.0.0.dylib     	0x052bafde gdk_event_dispatch + 222 (gdkeventloop-quartz.c:672)
12  libglib-2.0.0.dylib           	0x048ffd21 g_main_dispatch + 513 (gmain.c:2441)
13  libglib-2.0.0.dylib           	0x0490161b g_main_context_dispatch + 155 (gmain.c:3014)
14  libglib-2.0.0.dylib           	0x04901c4a g_main_context_iterate + 1466 (gmain.c:3092)
15  libglib-2.0.0.dylib           	0x0490258d g_main_loop_run + 1037 (gmain.c:3296)
16  libgtk-quartz-2.0.0.dylib     	0x04d35be0 gtk_main + 240 (gtkmain.c:1258)
17  ???                           	0x1288657c 0 + 310928764
18  ???                           	0x12886544 0 + 310928708
19  ???                           	0x12886524 0 + 310928676
20  ???                           	0x02fda66c 0 + 50177644
21  ???                           	0x004c0fe8 0 + 4984808
22  ???                           	0x004c0de4 0 + 4984292
23  ???                           	0x004c0eaa 0 + 4984490
24  mono-sgen                     	0x0000d9e2 mono_jit_runtime_invoke + 722
25  mono-sgen                     	0x001aa27a mono_runtime_invoke + 170
26  mono-sgen                     	0x001acd9c mono_runtime_exec_main + 620
27  mono-sgen                     	0x001ac001 mono_runtime_run_main + 929
28  mono-sgen                     	0x00069715 mono_jit_exec + 149
29  mono-sgen                     	0x0006bca9 mono_main + 9609
30  mono-sgen                     	0x000028d9 main + 553
31  mono-sgen                     	0x00002665 start + 53
Comment 49 Kristian Rietveld (inactive) 2012-10-13 04:34:01 UTC
My plan is to put together another valgrind smash test mimicking the animated bubbles. This strategy has worked before, so hopefully it will help us fix this case as well.
Comment 50 Michael Natterer 2012-11-13 04:38:56 UTC
The patch here has gone off my radar. Kris, shouldn't we upstream the
patch in comment 46?
Comment 51 Michael Natterer 2012-11-13 04:45:04 UTC
What I mean is, while this patch doesn't seem to fix the crash Michael
pasted in comment 48, it does fix a whole class of other crashes, and
it looks correct to me.
Comment 52 Kristian Rietveld (inactive) 2012-11-14 02:43:17 UTC
> The patch here has gone off my radar. Kris, shouldn't we upstream the
> patch in comment 46?

I wanted to test the updated patches (that fix the issue in GIMP) with the valgrind test case.  I have done that now, so we should be upstreaming this now.
Comment 53 Kristian Rietveld (inactive) 2012-11-14 02:44:05 UTC
I will get back to comment 48 later of course.
Comment 54 Michael Natterer 2012-11-15 06:41:38 UTC
The patch has been pushed to upstream gtk-2-24, gtk-3-6 and master.
Comment 55 Mikayla Hutchinson [MSFT] 2012-11-15 17:20:23 UTC
Updated bockbuild.
Comment 56 Kristian Rietveld (inactive) 2012-11-24 14:20:44 UTC
To look into the trace in comment 48 (actually into the comment on the "bubbles", because the trace just looks like any other in this bug...) I have started to try to trigger crashes with a resizing popup window similar to the bubbles on top of a text view.  So far I have not succeeded yet.

What I did find is that the bubbles actually move to be behind the MonoDevelop window when one clicks inside the MonoDevelop window. With the normal bubbles this cannot be seen because the animation is so quick.  But if you extend the duration of the animation, this behavior does appear. Perhaps this is triggered when the press and release happen in the MonoDevelop editor and the bubble respectively, so I have to focus on events before the bubble and not just after the bubble has appeared. I will try this avenue next.
Comment 57 Kristian Rietveld (inactive) 2012-12-02 11:59:05 UTC
A button press before a simulated bubble window appears and releasing within it doesn't seem to be any problem.


I still do not really have a clue what could be causing the crashes Michael is still seeing.

My plan of action from now on would be the following:

 - For the earlier fix, we observed that the crash that happened could not actually happen due to guards being in place. Since we were quite happy having a fix, we did not further research this apparent discrepancy. My plan is to revisit these original fixes, figure out if the guard worked or not (basically, did we really deal with NULL, a busted pointer or with an object already freed). Potentially, this could bring to light a race condition that could perhaps give an idea where to look for further crashes. In case of an object already being freed, can we figure a simple debug tool (perhaps like NSZombies, though we don't have the flexibility the objective-C runtime has) that we can use without valgrind and then run MonoDevelop with this enabled.

 - In comment 41 Michael indicates that this often happens immediately after a build. I *assume* the build is triggered through the menu, by either clicking or using the menu accelerator. I want to see if this menu can have any impact on events that are delivered right after.  (For example, I did see in the past that clicking on the global menu bar gives the GTK+ application some events with NSWindow == NULL. The question is whether this happens after the use of accelerators too and how this affects us).

 - Also with regard to comment 41, I tried to emulate the bubble windows on top of text view. This didn't seem to bring up any issues.  I guess next, I should try to stress test this on top of the MonoDevelop text editor.
Comment 58 Kristian Rietveld (inactive) 2012-12-02 12:00:28 UTC
Michael, it would actually be quite good to know:

 (a) whether you are still observing this crash and whether this still regularly occurs after a build.

 (b) whether in comment 41 you triggered the build with the menu (either by clicking or keyboard shortcut) or by using a toolbar button.  (I would assume you used the keyboard shortcut).
Comment 59 Kristian Rietveld (inactive) 2012-12-10 03:08:02 UTC
More observations:

 1) I cannot trigger the bug with the test case with valgrind on my Lion machine.  Maybe the machine is too fast.

 2) On my Snow Leopard machine, the crash is triggered by grab_toplevel being NULL, so get_window_point_from_screen_point crashes due to a read from 0x18.

 3) I have pulled the GDK Quartz code through the Clang static analyzer and found nothing that could affect this. So it is likely not some obvious uninitialized variable causing this.


The guards that have been in place in the GTK+ compiled by bockbuild check for grab_toplevel being NULL. In that case, this crash cannot occur, though you will see a bunch of GTK+ assertions. I am using the stack trace here from comment 48, where "find_window_for_ns_event + 701" is the instruction after the call to get_window_point_from_screen_point which indicates that that call is in progress.

The only way (as far as I can see now) the crash can then occur is as follows:

 1) A non-implicit grab is in place on a toplevel window.

 2) Owner events is zero, such that grab->window is checked.

 3) grab->window must be non-NULL and resemble a valid GdkWindow object, otherwise you cannot get past the call to gdk_window_get_effective_toplevel().

 4) The return value (stored in grab_toplevel) of gdk_window_get_effective_toplevel() must be non NULL to make it into get_window_point_from_screen_point.  This implies that the effective toplevel call has not failed (you would have gotten NULL), so you either get grab->window back or another GdkWindow which did pass the call to gdk_window_get_window_type() (which checks NULL and GDK_IS_WINDOW).


So I *think* we can state grab_toplevel will either be NULL or a pointer which passes GDK_IS_WINDOW. NULL is guarded for.

How can we crash on a GdkWindow pointer? I guess possibly if we access a GdkWindow that has been freed. But a) it wouldn't surprise me if GObject scrubs the type id on destroy (need to check that), b) we have never found this case so far, because valgrind would complain about accessing freed memory. The other case is if we access a GdkWindow pointer that has been corrupted internally (either ->impl in GdkWindowObject, or the toplevel field in GdkWindowImplQuartz). Again, we haven't found a code path like this, because valgrind would IIRC give a hint of such corruption as well.


Is it possible this is caused by some mono memory corruption like what is mentioned as possibility in bug 7820?
Comment 60 Mikayla Hutchinson [MSFT] 2012-12-10 20:21:59 UTC
I don't think so, you'd think JIT memory corruption would be more widespread. This seems very specific.

OTOH that does raise a question, maybe that popup window is being shown by a thread? I'll check tomorrow.
Comment 61 Kristian Rietveld (inactive) 2012-12-11 14:44:01 UTC
Before suggesting this, I also thought JIT memory corruption would be more widespread and that this case is too specific.

Though from what I recall, this happens most often after completing a build and it is very rare. I would assume performing a build involves threading. Is there anything else that could possibly corrupt memory? Any other tools we can use for debugging?
Comment 62 Mikayla Hutchinson [MSFT] 2012-12-11 15:39:06 UTC
Nah, it's not a threading issue, we marshal the build completion event back onto the UI thread.

I hadn't seen this crash in a while, and I assumed it was because I'd trained myself not to click on the editor while the animation was running. However, I can't repro the issue now - and I tried for about 10 minutes. It may be that it's been inadvertently fixed by one of the other patches, or maybe it is a Mono JIT issue that's been fixed, or maybe I was just lucky.

Let's put this bug on hold for a while until we get another repro on the latest builds.
Comment 63 Kristian Rietveld (inactive) 2012-12-11 15:43:40 UTC
Sounds good to me -- thank you for testing.   I can assure you I did test repeatedly clicking on the editor while the animation was running (I even modified the animation to run for a longer period of time).


By the way, in case another repro is found I would be very interested in the log messages in the console from where MonoDevelop was launched next to the stack trace  (if MonoDevelop was launched from the console ...).
Comment 64 Mikayla Hutchinson [MSFT] 2013-05-29 12:25:10 UTC
This hasn't been reported on recent builds, I think we can consider it fixed :)
Comment 65 Michael Natterer 2013-05-29 15:11:28 UTC
There goes my oldest open browser tab... :)