Bug 10072 - double free or corruption from tp_poll_wait
Summary: double free or corruption from tp_poll_wait
Status: RESOLVED FIXED
Alias: None
Product: Runtime
Classification: Mono
Component: io-layer ()
Version: unspecified
Hardware: PC Mac OS
: --- normal
Target Milestone: ---
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2013-02-06 03:57 UTC by Roope Kangas
Modified: 2013-08-26 08:26 UTC (History)
3 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Roope Kangas 2013-02-06 03:57:33 UTC
While running load test on our server :

export MONO_DISABLE_AIO=1
gdb --args /opt/mono-2.10.9/bin/mono-sgen --debug turf-server.exe ... more args..

set "handle SIGPIPE nostop noprint pass" in gdb.

run the server with around 500-600 connected sockets doing 1-3 send/recv per socket per second.

after a while (random amount of time from hour to several)

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffb7dad700 (LWP 2126)]
0x00007ffff71cf8a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64        return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb)
(gdb) backtrace
#0  0x00007ffff71cf8a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007ffff71d1085 in abort () at abort.c:92
#2  0x00007ffff720cfe7 in __libc_message (do_abort=2, fmt=0x7ffff72f47c0 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198
#3  0x00007ffff7212916 in malloc_printerr (action=3, str=0x7ffff72f4b00 "double free or corruption (out)", ptr=<value optimized out>) at malloc.c:6311
#4  0x00007ffff7215443 in _int_free (av=0x7ffff752be80, p=0x7fffac0022f0, have_lock=0) at malloc.c:4811
#5  0x00000000005ebb03 in monoeg_g_free (ptr=0x7fffac002300) at gmem.c:36
#6  0x00000000005ab10d in tp_poll_wait (p=0x906300) at ../../mono/metadata/tpool-poll.c:213
#7  0x00000000005b3551 in start_wrapper_internal (data=0x2304d20) at threads.c:784
#8  start_wrapper (data=0x2304d20) at threads.c:832
#9  0x00000000005dd962 in thread_start_routine (args=0x9ff550) at wthreads.c:287
#10 0x000000000058a52d in gc_start_thread (arg=0x2304ee0) at sgen-gc.c:6154
#11 0x00007ffff7537851 in start_thread (arg=0x7fffb7dad700) at pthread_create.c:301
#12 0x00007ffff728511d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115


I think I can pinpoint this down to us trying to write to a closed socket. Avoiding this seems to avoid the problem.
Comment 1 Rodrigo Kumpera 2013-08-23 14:28:32 UTC
Please provide a test case.
Comment 2 Roope Kangas 2013-08-26 05:33:42 UTC
Hi!

Since bug https://bugzilla.xamarin.com/show_bug.cgi?id=10127 was fixed I have not seen this in load testing or in any other environment.
Comment 3 Rodrigo Kumpera 2013-08-26 08:26:53 UTC
Marking this one as fixed for now then.