Bug 9936 - UDP socket hangs and console prints: "Operation on non-blocking socket would block"
Summary: UDP socket hangs and console prints: "Operation on non-blocking socket would ...
Status: RESOLVED FIXED
Alias: None
Product: Class Libraries
Classification: Mono
Component: System ()
Version: master
Hardware: PC Linux
: --- normal
Target Milestone: Untriaged
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2013-01-31 05:56 UTC by Esben Laursen
Modified: 2013-04-08 06:37 UTC (History)
3 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
UdpClient.cs, client app (1.74 KB, text/plain)
2013-01-31 05:58 UTC, Esben Laursen
Details
UdpServer.cs, Server App (963 bytes, text/plain)
2013-01-31 05:59 UTC, Esben Laursen
Details


Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Esben Laursen 2013-01-31 05:56:59 UTC
My initial problem was with the SharpSnmpLib (http://sharpsnmplib.codeplex.com/) that the GetMessage would hang for a very long time at least 9 minutes. 
I could also see that it has more "timeouts" than a Windows based host has. This was even though the socket timeout was set to 3000ms.

I tried to envistigate and so I created a simple client and server app that works in the following way:

1. Client initiates a new UDP socket
2. Client sends a datagram to the server with ~500 bytes of data
3. Client starts listening on the source port
4. Server reads the socket and mirror the data back to the client on the source port
5. Client closes the socket and loops back to 1.

This is "only" a single threaded application, and my guess is that its much worse with a multithreaded app.

Anyways, here is what I tried:

* Running server and client app in Windows 7, works fine
* Running client app on Windows 7 and server app on a Debian, works fine
* Running client and server app on the SAME Debian host, works fine
* Running client app on Debian and the server App on Windows 7, breaks!

This is the output from the client app when it fails:

Operation on non-blocking socket would block
Operation on non-blocking socket would block
Operation on non-blocking socket would block
Operation on non-blocking socket would block
Operation on non-blocking socket would block 

It does not happen all the time, but enough to see a dramatic loss of performance. I am guessing this is my problem in my code, as I have 100+ threads sending out 45+ UDP datagrams and expect a response (SNMPGet).

I have also noticed that the client app seems to be "leaking" sockets (if that even makes sense). I see that using a "netstat -an | grep -P '^udp'" command, I would expect that mono closes the socket after it is done, but that does not always seem to be the case.

I have tested my app with mono2.6.7 and 3.0.3, but I am fairly sure that the problem also exist in the 2.8.x branch.

This is the mono version I primary has test with:

mono-sgen --version
Mono JIT compiler version 3.0.3 (tarball Tue Jan 22 09:49:02 UTC 2013)
Copyright (C) 2002-2012 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
        TLS:           __thread
        SIGSEGV:       altstack
        Notifications: epoll
        Architecture:  x86
        Disabled:      none
        Misc:          softdebug
        LLVM:          supported, not enabled.
        GC:            sgen

I will try and upload the UdpClient.cs and UdpServer.cs that I have tested with once the bug report has been created.

Cheers

Esben
Comment 1 Esben Laursen 2013-01-31 05:58:41 UTC
Created attachment 3313 [details]
UdpClient.cs, client app

just compile with mcs UdpClient.cs

Remember to change the IP address of the server ;-)
Comment 2 Esben Laursen 2013-01-31 05:59:39 UTC
Created attachment 3314 [details]
UdpServer.cs, Server App

Just compile with mcs UdpServer.cs that should be it, it defaults listen on port 5001
Comment 3 Andres G. Aragoneses 2013-02-13 12:46:25 UTC
Have you tested with the boehm GC instead of sgen?
Comment 4 Esben Laursen 2013-02-14 03:38:11 UTC
Hi Andres,

Many thanks for your attention on this bug..

Here is the version of mono Boehm

root@agent:~# mono --version
Mono JIT compiler version 3.0.3 (tarball Tue Jan 22 09:49:02 UTC 2013)
Copyright (C) 2002-2012 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
        TLS:           __thread
        SIGSEGV:       altstack
        Notifications: epoll
        Architecture:  x86
        Disabled:      none
        Misc:          softdebug
        LLVM:          supported, not enabled.
        GC:            Included Boehm (with typed GC and Parallel Mark)


and this is the output of the client:

root@agent:~# mono UdpClient.exe
Started Client...
Operation on non-blocking socket would block
Operation on non-blocking socket would block
Operation on non-blocking socket would block
Operation on non-blocking socket would block
Operation on non-blocking socket would block
Operation on non-blocking socket would block
Operation on non-blocking socket would block

So yes its also a problem with Boehm
Comment 5 Andres G. Aragoneses 2013-02-24 11:37:29 UTC
Hey Esben, finally found time to look at this, and sadly was not able to reproduce the bug. It works perfectly for me with Mono 3.0.4 and Ubuntu 12.10 64 bits (and using Boehm). Can you tell me more details about your setup?
Comment 6 Esben Laursen 2013-03-20 11:23:07 UTC
After some great help from Andres, it looks like the problem was related to my vmware instance and not mono. I have tried to reproduce the problem with 2 physical devices and there is no problem..

However the error description (Operation on non-blocking socket would block) is wrong, so I will leave the case open until Andres pull request to fix this are either merged or rejected.

https://github.com/mono/mono/pull/601
Comment 7 Andres G. Aragoneses 2013-04-08 06:04:09 UTC
Esben, the pull request has been merged, you can close the bug.
Comment 8 Esben Laursen 2013-04-08 06:37:52 UTC
pull request has been merged, case is closed..