Bug 37624 - Websocket loses messages when data is sent before receiveAsync is called
Summary: Websocket loses messages when data is sent before receiveAsync is called
Status: RESOLVED FIXED
Alias: None
Product: Class Libraries
Classification: Mono
Component: System ()
Version: 4.2.0 (C6)
Hardware: PC Mac OS
: High normal
Target Milestone: Untriaged
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2016-01-12 18:01 UTC by David
Modified: 2018-04-30 17:45 UTC (History)
8 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
Android Application to Test Websocket (17.04 KB, application/zip)
2016-01-12 18:01 UTC, David
Details


Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description David 2016-01-12 18:01:49 UTC
Created attachment 14557 [details]
Android Application to Test Websocket

When connecting on websocket, data that is sent prior to calling receiveAsync is occasionally lost.  The timeline is 
1. Open Client Websocket, 
2. Websocket server sends first message with data, 
3. Client calls receiveAsync. 
4. Server sends second message with data

Expected behavior: data is received from first message.
Actual behavior: return value for receiveAsync is occasionally the second message

To reproduce, 

Create a websocket connection to `ws://echo.websocket.org?.kl=Y`.  The parameter on that connection will tell the server to respond with bytes 0x82, 0x04, 0xef, 0x83, 0xbf, 0x4e.  Once open, call receiveAsync.  (You may need a sleep in between open a readAsync to get the timing right).  Notice that message never arrives, and the connection stays open.  Wireshark or capture the traffic via proxy and you will see 0x82, 0x04, 0xef, 0x83, 0xbf, 0x4e on the wire

I've attached a sample project, here is the main code

```

        private void WebSocketTest(string p)
        {
            try
            {
                ClientWebSocket ws = new ClientWebSocket();
                ws.ConnectAsync(new Uri(p), CancellationToken.None).Wait();

                ArraySegment<byte> message = new ArraySegment<byte>(new byte[100]);
                ArraySegment<byte> data = new ArraySegment<byte>(Encoding.UTF8.GetBytes("Test Message"));
                Task<WebSocketReceiveResult> t = Task<WebSocketReceiveResult>.Run(() =>
                {
                    return ws.ReceiveAsync(message, CancellationToken.None);
                });
                Console.WriteLine("Send");
                ws.SendAsync(data, WebSocketMessageType.Text, true, CancellationToken.None).Wait();
                
                t.Wait();
                // the first received message is TEXT message, the BINARY messsage was lost
                Console.WriteLine("Message Type: {0} Length: {1}", t.Result.MessageType, t.Result.Count);
                t = Task<WebSocketReceiveResult>.Run(() =>
                {
                    return ws.ReceiveAsync(message, CancellationToken.None);
                });
                
                t.Wait();
                Console.WriteLine("Message Type: {0} Length: {1}", t.Result.MessageType, t.Result.Count);
                Console.WriteLine("Close");
                Task t2 = ws.CloseAsync(WebSocketCloseStatus.Empty, "", CancellationToken.None);
                t2.Wait();
                Console.WriteLine("Complete");
            }
            catch (AggregateException ex)
            {
                Console.WriteLine("Exception: " + ex.InnerException.ToString());
            }

```
Comment 2 Marek Habersack 2016-02-01 20:17:13 UTC
It seems there's a race condition in the connection establishment code in the Mono implementation of the WebSockets API. I rewrote the code from OP (will elaborate a bit farther down) and ran it with desktop Mono (using mono master/509c982) to find that it works sometimes:

$ mono --debug t.exe  ws://echo.websocket.org?.kl=Y
First receive
Message Type: Binary Length: 4
Send
Message Type: Text Length: 12
Close
Complete

and other times it fails to receive the initial data. Additionally, the connection tear-down on time outs results in this exception:

$ mono --debug t.exe  ws://echo.websocket.org?.kl=Y
First receive

Unhandled Exception:
System.IndexOutOfRangeException: Index was outside the bounds of the array.
  at System.Net.WebSockets.ClientWebSocket+<ReceiveAsync>c__AnonStorey5.<>m__0 () [0x00247] in /home/grendel/tmp/build/mono/mono/mcs/class/System/System.Net.WebSockets/ClientWebSocket.cs:272 
  at System.Threading.Tasks.Task`1[TResult].InnerInvoke () <0x7f2286c0b860 + 0x00053> in <filename unknown>:0 
  at System.Threading.Tasks.Task.Execute () [0x00016] in /home/grendel/tmp/build/mono/mono/external/referencesource/mscorlib/system/threading/Tasks/Task.cs:2502 
--- End of stack trace from previous location where exception was thrown ---
  at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw () [0x0000c] in /home/grendel/tmp/build/mono/mono/external/referencesource/mscorlib/system/runtime/exceptionservices/exceptionservicescommon.cs:143 
  at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Threading.Tasks.Task task) [0x00047] in /home/grendel/tmp/build/mono/mono/external/referencesource/mscorlib/system/runtime/compilerservices/TaskAwaiter.cs:187 
  at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Threading.Tasks.Task task) [0x0002e] in /home/grendel/tmp/build/mono/mono/external/referencesource/mscorlib/system/runtime/compilerservices/TaskAwaiter.cs:156 
  at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd (System.Threading.Tasks.Task task) [0x0000b] in /home/grendel/tmp/build/mono/mono/external/referencesource/mscorlib/system/runtime/compilerservices/TaskAwaiter.cs:128 
  at System.Runtime.CompilerServices.TaskAwaiter`1[TResult].GetResult () <0x7f2286b797c0 + 0x00017> in <filename unknown>:0 
  at app+<WebSocketTestAsync>c__async0.MoveNext () [0x00128] in /tmp/t.cs:22 
--- End of stack trace from previous location where exception was thrown ---
  at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw () [0x0000c] in /home/grendel/tmp/build/mono/mono/external/referencesource/mscorlib/system/runtime/exceptionservices/exceptionservicescommon.cs:143 
  at System.Runtime.CompilerServices.AsyncMethodBuilderCore.<ThrowAsync>m__1 (System.Object state) [0x00000] in /home/grendel/tmp/build/mono/mono/external/referencesource/mscorlib/system/runtime/compilerservices/AsyncMethodBuilder.cs:1034 
  at System.Threading.QueueUserWorkItemCallback.WaitCallback_Context (System.Object state) [0x0000e] in /home/grendel/tmp/build/mono/mono/external/referencesource/mscorlib/system/threading/threadpool.cs:1307 
  at System.Threading.ExecutionContext.RunInternal (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state, Boolean preserveSyncCtx) [0x0008d] in /home/grendel/tmp/build/mono/mono/external/referencesource/mscorlib/system/threading/executioncontext.cs:957 
  at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state, Boolean preserveSyncCtx) [0x00000] in /home/grendel/tmp/build/mono/mono/external/referencesource/mscorlib/system/threading/executioncontext.cs:904 
  at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem () [0x0002a] in /home/grendel/tmp/build/mono/mono/external/referencesource/mscorlib/system/threading/threadpool.cs:1284 
  at System.Threading.ThreadPoolWorkQueue.Dispatch () [0x00096] in /home/grendel/tmp/build/mono/mono/external/referencesource/mscorlib/system/threading/threadpool.cs:857 
  at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback () [0x00000] in /home/grendel/tmp/build/mono/mono/external/referencesource/mscorlib/system/threading/threadpool.cs:1212 

The rewritten code looks as follows:

async void WebSocketTest (string p)
{
  var ws = new ClientWebSocket();
  await ws.ConnectAsync (new Uri (p), CancellationToken.None);
  var message = new ArraySegment<byte> (new byte [100]);
  var data = new ArraySegment<byte> (Encoding.UTF8.GetBytes ("Test Message"));

  Console.WriteLine ("First receive");
  WebSocketReceiveResult result = await ws.ReceiveAsync (message, CancellationToken.None);
  Console.WriteLine("Message Type: {0} Length: {1}", result.MessageType, result.Count);

  Console.WriteLine("Send");
  await ws.SendAsync (data, WebSocketMessageType.Text, true, CancellationToken.None);

  result = await ws.ReceiveAsync (message, CancellationToken.None);
  Console.WriteLine("Message Type: {0} Length: {1}", result.MessageType, result.Count);

  Console.WriteLine("Close");
  await ws.CloseAsync (WebSocketCloseStatus.Empty, String.Empty, CancellationToken.None);
  Console.WriteLine("Complete");
}

which fixes issues with the OP code but doesn't fix the actual bug. The problem with the OP code is that it makes use of the TPL directly in a manner that doesn't guarantee order of operations on the socket and it is perfectly possible that the SendAsync call is executed before the first ReceiveAsync, or vice-versa, especially that calling async from within a task spins up two threads (which is not necessary). The async/await version of the code is free of the issues. Another possible approach is to use .ContinueWith on the tasks returned from the *Async calls, but with async/await available it is not necessary/recommended.

Anyway, there's most certainly a bug in the Mono's System.Net.WebSockets implementation, or rather two of them. Since the issue isn't specific to Xamarin.Android, I'm moving the bug to the Mono BCL component.

I think the best course of action would be to import System.Net.WebSockets implementation from reference source, if possible.
Comment 4 Marek Safar 2017-04-03 12:50:53 UTC
It should be fixed in master