Bug 10464 - Azure Mobile Services timeout
Summary: Azure Mobile Services timeout
Status: RESOLVED FIXED
Alias: None
Product: Android
Classification: Xamarin
Component: Mono runtime / AOT Compiler ()
Version: 4.4.x
Hardware: PC Mac OS
: Normal normal
Target Milestone: ---
Assignee: Alex Rønne Petersen
URL:
Depends on:
Blocks:
 
Reported: 2013-02-20 08:58 UTC by Allie Miller
Modified: 2013-05-24 08:34 UTC (History)
4 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
The Operation Has Timed Out Error (43.27 KB, image/png)
2013-02-20 08:58 UTC, Allie Miller
Details
Demo project (3.61 MB, application/zip)
2013-02-20 08:58 UTC, Allie Miller
Details
New demo project with Android Device Logging output (222.92 KB, application/x-zip-compressed)
2013-03-13 13:06 UTC, Olivier Ansquer
Details


Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on Developer Community or GitHub with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Allie Miller 2013-02-20 08:58:02 UTC
Created attachment 3427 [details]
The Operation Has Timed Out Error

When using Azure Mobile services from Mono for Android, the error message "The operation has timed out" appears.

Before the error message, the application waits 5 minutes (HttpWebRequest.ReadWriteTimeout default value).
You can reproduce the probleme with the attached demo project. When you click on the button, the application loads 5000 rows from azure. This usually fails about half the time.
Comment 1 Allie Miller 2013-02-20 08:58:58 UTC
Created attachment 3428 [details]
Demo project
Comment 3 Jonathan Pryor 2013-02-25 15:12:47 UTC
I believe that this is an Azure bug.

Things go "weird"/"wrong" when HttpWebResponse instances are not properly Dispose()d of, e.g.

https://bugzilla.novell.com/show_bug.cgi?id=648862#c9

Unfortunately, this appears to occur in:

azure-mobile-services-master/sdk/xamarin/android/src/Microsoft.Azure.Zumo.Android/ServiceFilter/ServiceFilterRequest.cs:
> ServiceFilterRequest.GetResponseAsync() doesn't appear to dispose.

azure-mobile-services-master/sdk/xamarin/libs/Xamarin.Auth/src/Xamarin.Auth/OAuth1Authenticator.cs
> OAuth1Authenticator.GetInitialUrlAsync() doesn't appear to Dispose() `respTask.Result`.

(and probably other spots; I'm only grepping for these things...)

The probable result is that HttpWebResponse instances are NOT being Dispose()d of, which causes a resource "leak", which results in timeouts.

The proper fix is to fix Azure so that it properly disposes of these instances.

The workaround is to "help dispose the HttpWebRespons instances" by provoking a GC. Change Activity1.Load() to call GC.Collect() within the `while` loop:

            while (currentTaskRowCount < maxRowCount)
            {
                System.GC.Collect ();
                ...

This allows me to reliably click the Load() button without getting any WebExceptions.
Comment 4 Olivier Ansquer 2013-02-26 03:25:21 UTC
Hi,

In ServiceFilterRequest, GetResponseAsync returns a ServiceFilterResponse.
In its constructor, this code is executed :

using (StreamReader reader = new StreamReader(response.GetResponseStream()))
{
   this.Content = reader.ReadToEnd();
}

So the Stream returned by GetResponseStream is correctly closed (it is the same code in HttpWebResponse.Dispose).

For the second example in "Xamarin.Auth", it is your code no ? And I think there is no problem in this code. GetResponseText close correctly the Stream in HttpWebResponse.

For your workaround, I will try it but if it works, the problem is not for me in Azure.

Best regards,

Olivier Ansquer
Comment 5 Olivier Ansquer 2013-02-26 04:09:48 UTC
I just test GC.Collect. This does not solve the problem.
I have uploaded 28000 row 250 by 250 (so 112 web requests).
The task was stopped 4 times by the problem. 4 times means a long wait of 20 minutes.

With a Windows 8 test application, it seems that there was no problem. I'll test again to be sure but it seems to happen only with my Android application.

Best regards,

Olivier Ansquer
Comment 6 Jonathan Pryor 2013-02-27 10:47:03 UTC
@Olivier: Regarding Comment #4, the problem isn't that `response.GetResponseStream()` isn't disposed, the problem is that `response` itself is not disposed, and it's `response` that needs to be disposed.
Comment 7 Olivier Ansquer 2013-02-27 11:06:06 UTC
Ok but if the code in the Dispose method is executed, where is the problem ?
The only thing made ​​in the code to the Dispose method is to close the Stream returned by the GetResponseStream method.
Is there hidden code executed when Dispose is called ?
I have also try do add Dispose in the code. It doesn't solved the problem.
Moreover, it happens that the timeout happens very quickly on the first request.
The code comes from https://github.com/xamarin/azure-mobile-services.
If Dispose calls are missing, can you (Xamarin) add them ?

In a Windows 8 application, with the same code without Dispose calls, I have done three uploaded (28000 rows * 3, 336 wbe requests) and it works perfectly (0 timeout).
Comment 8 Jonathan Pryor 2013-02-27 11:07:19 UTC
@Olivier: Regarding Comment #5, what target are you running on? I tested on a Nexus 10 and didn't observe the network timeouts after adding the GC.Collect().

Historically, things start getting "weird" on Mono when 3+ HttpWebRequests are "live" at once (i.e. haven't been disposed); you may be hitting this. Your W8 project, running on Windows 8, is a different platform and thus will exhibit different behavior.

That said, consider the following program:

  using System;
  using System.IO;
  using System.Net;

  class Test {
    public static void Main ()
    {
      string url = "http://example.com";
      for (int i = 0; i < 5; ++i) {
        try
        {
          Console.WriteLine ("Before web request {0}", i.ToString());
          var request = (HttpWebRequest) WebRequest.Create(url);
          HttpWebResponse response = request.GetResponse() as HttpWebResponse;
          Console.WriteLine ("After web request {0}", i.ToString ());
        }
        catch (WebException we) {
          Console.WriteLine ("attempt {0}: WebException: {1}; {2}",
              i.ToString(), we.Message, we.ToString());
        }
        catch (Exception sysExc) {
          Console.WriteLine ("attempt {0}: Exception: {1}; {2}",
              i.ToString(), sysExc.Message, sysExc.ToString());
        }
      }
    }
  }

This is the ~same scenario: we create an HttpWebResponse instance, and then we don't dispose of it. The result when executing on mono -- not Xamarin.Android, but normal desktop mono -- is:

> $ mono concurrent-web-requests.exe
> Before web request 0
> After web request 0
> Before web request 1
> After web request 1
> Before web request 2
> attempt 2: WebException: The request timed out; System.Net.WebException: The request timed out
>   at System.Net.HttpWebRequest.EndGetResponse (IAsyncResult asyncResult)
>   at System.Net.HttpWebRequest.GetResponse ()
>   at Test.Main ()
> Before web request 3
> After web request 3
> Before web request 4
> attempt 4: WebException: The request timed out; System.Net.WebException: The request timed out
>   at System.Net.HttpWebRequest.EndGetResponse (IAsyncResult asyncResult)
>   at System.Net.HttpWebRequest.GetResponse ()
>   at Test.Main ()

.NET is actually not significantly different:

> concurrent-web-requests.exe
> Before web request 0
> After web request 0
> Before web request 1
> After web request 1
> Before web request 2
> attempt 2: WebException: The operation has timed out; System.Net.WebException: The operation has timed out
>    at System.Net.HttpWebRequest.GetResponse()
>    at Test.Main()
> Before web request 3
> attempt 3: WebException: The operation has timed out; System.Net.WebException: The operation has timed out
>    at System.Net.HttpWebRequest.GetResponse()
>    at Test.Main()
> Before web request 4
> attempt 4: WebException: The operation has timed out; System.Net.WebException: The operation has timed out
>    at System.Net.HttpWebRequest.GetResponse()
>    at Test.Main()

In short, the response instance MUST be disposed for proper, sane, operation. Adding a `response.Dispose()` after the `request.GetResponse()` call allows the program to work without any exceptions on both Mono & .NET.
Comment 9 Olivier Ansquer 2013-02-27 11:14:44 UTC
The device is a Samsung Galaxy Note 2.
I have the same behavior with a Samsung Galaxy S2.
Comment 10 Jonathan Pryor 2013-02-27 11:20:15 UTC
@Olivier: Regarding Comment #7, the problem is in many places. :-)

The https://github.com/xamarin/azure-mobile-services repo is a fork of the "upstream" Microsoft-written repo https://github.com/WindowsAzure/azure-mobile-services. The files added to sdk/xamarin/android are copies of the windows files with async/await uses removed, as our current toolchain doesn't support C#5 await.

The problem is that Microsoft's source has the same bug: ServiceFilterRequest.GetResponseAsync() never disposes of the response instance:

https://github.com/WindowsAzure/azure-mobile-services/blob/ad64844f/sdk/windows/src/Microsoft.Azure.Zumo.Windows.WinMD/ServiceFilter/ServiceFilterRequest.cs#L135

The `response` instance is handed to ServiceFilterResponse(), which never disposes `response` itself, though it does dispose `response.GetResponseStream()`:

https://github.com/WindowsAzure/azure-mobile-services/blob/ad64844f/sdk/windows/src/Microsoft.Azure.Zumo.Windows.WinMD/ServiceFilter/ServiceFilterResponse.cs#L27

A (the?) proper fix would be to change ServiceFilterRequest.GetResponseAsync() to dispose of the instance:

    using (response)
        return new ServiceFilterResponse(response, errorStatus);

However, Microsoft doesn't do that. And since Microsoft didn't do that, our forked version doesn't either.

Even if we do fix our version, the Microsoft version will continue to have the bug so when we rebase our code, the bug may be re-introduced.

Furthermore, that may not be the only spot with this bug. The entire library needs to be reviewed to ensure that all WebResponse instances are properly disposed.
Comment 11 Olivier Ansquer 2013-02-27 11:22:39 UTC
I reproduce the timeout with your code but for me, the code in Mobile Service Client look more like this :

using System;
using System.IO;
using System.Net;

class Test
{
    public static void Main()
    {
        string url = "http://example.com";
        for (int i = 0; i < 5; ++i)
        {
            try
            {
                Console.WriteLine("Before web request {0}", i.ToString());
                var request = (HttpWebRequest)WebRequest.Create(url);
                HttpWebResponse response = request.GetResponse() as HttpWebResponse;
                Console.WriteLine("After web request {0}", i.ToString());
                // ADDED
                response.GetResponseStream().Close();
                // ADDED
            }
            catch (WebException we)
            {
                Console.WriteLine("attempt {0}: WebException: {1}; {2}",
                    i.ToString(), we.Message, we.ToString());
            }
            catch (Exception sysExc)
            {
                Console.WriteLine("attempt {0}: Exception: {1}; {2}",
                    i.ToString(), sysExc.Message, sysExc.ToString());
            }
        }
    }
}

With this code, in a .NET console application, no timeout.
Comment 12 Jonathan Pryor 2013-02-27 11:29:34 UTC
I've filed the "WebResponse is not disposed" issue at:

https://github.com/WindowsAzure/azure-mobile-services/issues/20
Comment 13 Jonathan Pryor 2013-02-27 11:41:03 UTC
@Olivier: Regarding Comment #10, very interesting. Adding `response.GetResponseStream().Close();` also allows desktop mono 2.10 to execute without error, as well as within Xamarin.Android 4.6.

Which suggests that the "demo" above is insufficiently complicated, or there's some other spot in azure-mobile-services which is creating a request/response and not getting the response stream either. (I haven't fully investigated the azure source.)

Some minor investigation later shows that `response.Dispose()` is ~identical to `response.GetResponseStream().Close()`, as `response.Dispose()` just closes the same stream that `GetResponseStream()` returns.

https://github.com/mono/mono/blob/mono-2-10/mcs/class/System/System.Net/HttpWebResponse.cs#L267
https://github.com/mono/mono/blob/mono-2-10/mcs/class/System/System.Net/HttpWebResponse.cs#L307

HOWEVER, just because they're currently the same doesn't mean that they always will be, so having `response.Dispose()` is still (arguably) more correct.
Comment 14 Jonathan Pryor 2013-02-27 11:42:40 UTC
@Olivier: Regarding Comment #5 and Comment #9, can you please provide the full stack trace printed in `adb logcat` for the WebException?
Comment 15 Olivier Ansquer 2013-02-27 17:02:50 UTC
I didn't success to get the WebException in adb logcat. I am a beginner with Android development.
But with my last tests, it seems that the error is linked to Tasks or threads.
In the Demo Project, replace takecount with 250 and maxRowCount with 28000.
You should reproduce the timeout.
If you replace :

Task.Factory.StartNew(Load).ContinueWith(OnError, TaskContinuationOptions.OnlyOnFaulted);

by

Load();

The application is not responding during uploading but the uploading seems complete without error every time.
Do you see an error in the code of my demo application ?
Comment 16 Jonathan Pryor 2013-02-27 23:53:11 UTC
I don't know why it fails when you make the changes suggested in Comment #15.

I also don't think those changes are a good idea:

> $ curl \
>     -H X-ZUMO-INSTALLATION-ID:... \
>     -H X-ZUMO-APPLICATION:... \
>     'https://konnect.azure-mobile.net/tables/Personne?$orderby=ModifieLe&$skip=17000&$top=250'
> > o.json

This is one of the URLs your app will be requesting. The above query returns 102215 bytes of data. This will be stored as a UTF-16 string prior to JSON deserialization, so memory use is twice that, so roughly 200KB.

Then we're doing (28000/250)=112 times, so you're downloading 100KB*112=11.2MB. For a mobile app. The internal strings will require twice that (~22MB), never mind all the intermediate byte[] buffers used for the decode process, which are "garbage" anyway because you then deserialize the JSON into an object graph...

This seems like a decidedly Bad Idea for a mobile platform, requiring heaps of memory use and lots of bandwidth to operate.

Does reducing the amount of data you download increase stability? :-)
Comment 18 Olivier Ansquer 2013-02-28 03:41:07 UTC
For Comment #13, I agree. It would be more correct to call Dispose although currently it would make no difference.

For Comment #16.
Obviously these changes are not a good idea but how do you explain the different behavior ? It would be interesting to find out why to help solve my problem, no ?

28000 is just for tests.
The real application is an application for professionals.
All users will use a Galaxy Note 2.
I think the real number of rows will be closer to 2000 than 28000.
Data will be stored locally in a sqlite database.
First synchronization will upload all the rows and can by done with a Wifi connection.
On subsequent synchronizations, only changes are transferred.

The problem is not the volume. If it still fails after a large number of lines, it would be understandable, but it fails very regularly after a few lines. I can not deliver an application with synchronization so unreliable. If I do some tests :
Test 1 : If I set takecount to 1, timeout happens after 2089 rows.
Test 2 : If I set takecount to 10, timeout happens after 11660 rows.
Test 3 : If I set takecount to 50, timeout happens after 850 rows.
Test 4 (same takecount) : If I set takecount to 50, timeout happens after 17550 rows.
Test 5 (same takecount) : If I set takecount to 50, timeout happens after 3350 rows.
Test 6 (sametakecount, Task.Factory.StartNew removed) : Application freezes but 28000 rows are loaded in 6 minutes without timeout.


This seems to fail less often with 50 than 250 lines but loading is slower.

PS : can you change the status, the problem is not solved :-)
Comment 19 Olivier Ansquer 2013-03-13 13:06:07 UTC
Created attachment 3600 [details]
New demo project with Android Device Logging output
Comment 20 Olivier Ansquer 2013-03-13 13:08:02 UTC
The problem occurs on a Galaxy Note 2 and on a Galaxy S 2 but it works fine in the emulator.
Comment 21 Olivier Ansquer 2013-05-24 02:35:13 UTC
The problem seems to be solved in version 4.7.4.
I have done tens of synchronizations without timeout with this version.
Do you know when a stable version will be available ?
Comment 22 Jonathan Pryor 2013-05-24 08:34:51 UTC
Closing as fixed as per Comment #21.