Documents List API client for C# timing out - c#

I'm working on a simple wrapper for the google docs api in c#. The problem I'm running into is my tests are timing out. Sometimes. When I run all of my tests (only 12 of them) then it usually hangs up on the 8th one, which is testing the delete function. After about 6.5 minutes, it continues on, but every test after it also times out after 6.5 minutes for each test. If I run the tests individually then it works fine every time.
Here is the first method that times out:
Updated to show exception handling
[TestMethod]
public void CanDeleteFile()
{
var api = CreateApi();
api.UploadFile("pic.jpg", "..\\..\\..\\pic.jpg", "image/jpeg");
try
{
var files = api.GetDocuments();
api.DeleteFile("pic.jpg");
var lessFiles = api.GetDocuments();
Assert.AreEqual(files.Count - 1, lessFiles.Count);
}
catch (Google.GData.Client.GDataRequestException ex)
{
using (StreamWriter writer = new StreamWriter("..\\..\\..\\errors.log", true))
{
string time = DateTime.Now.ToString();
writer.WriteLine(time + ":\r\n\t" + ex.ResponseString);
}
throw ex;
}
}
It times out on var lessFiles = api.GetDocuments(); The second call to that method. I have other methods that call that method twice, and they don't time out, but this one does.
The method that all the test methods use that times out:
public AtomEntryCollection GetDocuments(DocumentType type = DocumentType.All, bool includeFolders = false)
{
checkToken();
DocumentsListQuery query = getQueryByType(type);
query.ShowFolders = includeFolders;
DocumentsFeed feed = service.Query(query);
return feed.Entries;
}
It times out on this line DocumentsFeed feed = service.Query(query);. This would be closer to acceptable if I was requesting insane numbers of files. I'm not. I'm requesting 5 - 6 depending on what test I'm running.
Things I've tried:
Deleting all files from my google docs account, leaving only 1-2 files depending on the test for it to retrieve.
Running the tests individually (they all pass and nothing times out, but I shouldn't have to do this)
Checking my network speed to make sure it's not horribly slow (15mbps down 4.5mbps up)
I'm out of ideas. If anyone knows why it might start timing out on me? Any suggestions are welcome.
edit
As #gowansg suggested, I implemented exponential backoff in my code. It started failing at the same point with the same exception. I then wrote a test to send 10000 requests for a complete list of all documents in my drive. It passed without any issues without using exponential backoff. Next I modified my test class so it would keep track of how many requests were sent. My tests crash on request 11.
The complete exception:
Google.GData.Client.GDataRequestException was unhandled by user code
Message=Execution of request failed: https://docs.google.com/feeds/default/private/full
Source=GoogleDrive
StackTrace:
at GoogleDrive.GoogleDriveApi.GetDocuments(DocumentType type, Boolean includeFolders) in C:\Users\nlong\Desktop\projects\GoogleDrive\GoogleDrive\GoogleDriveApi.cs:line 105
at GoogleDrive.Test.GoogleDriveTests.CanDeleteFile() in C:\Users\nlong\Desktop\projects\GoogleDrive\GoogleDrive.Test\GoogleDriveTests.cs:line 77
InnerException: System.Net.WebException
Message=The operation has timed out
Source=System
StackTrace:
at System.Net.HttpWebRequest.GetResponse()
at Google.GData.Client.GDataRequest.Execute()
InnerException:
another edit
It seems that I only crash after requesting the number of documents after the second upload. I'm not sure why that is, but I'm definitely going to look into my upload method.

If your tests pass when ran individually, but not when ran consecutively then you may be hitting a request rate limit. I noticed in your comment to JotaBe's answer you mentioned were getting request timeout exceptions. You should take a look at the http status code to figure out what to do. In the case of a 503 you should implement exceptional back off.
Updated Suggestion
Place a try-catch around the line that is throwing the exception and catch the Google.GData.Client.GDataRequestException. According to the source there are two properties that may be of value to you:
/// <summary>
/// this is the error message returned by the server
/// </summary>
public string ResponseString
{ ... }
and
//////////////////////////////////////////////////////////////////////
/// <summary>Read only accessor for response</summary>
//////////////////////////////////////////////////////////////////////
public WebResponse Response
{ ... }
Hopefully these contain some useful information for you, e.g., an HTTP Status Code.

You can explicitly set the time-outs for your unit tests. Here you have extensive information on it:
How to: Set Time Limits for Running Tests
You can include Thread.Sleep(miliseconds) in your unit tests before the offending methods. Probably your requests are being rejected from google docs for being too may in too short a time.

Related

C# batch processing of async web responses hangs just before finishing

Here is the scenario.
I want to call 2 versions of an API (hosted on different servers), then cast their responses (they come as a JSON) to C# objects and compare them.
An important note here is that i need to query the APIs a lot of times ~3000. The reason for this is that I query an endpoint that has an id and that returns a specific object from the DB. So my queries are like http://myapi/v1/endpoint/id. And I basically use a loop to go through all of the ids.
Here is the issue
I start querying the API and for the first 90% of all requests it is blazing fast (I get the response and i process it) and all that happens under 5 seconds.
Then however, I start to come to a stop. The next 50-100 requests can take between 1 - 5 seconds to process and after that I come to a stop. No CPU-usage, network activity is low (and I am pretty sure that activity is from other apps). And my app just hangs.
UPDATE: Around 50% of the times I tested this, it does finally resume after quite a bit of time. But the other 50% it still just hangs
Here is what I am doing in-code
I have a list of Ids that I iterate to query the endpoint.
This is the main piece of code that queries the APIs and processes the responses.
var endPointIds = await GetIds(); // this queries a different endpoint to get all ids, however there are no issues with it
var tasks = endPointIds.Select(async id =>
{
var response1 = await _data.GetData($"{Consts.ApiEndpoint1}/{id}");
var response2 = await _data.GetData($"{Consts.ApiEndpoint2}/{id}");
return ProcessResponces(response1, response2);
});
var res = await Task.WhenAll(tasks);
var result = res.Where(r => r != null).ToList();
return result; // I never get to return the result, the app hangs before this is reached
This is the GetData() method
private async Task<string> GetAsync(string serviceUri)
{
try
{
var request = WebRequest.CreateHttp(serviceUri);
request.ContentType = "application/json";
request.Method = WebRequestMethods.Http.Get;
using (var response = await request.GetResponseAsync())
using (var responseStream = response.GetResponseStream())
using (var streamReader = new StreamReader(responseStream, Encoding.UTF8))
{
return await streamReader.ReadToEndAsync();
}
}
catch
{
return string.Empty;
}
}
I would link the ProcessResponces method as well, however I tried mocking it to return a string like so:
private string ProcessResponces(string responseJson1, string responseJson1)
{
//usually i would have 2 lines that deserialize responseJson1 and responseJson1 here using Newtonsoft.Json's DeserializeObject<>
return "Fake success";
}
And even with this implementation my issue did not go away (only difference it made is that I managed the have fast requests for like 97% of my requests, but my code still ended up stopping at the last few request), so I am guessing the main issue is not related to that method. But what it more or less does is deserialize both responses to c# objects, compares them and returns information about their equality.
Here are my observations after 4 hours of debugging
If I manually reduce the number of queries to my API (I used .Take() method on the list of ids) the issue still persists. For example on 1000 total requests I start hanging around 900th, for 1500 on the 1400th an so on. I believe the issue goes away at around 100-200 requests, but I am not sure since it might just be too fast for me to notice.
Since this is currently a console app I tried adding WriteLines() in some of my methods, and the issue seemed to go away (I am guessing the delay in speed that writing on the console creates, gives some time between requests and that helps)
Lastly i did a concurrency profiling of my app and it reported that there were a lot of contentions happening at the point where my app hangs. Opening the contention tab showed that they are mainly happening with System.IO.StreamReader.ReadToEndAsync()
Thoughts and Questions
Obviously, what can I do to resolve the issue?
Is my GetAsync() method wrong, should I be using something else instead of responseStream and streamReader?
I am not super proficient in asynchronous operations, maybe my flow of async/await operations is wrong.
Lastly, could it be something with the API controllers themselves? They are standard ASP.NET MVC 5 WebAPI controllers (version 5.2.3.0)
After long hours of tracking my requests with Fiddler and finally mocking my DataProvider (_data) to retrieve locally, from disk - it turns out that I had responses that were taking 30s+ to come (or even not coming at all).
Since my .Select() is async it always dispalyed info for the quick responses first, and then came to a halt as it was waiting for the slow ones. This gave an illusion that I was somehow loading the first X amount of requests quickly and then stopping. When, in reality, I was simply shown the fastest X amount of requests and then coming to a halt as I was waiting for the slow ones.
And to kind of answer my questions...
What can I do to resolve the issue - set a timeout that allows a maximum number of milliseconds/seconds for a request to finish.
The GetAsync() method is alright.
Async/await operations are also correct, just need to have in mind that doign an async select will return results ordered by the time it took for them to finish.
The ASP.NET Framework controllers are perfectly fine and do not contribute to the issue.

ArgumentException in asynchronous webservice calls

I'm running into a problem, I haven't experienced before. I'm calling a method asynchronously, via the BeginXXX/EndXX pattern, with an extra wrapper for some timing functionality, like so:
BeginGetStuffTimed(arguments)
{
var timingstuff = TimingStuff;
var callback = new AsyncResult(CallbackMethod);
service.BeginGetStuff(CallbackMethod, timingstuff);
}
CallbackMethodTimed(IAsyncResult result)
{
SaveTiming(result.AsyncState...);
service.EndGetStuff;
}
Now every once in a while, an exception gets thrown:
Async End called on wrong channel.
Parameter name: result
Since it doesn't occur always this kind of puzzles me. I was thinking IIS couldn't keep up with requests and goes fubar, so I added a pause in my calling code and this does seem to work. The longer the pause, the less frequent the exception.
Of course this is no real solution, so I'm looking for some insights into this matter.
Edit: upon further inspection this seems to be unrelated to IIS, IIS can process up to 5000 (per CPU) simultaneous threads. I'm pushing nowhere near this limit.

Web response in C# .NET doesn't work more than a couple of times

I am developing an application using twitter api and that involves writing a method to check if a user exists. Here is my code:
public static bool checkUserExists(string user)
{
//string URL = "https://twitter.com/" + user.Trim();
//string URL = "http://api.twitter.com/1/users/show.xml?screen_name=" + user.Trim();
//string URL = "http://google.com/#hl=en&sclient=psy-ab&q=" + user.Trim();
string URL = "http://api.twitter.com/1/users/show.json?screen_name=" + user.Trim();
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(URL);
try
{
var webResponse = (HttpWebResponse)webRequest.GetResponse();
return true;
}
//this part onwards does not matter
catch (WebException ex)
{
if (ex.Status == WebExceptionStatus.ProtocolError && ex.Response != null)
{
var resp = (HttpWebResponse)ex.Response;
if (resp.StatusCode == HttpStatusCode.NotFound)
{
return false;
}
else
{
throw new Exception("Unknown level 1 Exception", ex);
}
}
else
{
throw new Exception("Unknown level 2 Exception", ex);
}
}
}
The problem is, calling the method does not work(it doesn't get a response) more than 2 or 3 times, using any of the urls that have been commented, including the google search query(I thought it might be due to twitter API limit). On debug, it shows that it's stuck at:
var webResponse = (HttpWebResponse)webRequest.GetResponse();
Here's how I am calling it:
Console.WriteLine(TwitterFollowers.checkUserExists("handle1"));
Console.WriteLine(TwitterFollowers.checkUserExists("handle2"));
Console.WriteLine(TwitterFollowers.checkUserExists("handle3"));
Console.WriteLine(TwitterFollowers.checkUserExists("handle4"));
Console.WriteLine(TwitterFollowers.checkUserExists("handle5"));
Console.WriteLine(TwitterFollowers.checkUserExists("handle6"));
At most I get 2-3 lines of output. Could someone please point out what's wrong?
Update 1:
I sent 1 request every 15 seconds (well within limit) and it still causes an error. on the other hand, sending a request, closing the app and running it again works very well (on average accounts to 1 request every 5 seconds). The rate limit is 150 calls per hour Twitter FAQ.
Also, I did wait for a while, and got this exception at level 2:
http://pastie.org/3897499
Update 2:
Might sound surprising but if I run fiddler, it works perfectly. Regardless of whether I target this process or not!
The effect you're seeing is almost certainly due to rate-limit type policies on the Twitter API (multiple requests in quick succession). They keep a tight watch on how you're using their API: the first step is to check their terms of use and policies on rate limiting, and make sure you're in compliance.
Two things jump out at me:
You're hitting the API with multiple requests in rapid succession. Most REST APIs, including Google search, are not going to allow you to do that. These APIs are very visible targets, and it makes sense that they'd be pro-active about preventing denial-of-service attacks.
You don't have a User Agent specified in your request. Most APIs require you to send them a meaningful UA, as a way of helping them identify you.
Note that you're dealing with unmanaged resources underneath your HttpWebResponse. So calling Dispose() in a timely fashion or
wrapping the object in a using statement is not only wise, but important to avoid blocking.
Also, var is great for dealing with anonymous types, Linq query
results, and such but it should not become a crutch. Why use var
when you're well aware of the type? (i.e. you're already performing
a cast to HttpWebResponse.)
Finally, services like this often limit the rate of connections per second and/or the number of simultaneous connections allowed to prevent abuse. By not disposing of your HttpWebResponse objects, you may be violating the permitted number of simultaneous connections. By querying too often you'd break the rate limit.

Adjusting HttpWebRequest Connection Timeout in C#

I believe after lengthy research and searching, I have discovered that what I want to do is probably better served by setting up an asynchronous connection and terminating it after the desired timeout... But I will go ahead and ask anyway!
Quick snippet of code:
HttpWebRequest webReq = (HttpWebRequest)HttpWebRequest.Create(url);
webReq.Timeout = 5000;
HttpWebResponse response = (HttpWebResponse)webReq.GetResponse();
// this takes ~20+ sec on servers that aren't on the proper port, etc.
I have an HttpWebRequest method that is in a multi-threaded application, in which I am connecting to a large number of company web servers. In cases where the server is not responding, the HttpWebRequest.GetResponse() is taking about 20 seconds to time out, even though I have specified a timeout of only 5 seconds. In the interest of getting through the servers on a regular interval, I want to skip those taking longer than 5 seconds to connect to.
So the question is: "Is there a simple way to specify/decrease a connection timeout for a WebRequest or HttpWebRequest?"
I believe that the problem is that the WebRequest measures the time only after the request is actually made. If you submit multiple requests to the same address then the ServicePointManager will throttle your requests and only actually submit as many concurrent connections as the value of the corresponding ServicePoint.ConnectionLimit which by default gets the value from ServicePointManager.DefaultConnectionLimit. Application CLR host sets this to 2, ASP host to 10. So if you have a multithreaded application that submits multiple requests to the same host only two are actually placed on the wire, the rest are queued up.
I have not researched this to a conclusive evidence whether this is what really happens, but on a similar project I had things were horrible until I removed the ServicePoint limitation.
Another factor to consider is the DNS lookup time. Again, is my belief not backed by hard evidence, but I think the WebRequest does not count the DNS lookup time against the request timeout. DNS lookup time can show up as very big time factor on some deployments.
And yes, you must code your app around the WebRequest.BeginGetRequestStream (for POSTs with content) and WebRequest.BeginGetResponse (for GETs and POSTSs). Synchronous calls will not scale (I won't enter into details why, but that I do have hard evidence for). Anyway, the ServicePoint issue is orthogonal to this: the queueing behavior happens with async calls too.
Sorry for tacking on to an old thread, but I think something that was said above may be incorrect/misleading.
From what I can tell .Timeout is NOT the connection time, it is the TOTAL time allowed for the entire life of the HttpWebRequest and response. Proof:
I Set:
.Timeout=5000
.ReadWriteTimeout=32000
The connect and post time for the HttpWebRequest took 26ms
but the subsequent call HttpWebRequest.GetResponse() timed out in 4974ms thus proving that the 5000ms was the time limit for the whole send request/get response set of calls.
I didn't verify if the DNS name resolution was measured as part of the time as this is irrelevant to me since none of this works the way I really need it to work--my intention was to time out quicker when connecting to systems that weren't accepting connections as shown by them failing during the connect phase of the request.
For example: I'm willing to wait 30 seconds on a connection request that has a chance of returning a result, but I only want to burn 10 seconds waiting to send a request to a host that is misbehaving.
Something I found later which helped, is the .ReadWriteTimeout property. This, in addition to the .Timeout property seemed to finally cut down on the time threads would spend trying to download from a problematic server. The default time for .ReadWriteTimeout is 5 minutes, which for my application was far too long.
So, it seems to me:
.Timeout = time spent trying to establish a connection (not including lookup time)
.ReadWriteTimeout = time spent trying to read or write data after connection established
More info: HttpWebRequest.ReadWriteTimeout Property
Edit:
Per #KyleM's comment, the Timeout property is for the entire connection attempt, and reading up on it at MSDN shows:
Timeout is the number of milliseconds that a subsequent synchronous request made with the GetResponse method waits for a response, and the GetRequestStream method waits for a stream. The Timeout applies to the entire request and response, not individually to the GetRequestStream and GetResponse method calls. If the resource is not returned within the time-out period, the request throws a WebException with the Status property set to WebExceptionStatus.Timeout.
(Emphasis mine.)
From the documentation of the HttpWebRequest.Timeout property:
A Domain Name System (DNS) query may
take up to 15 seconds to return or
time out. If your request contains a
host name that requires resolution and
you set Timeout to a value less than
15 seconds, it may take 15 seconds or
more before a WebException is thrown
to indicate a timeout on your request.
Is it possible that your DNS query is the cause of the timeout?
No matter what we tried we couldn't manage to get the timeout below 21 seconds when the server we were checking was down.
To work around this we combined a TcpClient check to see if the domain was alive followed by a separate check to see if the URL was active
public static bool IsUrlAlive(string aUrl, int aTimeoutSeconds)
{
try
{
//check the domain first
if (IsDomainAlive(new Uri(aUrl).Host, aTimeoutSeconds))
{
//only now check the url itself
var request = System.Net.WebRequest.Create(aUrl);
request.Method = "HEAD";
request.Timeout = aTimeoutSeconds * 1000;
var response = (HttpWebResponse)request.GetResponse();
return response.StatusCode == HttpStatusCode.OK;
}
}
catch
{
}
return false;
}
private static bool IsDomainAlive(string aDomain, int aTimeoutSeconds)
{
try
{
using (TcpClient client = new TcpClient())
{
var result = client.BeginConnect(aDomain, 80, null, null);
var success = result.AsyncWaitHandle.WaitOne(TimeSpan.FromSeconds(aTimeoutSeconds));
if (!success)
{
return false;
}
// we have connected
client.EndConnect(result);
return true;
}
}
catch
{
}
return false;
}

Timeout for Web Request

What is a reasonable amount of time to wait for a web request to return? I know this is maybe a little loaded as a question, but all I am trying to do is verify if a web page is available.
Maybe there is a better way?
try
{
// Create the web request
HttpWebRequest request = WebRequest.Create(this.getUri()) as HttpWebRequest;
request.Credentials = System.Net.CredentialCache.DefaultCredentials;
// 2 minutes for timeout
request.Timeout = 120 * 1000;
if (request != null)
{
// Get response
response = request.GetResponse() as HttpWebResponse;
connectedToUrl = processResponseCode(response);
}
else
{
logger.Fatal(getFatalMessage());
string error = string.Empty;
}
}
catch (WebException we)
{
...
}
catch (Exception e)
{
...
}
You need to consider how long the consumer of the web service is going to take e.g. if you are connecting to a DB web server and you run a lengthy query, you need to make the web service timeout longer then the time the query will take. Otherwise, the web service will (erroneously) time out.
I also use something like (consumer time) + 10 seconds.
Offhand I'd allow 10 seconds, but it really depends on what kind of network connection the code will be running with. Try running some test pings over a period of a few days/weeks to see what the typical response time is.
I would measure how long it takes for pages that do exist to respond. If they all respond in about the same amount of time, then I would set the timeout period to approximately double that amount.
Just wanted to add that a lot of the time I'll use an adaptive timeout. Could be a simple metric like:
period += (numTimeouts/numRequests > .01 ? someConstant: 0);
checked whenever you hit a timeout to try and keep timeouts under 1% (for example). Just be careful about decrementing it too low :)
The reasonable amount of time to wait for a web request may differ from one server to the next. If a server is at the far end of a high-delay link then clearly it will take longer to respond than when it is in the next room. But two minutes seems like it's more than ample time for a server to respond. The default timeout value for the PING command is expressed in seconds, not minutes. I suggest you look into the timeout values that are used by networking utilities like PING or TRACERT for inspiration.
I guess this depends on two things:
network speed/load (as others wrote, using ping might give you an idea about this)
the kind of page you are calling: e.g. is it a static HTML page or is it a page which might do some time-consuming operations (DB access, etc.)
Anyway, I think 2 minutes is a lot of time. I would definitely reduce the timeout to less than 30 seconds.
I realize this doesn't directly answer your question, but then an "answer" to this question is a little tough. Anyway, a tool I've used gomez in the past to measure page load times from various parts of the world. It's free and if you haven't done this kind of testing before it might be helpful in terms of giving you a firm idea of what typical page load times are for a given page from a given location.
I would only wait (MAX) 30 seconds probably closer to 15. It really depends on what you are doing and what the result is of unsuccessful connection. As I am sure you know there is lots of reason why you could get a timeout...

Categories