I believe after lengthy research and searching, I have discovered that what I want to do is probably better served by setting up an asynchronous connection and terminating it after the desired timeout... But I will go ahead and ask anyway!
Quick snippet of code:
HttpWebRequest webReq = (HttpWebRequest)HttpWebRequest.Create(url);
webReq.Timeout = 5000;
HttpWebResponse response = (HttpWebResponse)webReq.GetResponse();
// this takes ~20+ sec on servers that aren't on the proper port, etc.
I have an HttpWebRequest method that is in a multi-threaded application, in which I am connecting to a large number of company web servers. In cases where the server is not responding, the HttpWebRequest.GetResponse() is taking about 20 seconds to time out, even though I have specified a timeout of only 5 seconds. In the interest of getting through the servers on a regular interval, I want to skip those taking longer than 5 seconds to connect to.
So the question is: "Is there a simple way to specify/decrease a connection timeout for a WebRequest or HttpWebRequest?"
I believe that the problem is that the WebRequest measures the time only after the request is actually made. If you submit multiple requests to the same address then the ServicePointManager will throttle your requests and only actually submit as many concurrent connections as the value of the corresponding ServicePoint.ConnectionLimit which by default gets the value from ServicePointManager.DefaultConnectionLimit. Application CLR host sets this to 2, ASP host to 10. So if you have a multithreaded application that submits multiple requests to the same host only two are actually placed on the wire, the rest are queued up.
I have not researched this to a conclusive evidence whether this is what really happens, but on a similar project I had things were horrible until I removed the ServicePoint limitation.
Another factor to consider is the DNS lookup time. Again, is my belief not backed by hard evidence, but I think the WebRequest does not count the DNS lookup time against the request timeout. DNS lookup time can show up as very big time factor on some deployments.
And yes, you must code your app around the WebRequest.BeginGetRequestStream (for POSTs with content) and WebRequest.BeginGetResponse (for GETs and POSTSs). Synchronous calls will not scale (I won't enter into details why, but that I do have hard evidence for). Anyway, the ServicePoint issue is orthogonal to this: the queueing behavior happens with async calls too.
Sorry for tacking on to an old thread, but I think something that was said above may be incorrect/misleading.
From what I can tell .Timeout is NOT the connection time, it is the TOTAL time allowed for the entire life of the HttpWebRequest and response. Proof:
I Set:
.Timeout=5000
.ReadWriteTimeout=32000
The connect and post time for the HttpWebRequest took 26ms
but the subsequent call HttpWebRequest.GetResponse() timed out in 4974ms thus proving that the 5000ms was the time limit for the whole send request/get response set of calls.
I didn't verify if the DNS name resolution was measured as part of the time as this is irrelevant to me since none of this works the way I really need it to work--my intention was to time out quicker when connecting to systems that weren't accepting connections as shown by them failing during the connect phase of the request.
For example: I'm willing to wait 30 seconds on a connection request that has a chance of returning a result, but I only want to burn 10 seconds waiting to send a request to a host that is misbehaving.
Something I found later which helped, is the .ReadWriteTimeout property. This, in addition to the .Timeout property seemed to finally cut down on the time threads would spend trying to download from a problematic server. The default time for .ReadWriteTimeout is 5 minutes, which for my application was far too long.
So, it seems to me:
.Timeout = time spent trying to establish a connection (not including lookup time)
.ReadWriteTimeout = time spent trying to read or write data after connection established
More info: HttpWebRequest.ReadWriteTimeout Property
Edit:
Per #KyleM's comment, the Timeout property is for the entire connection attempt, and reading up on it at MSDN shows:
Timeout is the number of milliseconds that a subsequent synchronous request made with the GetResponse method waits for a response, and the GetRequestStream method waits for a stream. The Timeout applies to the entire request and response, not individually to the GetRequestStream and GetResponse method calls. If the resource is not returned within the time-out period, the request throws a WebException with the Status property set to WebExceptionStatus.Timeout.
(Emphasis mine.)
From the documentation of the HttpWebRequest.Timeout property:
A Domain Name System (DNS) query may
take up to 15 seconds to return or
time out. If your request contains a
host name that requires resolution and
you set Timeout to a value less than
15 seconds, it may take 15 seconds or
more before a WebException is thrown
to indicate a timeout on your request.
Is it possible that your DNS query is the cause of the timeout?
No matter what we tried we couldn't manage to get the timeout below 21 seconds when the server we were checking was down.
To work around this we combined a TcpClient check to see if the domain was alive followed by a separate check to see if the URL was active
public static bool IsUrlAlive(string aUrl, int aTimeoutSeconds)
{
try
{
//check the domain first
if (IsDomainAlive(new Uri(aUrl).Host, aTimeoutSeconds))
{
//only now check the url itself
var request = System.Net.WebRequest.Create(aUrl);
request.Method = "HEAD";
request.Timeout = aTimeoutSeconds * 1000;
var response = (HttpWebResponse)request.GetResponse();
return response.StatusCode == HttpStatusCode.OK;
}
}
catch
{
}
return false;
}
private static bool IsDomainAlive(string aDomain, int aTimeoutSeconds)
{
try
{
using (TcpClient client = new TcpClient())
{
var result = client.BeginConnect(aDomain, 80, null, null);
var success = result.AsyncWaitHandle.WaitOne(TimeSpan.FromSeconds(aTimeoutSeconds));
if (!success)
{
return false;
}
// we have connected
client.EndConnect(result);
return true;
}
}
catch
{
}
return false;
}
Related
So basically I am running a program which is able to send up to 7,000 HTTP requests every second in average, 24/7, in order to detect last changes on a website as quickly as possible.
However, every 2.5 to 3 minutes in average, my program slowdowns for around 10-15 seconds and goes from ~7K rq/s to less than 1000.
Here are logs from my program, where you can see the amount of requests it sends every second:
https://pastebin.com/029VLxZG
When scrolling down through the logs, you can see it goes slower every ~3 minutes. Example: https://i.imgur.com/US0wPzm.jpeg
At first I thought it was my server's ethernet connection going in a temporary "restricted" mode, and I even tried contacting my host about it. But then I ran 2 instances of my program simulteanously just to see what would happen and I noticed that, even though the issue (downtime) was happening on both, it wasn't always happening at the same time (depending on when the program was started, if you get what I mean), which meant the problem wasn't coming from the internet connection, but my program itself.
I investigated a little bit more, and found out that as soon as my program goes from ~7K rq/s to ~700, a lot of RAM is being freed up on my server.
I have taken 2 screenshots of the consecutive seconds before and once the downtime occurs (including RAM metrics), to compare, and you can view them here: https://imgur.com/a/sk2TYQZ (please note that I was using less threads here, which is why the average "normal" speed is ~2K rq/s instead of ~7K as mentioned before)
If you'd like to see more of it, here is the full record of the issue, in a video which lasts about 40 seconds: https://i.imgur.com/z27FlVP.mp4 - As you can see, after the RAM is freed up, its usage slowly goes up again, before the same process repeats every ~3 minutes.
For more context, here is the method I am using to send the HTTP requests (it is being called from a lot of threads concurrently, as my app is multi-threaded in order to be super fast):
public static async Task<bool> HasChangedAsync(string endpoint, HttpClient httpClient)
{
const string baseAddress = "https://example.com/";
string response = await httpClient.GetStringAsync(baseAddress + endpoint);
return response.Contains("example");
}
One thing I did is I tried replacing the whole method by await Task.Delay(25) then return false, and that fixed the issue, RAM usage was barely increasing.
This lead me to believe the issue is HttpClient / my HTTP requests, and even though I tried replacing the GetStringAsync method by GetAsync using both a HttpRequestMessage and HttpResponseMessage (and disposing them with using), the behavior ended up being the exact same.
So here I am, desperate for a fix, and without enough knowledge about memory, garbage collector etc (if that's even needed here) to be able to fix this myself.
Please, Stack Overflow, do you have any idea?
Thanks a lot.
Your best bet would be to stream the response and then use chunks of it to find what your are looking for. An example implementation could be something as follows:
using var response = await Client.GetAsync(BaseUrl, HttpCompletionOption.ResponseHeadersRead);
await using var stream = await response.Content.ReadAsStreamAsync();
using var reader = new StreamReader(stream);
string line = null;
while ((line = await reader.ReadLineAsync()) != null)
{
if(line.Contains("example"))// do whatever
}
I am trying to achieve a high number of webrequests per second.
With C#, I used multiple threads to send webrequest and find that no matter how many threads I created,
the max number of webrequest is around 70 per second in the condition that a server responds quickly.
I tried to simulate timeout response using fiddler in order to make concurrent outstanding web requests to have a better understanding.
With whatever amount of threads, there are instantly fired 2x requests, afterward, the queued requests fired one by one very slowly although the previous requests were still getting response. Once there were finished requests, the queued requests fired faster to replenish the amount. Its like it takes time to initialize once the pre-initialized amount is reached. Moreover, the response is small enough that bandwidth problem could be neglected.
Below is the code.
I tried in window xp and window 7 in different network. Same thing happens.
public Form1()
{
System.Net.ServicePointManager.DefaultConnectionLimit = 1000;
for (int i = 0; i < 80; i++)
{
int copy = i;
new Thread(() =>
{
submit_test(copy);
}) { IsBackground = true }.Start();
}
}
public void submit_test(int pos)
{
webRequest = (HttpWebRequest)WebRequest.Create("http://www.test.com/");
webRequest.Method = "GET";
using (HttpWebResponse webResponse = (HttpWebResponse)webRequest.GetResponse())
{
}
}
Is it the network card limiting the instantly fired amount?
I know that a large server can handle thousands of incoming request concurrently. Isn't it the same as sending out requests ( Establishing connection )?
Please tell me if using a server helps solve the problem.
Update clue:
1) I suspect if the router limiting and unplugged it. No difference.
2) Fiddler show that one queued requests fired exactly every second
3) I used apache benchmarking tool to try to send concurrent timeout request and same thing happens.Not likely to be .Net problem.
4) I try to connect to localhost instead. No difference
5) I used begingetresponse instead and no difference.
6) I suspect if this is fiddler problem. I use wireshark as well to capture traffic. Sensibly, the held outgoing requests are emulated by fiddler and the response was received in fact.
There are not outstanding requests actually. It seems that it is fiddler queuing the requests. I will edit/close the question after I find a better method to test
I had been stuck in this problem for a few days already. Please tell me any single possibility if you could think of.
Finally, I find that my test is not accurate due to an implementation of fiddler. The requests are queued after 2X outstanding requests for unknown reason.
I set up a server and limit its bandwidth to simulate timeout response.
Using wireshark, I can see that 150 SYN could be sent in around 1.4s as soon as my threads are ready.
There is a lot of overhead associated with creating Threads directly. Try using Task factory instead of Thread. Tasks use ThreadPool under the covers, which reuses threads instead of continuously creating them.
for (int i = 0; i < 80; i++)
{
int copy = i;
Task.Factory.StartNew(() =>
{
submit_test(copy);
});
}
Check out this other post on the topic:
Why so much difference in performance between Thread and Task?
Using a modified WebClient, I download data periodically from a service with the following characteristics:
The data download (~1GB) can take around 20 minutes
Sometimes the service decides not to return any data at all (request hangs), or takes minutes to hours to return the first byte.
I would like to fail fast in the event that the service does not return any data within a reasonable (configurable) amount of time, yet allow plenty of time for a download that is making progress to succeed.
It seems that the WebRequest.Timeout property controls the total time for the request to complete, while ReadWriteTimeout controls the total time available to read data once the data transfer begins.
Am I missing a property that would control the maximum amount of time to wait between establishing the connection and the first byte returning? If there is no such property, how can I approach the problem?
I am not aware of any additional timeout property that will achieve the result you are looking for. The first thought that comes to mind is attaching a handler to DownloadProgressChanged that will update a flag to indicate data has been received (not always accurate though).
Using a Timer or EventWaitHandle you could then block (or handle async if you prefer) for a short period of time and evaluate whether any data has been received. The code below is not a fully fleshed out example, but an idea of how it may be implemented.
using (var manualResetEvent = new ManualResetEvent(false))
using (var client = new WebClient())
{
client.DownloadProgressChanged += (sender, e) => manualResetEvent.Set();
client.DownloadDataAsync(new Uri("https://github.com/downloads/cbaxter/Harvester/Harvester.msi"));
if (!manualResetEvent.WaitOne(5000))
client.CancelAsync();
}
In the above example, the manualResetEvent.WaitOne will return true if DownloadProgressChanged was invoked. You will likely want to check e.BytesReceived > 0 and only set for non-zero values, but I think you get the idea?
I am developing an application using twitter api and that involves writing a method to check if a user exists. Here is my code:
public static bool checkUserExists(string user)
{
//string URL = "https://twitter.com/" + user.Trim();
//string URL = "http://api.twitter.com/1/users/show.xml?screen_name=" + user.Trim();
//string URL = "http://google.com/#hl=en&sclient=psy-ab&q=" + user.Trim();
string URL = "http://api.twitter.com/1/users/show.json?screen_name=" + user.Trim();
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(URL);
try
{
var webResponse = (HttpWebResponse)webRequest.GetResponse();
return true;
}
//this part onwards does not matter
catch (WebException ex)
{
if (ex.Status == WebExceptionStatus.ProtocolError && ex.Response != null)
{
var resp = (HttpWebResponse)ex.Response;
if (resp.StatusCode == HttpStatusCode.NotFound)
{
return false;
}
else
{
throw new Exception("Unknown level 1 Exception", ex);
}
}
else
{
throw new Exception("Unknown level 2 Exception", ex);
}
}
}
The problem is, calling the method does not work(it doesn't get a response) more than 2 or 3 times, using any of the urls that have been commented, including the google search query(I thought it might be due to twitter API limit). On debug, it shows that it's stuck at:
var webResponse = (HttpWebResponse)webRequest.GetResponse();
Here's how I am calling it:
Console.WriteLine(TwitterFollowers.checkUserExists("handle1"));
Console.WriteLine(TwitterFollowers.checkUserExists("handle2"));
Console.WriteLine(TwitterFollowers.checkUserExists("handle3"));
Console.WriteLine(TwitterFollowers.checkUserExists("handle4"));
Console.WriteLine(TwitterFollowers.checkUserExists("handle5"));
Console.WriteLine(TwitterFollowers.checkUserExists("handle6"));
At most I get 2-3 lines of output. Could someone please point out what's wrong?
Update 1:
I sent 1 request every 15 seconds (well within limit) and it still causes an error. on the other hand, sending a request, closing the app and running it again works very well (on average accounts to 1 request every 5 seconds). The rate limit is 150 calls per hour Twitter FAQ.
Also, I did wait for a while, and got this exception at level 2:
http://pastie.org/3897499
Update 2:
Might sound surprising but if I run fiddler, it works perfectly. Regardless of whether I target this process or not!
The effect you're seeing is almost certainly due to rate-limit type policies on the Twitter API (multiple requests in quick succession). They keep a tight watch on how you're using their API: the first step is to check their terms of use and policies on rate limiting, and make sure you're in compliance.
Two things jump out at me:
You're hitting the API with multiple requests in rapid succession. Most REST APIs, including Google search, are not going to allow you to do that. These APIs are very visible targets, and it makes sense that they'd be pro-active about preventing denial-of-service attacks.
You don't have a User Agent specified in your request. Most APIs require you to send them a meaningful UA, as a way of helping them identify you.
Note that you're dealing with unmanaged resources underneath your HttpWebResponse. So calling Dispose() in a timely fashion or
wrapping the object in a using statement is not only wise, but important to avoid blocking.
Also, var is great for dealing with anonymous types, Linq query
results, and such but it should not become a crutch. Why use var
when you're well aware of the type? (i.e. you're already performing
a cast to HttpWebResponse.)
Finally, services like this often limit the rate of connections per second and/or the number of simultaneous connections allowed to prevent abuse. By not disposing of your HttpWebResponse objects, you may be violating the permitted number of simultaneous connections. By querying too often you'd break the rate limit.
What is a reasonable amount of time to wait for a web request to return? I know this is maybe a little loaded as a question, but all I am trying to do is verify if a web page is available.
Maybe there is a better way?
try
{
// Create the web request
HttpWebRequest request = WebRequest.Create(this.getUri()) as HttpWebRequest;
request.Credentials = System.Net.CredentialCache.DefaultCredentials;
// 2 minutes for timeout
request.Timeout = 120 * 1000;
if (request != null)
{
// Get response
response = request.GetResponse() as HttpWebResponse;
connectedToUrl = processResponseCode(response);
}
else
{
logger.Fatal(getFatalMessage());
string error = string.Empty;
}
}
catch (WebException we)
{
...
}
catch (Exception e)
{
...
}
You need to consider how long the consumer of the web service is going to take e.g. if you are connecting to a DB web server and you run a lengthy query, you need to make the web service timeout longer then the time the query will take. Otherwise, the web service will (erroneously) time out.
I also use something like (consumer time) + 10 seconds.
Offhand I'd allow 10 seconds, but it really depends on what kind of network connection the code will be running with. Try running some test pings over a period of a few days/weeks to see what the typical response time is.
I would measure how long it takes for pages that do exist to respond. If they all respond in about the same amount of time, then I would set the timeout period to approximately double that amount.
Just wanted to add that a lot of the time I'll use an adaptive timeout. Could be a simple metric like:
period += (numTimeouts/numRequests > .01 ? someConstant: 0);
checked whenever you hit a timeout to try and keep timeouts under 1% (for example). Just be careful about decrementing it too low :)
The reasonable amount of time to wait for a web request may differ from one server to the next. If a server is at the far end of a high-delay link then clearly it will take longer to respond than when it is in the next room. But two minutes seems like it's more than ample time for a server to respond. The default timeout value for the PING command is expressed in seconds, not minutes. I suggest you look into the timeout values that are used by networking utilities like PING or TRACERT for inspiration.
I guess this depends on two things:
network speed/load (as others wrote, using ping might give you an idea about this)
the kind of page you are calling: e.g. is it a static HTML page or is it a page which might do some time-consuming operations (DB access, etc.)
Anyway, I think 2 minutes is a lot of time. I would definitely reduce the timeout to less than 30 seconds.
I realize this doesn't directly answer your question, but then an "answer" to this question is a little tough. Anyway, a tool I've used gomez in the past to measure page load times from various parts of the world. It's free and if you haven't done this kind of testing before it might be helpful in terms of giving you a firm idea of what typical page load times are for a given page from a given location.
I would only wait (MAX) 30 seconds probably closer to 15. It really depends on what you are doing and what the result is of unsuccessful connection. As I am sure you know there is lots of reason why you could get a timeout...