Windows Phone, Multiple HTTP request parallel, how many? - c#

In my Windows Phone 8 app, Im fetching list of items from web api. After that I loop all items and get details for each Item.
Right now my code is something like this:
List<plane> planes = await planeService.getPlanes(); // Get all planes from web api
foreach(Plane plane in planes)
{
var details = await planeService.getDetails(plane.id); // Get one plane details from web api
drawDetails(details);
}
How can I improve this to make multiple request in parallel and what is resonable number of request running parallel? The planes list can be anything from 0 to 100 objects, typically max 20.

How can I improve this to make multiple request in parallel?
You can do the parallel processing like below (untested). It uses SemaphoreSlim to throttle getDetails requests.
async Task ProcessPlanes()
{
const int MAX_REQUESTS = 50;
List<plane> planes = await planeService.getPlanes(); // Get all planes from web api
var semaphore = new SemaphoreSlim(MAX_REQUESTS);
Func<string, Task<Details>> getDetailsAsync = async (id) =>
{
await semaphore.WaitAsync();
try
{
var details = await planeService.getDetails(id);
drawDetails(details);
return details;
}
finally
{
semaphore.Release();
}
};
var tasks = planes.Select((plane) =>
getDetailsAsync(plane.id));
await Task.WhenAll(tasks.ToArray());
}
what is resonable number of request running parallel? The planes list
can be anything from 0 to 100 objects, typically max 20.
It largely dependents on the server, but I don't think there's an ultimate answer to this. For example, check this question:
A reasonable number of simultaneous, asynchronous ajax requests
As far as the WP8 client goes, I believe it can spawn 100 parallel requests without a problem.

I don't know what the limit is for network connections, but there will be one.
If there wasn't, the only problem would be the amount of memory used to keep that many requests alive.
So, assuming the underlying operating system will handle throttling properly, I would do something this:
List<plane> planes = await planeService.getPlanes();
var allDetails = Task.WhenAll(from plane in plains
select planeService.getDetails(plane.id));
foreach(var details in allDetails)
{
drawDetails(details);
}
NOTE: You should follow common naming conventions to help others understand your code. Asynchronous methods should be suffixed Async and, in *C#, method names are always CamelCase.

You should check the ServicePoint, this will provides connection management for HTTP connections. The default maximum number of concurrent connections allowed by a ServicePoint object is 2. So if you need to increase it you can use ServicePointManager.DefaultConnectionLimit property. Just check the link in MSDN there you can see a sample. And set the value you need. This might help you..

Related

C# batch processing of async web responses hangs just before finishing

Here is the scenario.
I want to call 2 versions of an API (hosted on different servers), then cast their responses (they come as a JSON) to C# objects and compare them.
An important note here is that i need to query the APIs a lot of times ~3000. The reason for this is that I query an endpoint that has an id and that returns a specific object from the DB. So my queries are like http://myapi/v1/endpoint/id. And I basically use a loop to go through all of the ids.
Here is the issue
I start querying the API and for the first 90% of all requests it is blazing fast (I get the response and i process it) and all that happens under 5 seconds.
Then however, I start to come to a stop. The next 50-100 requests can take between 1 - 5 seconds to process and after that I come to a stop. No CPU-usage, network activity is low (and I am pretty sure that activity is from other apps). And my app just hangs.
UPDATE: Around 50% of the times I tested this, it does finally resume after quite a bit of time. But the other 50% it still just hangs
Here is what I am doing in-code
I have a list of Ids that I iterate to query the endpoint.
This is the main piece of code that queries the APIs and processes the responses.
var endPointIds = await GetIds(); // this queries a different endpoint to get all ids, however there are no issues with it
var tasks = endPointIds.Select(async id =>
{
var response1 = await _data.GetData($"{Consts.ApiEndpoint1}/{id}");
var response2 = await _data.GetData($"{Consts.ApiEndpoint2}/{id}");
return ProcessResponces(response1, response2);
});
var res = await Task.WhenAll(tasks);
var result = res.Where(r => r != null).ToList();
return result; // I never get to return the result, the app hangs before this is reached
This is the GetData() method
private async Task<string> GetAsync(string serviceUri)
{
try
{
var request = WebRequest.CreateHttp(serviceUri);
request.ContentType = "application/json";
request.Method = WebRequestMethods.Http.Get;
using (var response = await request.GetResponseAsync())
using (var responseStream = response.GetResponseStream())
using (var streamReader = new StreamReader(responseStream, Encoding.UTF8))
{
return await streamReader.ReadToEndAsync();
}
}
catch
{
return string.Empty;
}
}
I would link the ProcessResponces method as well, however I tried mocking it to return a string like so:
private string ProcessResponces(string responseJson1, string responseJson1)
{
//usually i would have 2 lines that deserialize responseJson1 and responseJson1 here using Newtonsoft.Json's DeserializeObject<>
return "Fake success";
}
And even with this implementation my issue did not go away (only difference it made is that I managed the have fast requests for like 97% of my requests, but my code still ended up stopping at the last few request), so I am guessing the main issue is not related to that method. But what it more or less does is deserialize both responses to c# objects, compares them and returns information about their equality.
Here are my observations after 4 hours of debugging
If I manually reduce the number of queries to my API (I used .Take() method on the list of ids) the issue still persists. For example on 1000 total requests I start hanging around 900th, for 1500 on the 1400th an so on. I believe the issue goes away at around 100-200 requests, but I am not sure since it might just be too fast for me to notice.
Since this is currently a console app I tried adding WriteLines() in some of my methods, and the issue seemed to go away (I am guessing the delay in speed that writing on the console creates, gives some time between requests and that helps)
Lastly i did a concurrency profiling of my app and it reported that there were a lot of contentions happening at the point where my app hangs. Opening the contention tab showed that they are mainly happening with System.IO.StreamReader.ReadToEndAsync()
Thoughts and Questions
Obviously, what can I do to resolve the issue?
Is my GetAsync() method wrong, should I be using something else instead of responseStream and streamReader?
I am not super proficient in asynchronous operations, maybe my flow of async/await operations is wrong.
Lastly, could it be something with the API controllers themselves? They are standard ASP.NET MVC 5 WebAPI controllers (version 5.2.3.0)
After long hours of tracking my requests with Fiddler and finally mocking my DataProvider (_data) to retrieve locally, from disk - it turns out that I had responses that were taking 30s+ to come (or even not coming at all).
Since my .Select() is async it always dispalyed info for the quick responses first, and then came to a halt as it was waiting for the slow ones. This gave an illusion that I was somehow loading the first X amount of requests quickly and then stopping. When, in reality, I was simply shown the fastest X amount of requests and then coming to a halt as I was waiting for the slow ones.
And to kind of answer my questions...
What can I do to resolve the issue - set a timeout that allows a maximum number of milliseconds/seconds for a request to finish.
The GetAsync() method is alright.
Async/await operations are also correct, just need to have in mind that doign an async select will return results ordered by the time it took for them to finish.
The ASP.NET Framework controllers are perfectly fine and do not contribute to the issue.

Need help in deciding when is it good idea to limit the 'number of thread pool threads .net app consumes'?

I have HTTP client which basically invokes multiple web requests against HTTP server. And I execute each HTTP request in a thread pool thread (synchronous call), and by default uses 30 TCP (using httpwebrequest.servicepoint - http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.servicepoint.aspx ). And based on the system i am managing, there can be ~500/1000 thread pool threads waiting for I/O (http response)
Now, I am wondering do I need to limit the number of threads I use as well? (for ex, http://msdn.microsoft.com/en-us/library/ee789351(v=vs.110).aspx System.Threading.Tasks - Limit the number of concurrent Tasks )
EDIT
Yes, I think I need to, limit the number of threads I use as even though these threads are in wait state they take up resources. This way I can control number of resources/threads I use up which makes it easier for my component to be integrated with others without causing them for starvation/contention for resource/threads.
EDIT 2
I have decided to completely embrace async model so that i won't be using thread pool threads to execute http requests, rather I can simply rely on "collaboration of OS Kernel and I/O completion port thread(s)" which will ensure that upon completion response will be sent in a callback (this way i can best use of cpu as well as resource). I am currently thinking of using (webclient.uploaddatataskasync) http://msdn.microsoft.com/en-us/library/system.net.webclient.uploaddatataskasync(v=vs.110).aspx, and update the code accordingly. (couple of references for details: HttpWebRequest and I/O completion ports, How does .NET make use of IO Threads or IO Completion Ports? )
EDIT 3
Basically i have used "async network I/O .net APIs as mentioned above" which essentially removed usage of my parallel library. For details, please see the below answer (i have added it for convenience, just in case if anyone is interested!).
psuedo code to give an idea how I am invoking web requests using webclient
//psudeo code to represents there can be varibale number of requests
//these can be ~500 to ~1000
foreach(var request in requests)
{
//psudeo code which basically executes webrequest in threadpool thread
//MY QUESTION: Is it OK to create as many worker threads as number rrequests
//and simply let them wait on a semaphore, on should i limit the concurrency?
MyThreadPoolConcurrentLibrary.ExedcuteAction(() =>
{
var sem = new Semaphore(initialCount: 50, maximumCount: 50.Value);
try
{
//using semaphore as the HTTP Server which i am taking to recommend
//to send '50' parallel requests in '30' TCP Connections
sem.WaitOne();
//using my custom webclient, so that i can configure 'tcp' connections
//(servicepoint connection limit) and ssl validation etc.
using (MyCustomWebClient client = new MyCustomWebClient())
{
//http://msdn.microsoft.com/en-us/library/tdbbwh0a(v=vs.110).aspx
//basically the worker thread simply waits here
client.UploadData(address: "urladdress", data: bytesdata);
}
}
finally
{
sem.Release(1);
}
});
}
MyThreadPoolConcurrentLibrary.WaitAll(/*...*/);
Basically should I do something to limit the number of threads I consume, or let the thread pool take care of it (i.e. in case if my app reaches thread pool's maximum thread limit, it any way queues the request - so I can simply rely on it)
*pseudo code which should show my custom webclient where I configure tcp connections, ssl validation etc.
class MyCustomWebClient : WebClient
{
protected override WebRequest GetWebRequest(Uri address)
{
HttpWebRequest request = (HttpWebRequest)base.GetWebRequest(address);
request.KeepAlive = true;
request.Timeout = 300;
request.ServicePoint.ConnectionLimit = TCPConnectionsLimit;
request.ServerCertificateValidationCallback = this.ServerCertificateValidationCallback;
return request;
}
private bool ServerCertificateValidationCallback(object sender, System.Security.Cryptography.X509Certificates.X509Certificate certificate, System.Security.Cryptography.X509Certificates.X509Chain chain, System.Net.Security.SslPolicyErrors sslPolicyErrors)
{
throw new NotImplementedException();
}
}
Best Regards.
Since I am performing network I/O (http web requests), it is not good idea use 'synchronous' httpwebrequests and let the thread pool threads to block in sync calls. So, i have used 'async network i/o operations (web client's async task methods) as mentioned above in the question as per the suggestions from comments. It automatically removed usage of number of threads in my component - for details, please see below psudeo code snippet...
Here are some useful links that helped me to adapt to few of c# 5.0 async concepts easily (async/await):
Deep Dive Video (good explanation of async/await state machine) http://channel9.msdn.com/events/TechDays/Techdays-2014-the-Netherlands/Async-programming-deep-dive
http://blog.stephencleary.com/2013/11/there-is-no-thread.html
async/await error handling: http://www.interact-sw.co.uk/iangblog/2010/11/01/csharp5-async-exceptions , http://msdn.microsoft.com/en-us/library/0yd65esw.aspx , How to better understand the code/statements from "Async - Handling multiple Exceptions" article?
Nice book: http://www.amazon.com/Asynchronous-Programming-NET-Richard-Blewett/dp/1430259205
class Program
{
static SemaphoreSlim s_sem = new SemaphoreSlim(90, 90);
static List<Task> s_tasks = new List<Task>();
public static void Main()
{
for (int request = 1; request <= 1000; request++)
{
var task = FetchData();
s_tasks.Add(task);
}
Task.WaitAll(s_tasks.ToArray());
}
private static async Task<string> FetchData()
{
try
{
s_sem.Wait();
using (var wc = new MyCustomWebClient())
{
string content = await wc.DownloadStringTaskAsync(
new Uri("http://www.interact-sw.co.uk/oops/")).ConfigureAwait(continueOnCapturedContext: false);
return content;
}
}
finally
{
s_sem.Release(1);
}
}
private class MyCustomWebClient : WebClient
{
protected override WebRequest GetWebRequest(Uri address)
{
var req = (HttpWebRequest)base.GetWebRequest(address);
req.ServicePoint.ConnectionLimit = 30;
return req;
}
}
}
Regards.
You could always simply aim for the same limit that browsers run under. That way the server admins can't really hate on you too much.
Now, the RFC says that you should limit connections to 2 pr domain, but according to
http://www.stevesouders.com/blog/2008/03/20/roundup-on-parallel-connections/
many browsers go as high as 6 or 8 parallel connections (and this was in 2008).
Browser HTTP/1.1 HTTP/1.0
IE 6,7 2 4
IE 8 6 6
Firefox 2 2 8
Firefox 3 6 6
Safari 3,4 4 4
Chrome 1,2 6 ?
Chrome 3 4 4
Chrome 4+ 6 ?
iPhone 2 4 ?
iPhone 3 6 ?
iPhone 4 4 ?
Opera 9.63, 4 4
Opera 10.51+ 8 ?

Speed up reverse DNS lookups for large batch of IPs

For analytics purposes, I'd like to perform reverse DNS lookups on large batches of IPs. "Large" meaning, at least tens of thousands per hour. I'm looking for ways to increase the processing rate, i.e. lower the processing time per batch.
Wrapping the async version of Dns.GetHostEntry into await-able tasks has already helped a lot (compared to sequential requests), leading to a throughput of appox. 100-200 IPs/second:
static async Task DoReverseDnsLookups()
{
// in reality, thousands of IPs
var ips = new[] { "173.194.121.9", "173.252.110.27", "98.138.253.109" };
var hosts = new Dictionary<string, string>();
var tasks =
ips.Select(
ip =>
Task.Factory.FromAsync(Dns.BeginGetHostEntry,
(Func<IAsyncResult, IPHostEntry>) Dns.EndGetHostEntry,
ip, null)
.ContinueWith(t =>
hosts[ip] = ((t.Exception == null) && (t.Result != null))
? t.Result.HostName : null));
var start = DateTime.UtcNow;
await Task.WhenAll(tasks);
var end = DateTime.UtcNow;
Console.WriteLine("Resolved {0} IPs in {1}, that's {2}/sec.",
ips.Count(), end - start,
ips.Count() / (end - start).TotalSeconds);
}
Any ideas how to further improve the processing rate?
For instance, is there any way to send a batch of IPs to the DNS server?
Btw, I'm assuming that under the covers, I/O Completion Ports are used by the async methods - correct me if I'm wrong please.
Hello here are some tips so you can improve:
Cache the queries locally since this information don't usually change for
days or even years. This way you don't have to resolve every time.
Most DNS servers will automatically cache the information, so the next time it will resolve
pretty fast. Usually the cache is 4 hours, at least it is the default on Windows servers.
This means that if you run this process in a batch in a short period, it will perform better that
if you resolve the addresses several times during the day allowing cahce to expire.
It is good that you are using Task Parallelism but you are still asking the same DNS servers
configured on your machine. I think that having two machines using different DNS servers will
improve the process.
I hope this helps.
As always, I would suggest using TPL Dataflow's ActionBlock instead of firing all requests at once and waiting for all to complete. Using an ActionBlock with a high MaxDegreeOfParallelism lets the TPL decide for itself how many calls to fire concurrently, which can lead to a better utilization of resources:
var block = new ActionBlock<string>(
async ip =>
{
try
{
var host = (await Dns.GetHostEntryAsync(ip)).HostName;
if (!string.IsNullOrWhitespace(host))
{
hosts[ip] = host;
}
}
catch
{
return;
}
},
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 5000});
I would also suggest adding a cache, and making sure you don't resolve the same ip more than once.
When you use .net's Dns class it includes some fallbacks beside DNS (e.g LLMNR), which makes it very slow. If all you need are DNS queries you might want to use a dedicated library like ARSoft.Tools.Net.
P.S: Some remarks about your code sample:
You should be using GetHostEntryAsync instead of FromAsync
The continuation can potentially run on different threads so you should really be using ConcurrentDictionary.

Factors limiting concurrent outstanding requests

I am trying to achieve a high number of webrequests per second.
With C#, I used multiple threads to send webrequest and find that no matter how many threads I created,
the max number of webrequest is around 70 per second in the condition that a server responds quickly.
I tried to simulate timeout response using fiddler in order to make concurrent outstanding web requests to have a better understanding.
With whatever amount of threads, there are instantly fired 2x requests, afterward, the queued requests fired one by one very slowly although the previous requests were still getting response. Once there were finished requests, the queued requests fired faster to replenish the amount. Its like it takes time to initialize once the pre-initialized amount is reached. Moreover, the response is small enough that bandwidth problem could be neglected.
Below is the code.
I tried in window xp and window 7 in different network. Same thing happens.
public Form1()
{
System.Net.ServicePointManager.DefaultConnectionLimit = 1000;
for (int i = 0; i < 80; i++)
{
int copy = i;
new Thread(() =>
{
submit_test(copy);
}) { IsBackground = true }.Start();
}
}
public void submit_test(int pos)
{
webRequest = (HttpWebRequest)WebRequest.Create("http://www.test.com/");
webRequest.Method = "GET";
using (HttpWebResponse webResponse = (HttpWebResponse)webRequest.GetResponse())
{
}
}
Is it the network card limiting the instantly fired amount?
I know that a large server can handle thousands of incoming request concurrently. Isn't it the same as sending out requests ( Establishing connection )?
Please tell me if using a server helps solve the problem.
Update clue:
1) I suspect if the router limiting and unplugged it. No difference.
2) Fiddler show that one queued requests fired exactly every second
3) I used apache benchmarking tool to try to send concurrent timeout request and same thing happens.Not likely to be .Net problem.
4) I try to connect to localhost instead. No difference
5) I used begingetresponse instead and no difference.
6) I suspect if this is fiddler problem. I use wireshark as well to capture traffic. Sensibly, the held outgoing requests are emulated by fiddler and the response was received in fact.
There are not outstanding requests actually. It seems that it is fiddler queuing the requests. I will edit/close the question after I find a better method to test
I had been stuck in this problem for a few days already. Please tell me any single possibility if you could think of.
Finally, I find that my test is not accurate due to an implementation of fiddler. The requests are queued after 2X outstanding requests for unknown reason.
I set up a server and limit its bandwidth to simulate timeout response.
Using wireshark, I can see that 150 SYN could be sent in around 1.4s as soon as my threads are ready.
There is a lot of overhead associated with creating Threads directly. Try using Task factory instead of Thread. Tasks use ThreadPool under the covers, which reuses threads instead of continuously creating them.
for (int i = 0; i < 80; i++)
{
int copy = i;
Task.Factory.StartNew(() =>
{
submit_test(copy);
});
}
Check out this other post on the topic:
Why so much difference in performance between Thread and Task?

Boosting performance on async web calls

Backgound: I must call a web service call 1500 times which takes roughly 1.3 seconds to complete. (No control over this 3rd party API.) total Time = 1500 * 1.3 = 1950 seconds / 60 seconds = 32 minutes roughly.
I came up with what I though was a good solution however it did not pan out that great.
So I changed the calls to async web calls thinking this would dramatically help my results it did not.
Example Code:
Pre-Optimizations:
foreach (var elmKeyDataElementNamed in findResponse.Keys)
{
var getRequest = new ElementMasterGetRequest
{
Key = new elmFullKey
{
CmpCode = CodaServiceSettings.CompanyCode,
Code = elmKeyDataElementNamed.Code,
Level = filterLevel
}
};
ElementMasterGetResponse getResponse;
_elementMasterServiceClient.Get(new MasterOptions(), getRequest, out getResponse);
elementList.Add(new CodaElement { Element = getResponse.Element, SearchCode = filterCode });
}
With Optimizations:
var tasks = findResponse.Keys.Select(elmKeyDataElementNamed => new ElementMasterGetRequest
{
Key = new elmFullKey
{
CmpCode = CodaServiceSettings.CompanyCode,
Code = elmKeyDataElementNamed.Code,
Level = filterLevel
}
}).Select(getRequest => _elementMasterServiceClient.GetAsync(new MasterOptions(), getRequest)).ToList();
Task.WaitAll(tasks.ToArray());
elementList.AddRange(tasks.Select(p => new CodaElement
{
Element = p.Result.GetResponse.Element,
SearchCode = filterCode
}));
Smaller Sampling Example:
So to easily test I did a smaller sampling of 40 records this took 60 seconds with no optimizations with the optimizations it only took 50 seconds. I would have though it would have been closer to 30 or better.
I used wireshark to watch the transactions come through and realized the async way was not sending as fast I assumed it would have.
Async requests captured
Normal no optimization
You can see that the asnyc pushes a few very fast then drops off...
Also note that between requests 10 and 11 it took nearly 3 seconds.
Is the overhead for creating threads for the tasks that slow that it takes seconds?
Note: The tasks I am referring to are the 4.5 TAP task library.
Why wouldn't the request come faster than that.
I was told the Apache web server I was hitting could hold 200 max threads so I don't see an issue there..
Am I not thinking about this clearly?
When calling web services are there little advantages from async requests?
Do I have a code mistake?
Any ideas would be great.
After many days of searching I found this post that solved my problem:
Trying to run multiple HTTP requests in parallel, but being limited by Windows (registry)
The reason that the request was not hitting the server quicker was due too the my client side code and nothing to do with the server. By default C# only allows 2 concurrent requests.
see here: http://msdn.microsoft.com/en-us/library/system.net.servicepointmanager.defaultconnectionlimit.aspx
I simply added this line of code and then all request came through in milliseconds.
System.Net.ServicePointManager.DefaultConnectionLimit = 50;

Categories