I'm working on an async Http crawler that gathers data from various services, and at the moment, I'm working with threadpools that do serial HttpWebRequest calls to post/get data from the services.
I want to transition over to the async web calls (BeginGetRequestStream and BeginGetResponse), I need some way get the response data and POST stats (% completed with the write, when complete (when complete more important), etc). I currently have an event that is called from the object that spawns/contains the thread, signaling HTTP data has been received. Is there an event in the WebRequests I can attach to to call the already implemented event? That would be the most seamless for the transition.
Thanks for any help!!
The following code I just copy/pasted (and edited) from this article about asynchronous Web requests. It shows a basic pattern of how you can write asynchronous code in a somewhat organized fashion, while keeping track of what responses go with what requests, etc. When you're finished with the response, just fire an event that notifies the UI that a response finished.
private void ScanSites ()
{
// for each URL in the collection...
WebRequest request = HttpWebRequest.Create(uri);
// RequestState is a custom class to pass info
RequestState state = new RequestState(request, data);
IAsyncResult result = request.BeginGetResponse(
new AsyncCallback(UpdateItem),state);
}
private void UpdateItem (IAsyncResult result)
{
// grab the custom state object
RequestState state = (RequestState)result.AsyncState;
WebRequest request = (WebRequest)state.request;
// get the Response
HttpWebResponse response =
(HttpWebResponse )request.EndGetResponse(result);
// fire the event that notifies the UI that data has been retrieved...
}
Note you can replace the RequestState object with any sort of object you want that will help you keep track of things.
You are probably already doing this, but if not, I believe this is a perfectly acceptable and clean way to tackle the problem. If this isn't what you were looking for, let me know.
You could passing a delegate (as part of the async "state" parameter) that needs to be called. Then after your EndGetResponseStream do what you need and then call this delegate with any parameters you need.
Personally, since you're moving to the aysnc programming model (I assume to get better performance) I strongly suggest you move your workflow over to to asynchronous as well. This model allows you to process the results as they come in and as fast as possible without any blocking whatsoever.
Edit
On my blog there is an article
HttpWebRequest - Asynchronous Programming Model/Task.Factory.FromAsyc
on this subject. I'm currently in the process of writing it, but I've presented a class that I think you could use in your situation. Take a look at either the GetAsync method or PostAsync method depending on what you need.
public static void GetAsyncTask(string url, Action<HttpWebRequestCallbackState> responseCallback,
string contentType = "application/x-www-form-urlencoded")
Notice the responseCallback parameter? Well that's the delegate I talked about earlier.
Here is an example of how you'd call it (I'm showing the PostAsyn() method
var iterations = 100;
for (int i = 0; i < iterations; i++)
{
var postParameters = new NameValueCollection();
postParameters.Add("data", i.ToString());
HttpSocket.PostAsync(url, postParameters, callbackState =>
{
if (callbackState.Exception != null)
throw callbackState.Exception;
Console.WriteLine(HttpSocket.GetResponseText(callbackState.ResponseStream));
});
}
The loop could be your collection of urls. In the case of a GET you don't need to send any (POST) parameters and the callback is the lambda you see where I'm writing to the console. Here you could do what you need, of you could send in a delegate so the response processing is done "elsewhere".
Also the callback method is an
Action<HttpWebRequestCallbackState>
Where HttpWebRequestCallbackState is a custom class you can modify to include any information you need for your purposes. Or you could modify the signature to to an Action.
You can use the System.Net.WebClient class:
var client = new WebClient();
client.DownloadDataCompleted += (s, args) => { /* do stuff here */ };
client.DownloadDataAsync(new Uri("http://someuri.com/"));
The second method is my primary way of ending the response.
public string GetResponse()
{
// Get the original response.
var response = _request.GetResponse();
Status = ((HttpWebResponse) response).StatusDescription;
// Get the stream containing all content returned by the requested server.
_dataStream = response.GetResponseStream();
// Open the stream using a StreamReader for easy access.
var reader = new StreamReader(_dataStream);
// Read the content fully up to the end.
var responseFromServer = reader.ReadToEnd();
// Clean up the streams.
reader.Close();
if (_dataStream != null)
_dataStream.Close();
response.Close();
return responseFromServer;
}
/// <summary>
/// Custom timeout on responses
/// </summary>
/// <param name="millisec"></param>
/// <returns></returns>
public string GetResponse(int millisec)
{
//Spin off a new thread that's safe for an ASP.NET application pool.
var responseFromServer = "";
var resetEvent = new ManualResetEvent(false);
ThreadPool.QueueUserWorkItem(arg =>
{
try
{
responseFromServer = GetResponse();
}
catch (Exception ex)
{
throw ex;
}
finally
{
resetEvent.Set();//end of thread
}
});
//handle a timeout with a asp.net thread safe method
WaitHandle.WaitAll(new WaitHandle[] { resetEvent }, millisec);
return responseFromServer;
}
Related
I'm in a bit of a conundrum regarding multithreading.
I'm currently working on a real-time service using SinglaR. The idea is that a connected user can request data from another.
Below is a gist of what the request and response functions look like.
Consider the following code:
private readonly ConcurrentBag _sharedObejcts= new ConcurrentBag();
The request:
[...]
var sharedObject = new MyObject();
_sharedObejcts.Add(sharedObject);
ForwardRequestFireAndForget();
try
{
await Task.Delay(30000, sharedObject.myCancellationToken);
}
catch
{
return sharedObject.ResponseProperty;
}
_myConcurrentBag.TryTake(sharedObject);
[...]
The response:
[...]
var result = DoSomePossiblyVeryLengthyTaskHere();
var sharedObject = ConcurrentBag
.Where(x)
.FirstOrDefault();
// The request has timed out so the object isn't there anymore.
if(sharedObject == null)
{
return someResponse;
}
sharedObject.ResponseProperty = result;
// triggers the cancellation source
sharedObject.Cancel();
return someOtherResponse;
[...]
So basically a request is made to the server, forwarded to the other host and the function waits for cancellation or it times out.
The other hosts call the respond function, which adds the repsonseObject and triggers myCancellationToken.
I am however unsure whether this represents a race condition.
In theory, could the responding thread retrieve the sharedObject while the other thread still sits on the finally block?
This would mean, the request timed out already, the task just hasn't gotten around to removing the object from the bag, which means the data is inconsistent.
What would be some guaranteed ways to make sure that the first thing that gets called after the Task.Delay() call is the TryTake()call?
You don't want to have the producer cancel the consumer's wait. That's way too much conflation of responsibilities.
Instead, what you really want is for the producer to send an asynchronous signal. This is done via TaskCompletionSource<T>. The consumer can add the object with an incomplete TCS, and then the consumer can (asynchronously) wait for that TCS to complete (or timeout). Then the producer just gives its value to the TCS.
Something like this:
class MyObject
{
public TaskCompletionSource<MyProperty> ResponseProperty { get; } = new TaskCompletionSource<MyProperty>();
}
// request (consumer):
var sharedObject = new MyObject();
_sharedObejcts.Add(sharedObject);
ForwardRequestFireAndForget();
var responseTask = sharedObject.ResponseProperty.Task;
if (await Task.WhenAny(Task.Delay(30000), responseTask) != responseTask)
return null;
_myConcurrentBag.TryTake(sharedObject);
return await responseTask;
// response (producer):
var result = DoSomePossiblyVeryLengthyTaskHere();
var sharedObject = ConcurrentBag
.Where(x)
.FirstOrDefault();
// The request has timed out so the object isn't there anymore.
if(sharedObject == null)
return someResponse;
sharedObject.ResponseProperty.TrySetResult(result);
return someOtherResponse;
The code above can be cleaned up a bit; specifically, it's not a bad idea to have the producer have a "producer view" of the shared object, and the consumer have a "consumer view", with both interfaces implemented by the same type. But the code above should give you the general idea.
I have a program which gets html code for ~500 webpages every 5 minutes
it runs correctly until first fail(unable to download source in 6 seconds)
after that all threads will fail
and if I restart program, again it runs correctly until ...
where I'm wrong, what I should do to do it better?
this function runs every 5 mins:
foreach (Company company in companies)
{
string link = company.GetLink();
Thread t = new Thread(() => F(company, link));
t.Start();
if (!t.Join(TimeSpan.FromSeconds(6)))
{
Debug.WriteLine( company.Name + " Fails");
t.Abort();
}
}
and this function download html code
private void F(Company company, string link)
{
try
{
string htmlCode = GetInformationFromWeb.GetHtmlRequest(link);
company.HtmlCode = htmlCode;
}
catch (Exception ex)
{
}
}
and this class:
public class GetInformationFromWeb
{
public static string GetHtmlRequest(string url)
{
using (MyWebClient client = new MyWebClient())
{
client.Encoding = Encoding.UTF8;
string htmlCode = client.DownloadString(url);
return htmlCode;
}
}
}
and web client class
public class MyWebClient : WebClient
{
protected override WebRequest GetWebRequest(Uri address)
{
HttpWebRequest request = base.GetWebRequest(address) as HttpWebRequest;
request.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
return request;
}
}
IF your foreach is looping over 500 companies, and each is creating a new thread, it could be that your internet speed could become a bottleneck and you will receive timeouts over 6 seconds, and fail very often.
I suggest you to try with parallelism. Note MaxDegreeOfParallelism, which sets maximum amount of parallel executions. You can tune this to suit your needs.
Parallel.ForEach(companies, new ParallelOptions { MaxDegreeOfParallelism = 10 }, (company) =>
{
try
{
string htmlCode = GetInformationFromWeb.GetHtmlRequest(company.link);
company.HtmlCode = htmlCode;
}
catch(Exception ex)
{
//ignore or process exception
}
});
I have four basic suggestions:
Use HttpClient instead of obsolete WebClient. HttpClient can deal with asynchronous operations natively and has far more flexibility to take advantage of. You can even read downloaded contents to strings/streams on different thread since you can configure await not to schedule back your operations. Or even program the HttpClientHandler to break after 6 seconds and raise TaskCanceledException if this was exceeded.
Avoid swallowing exceptions (like you do in your F function) as it breaks debugging and obfuscates the real cause of problems. Correctly-written program will never raise an exception during normal operation.
You are using threads in an useless way, in which they are not even overlapping; they are just waiting for each other to start, because you are locking the calling loop after each thread's start. In .NET it would be better to do multitasking using Tasks (for example, by calling them as Task.Run(async delegate() { await yourTask(); }) (or AsyncContext.Run(...) if you need UI access) and it won't block anything.
The whole GetInformationFromWeb class is pointless in the moment - and you are spawning multiple client objects also pointlessly, since one HTTP client object can handle multiple requests (if you'd use HttpClient even without additional bloat - you just instantiate it once as static global variable with all necessary configuration and then call it from any place using as little code as client.GetStringAsync(Uri uri).
OT: Is it some kind of an academic project?
I am developing an application wherein the user enters a sentence in a text box. I use the TextBox.Text method to get the text in the TextBox as a string and I call a method getTranslation() which internally invokes several async callbacks as it requires
establish connection to server
write request to POST Stream
Get response callback from server
Process the response
return the response to the xaml page
In the xaml page of the application I first call the first method passing the input text as param. Then the next line of code calls the return response method and sets the TextBlock content to the returned response.
These are my methods used to call the server.
public void searchOnline(string inputtxt)
{
//Lines of code
IAsyncResult writeRequestStreamCallback =
(IAsyncResult)req.BeginGetRequestStream(new AsyncCallback(RequestStreamReady), req);
}
private void RequestStreamReady(IAsyncResult ar)
{
//Lines of code
request.BeginGetResponse(new AsyncCallback(GetResponseCallback), request);
}
private void GetResponseCallback(IAsyncResult ar)
{
//Lines of code
IAsyncResult writeRequestStreamCallback = (IAsyncResult)serviceWebRequest.BeginGetResponse(new AsyncCallback(ServiceReady), serviceWebRequest);
}
private void ServiceReady(IAsyncResult ar)
{
//Lines of code
System.IO.StreamReader streamRead = new System.IO.StreamReader(streamResponse);
string responseString = streamRead.ReadToEnd();
searchresult = responseString;
}
public string returner()
{
return searchresult;
}
In the xaml page I call the following code
help.searchOnline(inputtextbox.Text);//line 1
outputtextbox.Text = help.returner();//line 2
outputtextbox.UpdateLayout();
My problem is how to make the return method in the xaml page wait i.e. line 2, to update the textblock, till the response is received i.e. until line 1 updates the search result string?
Only rewrite methods synchronously. You just want to continue running the program with the display in OutputTextBox. Therefore, the process for acquiring the data should be awaited.
But if you need to perform other operations during this process, you can use a Task:
(http://msdn.microsoft.com/en-us/library/system.threading.tasks.task(v=vs.110).aspx)
Through this class you can assign an object (TResult) to the return of the operation, so check your state and proceed the way you want, for example through the property:
public bool IsCompleted {get; }
Which would tell you if the SearchOnline is complete!
In addition to this property, there are methods such as Wait, thats literally waits for the Task to complete execution i.e, await the method SearchOnline to update the OutputTextBox.
I would like to reiterate that it would be very good to study the features of Class System.Threading.Tasks.Task.
I Hope it helps :)
I may be misunderstanding the flow of control, because by all accounts this seems like it should work. This is a Windows phone 8 app. I am attempting to make a web request, and accordingly display the data returned. I am trying to get the data (here, called 'Key') in the following method:
public Task<String> getSingleStockQuote(String URI)
{
return Task.Run(() =>
{
String key = null;
HttpWebRequest request = HttpWebRequest.Create(URI) as HttpWebRequest;
HttpWebResponse response;
try
{
request.BeginGetResponse((asyncres) =>
{
HttpWebRequest responseRequest = (HttpWebRequest)asyncres.AsyncState;
response = (HttpWebResponse)responseRequest.EndGetResponse(asyncres);
key = returnStringFromStream(response.GetResponseStream());
System.Diagnostics.Debug.WriteLine(key);
}, request);
}
catch (Exception e)
{
System.Diagnostics.Debug.WriteLine("WebAccessRT getSingleStockQuote threw exception");
key = String.Empty;
}
return key;
});
}
...And I am calling this method like so:
WebAccessRT rt = new WebAccessRT();
await rt.getSingleStockQuote(stockTagURI);
System.Diagnostics.Debug.WriteLine("Past load data");
The WriteLine() in BeginGetResponse is for testing purposes; it prints after "Past Load Data". I want BeginGetResponse to run and complete operation (thus setting the Key), before the task returns. The data prints out right in the console, but not in the desired order - so Key is set and has a value, but its the very last part that gets run. Can someone point me in the right direction and/or see what's causing the problem above? Thinking this through, is the await operator SIMPLY waiting for the Task to return, which is returning after spinning off its async call?
BeginGetResponse starts an asynchronous process (hence the callback) so you cannot guarantee the order it is completed. Remember that the code within BeginGetResponse is actually a separate method (see closures) that is executed separately from getSingleStockQuote. You would need to use await GetResponseAsync or (imo, even better - you could greatly simplify your code) use HttpClient.
I’m developing a Windows Phone 7.1 application, and trying to implement tombstoning.
Due to the legal reasons I can’t save my view model. I’m only saving encrypted session ID, which can be used to load a view model data from the remote server.
On resume, I need to verify the session ID, if it’s expired – I take user to the login page of my app, if it’s still OK, I reload view model data from the server.
The problem is the HttpWebRequest lacks blocking API. Moreover, while inside page.OnNavigatedTo method after de-tombstoning, the method described here blocks forever.
I’ve worked around the problem by presenting my own splash screen.
However, I’d rather like to complete those RPC calls while the system-provided “Resuming…” splash screen is visible, i.e. before I return from page.OnNavigatedTo method.
Any ideas how can I complete HTTP requests synchronously while inside page.OnNavigatedTo after de-tombstoning?
Let me start out by saying that Microsoft really tries to push you to do async calls for good reasons, which is why I wanted to emphasize it.
Now if you really want to do it synchronous, I have an idea which I haven't been able to test myself. When using the HttpWebRequest class, there are two important functions, which you've probably used as well: BeginGetResponse and EndGetResponse.
These two functions work closely together. BeginGetResponse starts a asynchronous webrequest, where when the request is finished the EndGetResponse gives you to ouput when it's done. This is the way MS tries to let you do it. The trick to doing this stuff synchronously is that the beginGetResponse returns a IAsyncResult. This IAsyncResult interface contains a WaitHandler, which can be used to synchronously wait until the request is done. After which you can just continue with the endGetRequest and go on with your bussiness. The same thing goes for the BeginGetRequestStream and EndGetRequestStream.
But as I said before, I haven't tested this solution and it's purely theoretical. Let me know if it worked or not.
Good luck!
Update: another option is to use Reactive Extensions.
If you're on VS2010 you can install the AsyncCTP and when you do an extension method gets added that allows you to await the response.
static async Task<Stream> AsynchronousDownload(string url)
{
WebRequest request = WebRequest.Create(url);
WebResponse response = await request.GetResponseAsync();
return (response.GetResponseStream());
}
then:
UPDATED:
protected override async void OnNavigatedTo(NavigationEventArgs e)
{
base.OnNavigatedTo(e);
var myResponse = await AsynchronousDownload("http://stackoverflow.com");
}
or
If you're using VS2012 you can install the Microsoft.Bcl.Async lib and do the same thing as if you were using the AsyncCTP, await the response.
or
You could implement something similar to Coroutines in Caliburn Micro. For this you implement the IResult interface.
public interface IResult
{
void Execute(ActionExecutionContext context);
event EventHandler<ResultCompletionEventArgs> Completed;
}
A possible implementation:
public class HttpWebRequestResult : IResult
{
public HttpWebRequest HttpWebRequest { get; set; }
public string Result { get; set; }
public HttpWebRequestResult(string url)
{
HttpWebRequest = (HttpWebRequest) HttpWebRequest.Create(url);
}
public void Execute (ActionExecutionContext context)
{
HttpWebRequest.BeginGetResponse (Callback, HttpWebRequest);
}
public void Callback (IAsyncResult asyncResult)
{
var httpWebRequest = (HttpWebRequest)asyncResult.AsyncState;
var httpWebResponse = (HttpWebResponse) httpWebRequest.EndGetResponse(asyncResult);
using (var reader = new StreamReader(httpWebResponse.GetResponseStream()))
Result = reader.ReadToEnd();
Completed (this, new ResultCompletionEventArgs ());
}
public event EventHandler<ResultCompletionEventArgs> Completed = delegate { };
}
Then to call it:
var httpWebRequestResult = new HttpWebRequestResult("http://www.google.com");
yield return httpWebRequestResult;
var result = httpWebRequestResult.Result;
This might be an example of grabbing the Coroutines implementation from CM and using it separately.