C# HttpWebRequest.BeginGetResponse blocks in one class but not another - c#

I am validating a list of proxies using HttpWebRequest.BeginGetResponse. It works really well, I can validate thousands of proxies in seconds and doesn't block.
In another class within my project, I am calling the same code and it blocks.
Proxy validation method (Doesn't block):
public void BeginTest(IProxyTest test, Action<ProxyStatus> callback, int timeout = 10000)
{
var req = HttpWebRequest.Create(test.URL);
req.Proxy = new WebProxy(this.ToString());
req.Timeout = timeout;
WebHelper.BeginGetResponse(req, new Action<RequestCallbackState>(callbackState =>
{
if (callbackState.Exception != null)
{
callback(ProxyStatus.Invalid);
}
else
{
var responseStream = callbackState.ResponseStream;
using (var reader = new StreamReader(responseStream))
{
var responseString = reader.ReadToEnd();
if (responseString.Contains(test.Validation))
{
callback(ProxyStatus.Valid);
}
else
{
callback(ProxyStatus.Invalid);
}
}
}
}));
}
WebHelper.BeginGetResponse
public static void BeginGetResponse(WebRequest request, Action<RequestCallbackState> responseCallback)
{
Task<WebResponse> asyncTask = Task.Factory.FromAsync<WebResponse>(request.BeginGetResponse, request.EndGetResponse, null);
ThreadPool.RegisterWaitForSingleObject((asyncTask as IAsyncResult).AsyncWaitHandle, new WaitOrTimerCallback(TimeoutCallback), request, request.Timeout, true);
asyncTask.ContinueWith(task =>
{
WebResponse response = task.Result;
Stream responseStream = response.GetResponseStream();
responseCallback(new RequestCallbackState(responseStream));
responseStream.Close();
response.Close();
}, TaskContinuationOptions.NotOnFaulted);
//Handle errors
asyncTask.ContinueWith(task =>
{
var exception = task.Exception;
responseCallback(new RequestCallbackState(exception.InnerException));
}, TaskContinuationOptions.OnlyOnFaulted);
}
Other class with a similar method that also calls WebHelper.BeginGetResponse, but blocks (why?)
public void BeginTest(Action<ProxyStatus> callback, int timeout = 10000)
{
var req = HttpWebRequest.Create(URL);
req.Timeout = timeout;
WebHelper.BeginGetResponse(req, new Action<RequestCallbackState>(callbackState =>
{
if (callbackState.Exception != null)
{
callback(ProxyStatus.Invalid);
}
else
{
var responseStream = callbackState.ResponseStream;
using (var reader = new StreamReader(responseStream))
{
var responseString = reader.ReadToEnd();
if (responseString.Contains(Validation))
{
callback(ProxyStatus.Valid);
}
else
{
callback(ProxyStatus.Invalid);
}
}
}
}));
}
Calling code which blocks
private async void validateTestsButton_Click(object sender, EventArgs e)
{
await Task.Run(() =>
{
foreach (var test in tests)
{
test.BeginTest((status) => test.Status = status);
}
});
}
Calling code which doesn't block:
public static async Task BeginTests(ICollection<Proxy> proxies, ICollection<ProxyJudge> judges, int timeout = 10000, IProgress<int> progress = null)
{
await Task.Run(() =>
{
foreach (var proxy in proxies)
{
proxy.BeginTest(judges.GetRandomItem(), new Action<ProxyStatus>(status =>
{
proxy.Status = status;
}), timeout);
}
});
}

Although this dosnt address your problem exactly, it might help you out a little
Here are a couple of problems
You are using APM (Asynchronous Programming Model)
You are using the ThreadPool class which seems a little old fashioned
You are doing IO bound work and blocking threads on the threadpool
You are using a weird mix of APM and TBA asynchronous models
And seemingly tying up your thread pool waiting for IO
So you are doing IO bound work, the best pattern to use as you might have guess is the TBA async await pattern. basically every time you wait for a an IO Completion port you want to give that thread back to the operating system and be nice to your system inturn freeing up resources for where its needed.
Also you obviously want some degree of parallelism and you are best to at least have some control over it.
I would suggest this is a nice job for TPL Dataflow and an ActionBlock
Given
public class Proxy
{
public ProxyStatus ProxyStatus { get; set; }
public string ProxyUrl { get; set; }
public string WebbUrl { get; set; }
public string Error { get; set; }
}
ActionBlock Example
public static async Task DoWorkLoads(List<Proxy> results)
{
var options = new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 50
};
var block = new ActionBlock<Proxy>(CheckUrlAsync, options);
foreach (var proxy in results)
{
block.Post(proxy);
}
block.Complete();
await block.Completion;
}
CheckUrlAsync Example
// note i havent tested this, add pepper and salt to taste
public static async Task CheckUrlAsync(Proxy proxy)
{
try
{
var request = WebRequest.Create(proxy.Url);
if (proxy.ProxyUrl != null)
request.Proxy = new WebProxy(proxy.ProxyUrl);
using (var response = await request.GetResponseAsync())
{
using (var responseStream = response.GetResponseStream())
{
using (var reader = new StreamReader(responseStream))
{
var responseString = reader.ReadToEnd();
if (responseString.Contains("asdasd"))
proxy.ProxyStatus = ProxyStatus.Valid;
else
proxy.ProxyStatus = ProxyStatus.Invalid;
}
}
}
}
catch (Exception e)
{
proxy.ProxyStatus = ProxyStatus.Error;
proxy.Error = e.Message;
}
}
Usage
await DoWorkLoads(proxies to test);
Summary
The code is neater, you arnt throwing actions all over the place, you using async and await you have ditched APM, you have control of the degrees of parallel and you are being nice to the thread pool

I solved the problem by wrapping the code in the BeginTest method which was blocking mysteriously in an Action, and then calling BeginInvoke on that Action.
I deduced this was caused by not setting the Proxy property on the HttpWebRequest in that method, which seemed to be causing a synchronous lookup of my systems proxy.
public void BeginTest(Action<ProxyStatus> callback, int timeout = 10000)
{
var action = new Action(() =>
{
var req = HttpWebRequest.Create(URL);
req.Timeout = timeout;
WebHelper.BeginGetResponse(req, new Action<RequestCallbackState>(callbackState =>
{
if (callbackState.Exception != null)
{
callback(ProxyStatus.Invalid);
}
else
{
var responseStream = callbackState.ResponseStream;
using (var reader = new StreamReader(responseStream))
{
var responseString = reader.ReadToEnd();
if (responseString.Contains(Validation))
{
callback(ProxyStatus.Valid);
}
else
{
callback(ProxyStatus.Invalid);
}
}
}
}));
});
action.BeginInvoke(null, null);
}

Related

SSIS Script Task using client.GetAsync(url) not waiting for response

I have an API call using client.GetAsync(url) within a SSIS script task but for some reason its not waiting for response from API and jumping back to the entry point for the script task which is public void Main(). Done some reading and found out that it might result in a deadlock for some reason but tried all the variations I could find to get it to work but with no luck. Something else that I don't understand is the exact same code is running on a webpage and that works perfect and waits for response from the api and continuing the flow.
Script Task entry point
The response here for payload is: ID =5, Status = WaitingForActivation, Method = "{null}", Result = "{Not yet computed}"
Here if in debug mode and moving the process back to go through the process again I noticed there 2 threads one current executing and the old one with the response I was expecting on the first call but not too sure what this means.
public void Main() {
// TODO: Add your code here
try {
PackageDetails packageInfo = new PackageDetails {
PackageNumber = 1234567891, Env = "Development", UserName = "USER"
};
var payload = API.ListHeadByPackAsync(packageInfo);
//var test = GetResponse();
Dts.TaskResult = (int) ScriptResults.Success;
} catch (Exception ex) {
System.Console.Write(ex.Message);
Dts.TaskResult = (int) ScriptResults.Failure;
}
}
API Call
public static class API {
public static async Task<PackageDetails> ListHeadByPackAsync(PackageDetails package) {
PackageDetails packageInfo = new PackageDetails();
try {
using(var client = new ApiClient(requestUrl, authToken)) {
var response = await client.GetAsync(); //-> not waiting for response
}
} catch (Exception err) {
switch (err.Message) {
//TODO:
}
}
return packageInfo;
}
}
Client
public class ApiClient: IDisposable {
private readonly TimeSpan _timeout;
private HttpClient _httpClient;
private HttpClientHandler _httpClientHandler;
private readonly string _baseUrl;
private readonly string _credentials;
//private const string MediaTypeXml = "application/csv";
public ApiClient(string baseUrl, string authToken, TimeSpan ? timeout = null) {
_baseUrl = baseUrl;
_credentials = Base64Encode(authToken);
_timeout = timeout ?? TimeSpan.FromSeconds(90);
}
public async Task < string > GetAsync() {
EnsureHttpClientCreated();
using(var response = await _httpClient.GetAsync(_baseUrl).ConfigureAwait(continueOnCapturedContext: false))
//-> after executing above line it will go straight to public void Main(), dose not wait for response
{
response.EnsureSuccessStatusCode();
return await response.Content.ReadAsStringAsync();
}
}
public void Dispose() {
_httpClientHandler?.Dispose();
_httpClient?.Dispose();
}
private void CreateHttpClient() {
_httpClientHandler = new HttpClientHandler {
AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip
};
_httpClient = new HttpClient(_httpClientHandler, false) {
Timeout = _timeout
};
if (!string.IsNullOrWhiteSpace(_baseUrl)) {
_httpClient.BaseAddress = new Uri(_baseUrl);
}
_httpClient.DefaultRequestHeaders.Add("Authorization", "Basic" + " " + _credentials);
}
private void EnsureHttpClientCreated() {
if (_httpClient == null) {
//ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
CreateHttpClient();
}
}
public static string Base64Encode(string token) {
var tokenBytes = System.Text.Encoding.UTF8.GetBytes(token);
return System.Convert.ToBase64String(tokenBytes);
}
}

Prevent task from starting when added to a concurrent queue

I am trying to add my task to a custom concurrent queue but it keep starting. How can I just add the task object to the queue without starting it so I can start it later in the code? Basically what it's supposed to do for each piece is get the request stream, stream it to a file, concurrently start the next piece.
My custom concurrent queue:
public sealed class EventfulConcurrentQueue<T> : ConcurrentQueue<T>
{
public ConcurrentQueue<T> Queue;
public EventfulConcurrentQueue()
{
Queue = new ConcurrentQueue<T>();
}
public void Enqueue(T item)
{
Queue.Enqueue(item);
OnItemEnqueued();
}
public int Count => Queue.Count;
public bool TryDequeue(out T result)
{
var success = Queue.TryDequeue(out result);
if (success)
{
OnItemDequeued(result);
}
return success;
}
public event EventHandler ItemEnqueued;
public event EventHandler<ItemDequeuedEventArgs<T>> ItemDequeued;
void OnItemEnqueued()
{
ItemEnqueued?.Invoke(this, EventArgs.Empty);
}
void OnItemDequeued(T item)
{
ItemDequeued?.Invoke(this, new ItemDequeuedEventArgs<T> { Item = item });
}
}
public sealed class ItemDequeuedEventArgs<T> : EventArgs
{
public T Item { get; set; }
}
The code I'm using to add the task to the Queue:
Parallel.ForEach(pieces, piece =>
{
//Open a http request with the range
var request = new HttpRequestMessage { RequestUri = new Uri(url) };
request.Headers.Range = new RangeHeaderValue(piece.start, piece.end);
//Send the request
var downloadTask = client.SendAsync(request).Result;
//Use interlocked to increment Tasks done by one
Interlocked.Add(ref OctaneEngine.TasksDone, 1);
//Add the task to the queue along with the start and end value
asyncTasks.Enqueue(new Tuple<Task, FileChunk>(
downloadTask.Content.ReadAsStreamAsync().ContinueWith(
task =>
{
using (var fs = new FileStream(piece._tempfilename,
FileMode.OpenOrCreate, FileAccess.Write))
{
task.Result.CopyTo(fs);
}
}), piece));
});
The code I am using to later start the tasks:
Parallel.ForEach(asyncTasks.Queue, async (task, state) =>
{
if (asyncTasks.Count > 0)
{
await Task.Run(() => task);
asyncTasks.TryDequeue(out task);
Interlocked.Add(ref TasksDone, 1);
}
});
I'm not sure what is going on so any help would be greatly appreciated! Thank you!
It seems to me that you're over-complicating things here.
I'm not sure you need any queue at all.
This approach allows the network access work to occur concurrently:
using var semaphore = new SemaphoreSlim(1);
var tasks = pieces.Select(piece =>
{
var request = new HttpRequestMessage { RequestUri = new Uri(url) };
request.Headers.Range = new RangeHeaderValue(piece.start, piece.end);
//Send the request
var download = await client.SendAsync(request);
//Use interlocked to increment Tasks done by one
Interlocked.Increment(ref OctaneEngine.TasksDone);
var stream = await download.Content.ReadAsStreamAsync();
using (var fs = new FileStream(piece._tempfilename, FileMode.OpenOrCreate,
FileAccess.Write))
{
await semaphore.WaitAsync(); // Only allow one concurrent file write
await stream.CopyToAsync(fs);
}
semaphore.Release();
Interlocked.Increment(ref TasksDone);
});
await Task.WhenAll(tasks);
Because your work involves I/O, there is little benefit trying to use multiple threads via Parallel.ForEach; awaiting the async methods will release threads to process other requests.

Programmatically trigger listener.GetContext()

Is it possible to trigger the below code by using a trigger URL?
As opposed to triggering by visiting the URL in the browser.
var context = listener.GetContext();
Something like this?
var triggerURL = "http://www.google.ie/";
var request = (HttpWebRequest)WebRequest.Create(triggerURL);
Or is it possible to use a do while loop? I.E do create trigger while get context
Instead of using listener.GetContext(), I was able to satisfy my requirement by using listener.BeginGetContext(new AsyncCallback(ListenerCallback), listener) and listener.EndGetContext(result), utilising the Asynchronous call, GetAsync.
public static string RunServerAsync(Action<string> triggerPost)
{
var triggerURL = "";
CommonCode(ref triggerURL);
if (listener.IsListening)
{
triggerPost(triggerURL);
}
while (listener.IsListening)
{
var context = listener.BeginGetContext(new AsyncCallback(ListenerCallback), listener);
context.AsyncWaitHandle.WaitOne(20000, true); //Stop listening after 20 seconds (20 * 1000).
listener.Close();
}
return plateString;
}
private static async void TriggerURL(string url)
{
var r = await DownloadPage(url);
}
static async Task<string> DownloadPage(string url)
{
using (var client = new HttpClient())
{
using (var r = await client.GetAsync(new Uri(url)))
{
if (r.IsSuccessStatusCode)
{
string result = await r.Content.ReadAsStringAsync();
return result;
}
else
{
return r.StatusCode.ToString();
}
}
}
}
private static void ListenerCallback(IAsyncResult result)
{
try
{
HttpListener listener = (HttpListener)result.AsyncState;
// Use EndGetContext to complete the asynchronous operation.
HttpListenerContext context = listener.EndGetContext(result);
if (context != null)
{
plateString = ProcessRequest(context);
}
else
{
plateString = "No response received!";
}
}
catch (Exception ex)
{
NLogManager.LogException(ex);
}
}

BeginGetContext performance

I saw a lot of examples about GeginGetContext but i have filling that all of them are waste time. Maybe i am wrong. Lets find out. Lets take for instance really good example from the Multi-threading with .Net HttpListener topic:
public class HttpListenerCallbackState
{
private readonly HttpListener _listener;
private readonly AutoResetEvent _listenForNextRequest;
public HttpListenerCallbackState(HttpListener listener)
{
if (listener == null) throw new ArgumentNullException("listener");
_listener = listener;
_listenForNextRequest = new AutoResetEvent(false);
}
public HttpListener Listener { get { return _listener; } }
public AutoResetEvent ListenForNextRequest { get { return _listenForNextRequest; } }
}
public class HttpRequestHandler
{
private int requestCounter = 0;
private ManualResetEvent stopEvent = new ManualResetEvent(false);
public void ListenAsynchronously(IEnumerable<string> prefixes)
{
HttpListener listener = new HttpListener();
foreach (string s in prefixes)
{
listener.Prefixes.Add(s);
}
listener.Start();
HttpListenerCallbackState state = new HttpListenerCallbackState(listener);
ThreadPool.QueueUserWorkItem(Listen, state);
}
public void StopListening()
{
stopEvent.Set();
}
private void Listen(object state)
{
HttpListenerCallbackState callbackState = (HttpListenerCallbackState)state;
while (callbackState.Listener.IsListening)
{
callbackState.Listener.BeginGetContext(new AsyncCallback(ListenerCallback), callbackState);
int n = WaitHandle.WaitAny(new WaitHandle[] { callbackState.ListenForNextRequest, stopEvent});
if (n == 1)
{
// stopEvent was signalled
callbackState.Listener.Stop();
break;
}
}
}
private void ListenerCallback(IAsyncResult ar)
{
HttpListenerCallbackState callbackState = (HttpListenerCallbackState)ar.AsyncState;
HttpListenerContext context = null;
int requestNumber = Interlocked.Increment(ref requestCounter);
try
{
context = callbackState.Listener.EndGetContext(ar);
}
catch (Exception ex)
{
return;
}
finally
{
callbackState.ListenForNextRequest.Set();
}
if (context == null) return;
HttpListenerRequest request = context.Request;
if (request.HasEntityBody)
{
using (System.IO.StreamReader sr = new System.IO.StreamReader(request.InputStream, request.ContentEncoding))
{
string requestData = sr.ReadToEnd();
//Stuff I do with the request happens here
}
}
try
{
using (HttpListenerResponse response = context.Response)
{
//response stuff happens here
string responseString = "Ok";
byte[] buffer = Encoding.UTF8.GetBytes(responseString);
response.ContentLength64 = buffer.LongLength;
response.OutputStream.Write(buffer, 0, buffer.Length);
response.Close();
}
}
catch (Exception e)
{
}
}
}
Here we can see main part for this topic:
while (callbackState.Listener.IsListening)
{
callbackState.Listener.BeginGetContext(new AsyncCallback(ListenerCallback), callbackState);
int n = WaitHandle.WaitAny(new WaitHandle[] { callbackState.ListenForNextRequest, stopEvent});
...
}
I can saw this patterns in +- all examples.
We are starting Getting Contex (Getting request = getting network stream with request) after that we are blocking current thread with Wait/WaitAny method, so thread witch is getting request is doing nothing until it will got full request after that it will getting new request. For example we are having WCF
request with large object(which are deserialized in this request and are serialized in the same way on other side) so we will wait and WASTE TIME until we complete getting full steam with this request.
I see that really it is Sync not Async way. Instead of this we can starting getting second request
right after the starting getting first and not call Wait, not blocking thread. I think we will have much better performance in such way. What do you think? Why do all examples contains Wait?

Async unit testing with portable class libraries

I have a portable class library that needs to target at least .net 4.5 and Silverlight 5. I'm running into an issue trying to write MSTest unit tests in VS 2012 because my library does not use the new async/await paradigm. Is there any way I can test this method?
public static void Get(string uri, string acceptHeader, Action<string> callback)
{
var request = (HttpWebRequest)WebRequest.Create(uri);
request.Accept = acceptHeader;
request.BeginGetResponse(o =>
{
var r = o.AsyncState as HttpWebRequest;
try
{
var response = r.EndGetResponse(o);
using (var sr = new StreamReader(response.GetResponseStream()))
{
callback(sr.ReadToEnd());
}
}
catch (Exception ex)
{
throw new WebException(string.Format("Unable to access {0}", uri), ex);
}
}, request);
}
First, I recommend you reconsider async/await. It's the wave of the future. Microsoft.Bcl.Async provides async support to portable libraries targeting .NET 4.5 and SL5.
But if you don't want to do that, you can still use async unit tests:
[TestMethod]
public async Task Get_RetrievesExpectedString()
{
var tcs = new TaskCompletionSource<string>();
var client = new ... // arrange
client.Get(uri, acceptHeader, result =>
{
tcs.SetResult(result);
});
var actual = await tcs.Task;
Assert.AreEqual(expected, actual);
}
Or if you want, you can do it "old-school":
[TestMethod]
public void Get_RetrievesExpectedString()
{
var mre = new ManualResetEvent(initialState: false);
string actual = null;
var client = new ... // arrange
client.Get(uri, acceptHeader, result =>
{
actual = result;
mre.Set();
});
mre.WaitOne();
Assert.AreEqual(expected, actual);
}
Just can't resist refactoring the code. You can use closure to do the following
public static void Get(string uri, string acceptHeader, Action<string> callback)
{
var request = (HttpWebRequest)WebRequest.Create(uri);
request.Accept = acceptHeader;
request.BeginGetResponse(o =>
{
try
{
var response = request.EndGetResponse(o);
using (var sr = new StreamReader(response.GetResponseStream()))
{
callback(sr.ReadToEnd());
}
}
catch (Exception ex)
{
throw new WebException(string.Format("Unable to access {0}", uri), ex);
}
}, null);
}
However at the end of the day you can just do the following
public async static void Get(string uri, string acceptHeader, Action<string> callback)
{
var request = (HttpWebRequest)WebRequest.Create(uri);
request.Accept = acceptHeader;
var response = await Task.Factory.FromAsync(
request.BeginGetRequestStream ,
request.EndGetRequestStream ,
uri,
null);
using (var sr = new StreamReader(response))
{
callback(sr.ReadToEnd());
}
}
Okay so here is how I would do it
void Main()
{
{...}
var request = (HttpWebRequest)WebRequest.Create(uri);
request.Accept = acceptHeader;
var response = await request.DownloadStringTaskAwait();
DoSomeStuff(response);
}
// Define other methods and classes here
public static class HttpWebRequestExtension
{
public async Task<string> DownloadStringTaskAwait(this HttpWebRequest request)
{
var response = await Task.Factory.FromAsync<Stream>(
request.BeginGetRequestStream,
request.EndGetRequestStream,
null);
using (var sr = new StreamReader(response))
{
return sr.ReadToEnd();
}
}
}

Categories