So I have a block of code like
EventLog.WriteEntry("About to run the task");
// run the dequeued task
var task = PageRetrievals.Dequeue();
PageRetrieval retrieval = new PageRetrieval();
var continuation = task.ContinueWith(t => retrieval = t.Result);
task.Wait();
EventLog.WriteEntry("html = " + retrieval.Html.DocumentNode.OuterHtml);
where the WriteEntrys are just my sanity-check that this is working. But the 2nd isn't getting called and I'm trying to figure out why my code isn't reaching that point.
The above block of code is inside a method that is
MyTimer.Elapsed += new ElapsedEventHandler(MethodThatInvokesTheAboveCode);
and the type of task is like
PageRetrievals.Enqueue(new Task<PageRetrieval>(() =>
new PageRetrieval()
{
Html = PageOpener.GetBoardPage(pagenum),
Page = PageType.Board,
Number = pagenum
}
));
where PageOpener.GetBoardPage simply gets the HTML from a URL, like
private static HtmlDocument GetDocumentFromUrl(string url)
{
var webget = new HtmlWeb();
var doc = webget.Load(url);
return webget.StatusCode == System.Net.HttpStatusCode.OK ? doc : null;
}
public static HtmlDocument GetBoardPage(int pageNumber)
{
return GetDocumentFromUrl(string.Format(BoardPageUrlFormat, pageNumber));
}
Is there anything about this that looks obviously wrong?
Related
Well, I'm building web parsing app and having some troubles making it async.
I have a method which creates async tasks, and decorator for RestSharp so I can do requests via proxy. Basically in code it just does 5 tries of requesting the webpage.
Task returns RestResponse and it's status code is always 0. And this is the problem, because if I do the same synchronously, it works.
private static async Task<HtmlNode> GetTableAsync(int page)
{
ProxyClient client = new ProxyClient((name) =>ProxyProvider.GetCoreNoCD(name),
serviceName, 10000, 10000);
var task = client.TryGetAsync(new Uri(GetPageUrl(page)), (res) =>
{
return res.IsSuccessStatusCode && res.IsSuccessful;
},5);
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml((await task).Content);
return doc.DocumentNode.SelectSingleNode("//div[#class=\"table_block\"]/table");
}
And this works as expected, but synchronously.
private static async Task<HtmlNode> GetTableAsync(int page)
{
ProxyClient client = new ProxyClient((name) =>ProxyProvider.GetCoreNoCD(name),
serviceName, 10000, 10000);
var task = client.TryGetAsync(new Uri(GetPageUrl(page)), (res) =>
{
return res.IsSuccessStatusCode && res.IsSuccessful;
},5);
task.Wait();
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(task.Result.Content);
return doc.DocumentNode.SelectSingleNode("//div[#class=\"table_block\"]/table");
}
ProxyClient's insides:
public async Task<RestResponse?> TryGetAsync(Uri uri,
Predicate<RestResponse> condition, int tryCount = 15,
List<KeyValuePair<string, string>> query = null,
List<KeyValuePair<string, string>> headers = null,
Method method = Method.Get, string body = null)
{
WebClient? client = null;
RestResponse? res = null;
for(int i = 0; i < tryCount; i++)
{
try
{
client = new WebClient(source.Invoke(serviceName), serviceName, timeout);
res = await client.GetResponseAsync(uri, query, headers, method, body);
if (condition(res))
return res;
}
catch(Exception)
{
///TODO:add log maybe?
}
finally
{
if (client != null)
{
client.SetCDToProxy(new TimeSpan(cd));
client.Dispose();
}
}
}
return res;
}
I have no idea how to make it work with async and don't understand why it doesn't work as expected.
I think it might have to do with the Task.Wait() I would consider changing to await like this.
private static async Task<HtmlNode> GetTableAsync(int page)
{
ProxyClient client = new ProxyClient((name) =>ProxyProvider.GetCoreNoCD(name),
serviceName, 10000, 10000);
var statusOk = false;
var result = await client.GetAsync(new Uri(GetPageUrl(page));
statusOk = result.IsSuccessStatusCode &&
result.StatusCode == HttpStatusCode.OK;
//do what you want based on statusOk
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(result.Content);
return doc.DocumentNode.SelectSingleNode("//div[#class=\"table_block\"]/table");
}
Just decided to try different solutions, and seems like it works only if I return task result
Like this:
ProxyClient client = new ProxyClient((name) => ProxyProvider.GetCoreNoCD(name),
serviceName, 10000, 10000);
return await client.TryGetAsync(new Uri(GetPageUrl(page)), (res) =>
{ return res.IsSuccessStatusCode && res.IsSuccessful; });
I thought it could be some kind of misunderstanding of async/await, but seems like no. Maybe some kind of RestSharp bug.
I think you're just checking the result too early. You need to look at the result after the await:
var task = client.TryGetAsync(...);
// Too early to check
var x = await task;
// Check now
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(x.Content);
This question already has answers here:
Why does System.Threading.Timer stop on its own?
(2 answers)
Closed 2 years ago.
I'm currently creating a new function for my Discord bot which alternates the Bots playing status. We have some Squad game servers and are pulling the data from Battlemetrics via web.load and using select single nodee from the XPath of the html response.
Now, onto the issue. I have created an async task of update which starts the first threading timer. this then goes to the next async void which starts doing the get request of the information for one server and then once timed out, will run a new threading timer to the next async void to pull the new details.
Now, the issue is this works wonderful for a while, until it starts to completely stop. Is this something I have to change in the threading.timer overloads?
Code View
_client.Ready += Update; is what tells the bot, on ready please run the update async task.
public async Task Update()
{
var timer = new System.Threading.Timer(new TimerCallback(FooAsync), null, 2000, Timeout.Infinite);
}
This part of code is what is needed (due to being an async task I believe, then to start alternating between the two async voids:
public async void FooAsync(Object obj)
{
var html = #"https://www.battlemetrics.com/servers/squad/9310072";
HtmlWeb web = new HtmlWeb();
var htmlDoc = web.Load(html);
// J4F Vars
var j4f = htmlDoc.DocumentNode.SelectSingleNode("/html/body/div[1]/div/div/div/div/div/div[1]/div[1]/dl/dd[2]/span");
var j4f1 = htmlDoc.DocumentNode.SelectSingleNode("/html/body/div[1]/div/div/div/div/div/div[1]/div[1]/div/dl/dd[1]");
var Status = _client.SetGameAsync($"J4F - Amount of Players: {j4f.InnerText},\n" + $"Current Map: {j4f1.InnerText},\n", "", ActivityType.Playing);
await Status;
var vis = _client.SetStatusAsync(UserStatus.Online);
await vis;
var timer = new System.Threading.Timer(new TimerCallback(Next), null, 2000, Timeout.Infinite);
}
Number two:
public async void Next(Object obj)
{
var html = #"https://www.battlemetrics.com/servers/squad/9376512";
HtmlWeb web = new HtmlWeb();
var htmlDoc = web.Load(html);
// J4F Vars
var j4f = htmlDoc.DocumentNode.SelectSingleNode("/html/body/div[1]/div/div/div/div/div/div[1]/div[1]/dl/dd[2]/span");
var j4f1 = htmlDoc.DocumentNode.SelectSingleNode("/html/body/div[1]/div/div/div/div/div/div[1]/div[1]/div/dl/dd[1]");
var Status1 = _client.SetGameAsync($"TPA - Amount of Players: {j4f.InnerText},\n" + $"Current Map: {j4f1.InnerText},\n", "", ActivityType.Playing);
await Status1;
var vis1 = _client.SetStatusAsync(UserStatus.Online);
await vis1;
var timer = new System.Threading.Timer(new TimerCallback(FooAsync), null, 2000, Timeout.Infinite);
}
Given that FooAsync and Next are almost identical I'd rewrite like this:
public async Task FooAndNextAsync(string url_suffix, string code)
{
var html = $#"https://www.battlemetrics.com/servers/squad/{url_suffix}";
HtmlWeb web = new HtmlWeb();
var htmlDoc = web.Load(html);
// J4F Vars
var j4f = htmlDoc.DocumentNode.SelectSingleNode("/html/body/div[1]/div/div/div/div/div/div[1]/div[1]/dl/dd[2]/span");
var j4f1 = htmlDoc.DocumentNode.SelectSingleNode("/html/body/div[1]/div/div/div/div/div/div[1]/div[1]/div/dl/dd[1]");
await _client.SetGameAsync($"{code} - Amount of Players: {j4f.InnerText},\n" + $"Current Map: {j4f1.InnerText},\n", "", ActivityType.Playing);
await _client.SetStatusAsync(UserStatus.Online);
}
Then using a timer is as simple as this:
var flag = false;
TimerCallback cb = async _ =>
{
if (flag = !flag)
await FooAndNextAsync("9310072", "J4F");
else
await FooAndNextAsync("9376512", "TPA");
flag = !flag;
}
var timer = new System.Threading.Timer(
new TimerCallback(cb), null,
TimeSpan.FromSeconds(2.0), TimeSpan.FromSeconds(2.0));
Personally, my preferred approach is to use Microsoft's Reactive Framework and do this instead:
var query =
from i in Observable.Interval(TimeSpan.FromSeconds(2.0))
let url_suffix = i % 2 == 0 ? "9310072" : "9376512"
let code = i % 2 == 0 ? "J4F" : "TPA"
let htmlDoc = new HtmlWeb().Load($#"https://www.battlemetrics.com/servers/squad/{url_suffix}")
let j4f = htmlDoc.DocumentNode.SelectSingleNode("/html/body/div[1]/div/div/div/div/div/div[1]/div[1]/dl/dd[2]/span")
let j4f1 = htmlDoc.DocumentNode.SelectSingleNode("/html/body/div[1]/div/div/div/div/div/div[1]/div[1]/div/dl/dd[1]")
from x in Observable.FromAsync(() => _client.SetGameAsync($"{code} - Amount of Players: {j4f.InnerText},\n" + $"Current Map: {j4f1.InnerText},\n", "", ActivityType.Playing))
from y in Observable.FromAsync(() => _client.SetStatusAsync(UserStatus.Online))
select y;
IDisposable subscription = query.Subscribe();
Stopping the observable subscription is as simple as calling subscription.Dispose().
I am trying to make create a function to get the source code from a number of pages. After each page is grabbed, I want to update a label on my form indicating the progress (1 of 5, 2 of 5, etc.).
However, no matter what I try, the GUI completely freezes until the for loop has ended.
public List<List<string>> GetPages(string base_url, int num_pages)
{
var pages = new List<List<string>>();
var webGet = new HtmlWeb();
var task = Task.Factory.StartNew(() => {
for (int i = 0; i <= num_pages; i++)
{
UpdateMessage("Fetching page " + i + " of " + num_pages + ".");
var page = new List<string>();
var page_source = webGet.Load(url+i);
// (...)
page.Add(url+i);
page.Add(source);
pages.Add(page);
}
});
task.Wait();
return pages;
}
The call to this method looks like this:
List<List<string>> pages = site.GetPages(url, num_pages);
If I remove task.Wait(); the GUI unfreezes, the label updates properly, but the code continues without the needed multidimensional list.
I should say that I'm very new to C#. What the heck am I doing wrong?
Update
As per Darin, I have changed my method:
public async Task<List<List<string>>> GetPages(string url, int num_pages)
{
var pages = new List<List<string>>();
var webGet = new HtmlWeb();
for (int i = 0; i <= num_pages; i++)
{
UpdateMessage("Fetching page " + i + " of " + num_pages + ".");
var page = new List<string>();
var page_source = webGet.Load(url+i);
// (...)
page.Add(url+i);
page.Add(source);
pages.Add(page);
}
return pages;
}
And the call:
List<List<string>> pages = await site.GetPages(url, num_pages);
However, now I am getting this error:
The 'await' operator can only be used within an async method. Consider marking this method with the 'async' modifier and changing its return type to 'Task'.
But when I mark the method with async, the GUI still freezes.
Update 2
Woops! I seem to missed a piece of Darin's new method. I have now included await webGet.LoadAsync(url + i); in the method. I have also marked the method I am calling from as async.
Now, unfortunately, I'm getting this error:
'HtmlWeb' does not contain a definition for 'LoadAsync' and no extension method 'LoadAsync' accepting a first argument of type 'HtmlWeb' could be found (are you missing a using directive or an assembly reference?)
I've checked, I'm using .NET 4.5.2, and the HtmlAgilityPack in my References is the Net45 version. I have no idea what's going on now.
If I remove task.Wait(); the GUI unfreezes, the label updates
properly, but the code continues without the needed multidimensional
list.
That's normal. You should update your function so that it doesn't return the value but rather the task:
public Task<List<List<string>>> GetPages(string base_url, int num_pages)
{
var webGet = new HtmlWeb();
var task = Task.Factory.StartNew(() =>
{
var pages = new List<List<string>>();
for (int i = 0; i <= num_pages; i++)
{
UpdateMessage("Fetching page " + i + " of " + num_chapters + ".");
var page = new List<string>();
var page_source = webGet.Load(url+i);
// (...)
page.Add(url+i);
page.Add(source);
pages.Add(page);
}
return pages;
});
return task;
}
and then when calling this function you will use the ContinueWith on the result:
var task = GetPages(baseUrl, numPages);
task.ContinueWith(t =>
{
List<List<string>> chapters = t.Result;
// Do something with the results here
});
Obviously before accessing t.Result in the continuation you probably would like to first check the other properties to see if the task completed successfully or if some exception was thrown so that you can act accordingly.
Also if you are using .NET 4.5 you may consider taking advantage of the async/await constructs:
public async Task<List<List<string>>> GetPages(string base_url, int num_pages)
{
var webGet = new HtmlWeb();
var pages = new List<List<string>>();
for (int i = 0; i <= num_pages; i++)
{
UpdateMessage("Fetching page " + i + " of " + num_chapters + ".");
var page = new List<string>();
var page_source = await webGet.LoadAsync(url+i);
// (...)
page.Add(url+i);
page.Add(source);
pages.Add(page);
}
return pages;
}
and then:
List<List<string>> chapters = await GetPages(baseUrl, numPages);
// Do something with the results here.
Assuming WinForms, start by making the toplevel eventhandler async void .
You then have an async method that can await a Task<List<List<string>>> method. That method does not have to be async itself.
private async void Button1_Click(...)
{
var pages = await GetPages(...);
// update the UI here
}
public Task<List<List<string>>> GetPages(string url, int num_pages)
{
...
return task;
}
I'll start off by publishing the code that is troubled:
public async Task main()
{
Task t = func();
await t;
list.ItemsSource = jlist; //jlist previously defined
}
public async Task func()
{
TwitterService service = new TwitterService(_consumerKey, _consumerSecret);
service.AuthenticateWith(_accessToken, _accessTokenSecret);
TwitterGeoLocationSearch g = new TwitterGeoLocationSearch(40.758367, -73.982706, 25, 0);
SearchOptions s = new SearchOptions();
s.Geocode = g;
s.Q = "";
s.Count = 1;
service.Search(s, (statuses, response) => get_tweets(statuses, response));
void get_tweets(TwitterSearchResult statuses, TwitterResponse response)
{
//unimportant code
jlist.Add(info);
System.Diagnostics.Debug.WriteLine("done with get_tweets, jlist created");
}
I am having issues with the get_tweets(..) function running (on what I believe a different thread) and the Task t is not awaited like I have in the main function. Basically, my issue is that the list.Itemsource = jlist is ran before the get_tweets function is finished. Does anyone have a solution or the right direction to point me in?
First, create a TAP wrapper for TwitterService.Search, using TaskCompletionSource. So something like:
public static Task<Tuple<TwitterSearchResult, TwitterResponse>> SearchAsync(this TwitterService service, SearchOptions options)
{
var tcs = new TaskCompletionSource<Tuple<TwitterSearchResult, TwitterResponse>>();
service.Search(options, (status, response) => tcs.SetResult(Tuple.Create(status, response)));
return tcs.Task;
}
Then you can consume it using await:
SearchOptions s = new SearchOptions();
s.Geocode = g;
s.Q = "";
s.Count = 1;
var result = await service.SearchAsync(s);
get_tweets(result.Item1, result.Item2);
I'd like to ask about how to wait for multiple async http requests.
My code is like this :
public void Convert(XDocument input, out XDocument output)
{
var ns = input.Root.Name.Namespace;
foreach (var element in input.Root.Descendants(ns + "a"))
{
Uri uri = new Uri((string)element.Attribute("href"));
var wc = new WebClient();
wc.OpenReadCompleted += ((sender, e) =>
{
element.Attribute("href").Value = e.Result.ToString();
}
);
wc.OpenReadAsync(uri);
}
//I'd like to wait here until above async requests are all completed.
output = input;
}
Dose anyone know a solution for this?
There is an article by Scott Hanselman in which he describes how to do non blocking requests. Scrolling to the end of it, there is a public Task<bool> ValidateUrlAsync(string url) method.
You could modify it like this (could be more robust about response reading)
public Task<string> GetAsync(string url)
{
var tcs = new TaskCompletionSource<string>();
var request = (HttpWebRequest)WebRequest.Create(url);
try
{
request.BeginGetResponse(iar =>
{
HttpWebResponse response = null;
try
{
response = (HttpWebResponse)request.EndGetResponse(iar);
using(var reader = new StreamReader(response.GetResponseStream()))
{
tcs.SetResult(reader.ReadToEnd());
}
}
catch(Exception exc) { tcs.SetException(exc); }
finally { if (response != null) response.Close(); }
}, null);
}
catch(Exception exc) { tcs.SetException(exc); }
return tsc.Task;
}
So with this in hand, you could then use it like this
var urls=new[]{"url1","url2"};
var tasks = urls.Select(GetAsync).ToArray();
var completed = Task.Factory.ContinueWhenAll(tasks,
completedTasks =>{
foreach(var result in completedTasks.Select(t=>t.Result))
{
Console.WriteLine(result);
}
});
completed.Wait();
//anything that follows gets executed after all urls have finished downloading
Hope this puts you in the right direction.
PS. this is probably as clear as it can get without using async/await
Consider using continuation passing style. If you can restructure your Convert method like this,
public void ConvertAndContinueWith(XDocument input, Action<XDocument> continueWith)
{
var ns = input.Root.Name.Namespace;
var elements = input.Root.Descendants(ns + "a");
int incompleteCount = input.Root.Descendants(ns + "a").Count;
foreach (var element in elements)
{
Uri uri = new Uri((string)element.Attribute("href"));
var wc = new WebClient();
wc.OpenReadCompleted += ((sender, e) =>
{
element.Attribute("href").Value = e.Result.ToString();
if (interlocked.Decrement(ref incompleteCount) == 0)
// This is the final callback, so we can continue executing.
continueWith(input);
}
);
wc.OpenReadAsync(uri);
}
}
You then run that code like this:
XDocument doc = something;
ConvertAndContinueWith(doc, (finishedDocument) => {
// send the completed document to the web client, or whatever you need to do
});