I'll start off by publishing the code that is troubled:
public async Task main()
{
Task t = func();
await t;
list.ItemsSource = jlist; //jlist previously defined
}
public async Task func()
{
TwitterService service = new TwitterService(_consumerKey, _consumerSecret);
service.AuthenticateWith(_accessToken, _accessTokenSecret);
TwitterGeoLocationSearch g = new TwitterGeoLocationSearch(40.758367, -73.982706, 25, 0);
SearchOptions s = new SearchOptions();
s.Geocode = g;
s.Q = "";
s.Count = 1;
service.Search(s, (statuses, response) => get_tweets(statuses, response));
void get_tweets(TwitterSearchResult statuses, TwitterResponse response)
{
//unimportant code
jlist.Add(info);
System.Diagnostics.Debug.WriteLine("done with get_tweets, jlist created");
}
I am having issues with the get_tweets(..) function running (on what I believe a different thread) and the Task t is not awaited like I have in the main function. Basically, my issue is that the list.Itemsource = jlist is ran before the get_tweets function is finished. Does anyone have a solution or the right direction to point me in?
First, create a TAP wrapper for TwitterService.Search, using TaskCompletionSource. So something like:
public static Task<Tuple<TwitterSearchResult, TwitterResponse>> SearchAsync(this TwitterService service, SearchOptions options)
{
var tcs = new TaskCompletionSource<Tuple<TwitterSearchResult, TwitterResponse>>();
service.Search(options, (status, response) => tcs.SetResult(Tuple.Create(status, response)));
return tcs.Task;
}
Then you can consume it using await:
SearchOptions s = new SearchOptions();
s.Geocode = g;
s.Q = "";
s.Count = 1;
var result = await service.SearchAsync(s);
get_tweets(result.Item1, result.Item2);
Related
Well, I'm building web parsing app and having some troubles making it async.
I have a method which creates async tasks, and decorator for RestSharp so I can do requests via proxy. Basically in code it just does 5 tries of requesting the webpage.
Task returns RestResponse and it's status code is always 0. And this is the problem, because if I do the same synchronously, it works.
private static async Task<HtmlNode> GetTableAsync(int page)
{
ProxyClient client = new ProxyClient((name) =>ProxyProvider.GetCoreNoCD(name),
serviceName, 10000, 10000);
var task = client.TryGetAsync(new Uri(GetPageUrl(page)), (res) =>
{
return res.IsSuccessStatusCode && res.IsSuccessful;
},5);
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml((await task).Content);
return doc.DocumentNode.SelectSingleNode("//div[#class=\"table_block\"]/table");
}
And this works as expected, but synchronously.
private static async Task<HtmlNode> GetTableAsync(int page)
{
ProxyClient client = new ProxyClient((name) =>ProxyProvider.GetCoreNoCD(name),
serviceName, 10000, 10000);
var task = client.TryGetAsync(new Uri(GetPageUrl(page)), (res) =>
{
return res.IsSuccessStatusCode && res.IsSuccessful;
},5);
task.Wait();
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(task.Result.Content);
return doc.DocumentNode.SelectSingleNode("//div[#class=\"table_block\"]/table");
}
ProxyClient's insides:
public async Task<RestResponse?> TryGetAsync(Uri uri,
Predicate<RestResponse> condition, int tryCount = 15,
List<KeyValuePair<string, string>> query = null,
List<KeyValuePair<string, string>> headers = null,
Method method = Method.Get, string body = null)
{
WebClient? client = null;
RestResponse? res = null;
for(int i = 0; i < tryCount; i++)
{
try
{
client = new WebClient(source.Invoke(serviceName), serviceName, timeout);
res = await client.GetResponseAsync(uri, query, headers, method, body);
if (condition(res))
return res;
}
catch(Exception)
{
///TODO:add log maybe?
}
finally
{
if (client != null)
{
client.SetCDToProxy(new TimeSpan(cd));
client.Dispose();
}
}
}
return res;
}
I have no idea how to make it work with async and don't understand why it doesn't work as expected.
I think it might have to do with the Task.Wait() I would consider changing to await like this.
private static async Task<HtmlNode> GetTableAsync(int page)
{
ProxyClient client = new ProxyClient((name) =>ProxyProvider.GetCoreNoCD(name),
serviceName, 10000, 10000);
var statusOk = false;
var result = await client.GetAsync(new Uri(GetPageUrl(page));
statusOk = result.IsSuccessStatusCode &&
result.StatusCode == HttpStatusCode.OK;
//do what you want based on statusOk
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(result.Content);
return doc.DocumentNode.SelectSingleNode("//div[#class=\"table_block\"]/table");
}
Just decided to try different solutions, and seems like it works only if I return task result
Like this:
ProxyClient client = new ProxyClient((name) => ProxyProvider.GetCoreNoCD(name),
serviceName, 10000, 10000);
return await client.TryGetAsync(new Uri(GetPageUrl(page)), (res) =>
{ return res.IsSuccessStatusCode && res.IsSuccessful; });
I thought it could be some kind of misunderstanding of async/await, but seems like no. Maybe some kind of RestSharp bug.
I think you're just checking the result too early. You need to look at the result after the await:
var task = client.TryGetAsync(...);
// Too early to check
var x = await task;
// Check now
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(x.Content);
I wrote a web crawler and I want to know if my approach is correct. The only issue I'm facing is that it stops after some hours of crawling. No exception, it just stops.
1 - the private members and the constructor:
private const int CONCURRENT_CONNECTIONS = 5;
private readonly HttpClient _client;
private readonly string[] _services = new string[2] {
"https://example.com/items?id=ID_HERE",
"https://another_example.com/items?id=ID_HERE"
}
private readonly List<SemaphoreSlim> _semaphores;
public Crawler() {
ServicePointManager.DefaultConnectionLimit = CONCURRENT_CONNECTIONS;
_client = new HttpClient();
_semaphores = new List<SemaphoreSlim>();
foreach (var _ in _services) {
_semaphores.Add(new SemaphoreSlim(CONCURRENT_CONNECTIONS));
}
}
Single HttpClient instance.
The _services is just a string array that contains the URL, they are not the same domain.
I'm using semaphores (one per domain) since I read that it's not a good idea to use the network queue (I don't remember how it calls).
2 - The Run method, which is the one I will call to start crawling.
public async Run(List<int> ids) {
const int BATCH_COUNT = 1000;
var svcIndex = 0;
var tasks = new List<Task<string>>(BATCH_COUNT);
foreach (var itemId in ids) {
tasks.Add(DownloadItem(svcIndex, _services[svcIndex].Replace("ID_HERE", $"{itemId}")));
if (++svcIndex >= _services.Length) {
svcIndex = 0;
}
if (tasks.Count >= BATCH_COUNT) {
var results = await Task.WhenAll(tasks);
await SaveDownloadedData(results);
tasks.Clear();
}
}
if (tasks.Count > 0) {
var results = await Task.WhenAll(tasks);
await SaveDownloadedData(results);
tasks.Clear();
}
}
DownloadItem is an async function that actually makes the GET request, note that I'm not awaiting it here.
If the number of tasks reaches the BATCH_COUNT, I will await all to complete and save the results to file.
3 - The DownloadItem function.
private async Task<string> DownloadItem(int serviceIndex, string link) {
var needReleaseSemaphore = true;
var result = string.Empty;
try {
await _semaphores[serviceIndex].WaitAsync();
var r = await _client.GetStringAsync(link);
_semaphores[serviceIndex].Release();
needReleaseSemaphore = false;
// DUE TO JSON SIZE, I NEED TO REMOVE A VALUE (IT'S USELESS FOR ME)
var obj = JObject.Parse(r);
if (obj.ContainsKey("blah")) {
obj.Remove("blah");
}
result = obj.ToString(Formatting.None);
} catch {
result = string.Empty;
// SINCE I GOT AN EXCEPTION, I WILL 'LOCK' THIS SERVICE FOR 1 MINUTE.
// IF I RELEASED THIS SEMAPHORE, I WILL LOCK IT AGAIN FIRST.
if (!needReleaseSemaphore) {
await _semaphores[serviceIndex].WaitAsync();
needReleaseSemaphore = true;
}
await Task.Delay(60_000);
} finally {
// RELEASE THE SEMAPHORE, IF NEEDED.
if (needReleaseSemaphore) {
_semaphores[serviceIndex].Release();
}
}
return result;
}
4- The function that saves the result.
private async Task SaveDownloadedData(List<string> myData) {
using var fs = new FileStream("./output.dat", FileMode.Append);
foreach (var res in myData) {
var blob = Encoding.UTF8.GetBytes(res);
await fs.WriteAsync(BitConverter.GetBytes((uint)blob.Length));
await fs.WriteAsync(blob);
}
await fs.DisposeAsync();
}
5- Finally, the Main function.
static async Task Main(string[] args) {
var crawler = new Crawler();
var items = LoadItemIds();
await crawler.Run(items);
}
After all this, is my approach correct? I need to make millions of requests, will take some weeks/months to gather all data I need (due to the connection limit).
After 12 - 14 hours, it just stops and I need to manually restart the app (memory usage is ok, my VPS has 1 GB and it never used more than 60%).
How to run an async Task without blocking other tasks?
I have one function that iterates though a List but the problem is that when the function is called other functions won't work again until the first function is done. What are the ways of making the HandleAsync function non-blocking ?
public static async Task HandleAsync(Message message, TelegramBotClient bot)
{
await Search(message, bot); // This should be handled without working other possible functions. I have a function similar to this but which doesn't iterate though any list.
}
private static async Task Search(Message message, TelegramBotClient bot)
{
var textSplit = message.Text.Split(new[] {' '}, 2);
if (textSplit.Length == 1)
{
await bot.SendTextMessageAsync(message.From.Id, "Failed to fetch sales. Missing game name. ",
ParseMode.Html);
}
else
{
var search = await Program.itad.SearchGameAsync(textSplit[1], limit: 10, cts: Program.Cts);
if (search.Data != null)
{
var builder = new StringBuilder();
foreach (var deal in search.Data.List)
{
var title = deal.Title;
var plain = deal.Plain;
var shop = deal.Shop != null ? deal.Shop.Name : "N/A";
var urls = deal.Urls;
var priceNew = deal.PriceNew;
var priceOld = deal.PriceOld;
var priceCut = deal.PriceCut;
builder.AppendLine($"<b>Title:</b> {title}");
builder.AppendLine($"<b>Shop:</b> {shop}");
builder.AppendLine();
builder.AppendLine($"<b>Price:</b> <strike>{priceOld}€</strike> | {priceNew}€ (-{priceCut}%)");
var buttons = new[]
{
new[]
{
InlineKeyboardButton.WithUrl("Buy", urls.Buy.AbsoluteUri),
InlineKeyboardButton.WithUrl("History",
urls.Game.AbsoluteUri.Replace("info", "history"))
}
};
var keyboard = new InlineKeyboardMarkup(buttons);
var info = await Program.itad.GetInfoAsync(plain, cts: Program.Cts);
var image = info.Data.GameInfo.Image;
if (image == null) image = new Uri("https://i.imgur.com/J7zLBLg.png");
await TelegramBot.Bot.SendPhotoAsync(message.From.Id, new InputOnlineFile(image.AbsoluteUri),
builder.ToString(), ParseMode.Html, replyMarkup: keyboard,
cancellationToken: Program.Cts.Token);
builder.Clear();
}
}
else
{
await bot.SendTextMessageAsync(message.From.Id, "Failed to fetch sales. Game not found. ",
ParseMode.Html);
}
}
I'm trying to send multiple same requests at (almost) once to my WebAPI to do some performance testing.
For this, I am calling PerformRequest multiple times and wait for them using await Task.WhenAll.
I want to calculate the time that each request takes to complete plus the start time of each one of them. In my code,however, I don't know what happens if the result of R3 (request number 3) comes before R1? Would the duration be wrong?
From what I see in the results, I think the results are mixing with each other. For example, the R4's result sets as R1's result. So any help would be appreciated.
GlobalStopWatcher is a static class that I'm using to find the start time of each request.
Basically I want to make sure that elapsedMilliseconds and Duration of each request is associated with the request itself.
So that if the result of request 10th comes before the result of 1st request, then duration would be duration = elapsedTime(10th)-(startTime(1st)). Isn't that the case?
I wanted to add a lock but it seems impossible to add it where there's await keyword.
public async Task<RequestResult> PerformRequest(RequestPayload requestPayload)
{
var url = "myUrl.com";
var client = new RestClient(url) { Timeout = -1 };
var request = new RestRequest { Method = Method.POST };
request.AddHeaders(requestPayload.Headers);
foreach (var cookie in requestPayload.Cookies)
{
request.AddCookie(cookie.Key, cookie.Value);
}
request.AddJsonBody(requestPayload.BodyRequest);
var st = new Stopwatch();
st.Start();
var elapsedMilliseconds = GlobalStopWatcher.Stopwatch.ElapsedMilliseconds;
var result = await client.ExecuteAsync(request).ConfigureAwait(false);
st.Stop();
var duration = st.ElapsedMilliseconds;
return new RequestResult()
{
Millisecond = elapsedMilliseconds,
Content = result.Content,
Duration = duration
};
}
public async Task RunAllTasks(int numberOfRequests)
{
GlobalStopWatcher.Stopwatch.Start();
var arrTasks = new Task<RequestResult>[numberOfRequests];
for (var i = 0; i < numberOfRequests; i++)
{
arrTasks[i] = _requestService.PerformRequest(requestPayload, false);
}
var results = await Task.WhenAll(arrTasks).ConfigureAwait(false);
RequestsFinished?.Invoke(this, results.ToList());
}
Where I think you're going wrong with this is trying to use a static GlobalStopWatcher and then pushing this code into your function that you're testing.
You should keep everything separate and use a new instance of Stopwatch for each RunAllTasks call.
Let's make it so.
Start with these:
public async Task<RequestResult<R>> ExecuteAsync<R>(Stopwatch global, Func<Task<R>> process)
{
var s = global.ElapsedMilliseconds;
var c = await process();
var d = global.ElapsedMilliseconds - s;
return new RequestResult<R>()
{
Content = c,
Millisecond = s,
Duration = d
};
}
public class RequestResult<R>
{
public R Content;
public long Millisecond;
public long Duration;
}
Now you're in a position to test anything that fits the signature of Func<Task<R>>.
Let's try this:
public async Task<int> DummyAsync(int x)
{
await Task.Delay(TimeSpan.FromSeconds(x % 3));
return x;
}
We can set up a test like this:
public async Task<RequestResult<int>[]> RunAllTasks(int numberOfRequests)
{
var sw = Stopwatch.StartNew();
var tasks =
from i in Enumerable.Range(0, numberOfRequests)
select ExecuteAsync<int>(sw, () => DummyAsync(i));
return await Task.WhenAll(tasks).ConfigureAwait(false);
}
Note that the line var sw = Stopwatch.StartNew(); captures a new Stopwatch for each RunAllTasks call. Nothing is actually "global" anymore.
If I execute that with RunAllTasks(7) then I get this result:
It runs and it counts correctly.
Now you can just refactor your PerformRequest method to just do what it needs to:
public async Task<string> PerformRequest(RequestPayload requestPayload)
{
var url = "myUrl.com";
var client = new RestClient(url) { Timeout = -1 };
var request = new RestRequest { Method = Method.POST };
request.AddHeaders(requestPayload.Headers);
foreach (var cookie in requestPayload.Cookies)
{
request.AddCookie(cookie.Key, cookie.Value);
}
request.AddJsonBody(requestPayload.BodyRequest);
var response = await client.ExecuteAsync(request);
return response.Content;
}
Running the tests is easy:
public async Task<RequestResult<string>[]> RunAllTasks(int numberOfRequests)
{
var sw = Stopwatch.StartNew();
var tasks =
from i in Enumerable.Range(0, numberOfRequests)
select ExecuteAsync<string>(sw, () => _requestService.PerformRequest(requestPayload));
return await Task.WhenAll(tasks).ConfigureAwait(false);
}
If there's any doubt about the thread-safety of Stopwatch then you could do this:
public async Task<RequestResult<R>> ExecuteAsync<R>(Func<long> getMilliseconds, Func<Task<R>> process)
{
var s = getMilliseconds();
var c = await process();
var d = getMilliseconds() - s;
return new RequestResult<R>()
{
Content = c,
Millisecond = s,
Duration = d
};
}
public async Task<RequestResult<int>[]> RunAllTasks(int numberOfRequests)
{
var sw = Stopwatch.StartNew();
var tasks =
from i in Enumerable.Range(0, numberOfRequests)
select ExecuteAsync<int>(() => { lock (sw) { return sw.ElapsedMilliseconds; } }, () => DummyAsync(i));
return await Task.WhenAll(tasks).ConfigureAwait(false);
}
Something is definitely flawed in my understanding of async/await. I want a piece of code named SaveSearchCase to run asynchronously in background.
I want it to be fired and forget about it and continue with the current method's return statement.
public IList<Entities.Case.CreateCaseOutput> createCase(ARC.Donor.Data.Entities.Case.CreateCaseInput CreateCaseInput, ARC.Donor.Data.Entities.Case.SaveCaseSearchInput SaveCaseSearchInput)
{
..........
..........
..........
var AcctLst = rep.ExecuteStoredProcedure<Entities.Case.CreateCaseOutput>(strSPQuery, listParam).ToList();
if (!string.IsNullOrEmpty(AcctLst.ElementAt(0).o_case_seq.ToString()))
{
Task<IList<Entities.Case.SaveCaseSearchOutput>> task = saveCaseSearch(SaveCaseSearchInput, AcctLst.ElementAt(0).o_case_seq);
Task t = task.ContinueWith(
r => { Console.WriteLine(r.Result); }
);
}
Console.WriteLine("After the async call");
return AcctLst;
}
And the SaveCaseSearch looks like
public async Task<IList<Entities.Case.SaveCaseSearchOutput>> saveCaseSearch(ARC.Donor.Data.Entities.Case.SaveCaseSearchInput SaveCaseSearchInput,Int64? case_key)
{
Repository rep = new Repository();
string strSPQuery = string.Empty;
List<object> listParam = new List<object>();
SQL.CaseSQL.getSaveCaseSearchParameters(SaveCaseSearchInput, case_key,out strSPQuery, out listParam);
var AcctLst = await rep.ExecuteStoredProcedureAsync<Entities.Case.SaveCaseSearchOutput>(strSPQuery, listParam);
return (System.Collections.Generic.IList<ARC.Donor.Data.Entities.Case.SaveCaseSearchOutput>)AcctLst;
}
But when I see the debugger createCase method waits for SaveCaseSearch to complete first and then only
it prints "After Async Call "
and then returns . Which I do not want definitely .
So which way is my understanding flawed ? Please help to make it run async and continue with current method's print and return statement .
UPDATE
I updated the SaveCaseSearch method to reflect like :
public async Task<IList<Entities.Case.SaveCaseSearchOutput>> saveCaseSearch(ARC.Donor.Data.Entities.Case.SaveCaseSearchInput SaveCaseSearchInput,Int64? case_key)
{
return Task.Run<IList<Entities.Case.SaveCaseSearchOutput>>(async (SaveCaseSearchInput, case_key) =>
{
Repository rep = new Repository();
string strSPQuery = string.Empty;
List<object> listParam = new List<object>();
SQL.CaseSQL.getSaveCaseSearchParameters(SaveCaseSearchInput, case_key, out strSPQuery, out listParam);
var AcctLst = await rep.ExecuteStoredProcedureAsync<Entities.Case.SaveCaseSearchOutput>(strSPQuery, listParam);
return (System.Collections.Generic.IList<ARC.Donor.Data.Entities.Case.SaveCaseSearchOutput>)AcctLst;
});
}
But there is something wrong with the params. It says
Error 4 A local variable named 'SaveCaseSearchInput' cannot be declared in this scope because it would give a different meaning to 'SaveCaseSearchInput', which is already used in a 'parent or current' scope to denote something else C:\Users\m1034699\Desktop\Stuart_V2_12042016\Stuart Web Service\ARC.Donor.Data\Case\Search.cs 43 79 ARC.Donor.Data
Well this saveCaseSearch() method runs synchronously in main thread and this is the main problem here. Instead of returning result with a task you should return Task with operation itself. Here is some simplified example :
Runs synchronously and waits 5 seconds
public IList<int> A()
{
var AcctLst = new List<int> { 0, 2, 5, 8 };
if (true)
{
Task<IList<int>> task = saveCaseSearch();
Task t = task.ContinueWith(
r => { Console.WriteLine(r.Result[0]); }
);
}
Console.WriteLine("After the async call");
return AcctLst;
}
// runs sync and in the end returns Task that is never actually fired
public async Task<IList<int>> saveCaseSearch()
{
Thread.Sleep(5000);
return new List<int>() { 10, 12, 16 };
}
Runs asynchronously - fires task & forgets :
public IList<int> A()
{
... same code as above
}
// notice that we removed `async` keyword here because we just return task.
public Task<IList<int>> saveCaseSearch()
{
return Task.Run<IList<int>>(() =>
{
Thread.Sleep(5000);
return new List<int>() { 10, 12, 16 };
});
}
Here is full code for this example
Against all that I believe in pertaining to "fire-and-forget" you can do this by writing your code this way:
public Task<SaveCaseSearchOutput> SaveCaseSearch(
SaveCaseSearchInput saveCaseSearchInput,
long? caseKey)
{
var rep = new Repository();
var query = string.Empty;
var listParam = new List<object>();
SQL.CaseSQL
.getSaveCaseSearchParameters(
saveCaseSearchInput,
caseKey,
out query,
out listParam);
return rep.ExecuteStoredProcedureAsync<SaveCaseSearchOutput>(
strSPQuery,
istParam);
}
And then if the place where you would like to fire it and log when it returns (which is really what you have -- so you're not forgetting about it), do this:
public IList<CreateCaseOutput> CreateCase(
CreateCaseInput createCaseInput,
SaveCaseSearchInput saveCaseSearchInput)
{
// Omitted for brevity...
var AcctLst =
rep.ExecuteStoredProcedure<CreateCaseOutput>(
strSPQuery,
listParam)
.ToList();
if (!string.IsNullOrEmpty(AcctLst.ElementAt(0).o_case_seq.ToString()))
{
SaveCaseSearch(saveCaseSearchInput,
AcctLst.ElementAt(0).o_case_seq)
.ContinueWith(r => Console.WriteLine(r.Result));
}
Console.WriteLine("After the async call");
return AcctLst;
}
The issue was that you were using async and await in the SaveSearchCase function, and this basically means that your code is the opposite of "fire-and-forget".
As a side note, you should really just use async and await, and avoid the "fire-and-forget" idea! Make your DB calls asynchronous and leverage this paradigm for what it's worth!
Consider the following:
The SaveCaseSearch call can stay as I have defined it above.
public Task<SaveCaseSearchOutput> SaveCaseSearch(
SaveCaseSearchInput saveCaseSearchInput,
long? caseKey)
{
var rep = new Repository();
var query = string.Empty;
var listParam = new List<object>();
SQL.CaseSQL
.getSaveCaseSearchParameters(
saveCaseSearchInput,
caseKey,
out query,
out listParam);
return rep.ExecuteStoredProcedureAsync<SaveCaseSearchOutput>(
strSPQuery,
istParam);
}
Then in your call to it, do this instead:
public async Task<IList<CreateCaseOutput>> CreateCase(
CreateCaseInput createCaseInput,
SaveCaseSearchInput saveCaseSearchInput)
{
// Omitted for brevity...
var AcctLst =
await rep.ExecuteStoredProcedureAsync<CreateCaseOutput>(
strSPQuery,
listParam)
.ToList();
if (!string.IsNullOrEmpty(AcctLst.ElementAt(0).o_case_seq.ToString()))
{
await SaveCaseSearch(saveCaseSearchInput,
AcctLst.ElementAt(0).o_case_seq)
.ContinueWith(r => Console.WriteLine(r.Result));
}
Console.WriteLine("After the async call");
return AcctLst;
}
This makes for a much better solution!