Async parsing with AngleSharp - c#

So I want to parse some data from website and I found a tutorial, here is code:
public static async void Test()
{
var config = Configuration.Default.WithDefaultLoader();
using var context = BrowsingContext.New(config);
var url = "http://webcode.me";
using var doc = await context.OpenAsync(url);
// var title = doc.QuerySelector("title").InnerHtml;
var title = doc.Title;
Console.WriteLine(title);
var pars = doc.QuerySelectorAll("p");
foreach (var par in pars)
{
Console.WriteLine(par.Text().Trim());
}
}
static void Main(string[] args)
{
Test();
}
And the program quits right after it reaches the:
using var doc = await context.OpenAsync(url);

Nothing is waiting for your asynchronous method to complete, so the program quits. You can fix this by amending to use an async main method:
static Task Main(string[] args)
{
return Test();
}
Or if you're using a version older than C# 7.1 (where async main not supported):
static void Main(string[] args)
{
Test().GetAwaiter().GetResult();
}
You'll also need to change the return type of Test to async Task:
public static async Task Test()
{
// ...
}
You might find the C# 7.1 docs on async main helpful.

Related

No results with C# web scraping

I'm a beginner and I want to try to do some web scraping with C#, but with this code, it does not return any results, even though it should return a full list of items.
static void Main(string[] args)
{
GetHtmlAsync();
Console.ReadLine();
}
private static async void GetHtmlAsync()
{
var url = "https://www.ebay.com/sch/i.html?_from=R40&_trksid=p2380057.m570.l1313&_nkw=playstation+5&_sacat=0";
var httpClient = new HttpClient();
var html = await httpClient.GetStringAsync(url);
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(html);
var ProductList = htmlDocument.DocumentNode.Descendants("ul").Where(node => node.GetAttributeValue("class", "").Equals("ListViewInner")).ToList();
}
When you use async in C# you need to be async all the way down (and up).
So, to call the async void GetHtmlAsync() method, your caller needs to be async.
For a method to be async, it can't return void, but instead we return a Task. Tasks basically represent the "potential to return a value" and can be handed around irrespective of whether the potential has been reached, so you can have a Task<int> that will get you a number at some point, if you wait for it to do so.
If you want to have the result before your read line, you also need to await the result.
static async Task Main(string[] args)
{
await GetHtmlAsync();
Console.ReadLine();
}
Full Example
I don't know which implementation of HtmLDocument you are using, so there is one using statement that needs to be replaced below (using SOURCE.OF.HTMLDOCUMENT;).
using System;
using System.Net.Http;
using System.Threading.Tasks;
using SOURCE.OF.HTMLDOCUMENT;
namespace ConsoleApp1
{
class Program
{
static async Task Main(string[] args)
{
await GetHtmlAsync();
Console.ReadLine();
}
private static async Task GetHtmlAsync()
{
var url = "https://www.ebay.com/sch/i.html?_from=R40&_trksid=p2380057.m570.l1313&_nkw=playstation+5&_sacat=0";
var httpClient = new HttpClient();
var html = await httpClient.GetStringAsync(url);
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(html);
var ProductList = htmlDocument.DocumentNode.Descendants("ul").Where(node => node.GetAttributeValue("class", "").Equals("ListViewInner")).ToList();
Console.WriteLine(ProductList.Count);
}
}
}

Simple.OData.Client not returning results, no error [duplicate]

public class test
{
public async Task Go()
{
await PrintAnswerToLife();
Console.WriteLine("done");
}
public async Task PrintAnswerToLife()
{
int answer = await GetAnswerToLife();
Console.WriteLine(answer);
}
public async Task<int> GetAnswerToLife()
{
await Task.Delay(5000);
int answer = 21 * 2;
return answer;
}
}
if I want to call Go in main() method, how can I do that?
I am trying out c# new features, I know i can hook the async method to a event and by triggering that event, async method can be called.
But what if I want to call it directly in main method? How can i do that?
I did something like
class Program
{
static void Main(string[] args)
{
test t = new test();
t.Go().GetAwaiter().OnCompleted(() =>
{
Console.WriteLine("finished");
});
Console.ReadKey();
}
}
But seems it's a dead lock and nothing is printed on the screen.
Your Main method can be simplified. For C# 7.1 and newer:
static async Task Main(string[] args)
{
test t = new test();
await t.Go();
Console.WriteLine("finished");
Console.ReadKey();
}
For earlier versions of C#:
static void Main(string[] args)
{
test t = new test();
t.Go().Wait();
Console.WriteLine("finished");
Console.ReadKey();
}
This is part of the beauty of the async keyword (and related functionality): the use and confusing nature of callbacks is greatly reduced or eliminated.
Instead of Wait, you're better off using
new test().Go().GetAwaiter().GetResult()
since this will avoid exceptions being wrapped into AggregateExceptions, so you can just surround your Go() method with a try catch(Exception ex) block as usual.
Since the release of C# v7.1 async main methods have become available to use which avoids the need for the workarounds in the answers already posted. The following signatures have been added:
public static Task Main();
public static Task<int> Main();
public static Task Main(string[] args);
public static Task<int> Main(string[] args);
This allows you to write your code like this:
static async Task Main(string[] args)
{
await DoSomethingAsync();
}
static async Task DoSomethingAsync()
{
//...
}
class Program
{
static void Main(string[] args)
{
test t = new test();
Task.Run(async () => await t.Go());
}
}
As long as you are accessing the result object from the returned task, there is no need to use GetAwaiter at all (Only in case you are accessing the result).
static async Task<String> sayHelloAsync(){
await Task.Delay(1000);
return "hello world";
}
static void main(string[] args){
var data = sayHelloAsync();
//implicitly waits for the result and makes synchronous call.
//no need for Console.ReadKey()
Console.Write(data.Result);
//synchronous call .. same as previous one
Console.Write(sayHelloAsync().GetAwaiter().GetResult());
}
if you want to wait for a task to be done and do some further processing:
sayHelloAsyn().GetAwaiter().OnCompleted(() => {
Console.Write("done" );
});
Console.ReadLine();
If you are interested in getting the results from sayHelloAsync and do further processing on it:
sayHelloAsync().ContinueWith(prev => {
//prev.Result should have "hello world"
Console.Write("done do further processing here .. here is the result from sayHelloAsync" + prev.Result);
});
Console.ReadLine();
One last simple way to wait for function:
static void main(string[] args){
sayHelloAsync().Wait();
Console.Read();
}
static async Task sayHelloAsync(){
await Task.Delay(1000);
Console.Write( "hello world");
}
public static void Main(string[] args)
{
var t = new test();
Task.Run(async () => { await t.Go();}).Wait();
}
Use .Wait()
static void Main(string[] args){
SomeTaskManager someTaskManager = new SomeTaskManager();
Task<List<String>> task = Task.Run(() => marginaleNotesGenerationTask.Execute());
task.Wait();
List<String> r = task.Result;
}
public class SomeTaskManager
{
public async Task<List<String>> Execute() {
HttpClient client = new HttpClient();
client.BaseAddress = new Uri("http://localhost:4000/");
client.DefaultRequestHeaders.Accept.Clear();
HttpContent httpContent = new StringContent(jsonEnvellope, Encoding.UTF8, "application/json");
client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
HttpResponseMessage httpResponse = await client.PostAsync("", httpContent);
if (httpResponse.Content != null)
{
string responseContent = await httpResponse.Content.ReadAsStringAsync();
dynamic answer = JsonConvert.DeserializeObject(responseContent);
summaries = answer[0].ToObject<List<String>>();
}
}
}
try "Result" property
class Program
{
static void Main(string[] args)
{
test t = new test();
t.Go().Result;
Console.ReadKey();
}
}
C# 9 Top-level statements simplified things even more, now you don't even have to do anything extra to call async methods from your Main, you can just do this:
using System;
using System.Threading.Tasks;
await Task.Delay(1000);
Console.WriteLine("Hello World!");
For more information see What's new in C# 9.0, Top-level statements:
The top-level statements may contain async expressions. In that case, the synthesized entry point returns a Task, or Task<int>.

Async method sending response back to Main

I have implemented a soap client using a Async method. I want this method to return a string value that I get from the API server to my main Thread or to another method (whichever method is calling). How do I do this:
MAIN THREAD
static void Main(string[] args)
{
TEXT().GetAwaiter().OnCompleted(() => { Console.WriteLine("finished"); });
Console.ReadKey();
// if I do it like this
// var test = TEXT().GetAwaiter().OnCompleted(() => { Console.WriteLine("finished"); });
// it gives me error: Cannot assign void to an implicitly-typed local variable
}
ASYNC METHOD
public static async Task<string> TEXT()
{
Uri uri = new Uri("http://myaddress");
HttpClient hc = new HttpClient();
hc.DefaultRequestHeaders.Add("SOAPAction", "Some Action");
var xmlStr = "SoapContent"; //not displayed here for simplicity
var content = new StringContent(xmlStr, Encoding.UTF8, "text/xml");
using (HttpResponseMessage response = await hc.PostAsync(uri, content))
{
var soapResponse = await response.Content.ReadAsStringAsync();
string value = await response.Content.ReadAsStringAsync();
return value; //how do I get this back to the main thread or any other method
}
}
In a pre-C# 7.0 console application it can be achieved as simple as this:
public static void Main()
{
string result = TEXT().Result;
Console.WriteLine(result);
}
In this case TEXT can be considered a usual method, which returns Task<string>, so its result is available in Result property. You don't need to mess with awaiter, results etc.
At the same time, you cannot do this in most types of applications (WinForms, WPF, ASP.NET etc.) and in this case you will have to use async/await across all your application:
public async Task SomeMethod()
{
string result = await TEXT();
// ... do something with result
}
If you plan to do a lot of async in a console application, I recommend using this sort of MainAsync pattern:
static public void Main(string[] args) //Entry point
{
MainAsync(args).GetAwaiter().GetResult();
}
static public Task MainAsync(string[] args) //Async entry point
{
await TEXT();
Console.WriteLine("finished");
}
If you upgrade to C# 7.1 or later, you can then remove the Main method and use async main.
Or if you ever migrate this code to an ASP.NET or WinForms application, you can ignore Main and migrate the code in MainAsync (otherwise you will run afoul of the synchronization model and get deadlocked).
In C# 7.0+, you can use async Task Main
static async Task Main(string[] args)
{
var result = TEXT().ConfigureAwait(false)
Console.ReadKey();
}
for older versions of C#
public static void Main(string[] args)
{
try
{
TEST().GetAwaiter().GetResult();
}
catch (Exception ex)
{
WriteLine($"There was an exception: {ex.ToString()}");
}
}

BrowserFetcher exit application using await

I'm using Puppeteer-Sharp to download the html of a site, I created a method called GetHtml which return a string that contains the site content. The problem is that when I call the line await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultRevision);
The application exit without any errors, this is my code:
public class Program
{
public static void Main(string[] args)
{
try
{
new FixtureController().AddUpdateFixtures();
}
catch (Exception ex)
{
new Logger().Error(ex);
}
}
}
public async Task AddFixtures()
{
int monthDays = DateTime.DaysInMonth(DateTime.Now.Year, DateTime.Now.Month);
var days = Enumerable.Range(1, monthDays).Select(x => x.ToString("D2")).ToArray();
HtmlDocument doc = new HtmlDocument(); //this is part of Htmlagilitypack library
foreach (var day in days)
{
//Generate url for this iteration
Uri url = new Uri("somesite/" + day);
var html = await NetworkHelper.GetHtml(url);
doc.LoadHtml(html);
}
}
so each foreach iteration will generate an url which download the data, and the method GetHtml should return the html but the application exit (without errors) when reach var html = .., this is the code of GetHtml:
public static async Task<string> GetHtml(Uri url)
{
try
{
//here the crash
await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultRevision);
}
catch (Exception e)
{
//No breakpoint point firing
}
await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultRevision);
var browser = await Puppeteer.LaunchAsync(new LaunchOptions
{
Headless = true
});
using (Page page = await browser.NewPageAsync())
{
await page.GoToAsync(url.ToString());
return await page.GetContentAsync();
}
}
Your main method does not wait for the result of your async call. The main method exits, closing the application. To fix it you need to wait for the async method to finish.
If you're using C# 7.1 or newer you can use async Main:
public class Program
{
public static async void Main()
{
await TestAsync();
}
private static async Task TestAsync()
{
await Task.Delay(5000);
}
}
Otherwise you need to wait synchronously:
public class Program
{
public static void Main()
{
TestAsync().GetAwaiter().GetResult();
}
private static async Task TestAsync()
{
await Task.Delay(5000);
}
}

Run time error and Program exits when using - Async and await using C#

I am trying to use the concept of async and await in my program. The program abruptly exits. I am trying to get the content length from few random urls and process it and display the size in bytes of each url.
Code:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Net;
namespace TestProgram
{
public class asyncclass
{
public async void MainCall() {
await SumPageSizes();
}
public async Task SumPageSizes(){
List<string> urllist = GetUrlList();
foreach (var url in urllist)
{
byte[] content = await GetContent(url);
Displayurl(content, url);
}
}
private void Displayurl(byte[] content, string url)
{
var length = content.Length;
Console.WriteLine("The bytes length for the url response " + url + " is of :" +length );
}
private async Task<byte[]> GetContent(string url)
{
var content = new MemoryStream();
try
{
var obj = (HttpWebRequest)WebRequest.Create(url);
WebResponse response = obj.GetResponse();
using (Stream stream = response.GetResponseStream())
{
await stream.CopyToAsync(content);
}
}
catch (Exception ex)
{
Console.WriteLine(ex.StackTrace);
}
return content.ToArray();
}
private List<string> GetUrlList()
{
var urllist = new List<string>(){
"http://msdn.microsoft.com/library/windows/apps/br211380.aspx",
"http://msdn.microsoft.com",
"http://msdn.microsoft.com/en-us/library/hh290136.aspx",
"http://msdn.microsoft.com/en-us/library/ee256749.aspx",
"http://msdn.microsoft.com/en-us/library/hh290138.aspx",
"http://msdn.microsoft.com/en-us/library/hh290140.aspx",
"http://msdn.microsoft.com/en-us/library/dd470362.aspx",
"http://msdn.microsoft.com/en-us/library/aa578028.aspx",
"http://msdn.microsoft.com/en-us/library/ms404677.aspx",
"http://msdn.microsoft.com/en-us/library/ff730837.aspx"
};
return urllist;
}
}
}
Main
public static void Main(string[] args)
{
asyncclass asyncdemo = new asyncclass();
asyncdemo.MainCall();
}
MainCall returns an uncompleted task and no other line of code is present beyond that, so your program ends
To wait for it use:
asyncdemo.MainCall().Wait();
You need to avoid async void and change MainCall to async Task in order to be able to wait for it from the caller.
Since this seems to be a console application, you can't use the await and async for the Main method using the current version of the compiler (I think the feature is being discussed for upcoming implementation in C# 7).
The problem is that you don't await an asynchron method and therefore you application exits before the method ended.
In c# 7 you could create an async entry point which lets you use the await keyword.
public static async Task Main(string[] args)
{
asyncclass asyncdemo = new asyncclass();
await asyncdemo.MainCall();
}
If you want to bubble your exceptions from MainCall you need to change the return type to Task.
public async Task MainCall()
{
await SumPageSizes();
}
If you wanted to run your code async before c# 7 you could do the following.
public static void Main(string[] args)
{
asyncclass asyncdemo = new asyncclass();
asyncdemo.MainCall().Wait();
// or the following line if `MainCall` doesn't return a `Task`
//Task.Run(() => MainCall()).Wait();
}
You have to be very careful when using async void methods. Those will not be awaited. One normal example of an async void is when you are calling an awaitable method inside a button click:
private async void Button_Click(object sender, RoutedEventArgs e)
{
// run task here
}
This way the UI won't be stuck waiting for the button click to complete.
On most custom methods you will almost always want to return a Task so that you are able to know when your method is finished.

Categories