How to make asynchronous calls using HtmlAgilityPack? - c#

I'm trying to get the table with id table-matches available here. The problem is that table is loaded using ajax so I don't get the full html code when I download the page:
string url = "";
using (HttpClient client = new HttpClient())
using (HttpResponseMessage response = client.GetAsync(url).Result)
using (HttpContent content = response.Content)
string result = content.ReadAsStringAsync().Result;
the html returned does not contains any table, so I tried to see if there is a problem of the library, infact I setted on Chrome (specifically on the Dev console F12) javascript off and same result on the browser.
Fox fix this problem I though to use a WebBrowser, in particular:
HtmlElementCollection elements = webBrowser.Document.GetElementsByTagName("table");
but I want ask if I can load also the full html doing asynchronus calls, someone has encountered a similar problem?
Could you please share a solution? Thanks.

The main issue with this page is that content inside table-matches is loaded via ajax. And neither HttpClient nor HtmlAgilityPack unable to wait for ajax to be executed. Therefore, you need different approach.
Approach #1 - Use any headless browser like PuppeteerSharp
using PuppeteerSharp;
using System;
using System.Threading.Tasks;
namespace PuppeteerSharpDemo
class Program
private static String url = "";
static void Main(string[] args)
var htmlAsTask = LoadAndWaitForSelector(url, "#table-matches .table-main");
public static async Task<string> LoadAndWaitForSelector(String url, String selector)
var browser = await Puppeteer.LaunchAsync(new LaunchOptions
Headless = true,
ExecutablePath = #"c:\Program Files (x86)\Google\Chrome\Application\chrome.exe"
using (Page page = await browser.NewPageAsync())
await page.GoToAsync(url);
await page.WaitForSelectorAsync(selector);
return await page.GetContentAsync();
In purpose of cleanness, I've posted output here here. And once you get html content you are able to parse it with HtmlAgilityPack.
Approach #2 - Use pure Selenium WebDriver. Can be launched in headless mode.
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.Support.UI;
using System;
namespace SeleniumDemo
class Program
private static IWebDriver webDriver;
private static TimeSpan defaultWait = TimeSpan.FromSeconds(10);
private static String targetUrl = "";
private static String driversDir = #"../../Drivers/";
static void Main(string[] args)
webDriver = new ChromeDriver(driversDir);
IWebElement table = webDriver.FindElement(By.Id("table-matches"));
var innerHtml = table.GetAttribute("innerHTML");
#region (!) I didn't even use this, but it can be useful (!)
public static IWebElement FindElement(By by)
var wait = new WebDriverWait(webDriver, defaultWait);
return wait.Until(driver => driver.FindElement(by));
return null;
public static void WaitForAjax()
var wait = new WebDriverWait(webDriver, defaultWait);
wait.Until(d => (bool)(d as IJavaScriptExecutor).ExecuteScript("return == 0"));
Approach #3 - Simulate ajax requests
If you analyse the page loading using Fiddler or browser's profiler (F12) you can see that all data is coming with these two requests:
So you can try to execute them directly using HttpClient. But in this case you may need to track authorization headers and maybe something else with each HTTP request.


Add BackgroundImage with EPPlus only allows path but cannot get path in Blazor WASM

This may not be 100% an EPPlus issue, but since it is Blazor WASM it appears I cannot get the file path to a static image in the wwwroot/images folder. I can get the url and paste it into a browser and that works, even adding that same path to the src attribute of an img works, neither of those helps me.
FYI "background" in this context means a watermark.
It appears that the EPPlus dev team only wants a drive path the file (ex. C:\SomeFolder\SomeFile.png), and I am not seeing how to get that within Blazor WASM. I can get the bytes of the file in c# and even a stream, but no direct path.
My code is the following:
using (var package = new ExcelPackage(fileName))
var sheet = package.Workbook.Worksheets.Add(exportModel.OSCode);
This returns an exception:
Unhandled exception rendering component: Can't find file /https:/localhost:44303/images/Draft.png
Noticing that leading / I even tried:
Which returned the same error:
Unhandled exception rendering component: Can't find file /images/Draft.png
So, I am perhaps needing 1 of 2 possible answers:
A way to get a local drive path to the file so the .SetFromFile method is not going to error.
To have a way to set that BackgroundImage property with a byte array or stream of the image. There is this property BackgroundImage.Image but it is readonly.
Thanks to a slap in the face from #Panagiotis-Kanavos I wound up taking the processing out of the client and moving it to the server. With that, I was able to use Static Files to add the watermark with relatively little pain.
In case anyone may need the full solution (which I always find helpful) here it is:
Here is the code within the button click on the Blazor component or page:
private async Task GenerateFile(bool isFinal)
var fileStream = await excelExportService.ProgramMap(exportModel);
var fileName = "SomeFileName.xlsx";
using var streamRef = new DotNetStreamReference(stream: fileStream);
await jsRuntime.InvokeVoidAsync("downloadFileFromStream", fileName, streamRef);
That calls a client-side service that really just passes control over to the server:
public class ExcelExportService : IExcelExportService
private const string baseUri = "api/excel-export";
private readonly IHttpService httpService;
public ExcelExportService(IHttpService httpService)
this.httpService = httpService;
public async Task<Stream> ProgramMap(ProgramMapExportModel exportModel)
return await httpService.PostAsJsonForStreamAsync<ProgramMapExportModel>($"{baseUri}/program-map", exportModel);
Here is the server-side controller that catches the call from the client:
public class ExcelExportController : ControllerBase
private readonly ExcelExportService excelExportService;
public ExcelExportController(ExcelExportService excelExportService)
this.excelExportService = excelExportService;
public async Task<Stream> ProgramMap([FromBody] ProgramMapExportModel exportModel)
return await excelExportService.ProgramMap(exportModel);
And that in-turn calls the server-side service where the magic happens:
public async Task<Stream> ProgramMap(ProgramMapExportModel exportModel)
var result = new MemoryStream();
ExcelPackage.LicenseContext = LicenseContext.Commercial;
var fileName = #$"Gets Overwritten";
using (var package = new ExcelPackage(fileName))
var sheet = package.Workbook.Worksheets.Add(exportModel.OSCode);
if (!exportModel.IsFinal)
var pathToDraftImage = #$"{Directory.GetCurrentDirectory()}\StaticFiles\Images\Draft.png";
result.Position = 0; // Without this, data does not get written
return result;
For some reason, this next method was not needed when doing this on the client-side but now that it is back here, I had to add a method that returned a stream specifically and used the ReadAsStreamAsync instead of ReadAsJsonAsync:
public async Task<Stream> PostAsJsonForStreamAsync<TValue>(string requestUri, TValue value, CancellationToken cancellationToken = default)
Stream result = default;
var responseMessage = await httpClient.PostAsJsonAsync(requestUri, value, cancellationToken);
result = await responseMessage.Content.ReadAsStreamAsync(cancellationToken: cancellationToken);
catch (HttpRequestException e)
return result;
Lastly, in order for it to give the end-user a download link, this was used (taken from the Microsoft Docs):
window.downloadFileFromStream = async (fileName, contentStreamReference) => {
const arrayBuffer = await contentStreamReference.arrayBuffer();
const blob = new Blob([arrayBuffer]);
const url = URL.createObjectURL(blob);
const anchorElement = document.createElement("a");
anchorElement.href = url; = fileName ?? "";;

System.NullReferenceException when reading browser log with selenium

I am writing using C#, selenium chromeWebDriver. When I try to read the browser console log file with selenium I get:
System.NullReferenceException: 'Object reference not set to an instance of an object.'
private void button1_Click(object sender, EventArgs e)
ChromeOptions options = new ChromeOptions();
options.SetLoggingPreference(LogType.Browser, LogLevel.Warning);
IWebDriver driver = new ChromeDriver(options);
driver.Url = "";
var entries = driver.Manage().Logs.GetLog(LogType.Browser); // System.NullReferenceException
foreach (var entry in entries)
This is my solution until Selenium 4 is out (will work also with Selenium 4).
It is quick and dirty and was design to demonstrate how it can be done. Feel free to alter and improve.
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.Remote;
using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Reflection;
using System.Text;
namespace GetChromeConsoleLog
internal static class Program
private static void Main()
// setup options
var options = new ChromeOptions();
options.SetLoggingPreference(LogType.Browser, LogLevel.All);
// do whatever actions
var driver = new ChromeDriver(options)
Url = ""
var logs = driver.GetBrowserLogs();
// extract logs (using the extension method GetBrowserLogs)
foreach (var log in driver.GetBrowserLogs())
Console.WriteLine($"{log["timestamp"]}: {log["message"]}");
// cleanup
// hold console
Console.WriteLine("Press any key to exit...");
public static class WebDriverExtensions
public static IEnumerable<IDictionary<string, object>> GetBrowserLogs(this IWebDriver driver)
// not a chrome driver
if(driver.GetType() != typeof(ChromeDriver))
return Array.Empty<IDictionary<string, object>>();
// setup
var endpoint = GetEndpoint(driver);
var session = GetSession(driver);
var resource = $"{endpoint}session/{session}/se/log";
const string jsonBody = #"{""type"":""browser""}";
// execute
using (var httpClient = new HttpClient())
var content = new StringContent(jsonBody, Encoding.UTF8, "application/json");
var response = httpClient.PostAsync(resource, content).GetAwaiter().GetResult();
var responseBody = response.Content.ReadAsStringAsync().GetAwaiter().GetResult();
return AsLogEntries(responseBody);
private static string GetEndpoint(IWebDriver driver)
// setup
const BindingFlags Flags = BindingFlags.Instance | BindingFlags.NonPublic;
// get RemoteWebDriver type
var remoteWebDriver = GetRemoteWebDriver(driver.GetType());
// get this instance executor > get this instance internalExecutor
var executor = remoteWebDriver.GetField("executor", Flags).GetValue(driver) as ICommandExecutor;
// get URL
var uri = executor.GetType().GetField("remoteServerUri", Flags).GetValue(executor) as Uri;
// result
return uri.AbsoluteUri;
private static Type GetRemoteWebDriver(Type type)
if (!typeof(RemoteWebDriver).IsAssignableFrom(type))
return type;
while (type != typeof(RemoteWebDriver))
type = type.BaseType;
return type;
private static SessionId GetSession(IWebDriver driver)
if (driver is IHasSessionId id)
return id.SessionId;
return new SessionId($"gravity-{Guid.NewGuid()}");
private static IEnumerable<IDictionary<string, object>> AsLogEntries(string responseBody)
// setup
var value = $"{JToken.Parse(responseBody)["value"]}";
return JsonConvert.DeserializeObject<IEnumerable<Dictionary<string, object>>>(value);
get logs
I have previously used the property above but I cannot get it to work currently with the ChromeDriver 75+. I found issues related to it reported here.
This issue was supposedly fixed as it was reported in GitHub issue #7323 here, but I have actually attempted to test this fix in ChromeDriver Nuget version 77.0.3865.4000 and it still proves to be an issue.
Further experimentation with newer Chromedriver version 78.0.3904.7000 (currently Latest Stable version at the time of writing) shows that the issue still exists.
I have also experimented with using workarounds provided in other Selenium issue #7335 here back in September, and while this workaround does allow the driver to be instantiated, the logs are still inaccessible (and null).
Workaround when creating chromedriver instance: typeof(CapabilityType).GetField(nameof(CapabilityType.LoggingPreferences), BindingFlags.Static | BindingFlags.Public).SetValue(null, "goog:loggingPrefs");
Based on what MatthewSteeples said in that issue (see quote below), the fix is in place just not yet fully released to Nuget. Hopefully it will come in with the next release.
"The issue has been resolved but the fix is not (yet) available on NuGet so you'll need to roll your own if you need it before the next release is out" - MatthewSteeples September 25'th 2019
Edit: It may be worth mentioning the reason for using an older ChromeDriver Nuget is so that running automated tests locally and in the Hosted Azure Devops release pipeline is possible without manually modifying the nuget version locally.

Selenium. Bring-up window on the front

If you run the following code, then at each iteration of the cycle, the browser will bring up on the front and get focus.
public class Program
private static void Main()
var driver = new ChromeDriver();
for (int i = 0; i < 100; i++)
var ss = ((ITakesScreenshot)driver).GetScreenshot();
The question is: why does this happen and can it be turned off? headless mod does not fit.
It seems that this always happens when Selenium needs to save / read the file or start the process.
To take a screenshot, chromedriver activates the window. It's by design and there's no option to avoid it even though it's technically possible.
For the relevant sources have a look at
You could however avoid the effect by moving the window off-screen:
driver.Manage().Window.Position = new Point(-32000, -32000);
or by launching the browser off-screen:
var options = new ChromeOptions();
You can avoid the activation by taking the screenshot directly via the devtool API. Here's a class to override GetScreenshot:
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.Remote;
using JObject = System.Collections.Generic.Dictionary<string, object>;
class ChromeDriverEx : ChromeDriver
public ChromeDriverEx(ChromeOptions options = null)
: base(options ?? new ChromeOptions()) {
var repo = base.CommandExecutor.CommandInfoRepository;
repo.TryAddCommand("send", new CommandInfo("POST", "/session/{sessionId}/chromium/send_command_and_get_result"));
public new Screenshot GetScreenshot() {
object response = Send("Page.captureScreenshot", new JObject {{"format", "png"}, {"fromSurface", true}});
string base64 = (string)((JObject)response)["data"];
return new Screenshot(base64);
protected object Send(string cmd, JObject args) {
return this.Execute("send", new JObject {{"cmd", cmd}, {"params", args}}).Value;
var driver = new ChromeDriverEx();
driver.Url = "";
When you invoke Navigate().GoToUrl("url") method through your Automation script, it is expected that your script will be interacting with some of the elements on the webpage. So for Selenium to interact with those elements, Selenium needs focus. Hence opening up the browser, bring up on the front and getting the focus is the default phenomenon implemented through Navigate().GoToUrl("url").
Now Default Mode or Headless Mode is controlled by the ChromeOption/FirefoxOptions class which is passed as an argument while initializing the WebDriver instance and will call Navigate().GoToUrl("url"). So, Navigate().GoToUrl("url") would have no impact how the WebDriver instance is controlling the Mode of Operation i.e. Default Mode or Headless Mode.
Now when you try to invoke the method from ITakesScreenshot Interface i.e. ITakesScreenshot.GetScreenshot Method which is defined as :
Gets a Screenshot object representing the image of the page on the screen.
In case of WebDriver instance which extends ITakesScreenshot, makes the best effort depending on the browser to return the following in order of preference:
Entire page
Current window
Visible portion of the current frame
The screenshot of the entire display containing the browser
There may be some instances when the browser looses the focus. In that case you can use IJavascriptExecutor to regain the focus as follows :
((IJavascriptExecutor) driver).executeScript("window.focus();");
I was struggling with an issue when generic GetScreenshot() in parallel testing was causing browser to lose focus. Some elements were being removed from DOM and my tests were failing. I've come up with a working solution for Edge and Chrome 100+ with Selenium 4.1:
public Screenshot GetScreenshot()
IHasCommandExecutor executor = webDriverInstance as IHasCommandExecutor;
var sessionId = ((WebDriver)webDriverInstance).SessionId;
var command = new HttpCommandInfo(HttpCommandInfo.PostCommand, $"/session/{sessionId}/chromium/send_command_and_get_result");
executor.CommandExecutor.TryAddCommand("Send", command);
var response = Send(executor, "Page.captureScreenshot", new JObject { { "format", "png" }, { "fromSurface", true } });
var base64 = ((Dictionary<string, object>)response.Value)["data"];
return new Screenshot(base64.ToString());
private Response Send(IHasCommandExecutor executor, string cmd, JObject args)
var json = new JObject { { "cmd", cmd }, { "params", args } };
var command = new Command("Send", json.ToString());
return executor.CommandExecutor.Execute(command);

Explicit waits in Selenium C# doesn't work . What is wrong?

So I have this issue with explicit waits. I don't want to use Thread.Sleep(). This is an simple test which it opens a page and then goes back and forward. It takes about 2-3 seconds to load this page and I want to do this in a dynamic way (testing). Hope I am not too confusig. I did a lot of research but nothing works, maybe I am doing something wrong. ( I'm using Resharper to run unit tests)
Here I have also the solution:
I am using an extension to FindElement method so I believe it would be easy to just call this method and wait by itself. I have some explanation commented in the solution. I would appreciate if somebody give me some help. (Sorry for not so perfect english).
using System;
using System.Threading;
using ConsoleApplication2.Extensions;
using NUnit.Framework;
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.Support.UI;
namespace ConsoleApplication2
class UnitTest1
private IWebDriver driver = new ChromeDriver();
public void Initialize()
driver.Url = "some url";
// driver.Manage().Timeouts().ImplicitlyWait(TimeSpan(20));
public void CheckBackForward()
//Go to first page in Online Help
// So if I am commenting this Thread.Sleep
// it will throw an exception at the extension method at line 13 at "#by"
// I've also checked this without extension method but still the same problem. It doesn'w wait at all, it will throw this
// exception as soon as FindElement method is called.
// I know that I shouldn't mix explicit waits with implicit ones.
//Store title
var title_text = driver.FindElement(By.XPath("//div[#id='shellAreaContent']/div/div[2]/ol/li[3]/span"),60).Text;
//Check if Back is enabled
//Go back
//Check if Forward is enabled
//Go forward
//Store title
var title_text2 = driver.FindElement(By.XPath("//div[#id='shellAreaContent']/div/div[2]/ol/li[3]/span")).Text;
//Check if you are on the same page
public void EndTest()
And here is the extension:
using System;
using OpenQA.Selenium;
using OpenQA.Selenium.Support.UI;
namespace ConsoleApplication2.Extensions
public static class Extension
public static IWebElement FindElement(this IWebDriver driver, By by, int timeoutInSeconds)
if (timeoutInSeconds <= 0) return driver.FindElement(#by);
var wait = new WebDriverWait(driver, TimeSpan.FromSeconds(timeoutInSeconds));
return wait.Until(drv => drv.FindElement(#by));
I am coding with Selenium for 6+ months and I had the same problem as yours. I have created this extension method and it works for me every time.
What the code does is:
During 20 seconds, it checks each 500ms, whether or not the element is present on the page. If after 20 seconds, it's not found, it will throw an exception.
This will help you make a dynamic wait.
public static class SeleniumExtensionMethods {
public static WebDriverWait wait = new WebDriverWait(driver, TimeSpan.FromSeconds(20));
public static void SafeClick(this IWebElement webElement) {
try {
} catch (TargetInvocationException ex) {
and then replace this code of yours:
IWebElement x = driver.FindElement(By.XPath("//span/ul/li/span/a"));

Tring to get GitHub repo via Octokit

I'm building simple tool for downloading .lua files from online public GitHub repos via link given by user. I started learning async methods so I wanted to test myself.
It's a console application (for now). The ultimate goal is to get .lua files in a repo and ask the user which ones he wants downloaded, but I'll be happy if I connect to GH for now.
I'm using Octokit ( GitHub API integration to .NET.
This is the reduced code; I removed some of unimportant stuff:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Octokit;
namespace GetThemLuas
class Program
static readonly GitHubClient Github = new GitHubClient(new ProductHeaderValue ("Testing123"), new Uri(""));
static void Main(string[] args)
Console.WriteLine("Welcome to GitHub repo downloader");
private static async void GetRepoTry4()
Console.WriteLine("Searching for data"); //returns here... code below is never ran
var searchResults = await Github.Search.SearchRepo(new SearchRepositoriesRequest("octokit"));
if (searchResults != null)
foreach (var result in searchResults.Items)
Console.WriteLine("Fetching data...."); //testing search
var myrepo = await Github.Repository.Get("Haacked", "");
Console.WriteLine("Done! :)");
Console.WriteLine("Repo loaded successfully!");
Console.WriteLine("Repo owner: " + myrepo.Owner);
Console.WriteLine("Repo ID: " + myrepo.Id);
Console.WriteLine("Repo Date: " + myrepo.CreatedAt);
catch (Exception e)
Console.WriteLine("Ayyyy... troubles"); //never trigged
The problem is the await` keyword as it terminates the method and returns.
I'm still learning async methods so it's possible I messed something up, but even my ReSharper says it fine.
I used var to replace task<T> stuff. It seams OK to me plus no warnings nor errors.
I fixed the await issue. Now when I finally connected to GH and tried to get the repo it threw an exeption at both calls to GH (tested with commenting first then second call). e.message was some huge stuff.
I logged it into a file and it looks like an HTML document. Here it is (
Change GetRepoTry4(); to Task.Run(async () => { await GetRepoTry4(); }).Wait(); and private static async void GetRepoTry4() to private static async Task GetRepoTry4().
This should get you at least wired up correctly enough to start debugging the real issue.
Generally speaking all async methods need to return a Task or Task<T> and all methods that return a Task or Task<T> should be async. Additionally, you should get your code into the dispatcher as quickly as possible and start using await.
The constructor with the Uri overload is intended for use with GitHub Enterprise installations, e.g:
static readonly GitHubClient Github = new GitHubClient(new ProductHeaderValue ("Testing123"), new Uri(""));
If you're just using it to connect to GitHub, you don't need to specify this:
static readonly GitHubClient Github = new GitHubClient(new ProductHeaderValue ("Testing123"));
You're seeing a HTML page because the base address is incorrect - all of the API-related operations use, which is the default.
Install Octokit Nuget Package for Github.Then add below function
public JsonResult GetRepositoryDeatil(long id)
var client = new GitHubClient(new ProductHeaderValue("demo"));
var tokenAuth = new Credentials("xxxxxxx"); // NOTE: not real token
client.Credentials = tokenAuth;
var content = client.Repository.Content.GetAllContents(id).Result;
List<RepositoryContent> objRepositoryContentList = content.ToList();
return Json(objRepositoryContentList, JsonRequestBehavior.AllowGet);
Due to the use of the async/await you should change the definition of the method GetRepoTry4 to the following:
private static async Task GetRepoTry4()
Then in the Main method call it like so GetRepoTry4().Wait();. This will enable the method GetRepoTry4() to be awaited.
