Puppeteersharp - redundant chromium command line arguments when using it in the container with .Net Core App - c#

I have a task, that requires me of creating microservice, that uses puppeteersharp in order to make page screenshots. In order to do so, I use ASP.net core web api project template. In Startup.cs file I launch Puppeteersharp. Below is the code for that:
public void ConfigureServices(IServiceCollection services)
{
services.AddControllers();
Browser puppeteerBrowser = null;
Task.Run(async () => puppeteerBrowser = await LaunchPuppeteerBrowserAsync());
services.AddSingleton(puppeteerBrowser);
}
public static async Task<Browser> LaunchPuppeteerBrowserAsync()
{
Console.WriteLine("Starting to launch CHROMIUM...");
// Uncomment to let puppeteer download chromium
//await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultRevision);
var browser = await Puppeteer.LaunchAsync(new LaunchOptions
{
// Comment to let puppeteer run downloaded chromium
ExecutablePath = "/usr/bin/chromium",
Args = new[]{ "--no-sandbox", "--disable-gpu-rasterization", "--disable-remote-extensions" },
Headless = true,
// Didn't help
//IgnoredDefaultArgs = new []{ "--enable-gpu-rasterization", "--enable-remote-extensions", "--load-extension=" }
});
Console.WriteLine("CHROMIUM launched successfully");
return browser;
}
Further, the microsevice is required to run in the container inside our server. I created dockerfile, that uses standard docker-image for .Net Core applications from dockerhub, created by Microsoft. In order to Illustrate my problem, I created test application and uploaded it to github repo, you can find it here:
ASP.net Core Web Api project
There is a dockerfile, which can be used to build image like so:
*Please note, that I'm using Windows 10, and for test purposes I installed Docker onto my machine.
So, as you might know, docker images have layered structure. In order to take advantage of that feature, I decided to add the download and installation of chromium browser as 1 of the steps in dockerfile. That way, when request comes to server for the first time since launch, puppeteersharp library will not have to download the browser first and then do the job it required to do, and will use the binary (or whatever it is called in linux, I mean the application launching file; in windows that would be .exe).
You can see, that I provide executable path to Puppeteersharp browser when launching new browser in the code example I provided earlier (ExecutablePath = "/usr/bin/chromium").
The container can be started the following way:
And finally, I can describe my problem. First, using CMD command: docker exec -it e95e6d9fca63 bash I tune into container's bash.
There I run this:
In order to "ps" command to work. I also install "less" package (apt-get install less).
Then I run ps aux | less to show running processes inside the container. I get the following result, which show the command line parameters for chromium process, how it was launched. I underlined the ones, which bother me:
/usr/lib/chromium/chromium --show-component-extension-options --enable-gpu-rasterization --no-default-browser-check --disable-pings --media-router=0 --enable-remote-extensions --load-extension= --disable-background-networking --enable-features=NetworkService,NetworkServiceInProcess --disable-background-timer-throttling --disable-backgrounding-occluded-windows --disable-breakpad --disable-client-side-phishing-detection --disable-component-extensions-with-background-pages --disable-default-apps --disable-dev-shm-usage --disable-extensions --disable-features=TranslateUI,BlinkGenPropertyTrees --disable-hang-monitor --disable-ipc-flooding-protection --disable-popup-blocking --disable-prompt-on-repost --disable-renderer-backgrounding --disable-sync --force-color-profile=srgb --metrics-recording-only --no-first-run --enable-automation --password-store=basic --use-mock-keychain --headless --hide-scrollbars --mute-audio about:blank --no-sandbox --disable-gpu-rasterization --disable-remote-extensions --remote-debugging-port=0 --user-data-dir=/tmp/kgkyt1d1.bqe
The latter --disable-gpu-rasterization --disable-remote-extensions arguments are the ones I set manually in Args property of LaunchOptions - check out the first code example I provided.
I also tried to use "IgnoredDefaultArgs" property of LaunchOptions, because according to documenation of the library, the options set there will be ignored. You can also see that happening in source code.
Launcher code - creates new chromium process, which has "PrepareChromiumArgs" method
The parameters put into "IgnoredDefaultArgs" array get deleted from the result array. But that didn't help, the --enable-gpu-rasterization --enable-remote-extensions --load-extension= are still there.
The strange thing is, that when I let the Puppeteersharp library to fetch the browser itself, and comment out "ExecutablePath" property from "LaunghOptions" and start it, there are no such redundant parameters. I have checked it.
My guess is that redundant arguments occur from the executable I downloaded when it gets started by the library. I mean in somewhat similar fashion that Windows allows you to add to file properties some extra command line arguments. But is this possible in Linux?
Can anyone please help?

Related

PuppeteerExtraSharp Package No Errors But Browser Is Not Launching

I am trying to get puppeteer stealth working and found the "PuppeteerExtraSharp" package of NuGet. I however although no problems showing don't get the browser to launch and because I must use the async mehtod and don't get any errors whenever I run async I don't know how to fix the issue. I used the code that can be found on the official GitHub page of the publisher.
official link:
https://github.com/Overmiind/Puppeteer-sharp-extra
The code:
// Initialization plugin builder
var extra = new PuppeteerExtra();
// Use stealth plugin
extra.Use(new StealthPlugin());
// Launch the puppeteer browser with plugins
var browser = await extra.LaunchAsync(new LaunchOptions()
{
Headless = false
});
// Create a new page
var page = await browser.NewPageAsync();
await page.GoToAsync("http://google.com");
// Wait 2 second
await page.WaitForTimeoutAsync(2000);
// Take the screenshot
await page.ScreenshotAsync("extra.png");
I have also tried to define the path to my chrome and have tried the same for chromium and although it works when using the NuGet package "Puppeteersharp" is does not work with PuppeteerExtraSharp.
My end goal: is to run either playwright stealth or puppeteer stealth with C#

System.Net.WebClient does not handle cache when launched from SYSTEM account

I am having trouble with a script I am writing that is using System.Net.WebClient (called from Powershell but I guess the problem should occur with everything that is using the same cache as System.Net.WebRequest):
For context (as there may be a better solution than what I found):
I made an extension for IE (yes, some clients still use it) in C# (yes, it's not recommended but I had no choice)
this extension needs to run with EPM activated (so low-privileged).
it needs a configuration file that is available on a server accessed by HTTPS.
the configuration needs to be available when IE is launched so we have to cache it (also, each tab has its own instance of the extension)
that cached configuration have to stay in a privileged folder (the extension injects code to some of the pages according to that configuration, so you don't want the user or any process to have write access to it)
To solve the problem of caching the configuration, I wrote a Powershell script that is launched through the task scheduler. The script uses System.Net.WebClient to download the file, and I set it to respect the cache of the file:
$webclient = New-Object System.Net.WebClient
$cacheLevel = [System.Net.Cache.RequestCacheLevel]::CacheIfAvailable
$webclient.CachePolicy = New-Object System.Net.Cache.RequestCachePolicy($cacheLevel)
When I launch the script using "Run As Administrator", the cache is respected (providing the server is well configured).
When I launch the script from the task scheduler (user NT AUTHORITY\SYSTEM, as I need privilege to be able to save the file in the extension installation dir), the cache is not respected and the file is downloaded every single time.
Any idea on how to solve this issue? I need the caching to able to be poll the file without having to do a full download (the file is small, but the number of users is high :D).
Maybe it would be possible to use the date of the file that was previously downloaded?

Use PuppeteerSharp in AWS Lambda function

I have a skill for the Amazon Echo. The lambda function is written in C#/.Net Core, and hosted on AWS. I now need to extend it with functionality that uses PuppeteerSharp. My first problem with this was that it downloads chromium and puts it in your application's directory by default. This gave me an error, as the filesystem is read only. I then tried the package HeadlessChromium.Puppeteer.Lambda.Dotnet, but my resulting skill was too large to upload as a lambda function, because it now had chromium embedded.
I'm now trying to override the default puppeteer downloadpath, using a BrowserFetchOption. This allows me to download chromium on AWS to the system temp path, but then Puppeteer can't find it. I need to add ExecutablePath to the LaunchOptions object when I try to launch puppeteer. I've tested this locally under windows and it works, I now just need to write some general code to find the chromium executable under AWS.
Is there a better way than searching for either chrome or chrome.exe under my download path (I need it to run under windows as well as AWS for the unit tests)? It feels like if Puppeteer provides this pair of options then there must be a better way to use them?
Is there a better way to use Puppeteer in a lambda function than what I'm attempting?
Here is the windows code that is currently working, with the chrome path hard coded:
string cpath = System.IO.Path.Combine(System.IO.Path.GetTempPath(), "chromium");
var bfopt = new BrowserFetcherOptions() { Path = cpath };
await new BrowserFetcher(bfopt).DownloadAsync(BrowserFetcher.DefaultRevision);
var browser = await Puppeteer.LaunchAsync(new LaunchOptions {
ExecutablePath=System.IO.Path.Combine(cpath, "Win64-674921\\chrome-win\\chrome.exe"),
Headless = true
});

How do I make the --viewport-size argument available in Tuespechkin?

I have a web application where I want to add a "Save as PDF" button on each page that when pressed will automatically convert the current web page to a PDF document available for download in the browser.
I have added TuesPechkin (v2.1.1) and TuesPechkin.Wkhtmltox.AnyCPU (v0.12.4.1) to my project references from NuGet. I got the PDF download working but it wasn't quite generating correctly since my site uses bootstrap (the PDF always exports in mobile view). I found that adding --viewport-size 1024x768 gives me the desired solution and was able to test this (outside the web app) with the latest wkhtmltopdf.exe (0.12.5) and the following command line:
wkhtmltopdf.exe --viewport-size 1024x768 -O Landscape -s A3 http://localhost/mywebpage output.pdf
When I try to find viewport-size argument in TuesPechkin, it is not available. This issue was created several years ago https://github.com/tuespetre/TuesPechkin/issues/122 and the author has described that it has not been added because the latest version of wkhtmltopdf (0.12.4) at the time, did not have that property available in it's API. Since then it has been added (0.12.5) https://github.com/wkhtmltopdf/wkhtmltopdf/issues/2609
So... I cloned the TuesPechkin.Wkhtmltox.AnyCPU repo https://github.com/cratu/TuesPechkin. I downloaded the latest wkhtmltopdf 32bit and 64bit dll's. I used 7-Zip to add these to gz files and updated the files in the TuesPechkin.Wkhtmltox.AnyCPU repo. I then added the following code to the GlobalSettings.cs file:
[WkhtmltoxSetting("viewportSize")]
public string ViewportSize { get; set; }
I could then use set this property from my web application code:
var document = new HtmlToPdfDocument
{
GlobalSettings =
{
DocumentTitle = this.Ets.Pages.Title,
PaperSize = PaperKind.A3,
Orientation = GlobalSettings.PaperOrientation.Landscape,
ViewportSize = "1024x768"
},
Objects = {
new ObjectSettings { PageUrl = this.Request.Url.AbsoluteUri }
}
};
However, the PDF export from my web application still does not give me the desired result to match the output of the exe. The new ViewportSize property does not seem to have any affect at all. What am I missing?
Update
I created a .NET console application and the ViewportSize property is working. I checked the correct dll's are being used by the web app but the ViewportSize property still doesn't work like it does in the console app. I restarted IIS & the application pool but still the same result.

Self-installing a service that executes with parameters

I have a console application that will optionally self-install itself as a service. This works fine, but I'd like to embed some arguments into the service startup - similar to (for example) Google's Update Service (which has the parameter /medsvc)
So let's say I'd like my service to start
MyService.exe RUN Test1
.. so that'd start up MyService.exe with the parameters RUN and Test1.
I can install the service fine, using
ManagedInstallerClass.InstallHelper(new[] {Assembly.GetExecutingAssembly().Location});
However, there's no parameters on the service. So if I try:
ManagedInstallerClass.InstallHelper(new[] {Assembly.GetExecutingAssembly().Location +" RUN Test1"});
I get a FileNotFoundException. Giving that it's a array, I thought I'd try:
ManagedInstallerClass.InstallHelper(new[] {Assembly.GetExecutingAssembly().Location,"RUN","Test1"});
.. which gives the same exception, except that it's trying to find the file RUN now.
I can't find any specific documentation on how to achieve this - does anyone know if it is possible to embed parameters in with the service executable path? As another example, here's Google's Update Service with parameters - I'd like to ultimately achieve the same.
It took me a while to find this out, I hope it's still useful to someone.
First I found out, that you are not supposed to run ManagedInstallerClass.InstallHelper according to MSDN docs:
This API supports the product infrastructure and is not intended to be
used directly from your code.
Then I found out I could just use my own ProjectInstaller (a component class I added containing a Service Installer and a Service Process Installer) to install the service like this:
ProjectInstaller projectInstaller = new ProjectInstaller();
string[] cmdline = { string.Format("/assemblypath={0} \"/myParam\"", Assembly.GetExecutingAssembly().Location) };
projectInstaller.Context = new InstallContext(null, cmdline);
System.Collections.Specialized.ListDictionary state = new System.Collections.Specialized.ListDictionary();
projectInstaller.Install(state);
Be sure to encapsulate your parameters in quotes and escape the quotes, otherwise your parameters will become part of the executable path and fail to start.
The end result will be a new service with the specified properties in your Service Installer and Service Process Installer, and a path just like in your screenshot (with the /medsvc parameter for example).
I use a console application inside my windows service. The Main method in Program.cs processes the command line args. The OnStart method starts the console application. It works great.
Windows Service to Run Constantly
HybridService Easily Switch Between Console Application and Service
Only parameters before location are being passed into the context for the installer.
Try this:
args = new[] { "/ServiceName=WinService1", Assembly.GetExecutingAssembly().Location };
ManagedInstallerClass.InstallHelper(args);
Reference from another answer: Passing Parameter Collection to Service via InstallHelper

Categories