Has anyone seen any documentation on the WebView2 DevToolsProtocolHelper?
In another question I asked (How do I programmatically add a file to a fileupload control from a windows form to a webpage) it was suggested that I download and use the Microsoft.Web.WebView2.DevToolsProtocolExtensions. At first it seemed like it was going to be very straight forward to use but not so much.
Win forms App using c# and webview2
DevToolsProtocolHelper helper = webView21.CoreWebView2.GetDevToolsProtocolHelper()
Task<DOM.Node> t = helper.DOM.GetDocumentAsync();
Task<int> querySelectorResponse = helper.DOM.QuerySelectorAsync(t.Result.NodeId, "#fileupload");
_ = helper.DOM.SetFileInputFilesAsync(new string[] { filename }, querySelectorResponse.Result);
These 4 lines of code should get the document and search for the node fileupload. I get nothing but errors and I have not seen any real examples or documentation on this.
Any help would be greatly appreciated.
**** UPDATE *****
DevToolsProtocolHelper helper = webView21.CoreWebView2.GetDevToolsProtocolHelper();
DOM dom = helper.DOM;
DOM.Node t = await dom.GetDocumentAsync(-1,true);
int querySelectorResponse = await dom.QuerySelectorAsync(t.NodeId, "#fileupload");
_ = helper.DOM.SetFileInputFilesAsync(new string[] { filename }, t.NodeId);
Here is the latest version of my code and it seems I have made progress. When I used CEFSHARP, the IDs I got back from Document and the #fileUpload were always the same and it worked in uploading the file.
With this code above, I am getting IDs but they are always different and I am not getting the file to upload.
Another update, when I run this code (from a button click on the winform) a second time, I do get the proper ID (504) for the int querySelectorResponse = await dom.QuerySelectorAsync(t.NodeId, "#fileupload") line of code. Again, still not getting the file to upload to the page.
Again, any help would be greatly appreciated
The GetDevToolsProtocolHelper documentation is the 'How to' article on Using Chromium DevTools Protocol in WebView2.
Separately, you cannot use Task.Result with WebView2 tasks which I see you doing in the above code. WebView2 can only be used from the UI thread its created on and requires that UI thread to communicate task completions. You should be able to use await instead.
Related
The issue:
We have an application written in C# that uses UIAutomation to get the current text (either selected or the word behind the carret) in other applications (Word, OpenOffice, Notepad, etc.).
All is working great on Windows 10, even up to 21H2, last update check done today.
But we had several clients informing us that the application is closing abruptly on Windows 11.
After some debugging I've seen some System.AccessViolationException thrown when trying to use the TextPatternRange.GetText() method:
System.AccessViolationException: 'Attempted to read or write protected memory. This is often an indication that other memory is corrupt.'
What we've tried so far:
Setting uiaccess=true in manifest and signing the app : as mentionned here https://social.msdn.microsoft.com/Forums/windowsdesktop/en-US/350ceab8-436b-4ef1-8512-3fee4b470c0a/problem-with-manifest-and-uiaccess-set-to-true?forum=windowsgeneraldevelopmentissues => no changes (app is in C:\Program Files\
In addition to the above, I did try to set the level to "requireAdministrator" in the manifest, no changes either
As I've seen that it may come from a bug in Windows 11 (https://forum.emclient.com/t/emclient-9-0-1317-0-up-to-9-0-1361-0-password-correction-crashes-the-app/79904), I tried to install the 22H2 Preview release, still no changes.
Reproductible example
In order to be able to isolate the issue (and check it was not something else in our app that was causing the exception) I quickly made the following test (based on : How to get selected text of currently focused window? validated answer)
private void btnRefresh_Click(object sender, RoutedEventArgs e)
{
var p = Process.GetProcessesByName("notepad").FirstOrDefault();
var root = AutomationElement.FromHandle(p.MainWindowHandle);
var documentControl = new
PropertyCondition(AutomationElement.ControlTypeProperty,
ControlType.Document);
var textPatternAvailable = new PropertyCondition(AutomationElement.IsTextPatternAvailableProperty, true);
var findControl = new AndCondition(documentControl, textPatternAvailable);
var targetDocument = root.FindFirst(TreeScope.Descendants, findControl);
var textPattern = targetDocument.GetCurrentPattern(TextPattern.Pattern) as TextPattern;
string text = "";
foreach (var selection in textPattern.GetSelection())
{
text += selection.GetText(255);
Console.WriteLine($"Selection: \"{selection.GetText(255)}\"");
}
lblFocusedProcess.Content = p.ProcessName;
lblSelectedText.Content = text;
}
When pressing a button, this method is called and the results displayed in labels.
The method uses UIAutomation to get the notepad process and extract the selected text.
This works well in Windows 10 with latest update, crashes immediately on Windows 11 with the AccessViolationException.
On Windows 10 it works even without the uiaccess=true setting in the manifest.
Questions/Next steps
Do anyone know/has a clue about what can cause this?
Is Windows 11 way more regarding towards UIAutomation?
On my side I'll probably open an issue by Microsoft.
And one track we might follow is getting an EV and sign the app itself and the installer as it'll also enhance the installation process, removing the big red warnings. But as this is an app distributed for free we had not done it as it was working without it.
I'll also continue testing with the reproductible code and update this question should anything new appear.
I posted the same question on MSDN forums and got this answer:
https://learn.microsoft.com/en-us/answers/questions/915789/uiautomation-throws-accessviolationexception-on-wi.html
Using IUIautomation instead of System.Windows.Automation works on Windows 11.
So I'm marking this as solved but if anyone has another idea or knows what happens you're welcome to comment!
I am using WebView2 in a C# windows application and I want to be able to upload screen captures.
I capture the screen and save it to a file (jpg) on the local hard drive. I want to attach this file (upload) it to the FILEUPLOAD control on a web page being displayed on a windows form using WebView2.
I was using CEFSHARP browser and was using DOM to upload the file before we switched to WebView2.
CEFSHAPRP had the DOM objects wrapped in the browser control so it was very
easy to use this code :
if (client == null)
{
client = browser.GetDevToolsClient();
dom = client.DOM.GetDocumentAsync();
}
await Task.Run(async () =>
{
QuerySelectorResponse querySelectorResponse = await client.DOM.QuerySelectorAsync(dom.Result.Root.NodeId, "#fileupload");
_ = client.DOM.SetFileInputFilesAsync(new string[] { filename }, querySelectorResponse.NodeId);
});
I do not see anything like this built into the WebView2 control. I did see that WebView has another nuget called WebView2.DOM but it requires me to convert my entire project from .NET framework 4.7.2 to .netCore.
Any help would be greatly appreciated. I am very new to WebView2 and cannot see how I can access the DOM to make this happen. I am using .net framework 4.7.2
******* June 17 Update *****
I installed the Microsoft.Web.WebView2.DevToolsProtocolExtension and tried to use the existing framework I already had working with CEFSHARP.
DevToolsProtocolHelper helper = webView21.CoreWebView2.GetDevToolsProtocolHelper();
Task<DOM.Node> t = helper.DOM.GetDocumentAsync(-1,false);
await Task.Run(async () =>
{
var querySelectorResponse = await helper.DOM.QuerySelectorAsync(t.Result.NodeId, "#fileupload");
_ = helper.DOM.SetFileInputFilesAsync(new string[] { filename }, querySelectorResponse);
});
It gives me the following error when I try to execute the line :
var querySelectorResponse = await helper.DOM.QuerySelectorAsync(t.Result.NodeId, "#fileupload");
Unable to cast COM object of type 'System.__ComObject' to interface type 'Microsoft.Web.WebView2.Core.Raw.ICoreWebView2'. This operation failed because the QueryInterface call on the COM component for the interface with IID '{76ECEACB-0462-4D94-AC83-423A6793775E}' failed due to the following error: No such interface supported (Exception from HRESULT: 0x80004002 (E_NOINTERFACE)).
Any ideas. Thank you for all the advice and help I have received already.
I believe there would be three ways to do this;
Get the screen co-ordinates of the button, and then using send keys send an enter key to open the dialog. Then (currently webview2 cannot intercept this dialog like CefSharp) you would need to use the WinAPI to manipulate the file dialog window.
I have used 1 before. It is a little unreliable with the send keys and in waiting for the dialog to appear and then getting the timing right with the dialog.
Using javascript you should be able to add to the filelist files and then add them to the element. I have not tried this with Webview2.
Finally, and the one you already know can be used. In your question here you use DOM.SetFileInputFilesAsync which as amaitland pointed out comes from the chrome devtools protocol.. WebView2 also supports the Chrome Devtools protocols so you can effectively use the same code you already know.
I am coming back to work on a BOT that scraped data from a site once a day for my personal use.
However they have changed the code during COVID and now it seems they are loading in a lot of the content with Ajax/JavaScript.
I thought that if I did a WebRequest and obtained the response HTML from a URL, it should match the same content that I see in a browser (FF/Chrome) when I right click and "view source". I thought the actual DOM and generated source code would come later when those files were loaded as onload events fired, scripts lazily loaded and so on.
However the source HTML I obtain with my BOT is NOT the same as the HTML I see when viewing the source code. So my regular expressions that find certain links are not available to me.
Why am I seeing a difference between "view source" and a download of the HTML?
I can only think that when the page loads, SCRIPTS run that load other content into the page and that when I view source I am actually seeing a partial generated source rather than the original source code. Therefore is there a way I can call the page with my BOT, wait X seconds before obtaining the response to get this "onload" generated HTML?
Or even better a way for MY BOT (not using someone elses), to view generated source.
This BOT runs as a web service. I can find another site to scrape but it's just painful when I have all the regular expressions working on the source I see, except it's NOT the source my BOT obtains.
A bit confused at why my browser is showing me more content with a view source (not generated source), than my BOT gets when making a valid request.
Any help would be much appreciated this is almost an 8 year project that I have been doing on/off and this change has ruined one of the core parts of the system.
In response to OP's comment, here is the Java code for how to click at different parts on the screen to do this:
You could use Java's Robot class. I just learned about it a few days ago:
// Import
import java.awt.Robot;
// Code
void click(int x, int y, int btn) {
Robot robot = new Robot();
robot.mouseMove(x, y);
robot.mousePress(btn);
robot.mouseRelease(btn);
}
You would then run the click function with the x and y position to click, as well as the button (MouseEvent.BUTTON1, MouseEvent.BUTTON2, etc.)
After stringing together the right positions (this will vary depending on the screen) you could do just about anything.
To use shortcuts, just use the keyPress and keyRelease functions. Here is a good way to do this:
void key(int keyCode, boolean ctrl, boolean alt, boolean shift) {
if (ctrl)
robot.keyPress(KeyEvent.VK_CONTROL);
if (alt)
robot.keyPress(KeyEvent.VK_ALT);
if (shift)
robot.keyPress(KeyEvent.VK_SHIFT);
robot.keyPress(keyCode);
robot.keyRelease(keyCode);
if (ctrl)
robot.keyRelease(KeyEvent.VK_CONTROL);
if (alt)
robot.keyRelease(KeyEvent.VK_ALT);
if (shift)
robot.keyRelease(KeyEvent.VK_SHIFT);
}
Thus, something like Ctrl+Shift+I to open the inspect menu would look like this:
key(KeyEvent.VK_I, true, false, true);
Here are the steps to copy a website's code (from the inspector) with Google Chrome:
Ctrl + Shift + I
Right click the HTML tag
Select "Edit as HTML"
Ctrl + A
Ctrl + C
Then, you can use the technique from this StackOverflow to get the content from the clipboard:
Clipboard c = Toolkit.getDefaultToolkit().getSystemClipboard();
String text = (String) c.getData(DataFlavor.stringFlavor);
Using something like FileOutputStream to put the info into a file:
FileOutputStream output = new FileOutputStream(new File( PATH HERE ));
output.write(text.getBytes());
output.close();
I hope this helps!
I have seemed to have fixed it by just turning on the ability to store cookies in my custom HTTP (Bot/Scraper) class, that was being called from the class trying to obtain the data. Probably the site has a defense system to prevent visitors requesting pages and not the JS/CSS with a different session ID on each request.
However I would like to see some other examples because if it is just cookies then they could use JavaScript to test for JavaScript e.g an AJAX call to log if JS is actually on or some DOM manipulation to determine if you are really Human or not which would break it again.
Every site uses different methods to prevent scrapers, email harvesters, job rapists, link harvesters etc inc working out the standard time between requests for 100% verifiable humans and BOTS and then using those values to help determine spoofed user-agents etc. I wrote a whole system to stop BOTS at my last place of work and its a layered approach, just glad the cookies being enabled solved it on this site but it could easily be beefed up with other tricks to test for BOTS vs HUMANS.
I do know some Java, enough to work out what is going on anyway. My BOT is in C#.
I'm trying to duplicate the functionality in this sample app from here : http://code.msdn.microsoft.com/windowsapps/Licensing-API-Sample-19712f1a
into an app I'm writing. I've started with working on implementing the ability to buy a consumable item.
What I've done so far:
1.) Copy the function body into the event handler for my "buy" button.
2.) Copy the WindowsStoreProxy.xml from the working sample to replace the one in my project.
3.) Double and triple checked that trial mode is false.
Note:
CurrentAppSimulator.RequestProductPurchaseAsync("product2");
Does not bring up the gui to select a return code in my project code (it did in the sample). Changing "product2" to "2" fixed that problem. However, when the awaited RequestProductPurchaseAsync returns, the following expression:
licenseInformation.ProductLicenses["2"].IsActive
is still false when in the sample says it should be true, so my code never succeeds.
Are you reading the WindowsStoreProxy.xml into the Simulator?
StorageFolder proxyDataFolder = await Package.Current.InstalledLocation.GetFolderAsync("data");
StorageFile proxyFile = await proxyDataFolder.GetFileAsync("WindowsStoreProxy.xml");
await CurrentAppSimulator.ReloadSimulatorAsync(proxyFile);
I am automating a task using webbrowser control , the site display pages using frames.
My issue is i get to a point , where i can see the webpage loaded properly on the webbrowser control ,but when it gets into the code and i see the html i see nothing.
I have seen other examples here too , but all of those do no return all the browser html.
What i get by using this:
HtmlWindow frame = webBrowser1.Document.Window.Frames[1];
string str = frame.Document.Body.OuterHtml;
Is just :
The main frame tag with attributes like SRC tag etc, is there any way how to handle this?Because as i can see the webpage completely loaded why do i not see the html?AS when i do that on the internet explorer i do see the pages source once loaded why not here?
ADDITIONAL INFO
There are two frames on the page :
i use this to as above:
HtmlWindow frame = webBrowser1.Document.Window.Frames[0];
string str = frame.Document.Body.OuterHtml;
And i get the correct HTMl for the first frame but for the second one i only see:
<FRAMESET frameSpacing=1 border=1 borderColor=#ffffff frameBorder=0 rows=29,*><FRAME title="Edit Search" marginHeight=0 src="http://web2.westlaw.com/result/dctopnavigation.aspx?rs=WLW12.01&ss=CXT&cnt=DOC&fcl=True&cfid=1&method=TNC&service=Search&fn=_top&sskey=CLID_SSSA49266105122&db=AK-CS&fmqv=s&srch=TRUE&origin=Search&vr=2.0&cxt=RL&rlt=CLID_QRYRLT803076105122&query=%22LAND+USE%22&mt=Westlaw&rlti=1&n=1&rp=%2fsearch%2fdefault.wl&rltdb=CLID_DB72585895122&eq=search&scxt=WL&sv=Split" frameBorder=0 name=TopNav marginWidth=0 scrolling=no><FRAME title="Main Document" marginHeight=0 src="http://web2.westlaw.com/result/dccontent.aspx?rs=WLW12.01&ss=CXT&cnt=DOC&fcl=True&cfid=1&method=TNC&service=Search&fn=_top&sskey=CLID_SSSA49266105122&db=AK-CS&fmqv=s&srch=TRUE&origin=Search&vr=2.0&cxt=RL&rlt=CLID_QRYRLT803076105122&query=%22LAND+USE%22&mt=Westlaw&rlti=1&n=1&rp=%2fsearch%2fdefault.wl&rltdb=CLID_DB72585895122&eq=search&scxt=WL&sv=Split" frameBorder=0 borderColor=#ffffff name=content marginWidth=0><NOFRAMES></NOFRAMES></FRAMESET>
UPDATE
The two url of the frames are as follows :
Frame1 whose html i see
http://web2.westlaw.com/nav/NavBar.aspx?RS=WLW12.01&VR=2.0&SV=Split&FN=_top&MT=Westlaw&MST=
Frame2 whose html i do not see:
http://web2.westlaw.com/result/result.aspx?RP=/Search/default.wl&action=Search&CFID=1&DB=AK%2DCS&EQ=search&fmqv=s&Method=TNC&origin=Search&Query=%22LAND+USE%22&RLT=CLID%5FQRYRLT302424536122&RLTDB=CLID%5FDB6558157526122&Service=Search&SRCH=TRUE&SSKey=CLID%5FSSSA648523536122&RS=WLW12.01&VR=2.0&SV=Split&FN=_top&MT=Westlaw&MST=
And the properties of the second frame whose html i do not get are in the picture below:
Thank you
I paid for the solution of the question above and it works 100 %.
What i did was use this function below and it returned me the count to the tag i was seeking which i could not find :S.. Use this to call the function listed below:
FillFrame(webBrowser1.Document.Window.Frames);
private void FillFrame(HtmlWindowCollection hwc)
{
if (hwc == null) return;
foreach (HtmlWindow hw in hwc)
{
HtmlElement getSpanid = hw.Document.GetElementById("mDisplayCiteList_ctl00_mResultCountLabel");
if (getSpanid != null)
{
doccount = getSpanid.InnerText.Replace("Documents", "").Replace("Document", "").Trim();
break;
}
if (hw.Frames.Count > 0) FillFrame(hw.Frames);
}
}
Hope it helps people .
Thank you
For taking html you have to do it that way:
WebClient client = new WebClient();
string html = client.DownloadString(#"http://stackoverflow.com");
That's an example of course, you can change the address.
By the way, you need using System.Net;
This works just fine...gets BODY element with all inner elements:
Somewhere in your Form code:
wb.Url = new Uri("http://stackoverflow.com");
wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wbDocumentCompleted);
And here is wbDocumentCompleted:
void wb1DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
var yourBodyHtml = wb.Document.Body.OuterHtml;
}
wb is System.Windows.Forms.WebBrowser
UPDATE:
The same as for the document, I think that your second frame is not loaded at the time you check for it's content...You can try solutions from this link. You will have to wait for your frames to be loaded in order to see its content.
The most likely reason is that frame index 0 has the same domain name as the main/parent page, while the frame index 1 has a different domain name. Am I correct?
This creates a cross-frame security issue, and the WB control just leaves you high and dry and doesn't tell you what on earth went wrong, and just leaves your objects, properties and data empty (will say "No Variables" in the watch window when you try to expand the object).
The only thing you can access in this situation is pretty much the URL and iFrame properties, but nothing inside the iFrame.
Of course, there are ways to overcome teh cross-frame security issues - but they are not built into the WebBrowser control, and they are external solutions, depending on which WB control you are using (as in, .NET version or pre .NET version).
Let me know if I have correctly identified your problem, and if so, if you would like me to tell you about the solution tailored to your setup & instance of the WB control.
UPDATE: I have noticed that you're doing a .getElementByTagName("HTML")(0).outerHTML to get the HTML, all you need to do is call this on the document object, or the .body object and that should do it. MyDoc.Body.innerHTML should get the the content you want. Also, notice that there are additional iFrames inside these documents, in case that is of relevance. Can you give us the main document URL that has these two URL's in it so we / I can replicate what you're doing here? Also, not sure why you are using DomElement but you should just cast it to the native object it wants to be cast to, either a IHTMLDocument2 or the object you see in the watch window, which I think is IHTMLFrameElement (if i recall correctly, but you will know what i mean once you see it). If you are trying to use an XML object, this could be the reason why you aren't able to get the HTML content, change the object declaration and casting if there is one, and give it a go & let us know :). Now I'm curious too :).