I've tried the built in WPF WebBrowser (.NET ActiveX Wrapper), Awesomium.NET, and Chromium.NET.
I'm not sure if I just missed something here, but I don't know if there is any way to raise an event when the DOM changes so that my C# code can execute. I want to avoid having a timer check/compare the DOM for changes each time if I can...
So, I'm just wondering if I missed something in those controls that allows me to do what I want or if not, is there any alternatives/methods to have a DOMChanged event?
I don't believe there's any way to do that by default with those controls (though I've never used Chromium.NET).
One thing you could try doing is using JavaScript to detect DOM changes. You could have your application register a callback function and write the JavaScript itself. Here's a rough example in Awesomium syntax:
myControl.CreateObject("JSCallback");
myControl.SetObjectCallback("JSCallback", "handleDOMChange", OnDOMChange);
myControl.ExecuteJavascript(
//this is where you would write your Javascript method that would detect DOM change and have it execute this handler method
//Here's a post with some info on this (ignore cross-browser compatibility issues since it's all running within Awesomium): http://stackoverflow.com/questions/3219758/detect-changes-in-the-dom
function DOMChangeDetected ()
{
JSCallback.handleDOMChange();
}
);
private void OnDOMChange(object sender, JSCallbackEventArgs eventArgs)
{
//.NET event triggered from JS DOM change detection
}
Related
I'm working in .NET, C# to be specific, creating a Win Forms UserControl, which contains a WebBrowser control. The WebBrowser control hosts a page, which in turn uses a third-party javascript component. The problem I'm having is with invoking a javascript function to initialize the third-party javascript component and block the UI in the Windows Forms application until the component has been initialized, which the component notifies you of through an internal javascript event that it has.
Part of the problem is that the only way to change any configuration parameter of the third-party javascript component is to re-initialize it with the new configuration. So for example, if you want to make it read-only you have to re-initialize it with the read-only parameter.
I've got everything working in terms of being able to call the Document.InvokeScript and then in the web page call the UserControl method using window.external but the problem I'm having is how to block the UserControl code that makes the call to initialize the javascript component so that it waits and doesn't return control to the user until the initialization of the javascript component has been completed.
The reason I need it to work this way is because if I have a "Read-Only" checkbox on the form that changes the the ReadOnly property of the UserControl to control whether the javascript component shows the data as read-only and the user clicks that checkbox really quickly you will either get a javascript error or the checkbox will get out of sync with the actual read-only state of the javascript component. This seems to happen because the control hasn't re-initialized yet after it's configuration has changed and you're already trying to change it again.
I've spent hours and hours trying work out a way to make it work using everything from AutoResetEvent to Application.DoEvents and so on, but don't seem to be able to get it working.
The closest I've found is Invoke a script in WebBrowser, and wait for it to finish running (synchronized) but that uses features introduced in VS2012 (and I'm using VS2010) and I don't think it would work anyway as it's a bit different in that you're not waiting for a javascript event to fire.
Any help would be greatly appreciated.
The problem in the first place is the requirement to "block" the UI thread until some event has been fired. It's usually possible to re-factor the application to use asynchronous event handlers (with or without async/await), to yield execution control back to the message loop and avoid any blocking.
Now let's say, for some reason you cannot re-factor your code. In this case, you'd need a secondary modal message loop. You'd also need to disable the main UI while you're waiting for the event, to avoid nasty re-entrancy scenarios. The waiting itself should to be user-friendly (e.g., use the wait cursor or progress animation) and non-busy (avoid burning CPU cycles on a tight loop with DoEvents).
One way to do this is to use a modal dialog with a user-friendly message, which gets automatically dismissed when the desired JavaScript event/callback has occured. Here's a complete example:
using System;
using System.Runtime.InteropServices;
using System.Windows.Forms;
namespace WbTest
{
[ComVisible(true)]
[ClassInterface(ClassInterfaceType.None)]
[ComDefaultInterface(typeof(IScripting))]
public partial class MainForm : Form, IScripting
{
WebBrowser _webBrowser;
Action _onScriptInitialized;
public MainForm()
{
InitializeComponent();
_webBrowser = new WebBrowser();
_webBrowser.Dock = DockStyle.Fill;
_webBrowser.ObjectForScripting = this;
this.Controls.Add(_webBrowser);
this.Shown += MainForm_Shown;
}
void MainForm_Shown(object sender, EventArgs e)
{
var dialog = new Form
{
Width = 100,
Height = 50,
StartPosition = FormStartPosition.CenterParent,
ShowIcon = false,
ShowInTaskbar = false,
ControlBox = false,
FormBorderStyle = FormBorderStyle.FixedSingle
};
dialog.Controls.Add(new Label { Text = "Please wait..." });
dialog.Load += (_, __) => _webBrowser.DocumentText =
"<script>setTimeout(function() { window.external.OnScriptInitialized}, 2000)</script>";
var canClose = false;
dialog.FormClosing += (_, args) =>
args.Cancel = !canClose;
_onScriptInitialized = () => { canClose = true; dialog.Close(); };
Application.UseWaitCursor = true;
try
{
dialog.ShowDialog();
}
finally
{
Application.UseWaitCursor = false;
}
MessageBox.Show("Initialized!");
}
// IScripting
public void OnScriptInitialized()
{
_onScriptInitialized();
}
}
[ComVisible(true)]
[InterfaceType(ComInterfaceType.InterfaceIsIDispatch)]
public interface IScripting
{
void OnScriptInitialized();
}
}
Which looks like this:
Another option (a less user-friendly one) is to use something like WaitOneAndPump from here. You'd still need to take care about disabling the main UI and showing some kind of waiting feedback to the user.
Updated to address the comment. Is your WebBrowser actually a part of the UI and visible to the user? Should the user be able to interact with it? If so, you cannot use a secondary thread to execute JavaScript. You need to do it on the main thread and keep pumping messages, but WaitOne doesn't pump most of Windows messages (it only pumps a small fraction of them, related to COM). You might be able to use WaitOneAndPump which I mentioned above. You'd still need to disable the UI while waiting, to avoid re-entrancy.
Anyhow, that'd still be a kludge. You really shouldn't be blocking the execution just to keep the linear code flow. If you can't use async/await, you can always implement a simple state machine class and use callbacks to continue from where it was left. That's how it used to be before async/await.
which event is rised up in Internet Explorer (IE9) when the F5 key (refresh) is clicked? And how can I catch it with an Handler in my BHO?
Note:
I have created a BHO in C# for IE9. My class extend IObjectWithSite that allow me to add handlers through SetSite function.
public int SetSite(object site)
{
webBrowser = (SHDocVw.WebBrowser)site;
//events here...
}
If you are developing a browser plugin that injects Javascript, I found it useful to hook both ondocumentcomplete and ondownloadcomplete.
Ondocumentcomplete fires as soon as the DOM has been loaded and can be manipulated, but it misses refreshes.
Ondownloadcomplete waits until all resources (e.g., images) have downloaded, but catches refreshes. This delay can be quite long.
By hooking both, you get a responsive plugin most of the time, and you don't miss refreshes. Your javascript can then include a check to avoid running twice. Something like:
// Inject the code, but only once
if (typeof myplugin == 'undefined') {
myplugin = new function () {
// Your code runs here.
};
}
I found the following page to be informative:
Alternative way to detect refresh in a BHO
There is no direct method and it is hard to implement across different versions of IE. Although you can use combination of some events to achieve that. Be warned the following approaches are not fool proof.
Links:
MSDN Forum
Detecting the IE Refresh button
Refresh and DISPID_DOCUMENTCOMPLETE
Maintaining focus across post backs is an apparently difficult task. Searching Google, you will find a ton of people that desire the same thing, but all hook it up differently, and mostly, custom-ly. I would like to avoid a custom implementation, especially if there's a way it's supported by .NET. Only after some very deep searching, did I come across PostBackOptions.TrackFocus, mentioned quietly in another stack overflow post. According to MSDN:
Gets or sets a value indicating whether the postback event should return the page to the current scroll position and return focus to the current control."
Holy crap, this is supported by .NET 4? AWESOME. But we have a ton of custom controls, how does .NET know how to set the focus on a control? I have no idea. Looking a the MSDN documentation for System.Web.UI.Control, there's an interesting method:
public virtual void Focus()
"Use the Focus method to set the initial focus of the Web page to the
control. The page will be opened in the browser with the control
selected."
Alright, clearly overridable. But what is the recommended method of doing so? It returns void. No examples. Unable to find any examples of people overriding this method in their implementations. However, after overriding it and doing nothing more than throwing an exception, it becomes evident that this is not how ASP.NET gets focus on a control that had focus before the post back: it never gets called.
After a ton of debugging using Firebug, I have found that enabling PostBackOptions.TrackFocus works! Sometimes. It is apparent that the focus of a control is only maintained when the control calls the __doPostBack JavaScript method. Other controls that launch a PostBack (when pressing enter inside the control), call WebForm_OnSubmit(), which doesn't update the ASP hidden field __LASTFOCUS. __doPostBack calls WebForm_OnSubmit() after setting the hidden fields.
This is where I'm currently stuck. It's looks as if I need to get everything to call __doPostBack, no matter what. There's very, very little documentation on the use of TrackFocus. So does anyone have any tips from here?
I've been maintaining focus accross postbacks using the method in this article:
(ie: store focus in __LASTFOCUS hidden field on field enter event clientside for all controls)
http://www.codeproject.com/KB/aspnet/MainatinFocusASPNET.aspx
If you've gotten as far as having __LASTFOCUS show up on the page, this should get you most of the rest of the way...
Note: It'd be nice to find a way to keep the extra javascript from bloating __VIEWSTATE for example.
It was working pretty well for me until I figured out that some of my pages included the hidden __LASTFOCUS field and some of my pages didn't. (That's what prompted me to search around and find your question) Now I'm just trying to figure out how to make sure __LASTFOCUS always shows up on every page I want to keep track of focus on... (Looks like I'll have to open a separate question about it)
Here is what I just did. Assuming you have a handler in your code behind that takes care of the event and has a signature like this:
protected void myEventHandler(object sender, EventArgs e)
You can use this line of code to restore focus back to the sending object:
ScriptManager.RegisterStartupScript((WebControl) sender, sender.GetType(), "RestoreFocusMethod", "document.getElementById(\"" + ((WebControl) sender).ClientID + "\").focus();", true);
just using the Focus() method of the sending control will reposition the page (if you are scrolled down a bit), but this works beautifully. And if you have specific handlers for your control, you can just use the control itself rather than casting the sender to a WebControl, like this:
protected void CityListDropDown_SelectedIndexChanged(object sender, EventArgs e)
{
...
ScriptManager.RegisterStartupScript(CityListDropDown, CityListDropDown.GetType(), "CityDropDownRefocus", "document.getElementById(\"" + CityListDropDown.ClientID + "\").focus();", true);
}
Usually when I subscribe to an event, I use the Visual Studio builtin function to generate the method. So if I want to bind a clicked event to a button after I write += I click tab one time to generate the code after +=, and then tab again to create the empty method associated with this event.
So for a button clicked event, I will end up with something like this:
button.Clicked += new EventHandler(button_Clicked);
void button_Clicked(object sender, EventArgs e) {
throw new NotImplementedException();
}
Since I prefer the shorter syntax for binding the eventhandler, I always go back to the autogenerated line, and change it to look like this:
button.Clicked += button_Clicked;
My question is simply. Are there any way to make VS automatically prefer this syntax over the default one, so I don't manually have to go and change this every time.
This applies both for VS2008 and VS2010
No, this is not under your control to modify.
It is easier for them to keep it in the old style so that it always works no matter which version of C# is being targeted. Otherwise they would have to make the generated code conditional on the C# version and I can imagine that is just more work than it is worth. Unfortunately this code generation is not extensible so you will need to modify your code manually yourself.
You could try third-party extras such as the ReSharper product to gain extra productivity as they implement many cool features by accessing the object modal and modifying it.
As far as I know, no.
I'm sure I've read it somewhere authoritative, but I can't remember where right now
This annoys me as well. However I use ReSharper which offers a few choices when creating event handlers such as creating a new method, adding a lambda or anonymous method, or using any existing methods that have the appropriate signature.
Also, R# will highlight any redundant code and let you remove it easily, either from a single site or from the entire project/solution.
I'm having a problem screenscraping some data from this website using the MSHTML COM component. I have a WebBrowser control on my WPF form.
The code where I retrieve the HMTL elements is in the WebBrowser LoadCompleted events. After I set the values of the data to the HTMLInputElement and call the click method on the HTMLInputButtonElement, it is refusing to submit the the request and display the next page.
I analyse the HTML for the onclick attribute on the button, it is actually calling a JavaScript function and it processes my request. Which makes me not sure if calling the JavaScript function is causing the problem? But funny enough when I take my code out of the LoadCompleted method and put it inside a button click event it actually takes me to the next page where as the LoadCompleted method didn't do. Doing that sort of thing defeats the point of trying to screenscrape the page automatically.
On another thought: when I had the code inside the LoadCompleted method, I'm thinking the HTMLInputButtonElement is not fully rendered on to the page which result in click event not firing, despite the fact when I looked at the object in run time it is actually held the submit button element there and the state is saying I completed which baffles me even more.
Here is the code I used inside the LoadCompleted method and the click method on the button:
private void browser_LoadCompleted(object sender, NavigationEventArgs e)
{
HTMLDocument dom = (HTMLDocument)browser.Document;
IHTMLElementCollection elementCollection = dom.getElementsByName("PCL_NO_FROM.PARCEL_RANGE.XTRACKING.1-1-1.");
HTMLInputElement inputBox = null;
if (elementCollection.length > 0)
{
foreach (HTMLInputElement element in elementCollection)
{
if (element.name.Equals("PCL_NO_FROM.PARCEL_RANGE.XTRACKING.1-1-1."))
{
inputBox = element;
}
}
}
inputBox.value = "Test";
elementCollection = dom.getElementsByName("SUBMIT.DUM_CONTROLS.XTRACKING.1-1.");
HTMLInputButtonElement submitButton = null;
if (elementCollection.length > 0)
{
foreach (HTMLInputButtonElement element in elementCollection)
{
if (element.name.Equals("SUBMIT.DUM_CONTROLS.XTRACKING.1-1."))
{
submitButton = element;
}
}
}
submitButton.click();
}
FYI: This is the URL of the web page I'm trying to access using MSHTML,
http://track.dhl.co.uk/tracking/wrd/run/wt_xtrack_pw.entrypoint.
There are many possibilities:
You may try to put your code at
other events, such as on Navigation
Completed, or on Download Completed.
You may need to explicitly evaluate the OnClick event after the click() function.
Using the MS WebBrowser control is
easier than using the MSHTML COM.
To make life easier, you may just use a webscraping library such as the IRobotSoft ActiveX control to automate your entire process.
Delay in OnBeforeNavigate can cause click actions to fail.
We have noticed that with some submit actions OnBeforeNavigate is called twice, especially where onClick is used. The first call is before the onClick action is performed, the second is after it is complete.
Turn off your BHO, put a breakpoint on onClick, step over the submit action return jsSubmit() and then wait a bit and you should be able to cause the same issue without your automation.
Any delay >150ms on the second call to OnBeforeNavigate causes some failure in page load/navigation to the result.
Edit:
Having tried our own automation of this DHL page we don't currently have an issue with the timing described above.