WinForms WebBrowser HtmlDocument.Write behaves differently in different solutions - c#

Recently, my application has crashed when trying to display a rather lengthy (but otherwise simple) HTML e-mail.
The crash was caused by mshtml.dll getting a stack overflow (exception code 0xc00000fd). Of note here is that this didn't throw an exception but it actually just crashed the program as a whole. The error was retrieved from the Windows event log.
In the process of debugging, I created a smaller sample solution to try and narrow down the issue. However, not only does it work fine in the sample solution, it behaves completely different from the main program despite running the same code even for the simplest of HTML strings.
The code is as follows:
var webBrowser1 = new System.Windows.Forms.WebBrowser();
webBrowser1.AllowNavigation = false;
webBrowser1.AllowWebBrowserDrop = false;
webBrowser1.Navigate("about:blank");
var doc = webBrowser1.Document.OpenNew(true);
doc.Write("<HTML><BODY>This is a new HTML document.</BODY></HTML>");
var count = doc.All.Count;
var html = doc.All[0].OuterHtml;
In the sample solution this evaluates to:
count = 4; // [HTML, HEAD, TITLE, BODY]
html = "<HTML><HEAD></HEAD>\r\n<BODY>This is a new HTML document.</BODY></HTML>";
Meanwhile in the main program it comes out to:
count = 3; // [HTML, HEAD, BODY]
html = "<html><head></head><body>This is a new HTML document.</body></html>";
These are small discrepancies but that is largely due to the simple HTML used. The one that causes the crash has rather significant differences.
I am absolutely stumped as to how the result can be so vastly different.
The documentation for HtmlDocument.Write(string) states that:
It is recommended that you write an entire valid HTML document using the Write method, including HTML and BODY tags. However, if you write just HTML elements, the Document Object Model (DOM) will supply these elements for you.
But I have no idea how the DomDocument is provided nor why they would be different in the first place. Both solutions are running in x64 Debug mode and Net-Framework 4.6.2.
Both load the module: C:\WINDOWS\assembly\GAC\Microsoft.mshtml\7.0.3300.0__b03f5f7f11d50a3a\Microsoft.mshtml.dll
How is it possible that these produce different results?!
Any and all help welcome.
Thanks in advance.

The difference in behavior stems from the registry entry as proposed by Jimi and linked here
HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Main\FeatureControl\FEATURE_BROWSER_EMULATION
To reiterate, in the above registry the main program had an entry "ApplicationFileName.exe"=dword:00002af9 while my new test-application didn't.
This, in and for itself, does not explain the crashing of mshtml.dll itself but since the question was about the difference in behavior I'll post this as the answer. The crash is most likely linked to the outdated version used by Visual Studio but I haven't yet had the chance to look into some of the proposed fixes for that.

Related

UIAutomation throws AccessViolationException on Windows 11

The issue:
We have an application written in C# that uses UIAutomation to get the current text (either selected or the word behind the carret) in other applications (Word, OpenOffice, Notepad, etc.).
All is working great on Windows 10, even up to 21H2, last update check done today.
But we had several clients informing us that the application is closing abruptly on Windows 11.
After some debugging I've seen some System.AccessViolationException thrown when trying to use the TextPatternRange.GetText() method:
System.AccessViolationException: 'Attempted to read or write protected memory. This is often an indication that other memory is corrupt.'
What we've tried so far:
Setting uiaccess=true in manifest and signing the app : as mentionned here https://social.msdn.microsoft.com/Forums/windowsdesktop/en-US/350ceab8-436b-4ef1-8512-3fee4b470c0a/problem-with-manifest-and-uiaccess-set-to-true?forum=windowsgeneraldevelopmentissues => no changes (app is in C:\Program Files\
In addition to the above, I did try to set the level to "requireAdministrator" in the manifest, no changes either
As I've seen that it may come from a bug in Windows 11 (https://forum.emclient.com/t/emclient-9-0-1317-0-up-to-9-0-1361-0-password-correction-crashes-the-app/79904), I tried to install the 22H2 Preview release, still no changes.
Reproductible example
In order to be able to isolate the issue (and check it was not something else in our app that was causing the exception) I quickly made the following test (based on : How to get selected text of currently focused window? validated answer)
private void btnRefresh_Click(object sender, RoutedEventArgs e)
{
var p = Process.GetProcessesByName("notepad").FirstOrDefault();
var root = AutomationElement.FromHandle(p.MainWindowHandle);
var documentControl = new
PropertyCondition(AutomationElement.ControlTypeProperty,
ControlType.Document);
var textPatternAvailable = new PropertyCondition(AutomationElement.IsTextPatternAvailableProperty, true);
var findControl = new AndCondition(documentControl, textPatternAvailable);
var targetDocument = root.FindFirst(TreeScope.Descendants, findControl);
var textPattern = targetDocument.GetCurrentPattern(TextPattern.Pattern) as TextPattern;
string text = "";
foreach (var selection in textPattern.GetSelection())
{
text += selection.GetText(255);
Console.WriteLine($"Selection: \"{selection.GetText(255)}\"");
}
lblFocusedProcess.Content = p.ProcessName;
lblSelectedText.Content = text;
}
When pressing a button, this method is called and the results displayed in labels.
The method uses UIAutomation to get the notepad process and extract the selected text.
This works well in Windows 10 with latest update, crashes immediately on Windows 11 with the AccessViolationException.
On Windows 10 it works even without the uiaccess=true setting in the manifest.
Questions/Next steps
Do anyone know/has a clue about what can cause this?
Is Windows 11 way more regarding towards UIAutomation?
On my side I'll probably open an issue by Microsoft.
And one track we might follow is getting an EV and sign the app itself and the installer as it'll also enhance the installation process, removing the big red warnings. But as this is an app distributed for free we had not done it as it was working without it.
I'll also continue testing with the reproductible code and update this question should anything new appear.
I posted the same question on MSDN forums and got this answer:
https://learn.microsoft.com/en-us/answers/questions/915789/uiautomation-throws-accessviolationexception-on-wi.html
Using IUIautomation instead of System.Windows.Automation works on Windows 11.
So I'm marking this as solved but if anyone has another idea or knows what happens you're welcome to comment!

Rotativa ActionAsPdf() Very Slow

Using Rotativa 1.6.4 from NuGet and have noticed the following issue using the code below.
ActionAsPdf hangs randomly for indeterminate amount of time.
Code below that is hanging:
var pdfResult = new ActionAsPdf("Report", new {id = Request.Params["id"]})
{
Cookies = cookieCollection,
FormsAuthenticationCookieName = FormsAuthentication.FormsCookieName,
CustomSwitches = "--load-error-handling ignore"
};
Background info that may help:
The customSwitches is in use to ignore a documented issue calling wkhtmltopdf.exe using the ActionAsPdf, but it does not suppress errors in the code only in the wkhtmltopdf call.
Observations, usage and testing:
It works but when running the application (whether or not stepping through code), it can be anywhere from 10 seconds up to about 4 minutes between hitting the pdfResult = new ActionAsPdf and finally entering into the "Report" action being called. Can't discern anything actually happening in the output window of Visual Studio, no errors are being thrown that I have found. Just random slow transition into the Reports() action.
I can run the Reports() action directly via URL and it never slows like this and is quite fast for PDF generation. I am running it using the ActionAsPdf to obtain the binary to save to file system and send via email, which is the prescribed method of doing so for this library.
The behavior exists on both a local Windows 10 dev box and a remote Server 2008R2 Test box. .Net 4.5.1 on both boxes, default IIS on each.
Questions I have:
Any idea on what might cause this slow down and how to remedy it?
I ended up using UrlAsPdf() instead of ActionAsPdf() and it works. Seems there may be some issues with the ActionAsPdf() and I have filed a bug with Rotative project on GitHub. The ActionAsPdf() is still marked as beta, so hopefully it get's fixed in future versions or by the community.
In my case, I had to do few more tweaks along with using UrlAsPdf(). I have narrowed down the issue to the cookie collection that I was adding. So I tried just adding the cookie that I needed, and the issue was resolved. Following is the sample code that I have used.
var report = new UrlAsPdf(url);
Dictionary<string, string> cookieCollection = new Dictionary<string, string>();
foreach (var key in Request.Cookies.AllKeys)
{
if (Crypto.Hash("_user").Equals(key))
{
cookieCollection.Add(key, Request.Cookies.Get(key).Value);
break;
}
}
report.Cookies = cookieCollection;
report.FormsAuthenticationCookieName = FormsAuthentication.FormsCookieName;

In coded ui how do you correctly retrieve a browser window that contains an embedded Adobe PDF reader in browser

I'am running across this issue when I'm debugging or running my coded UI automation project, where i get the exception labeled "{"COM object that has been separated from its underlying RCW cannot be used." System.Exception {System.Runtime.InteropServices.InvalidComObjectException}" everytime i come from a browser window that contains a pdf reader embedded in it. This happens every time I retrieve the window and try to click back. It barfs when i perform the back method on it. I've tried different things but none has worked including the playback wait.
var hereIsmypdf = ReturnPDFDoc();
public BrowserWindow ReturnPDFDoc()
{
Playback.Wait(1000);
var myPdFdoc = GlobalVariables.Browser;
return myPdFdoc;
}
hereIsmypdf.Back();
The only way i was able to get around this issue was not to use the BrowserWindow class. I ended up using the WinWindow class and just getting the tab of the window from it. The BrowserWindow class seemed to trigger the exception "COM object that has been separated from its underlying RCW cannot be used." System.Exception {System.Runtime.InteropServices.InvalidComObjectException}" everytime i tried to retrieve it. I hope this helps someone one or maybe someone has a better way to handle this issue.
For the people that voted my question down, i really did try to figure it out. Sorry i wasnt clear about what i was asking the community or couldn't properly articulate what this pain was. I'm sure someone probably is going through the same pain i did and having a hard time articulating whats going on.
Here is my code on what i ended up doing
public WinTabPage ReturnPDFDoc()
{
WinWindow Wnd = new WinWindow();
Wnd.SearchProperties[BrowserWindow.PropertyNames.ClassName] = "IEFrame";
WinTabList tabRoWlist = new WinTabList(Wnd);
tabRoWlist.SearchProperties[WinTabPage.PropertyNames.Name] = "Tab Row";
WinTabPage myTab = new WinTabPage(tabRoWlist);
myTab.SearchConfigurations.Add(SearchConfiguration.AlwaysSearch);
myTab.SearchProperties[WinTabPage.PropertyNames.Name] = "something";
//UITestControlCollection windows = newWin.FindMatchingControls();
return myTab;
}

ABCPdf giving "Unable to render HTML. Unable to apply JScript" even with simple OnLoadScript

I'm trying to do some simple DOM manipulation when a page is rendered as a PDF using ABCPdf. I followed what they document here: http://www.websupergoo.com/helppdf9net/source/5-abcpdf/xhtmloptions/2-properties/usescript.htm
But when I try something as simple as the following:
var doc = new Doc();
doc.HtmlOptions.UseScript = true;
doc.HtmlOptions.UseNoCache = true;
doc.HtmlOptions.PageCachePurge();
doc.HtmlOptions.OnLoadScript = #"var reportElms = document.getElementsByClassName(""report"");";
doc.Page = doc.AddPage();
doc.AddImageUrl(Url.Action("TestPdf", "Pdf", new { }, "http"));
I get the exception:
Unable to render HTML. Unable to apply JScript.
COM error 80020101.
Script 'var reportElms = document.getElementsByClassName("report");'.
Any thoughts as to what I'm doing wrong?
Not even the built in functions work
I'm even getting the same exception with the following script:
doc.HtmlOptions.OnLoadScript = #"
window.ABCpdf_RenderWait();
window.ABCpdf_RenderComplete();";
Btw, I'm using version 8 because that's what we have a licence for.
Edit:
I was missing the .external for the ABCpdf_RenderWait() and ABCpdf_RenderComplete() calls. It works if you reference them properly (imagine that):
doc.HtmlOptions.OnLoadScript = #"
window.external.ABCpdf_RenderWait();
window.external.ABCpdf_RenderComplete();";
Though as I mention in my answer, there are a lot of security hoops that need to be jumped through for IE also.
So I didn't actually get the IE engine to execute JavaScript the way I wanted but I was able to find a solution using the Gecko engine. The original NuGet install did not include the Gecko DLL, so I just downloaded the standalone install and added the DLLs manually.
After that everything worked exactly as expected.
I believe that the IE engine didn't work because it requires a lot of security configuration, because the FAQs spend a lot of time discussing debugging of security: http://www.websupergoo.com/support.htm#6.7

FOP and IKVM in .NET - Images Not Working

UPDATE2: I got it working completely now! Scroll way down to find out how...
UPDATE: I got it working! Well... partially. Scroll down for the answer...
I'm trying to get my FO file to show an external image upon transforming it to PDF (or RTF for that matter, but I'm not sure whether RTFs are even capable of displaying images (they are)) with FOP, but I can't seem to get it working. (The question asked here is different than mine.)
I am using IKVM 0.46.0.1 and have compiled a FOP 1.0 dll to put in .NET; this code worked fine when I didn't try to add images:
private void convertFoByMimetype(java.io.File fo, java.io.File outfile, string mimetype)
{
OutputStream output = null;
try
{
FOUserAgent foUserAgent = fopFactory.newFOUserAgent();
// configure foUserAgent as desired
// Setup outputput stream. Note: Using BufferedOutputStream
// for performance reasons (helpful with FileOutputStreams).
output = new FileOutputStream(outfile);
output = new BufferedOutputStream(output);
// Construct fop with desired output format
Fop fop = fopFactory.newFop(mimetype, foUserAgent, output);
// Setup JAXP using identity transformer
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(); // identity transformer
// Setup input stream
Source src = new StreamSource(fo);
// Resulting SAX events (the generated FO) must be piped through to FOP
Result res = new SAXResult(fop.getDefaultHandler());
// Start XSLT transformation and FOP processing
transformer.transform(src, res);
}
catch (Exception ex)
...
}
However, when I (or rather a DocBook2FO transformation) added the following code:
<fo:external-graphic src="url(images/interface.png)" width="auto" height="auto" content-width="auto" content-height="auto" content-type="content-type:image/png"></fo:external-graphic>
into the FO file, the image did not show. I read through a bit of the FAQ on Apache's site, which says:
3.3. Why is my graphic not rendered?
Most commonly, the external file is not being found by FOP. Check the
following:
Empty or wrong baseDir setting.
Spelling errors in the file name (including using the wrong case).
...
Other options did not seem to be my case (mainly for the reason below - "The Weird Part"). I tried this:
...
try
{
fopFactory.setBaseURL(fo.getParent());
FOUserAgent foUserAgent = fopFactory.newFOUserAgent();
foUserAgent.setBaseURL(fo.getParent());
FOURIResolver fourir = fopFactory.getFOURIResolver();
foUserAgent.setURIResolver(fourir);
// configure foUserAgent as desired
...
with no avail.
The Weird Part
When I use the command-line implementation of FOP, it works fine and displays my image with no problem. (I don't want to go the run-command-line-from-program route, because I don't want to force the users to install Java AND the .NET framework when they want to use my program.)
The png file is generated from GDI+ from within my application (using Bitmap.Save). I also tried different png files, but none of them worked for me.
Is there anything I might be missing?
Thanks a bunch for getting this far
UPDATE and possible answer
So I might have figured out why it didn't work. I put some time into studying the code (before I basically just copypasted it without thinking about it much). The problem is indeed in the wrong basedir setting.
The key is in this chunk of code:
// Setup JAXP using identity transformer
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(); // identity transformer
// Setup input stream
Source src = new StreamSource(fo);
// Resulting SAX events (the generated FO) must be piped through to FOP
Result res = new SAXResult(fop.getDefaultHandler());
// Start XSLT transformation and FOP processing
transformer.transform(src, res);
What happens here is an identity transformation, which routes its own result into an instance of FOP I've created before. This effectively changes the basedir of the routed FO into that of the application's executable. I have yet to figure out how to do this without a transformation and route my input directly into FOP, but for the moment I worked around this by copying my images into the executable's directory.
Which is where another problem came in. Now whenever I try to execute the code, I get an exception at the line that says transformer.transform(src, res);, which confuses the pants out of me, because it doesn't say anything. The ExceptionHelper says:
java.lang.ExceptionInInitializerError was caught
and there is no inner exception or exception message. I know this is hard to debug just from what I wrote, but I'm hoping there might be an easy fix.
Also, this e-mail seems vaguely related but there is no answer to it.
UPDATE2
Finally, after a few sleepless nights, I managed to get it working with one of the simplest ways possible.
I updated IKVM, compiled fop with the new version and replaced the IKVM references with the new dlls. The error no longer occurs and my image renders fine.
I hope this helps someone someday
I'm using very similar code, although without the FOUserAgent, Resolvers, etc. and it works perfectly.
Did you try setting the src attribute in the XSLT without the url() function?
What might help you diagnose the problem further are the following statements:
java.lang.System.setProperty("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.SimpleLog")
java.lang.System.setErr(New java.io.PrintStream(New TraceStream(TraceStream.Level.Error)))
java.lang.System.setOut(New java.io.PrintStream(New TraceStream(TraceStream.Level.Info)))
Where TraceStream is a .NET implementation of a java.io.OutputStream which writes to your favorite logger.
I posted a version for the Common.Logging package at http://pastebin.com/XH1Wg7jn.
Here's a post not to leave the question unanswered, see "Update 1" and "Update 2" in the original post for the solution.

Categories