I am trying to convert a content script I wrote for Google Chrome into an IE Addon, mainly using the code in this answer.
I needed to inject an stylesheet and, I found a way to do it using Javascript. I thought I might be able to do the same using C#. Here's my code:
[ComVisible(true)]
[Guid(/* replaced */)]
[ClassInterface(ClassInterfaceType.None)]
public class SimpleBHO: IObjectWithSite
{
private WebBrowser webBrowser;
void webBrowser_DocumentComplete(object pDisp, ref object URL)
{
var document2 = webBrowser.Document as IHTMLDocument2;
var document3 = webBrowser.Document as IHTMLDocument3;
// trying to add a '<style>' element to the header. this does not work.
var style = document2.createElement("style");
style.innerHTML = ".foo { background-color: red; }";// this line is the culprit!
style.setAttribute("type", "text/css");
var headCollection = document3.getElementsByTagName("head");
var head = headCollection.item(0, 0) as IHTMLDOMNode;
var result = head.appendChild(style as IHTMLDOMNode);
// trying to repace an element in the body. this part works if
// adding style succeeds.
var journalCollection = document3.getElementsByName("elem_id");
var journal = journalCollection.item(0, 0) as IHTMLElement;
journal.innerHTML = "<div class=\"foo\">Replaced!</div>";
// trying to execute some JavaScript. this part works as well if
// adding style succeeds.
document2.parentWindow.execScript("alert('Hi!')");
}
int IObjectWithSite.SetSite(object site)
{
if (site != null)
{
webBrowser = (WebBrowser)site;
webBrowser.DocumentComplete += new DWebBrowserEvents2_DocumentCompleteEventHandler(webBrowser_DocumentComplete);
}
else
{
webBrowser.DocumentComplete -= new DWebBrowserEvents2_DocumentCompleteEventHandler(webBrowser_DocumentComplete);
webBrowser = null;
}
return 0;
}
/* some code (e.g.: IObjectWithSite.SetSite) omitted to improve clarity */
}
If I just comment out the following line...
style.innerHTML = ".foo { background-color: red; }";
... the rest of the code executes perfectly (The element #elem_id is replaced and the JavaScript I injected is executed).
What am I doing wrong when trying to inject the stylesheet? Is this even possible?
EDIT: I found out that the site I'm trying to inject CSS requests Document Mode 5, and when Compatibility view is disabled, my code works perfectly. But how do I make it to work even when compatibility view is enabled?
After lot of experimenting, I found out that only failsafe way to inject stylesheets to inject them using JavaScript, which is executed with IHTMLWindow2.execScript().
I used following JavaScript:
var style = document.createElement('style');
document.getElementsByTagName('head')[0].appendChild(style);
var sheet = style.styleSheet || style.sheet;
if (sheet.insertRule) {
sheet.insertRule('.foo { background-color: red; }', 0);
} else if (sheet.addRule) {
sheet.addRule('.foo', 'background-color: red;', 0);
}
The above JavaScript was executed in the following fashion:
// This code is written inside a BHO written in C#
var document2 = webBrowser.Document as IHTMLDocument2;
document2.parentWindow.execScript(#"
/* Here, we have the same JavaScript mentioned above */
var style = docu....
...
}");
document2.parentWindow.execScript("alert('Hi!')");
Related
Good day, everyone.
How can I apply predefined styles of word document to inserted HTML?
Like:
builder.InsertHTML(post.Title)
// apply style from document "Media-title"
builder.InsertHTML(post.Content)
// apply style "Media-content"
Please note InsertHtml() overload with useBuilderFormatting will not override styles of HTML text having inline styles. You may implement INodeChangingCallback for applying styles/formatting to HTML text. Please check following code snippet for reference.
public static void HtmlFormatting()
{
// Create a blank document object
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
// Set up and pass the object which implements the handler methods.
doc.NodeChangingCallback = new HandleNodeChanging_FontChanger();
// Insert sample HTML content
builder.InsertHtml("<p>Hello World</p>");
doc.NodeChangingCallback = null;
builder.InsertHtml("<p>Some Test Text</p>");
doc.Save(#"Out.docx");
}
public class HandleNodeChanging_FontChanger : INodeChangingCallback
{
// Implement the NodeInserted handler to set default font settings for every Run node inserted into the Document
void INodeChangingCallback.NodeInserted(NodeChangingArgs args)
{
// Change the font of inserted text contained in the Run nodes.
if (args.Node.NodeType == NodeType.Run)
{
Run run = (Run)args.Node;
Console.WriteLine(run.Text);
run.Font.StyleName = "Intense Emphasis";
// Aspose.Words.Font font = ((Run)args.Node).Font;
// font.Size = 24;
// font.Name = "Arial";
}
}
void INodeChangingCallback.NodeInserting(NodeChangingArgs args)
{
// Do Nothing
}
void INodeChangingCallback.NodeRemoved(NodeChangingArgs args)
{
// Do Nothing
}
void INodeChangingCallback.NodeRemoving(NodeChangingArgs args)
{
// Do Nothing
}
}
I work with Aspose as developer Evangelist.
Well, after a couple of hours of finding a solution I've got this to work:
builder.ParagraphFormat.ClearFormatting();
builder.ParagraphFormat.Style = styles["word_style1"];
builder.Writeln(post.Title);
builder.InsertParagraph();
builder.ParagraphFormat.Style = styles["word_style2"];
builder.InsertHtml(post.Annotation, true);
builder.InsertParagraph();
builder.ParagraphFormat.Style = styles["word_style3"];
builder.InsertHyperlink(post.Url, post.Url, false);
P.S.: I hope there are any workarounds or improvements to do ot better.
Using MvcMailer, the problem is that our emails are being sent without our CSS as inline style attributes.
PreMailer.Net is a C# Library that can read in an HTML source string, and return a resultant HTML string with CSS in-lined.
How do we use them together? Using the scaffolding example in the MvcMailer step-by-step guide, we start out with this example method in our UserMailer Mailer class:
public virtual MvcMailMessage Welcome()
{
return Populate(x => {
x.ViewName = "Welcome";
x.To.Add("some-email#example.com");
x.Subject = "Welcome";
});
}
Simply install PreMailer.Net via NugGet
Update the Mailer class:
public virtual MvcMailMessage Welcome()
{
var message = Populate(x => {
x.ViewName = "Welcome";
x.To.Add("some-email#example.com");
x.Subject = "Welcome";
});
message.Body = PreMailer.Net.PreMailer.MoveCssInline(message.Body).Html;
return message;
}
Done!
If you have a text body with HTML as an alternate view (which I recommend) you'll need to do the following:
var message = Populate(m =>
{
m.Subject = subject;
m.ViewName = viewName;
m.To.Add(model.CustomerEmail);
m.From = new System.Net.Mail.MailAddress(model.FromEmail);
});
// get the BODY so we can process it
var body = EmailBody(message.ViewName);
var processedBody = PreMailer.Net.PreMailer.MoveCssInline(body, true).Html;
// start again with alternate view
message.AlternateViews.Clear();
// add BODY as alternate view
var htmlView = AlternateView.CreateAlternateViewFromString(processedBody, new ContentType("text/html"));
message.AlternateViews.Add(htmlView);
// add linked resources to the HTML view
PopulateLinkedResources(htmlView, message.LinkedResources);
Note: Even if you think you don't care about text it can help with spam filters.
I recommend reading the source for MailerBase to get a better idea what's going on cos all these Populate methods get confusing.
Note: This may not run as-is but you get the idea. I have code (not shown) that parses for any img tags and adds as auto attachments.
Important part is to clear the HTML alternate view. You must have a .text.cshtml file for the text view.
If you're using ActionMailer.Net(.Next), you can do this:
protected override void OnMailSending(MailSendingContext context)
{
if (context.Mail.IsBodyHtml)
{
var inlineResult = PreMailer.Net.PreMailer.MoveCssInline(context.Mail.Body);
context.Mail.Body = inlineResult.Html;
}
for (var i = 0; i < context.Mail.AlternateViews.Count; i++)
{
var alternateView = context.Mail.AlternateViews[i];
if (alternateView.ContentType.MediaType != AngleSharp.Network.MimeTypeNames.Html) continue;
using (alternateView) // make sure it is disposed
{
string content;
using (var reader = new StreamReader(alternateView.ContentStream))
{
content = reader.ReadToEnd();
}
var inlineResult = PreMailer.Net.PreMailer.MoveCssInline(content);
context.Mail.AlternateViews[i] = AlternateView.CreateAlternateViewFromString(inlineResult.Html, alternateView.ContentType);
}
}
base.OnMailSending(context);
}
If you don't like using AngleSharp.Network.MimeTypeNames, you can just use "text/html". AngleSharp comes as a dependency of ActionMailer.Net.
I'm using sgmlreader to convert HTML to XML. The output goes into a XmlDocument object, which I can then use the InnerText method to extract the plain text from the website. I'm trying to get the text to look as clean as possible, by removing any javascript. Looping through the xml and removing any <script type="text/javascript"> is easy enough, but I've hit a brick wall when any jquery or styling isn't encapsulated in any tags. Can anybody help me out?
Sample Code:
Step one:
Once I use the webclient class to download the HTML, I save it, then open the file with the text reader class.
Step two:
Create sgmlreader class and set the input stream to the text reader:
// setup SGMLReader
Sgml.SgmlReader sgmlReader = new Sgml.SgmlReader();
sgmlReader.DocType = "HTML";
sgmlReader.WhitespaceHandling = WhitespaceHandling.All;
sgmlReader.CaseFolding = Sgml.CaseFolding.ToLower;
sgmlReader.InputStream = reader;
// create document
doc = new XmlDocument();
doc.PreserveWhitespace = true;
doc.XmlResolver = null;
doc.Load(sgmlReader);
Step three:
Once I have a xmldocument, I use the doc.InnerText to get my plain text.
Step four:
I can easy remove JavaScript tags like so:
XmlNodeList nodes = document.GetElementsByTagName("text/javascript");
for (int i = nodes.Count - 1; i >= 0; i--)
{
nodes[i].ParentNode.RemoveChild(nodes[i]);
}
Some stuff still slips through. Heres an example of an ouput for one particular website I'm scriping:
Criminal and Civil Enforcement | Fraud | Office of Inspector General | U.S. Department of Health and Human Services
#fancybox-right {
right:-20px;
}
#fancybox-left {
left:-20px;
}
#fancybox-right:hover span, #fancybox-right span
#fancybox-right:hover span, #fancybox-right span {
left:auto;
right:0;
}
#fancybox-left:hover span, #fancybox-left span
#fancybox-left:hover span, #fancybox-left span {
right:auto;
left:0;
}
#fancybox-overlay {
/* background: url('/connections/images/wc-overlay.png'); */
/* background: url('/connections/images/banner.png') center center no-repeat; */
}
$(document).ready(function(){
$("a[rel=photo-show]").fancybox({
'titlePosition' : 'over',
'overlayColor' : '#000',
'overlayOpacity' : 0.9
});
$(".title-under").fancybox({
'titlePosition' : 'outside',
'overlayColor' : '#000',
'overlayOpacity' : 0.9
})
});
That jquery and styling needs to be removed.
I just threw this together in LinqPad based on the html of this page and it properly removes the script and style tags.
void Main()
{
string htmlPath = #"C:\Users\Jschubert\Desktop\html\test.html";
var sgmlReader = new Sgml.SgmlReader();
var stringReader = new StringReader(File.ReadAllText(htmlPath));
sgmlReader.DocType = "HTML";
sgmlReader.WhitespaceHandling = WhitespaceHandling.All;
sgmlReader.CaseFolding = Sgml.CaseFolding.ToLower;
sgmlReader.InputStream = stringReader;
// create document
var doc = new XmlDocument();
doc.PreserveWhitespace = true;
doc.XmlResolver = null;
doc.Load(sgmlReader);
List<XmlNode> nodes = doc.GetElementsByTagName("script")
.Cast<XmlNode>().ToList();
var byType = doc.SelectNodes("script[#type = 'text/javascript']")
.Cast<XmlNode>().ToList();
var style = doc.GetElementsByTagName("style").Cast<XmlNode>().ToList();
nodes.AddRange(byType);
nodes.AddRange(style);
for (int i = nodes.Count - 1; i >= 0; i--)
{
nodes[i].ParentNode.RemoveChild(nodes[i]);
}
doc.DumpFormatted();
stringReader.Close();
sgmlReader.Close();
}
Casting to XmlNode to use the generic list is not ideal, but I did it for the sake of space and demonstration.
Also, you shouldn't need both
doc.GetElementsByTagName("script") and
doc.SelectNodes("script[#type = 'text/javascript']").
Again, I did that for the sake of demonstration.
If you have other scripts and you only want to remove JavaScript, use the latter. If you're removing all script tags, use the first one. Or, use both if you want.
I use C#.net.
I wrote JavaScript for hide and show expand and collapse div accordingly. It work well in IE but not on Firefox, not even call the JavaScript function and gives me error as Error: ctl00_cpContents_dlSearchList_ctl08_profiledetailscollapse is not defined.
My JavaScript is as follows
function displayDiv(divCompact, divExpand) {
//alert('1');
var str = "ctl00_cpContents_";
var divstyle = new String();
// alert("ibtnShowHide" + ibtnShowHide);
divstyle = divCompact.style.display;
if (divstyle.toLowerCase() == "block" || divstyle == "") {
divCompact.style.display = "none";
divExpand.style.display = "block";
// ibtnShowHide.ImageUrl = "images/expand_img.GIF";
}
else {
// ibtnShowHide.ImageUrl = "images/restore_img.GIF";
divCompact.style.display = "block";
divExpand.style.display = "none";
}
return false;
}
ctl00_cpContents_dlSearchList_ctl08_profiledetailscollapse is an element id generated by ASP.NET. It's a profiledetailscollapse control inside dlSearchList.
JavaScript variable "ctl00_cpContents_dlSearchList_ctl08_profiledetailscollapse" is not
defined. Firefox does not automatically create, for each element with an id, a
variable in the global scope named after that id and containing a reference
to the element.
You might want to consider using jQuery to make sure that your DOM manipulation is cross-browser compatible.
I have a project that I am working on in VS2005. I have added a WebBrowser control. I add a basic empty page to the control
private const string _basicHtmlForm = "<html> "
+ "<head> "
+ "<meta http-equiv='Content-Type' content='text/html; charset=utf-8'/> "
+ "<title>Test document</title> "
+ "<script type='text/javascript'> "
+ "function ShowAlert(message) { "
+ " alert(message); "
+ "} "
+ "</script> "
+ "</head> "
+ "<body><div id='mainDiv'> "
+ "</div></body> "
+ "</html> ";
private string _defaultFont = "font-family: Arial; font-size:10pt;";
private void LoadWebForm()
{
try
{
_webBrowser.DocumentText = _basicHtmlForm;
}
catch(Exception ex)
{
MessageBox.Show(ex.Message);
}
}
and then add various elements via the dom (using _webBrowser.Document.CreateElement). I am also loading a css file:
private void AddStyles()
{
try
{
mshtml.HTMLDocument currentDocument = (mshtml.HTMLDocument) _webBrowser.Document.DomDocument;
mshtml.IHTMLStyleSheet styleSheet = currentDocument.createStyleSheet("", 0);
TextReader reader = new StreamReader(Path.Combine(Path.GetDirectoryName(Application.ExecutablePath),"basic.css"));
string style = reader.ReadToEnd();
styleSheet.cssText = style;
}
catch(Exception ex)
{
MessageBox.Show(ex.Message);
}
}
Here is the css page contents:
body {
background-color: #DDDDDD;
}
.categoryDiv {
background-color: #999999;
}
.categoryTable {
width:599px; background-color:#BBBBBB;
}
#mainDiv {
overflow:auto; width:600px;
}
The style page is loading successfully, but the only elements on the page that are being affected are the ones that are initially in the page (body and mainDiv). I have also tried including the css in a element in the header section, but it still only affects the elements that are there when the page is created.
So my question is, does anyone have any idea on why the css is not being applied to elements that are created after the page is loaded? I have also tried no applying the css until after all of my elements are added, but the results don't change.
I made a slight modification to your AddStyles() method and it works for me.
Where are you calling it from? I called it from "_webBrowser_DocumentCompleted".
I have to point out that I am calling AddStyles after I modify the DOM.
private void AddStyles()
{
try
{
if (_webBrowser.Document != null)
{
IHTMLDocument2 currentDocument = (IHTMLDocument2)_webBrowser.Document.DomDocument;
int length = currentDocument.styleSheets.length;
IHTMLStyleSheet styleSheet = currentDocument.createStyleSheet(#"", length + 1);
//length = currentDocument.styleSheets.length;
//styleSheet.addRule("body", "background-color:blue");
TextReader reader = new StreamReader(Path.Combine(Path.GetDirectoryName(Application.ExecutablePath), "basic.css"));
string style = reader.ReadToEnd();
styleSheet.cssText = style;
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
}
Here is my DocumentCompleted handler (I added some styles to basic.css for testing):
private void _webBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
HtmlElement element = _webBrowser.Document.CreateElement("p");
element.InnerText = "Hello World1";
_webBrowser.Document.Body.AppendChild(element);
HtmlElement divTag = _webBrowser.Document.CreateElement("div");
divTag.SetAttribute("class", "categoryDiv");
divTag.InnerHtml = "<p>Hello World2</p>";
_webBrowser.Document.Body.AppendChild(divTag);
HtmlElement divTag2 = _webBrowser.Document.CreateElement("div");
divTag2.SetAttribute("id", "mainDiv2");
divTag2.InnerHtml = "<p>Hello World3</p>";
_webBrowser.Document.Body.AppendChild(divTag2);
AddStyles();
}
This is what I get (modified the style to make it as ugly as a single human being can hope to make it :D ):
one solution is to inspect the html prior to setting the DocumentText and inject CSS on the client side. I don't set the control url property but rather get the HTML via WebCLient and then set the DocumentText. maybe setting DocumentText (or in your case Document) after you manipulate the DOM could get it to re-render properly
private const string CSS_960 = #"960.css";
private const string SCRIPT_FMT = #"<style TYPE=""text/css"">{0}</style>";
private const string HEADER_END = #"</head>";
public void SetDocumentText(string value)
{
this.Url = null; // can't have both URL and DocText
this.Navigate("About:blank");
string css = null;
string html = value;
// check for known CSS file links and inject the resourced versions
if(html.Contains(CSS_960))
{
css = GetEmbeddedResourceString(CSS_960);
html = html.Insert(html.IndexOf(HEADER_END), string.Format(SCRIPT_FMT,css));
}
if (Document != null) {
Document.Write(string.Empty);
}
DocumentText = html;
}
It would be quite hard to say unless you send a link of this.
but usually the best method for doing style related stuff is that you have the css already in the page and in your c# code you only add ids or classes to elements to see the styles effects.
I have found that generated tags with class attribute does not get their styles applied.
This is my workaround that is done after the document is generated:
public static class WebBrowserExtensions
{
public static void Redraw(this WebBrowser browser)
{
string temp = Path.GetTempFileName();
File.WriteAllText(temp, browser.Document.Body.Parent.OuterHtml,
Encoding.GetEncoding(browser.Document.Encoding));
browser.Url = new Uri(temp);
}
}
I use similiar control instead of WebBrowser, I load HTML page with "default" style rules and I change the rules within the program.
(DrawBack - maintainance, when I need to add a rule, I also need to change it in code)
' ----------------------------------------------------------------------
Public Sub mcFontOrColorsChanged(ByVal isRefresh As Boolean)
' ----------------------------------------------------------------------
' Notify whichever is concerned:
Dim doc As mshtml.HTMLDocument = Me.Document
If (doc.styleSheets Is Nothing) Then Return
If (doc.styleSheets.length = 0) Then Return
Dim docStyleSheet As mshtml.IHTMLStyleSheet = CType(doc.styleSheets.item(0), mshtml.IHTMLStyleSheet)
Dim docStyleRules As mshtml.HTMLStyleSheetRulesCollection = CType(docStyleSheet.rules, mshtml.HTMLStyleSheetRulesCollection)
' Note: the following is needed seperately from 'Case "BODY"
Dim docBody As mshtml.HTMLBodyClass = CType(doc.body, mshtml.HTMLBodyClass)
If Not (docBody Is Nothing) Then
docBody.style.backgroundColor = colStrTextBg
End If
Dim i As Integer
Dim maxI As Integer = docStyleRules.length - 1
For i = 0 To maxI
Select Case (docStyleRules.item(i).selectorText)
Case "BODY"
docStyleRules.item(i).style.fontFamily = fName ' "Times New Roman" | "Verdana" | "courier new" | "comic sans ms" | "Arial"
Case "P.myStyle1"
docStyleRules.item(i).style.fontSize = fontSize.ToString & "pt"
Case "TD.myStyle2" ' do nothing
Case ".myStyle3"
docStyleRules.item(i).style.fontSize = fontSizePath.ToString & "pt"
docStyleRules.item(i).style.color = colStrTextFg
docStyleRules.item(i).style.backgroundColor = colStrTextBg
Case Else
Debug.WriteLine("Rule " & i.ToString & " " & docStyleRules.item(i).selectorText)
End Select
Next i
If (isRefresh) Then
Me.myRefresh(curNode)
End If
End Sub
It could be that the objects on the page EXIST at the time the page is being loaded, so each style can be applied. just because you add a node to the DOM tree, doesnt mean that it can have all of its attributes manipulated and rendered inside of the browser.
the methods above seem to use an approach the reloads the page (DOM), which suggests that this may be the case.
In short, refresh the page after you've added an element
It sounds as though phq has experienced this. I think the way I would approach is add a reference to jquery to your html document (from the start).
Then inside of the page, create a javascript function that accepts the element id and the name of the class to apply. Inside of the function, use jquery to dynamtically apply the class in question or to modify the css directly. For example, use .addClass or .css functions of jquery to modify the element.
From there, in your C# code, after you add the element dynamically invoke this javascript as described by Rick Strahl here: http://www.west-wind.com/Weblog/posts/493536.aspx