Change font of html content converted to string using web browser control - c#

I am using this to convert HTML content to XAML but I need to change font size of the content. So, I am trying to use this to change the font size but I am getting doc as null. Any idea why?
Here's my code-
public static void DocumentPropertyChanged(DependencyObject target, DependencyPropertyChangedEventArgs e)
{
WebBrowser browser = target as WebBrowser;
var doc = browser.Document as HTMLDocument;
if (browser != null)
{
string document = e.NewValue as string;
browser.NavigateToString(document);
}
if (doc != null)
{
doc.execCommand("FontSize", false, 12);
doc.execCommand("FontFamily", false, "Arial");
}
}

Try this:
public static void DocumentPropertyChanged(DependencyObject target, DependencyPropertyChangedEventArgs e)
{
if (!(target is WebBrowser)) // Handles null and other weird things.
throw new Exception("target is not a WebBrowser!");
WebBrowser browser = target as WebBrowser;
string document = e.NewValue as string;
if (document == null)
throw new Exception("e.NewValue is not a string!");
browser.NavigateToString(document);
var doc = browser.Document as HTMLDocument;
if (doc != null)
{
doc.execCommand("FontSize", false, 12);
doc.execCommand("FontFamily", false, "Arial");
}
else
{
throw new Exception("browser.Document is not an HTMLDocument!");
}
}
I think this requires a reference to Microsoft.mshtml.dll and a using mshtml; statement, if you haven't done that yet.
I've added all of the throws, because I'm not 100% confident where the problem is, having not run this code.
So for what it's worth, I hope this helps.
Edit...
The documentation states that NavigateToString loads the content asynchronously.
By assigning doc after the navigation, the above code appears to work for very short documents, but that cannot be trusted. Assigning doc before navigation does not work.
A better solution might be to handle the WebBrowser.Navigated event to ensure the content has been fully loaded before interacting with the WebBrowser.Document property:
XAML:
<WebBrowser Name="browser" Navigated="Browser_Navigated"/>
CS:
public static void DocumentPropertyChanged(DependencyObject target, DependencyPropertyChangedEventArgs e)
{
WebBrowser browser = target as WebBrowser;
if (browser != null)
{
string document = e.NewValue as string;
browser.NavigateToString(document);
}
}
private void Browser_Navigated(object sender, NavigationEventArgs e)
{
var doc = webBrowser.Document as HTMLDocument;
if (doc != null)
{
doc.execCommand("FontSize", false, 12);
doc.execCommand("FontFamily", false, "Arial");
}
}
Note: This code will run your execCommand (which I have not tested btw) on every document loaded into the WebBrowser. If this is a problem we can fix it.

Related

Manipulating HTML document before displaying into WPF WebBrowser control

I have to change inner html code before showing it in the WebBrowser.
Test page - http://aksmod.ru/skajrim-mod-kukri-ot-aksyonov-v5-0/
I tried to use AngleSharp.Scripting but it doesn't work correctly (the ads doesn't load)
var config = new Configuration().WithDefaultLoader().WithJavaScript();
var document = BrowsingContext.New(config).OpenAsync(address).Result;
//do something
return document.DocumentElement.OuterHtml;
later I thought about LoadCompleted, but the result was the same
private void Wb_LoadCompleted(object sender, NavigationEventArgs e)
{
Console.WriteLine("Loaded");
string url = e.Uri.ToString();
if (!(url.StartsWith("http://") || url.StartsWith("https://")))
{ }
if (e.Uri.AbsolutePath != wb.Source.AbsolutePath)
{ }
else
{
Console.WriteLine("Full Loaded");
HTMLDocument html = (HTMLDocument)wb.Document;
var value = html.getElementsByTagName("html").item(index: 0);
//do something
wb.NavigateToString(value.OuterHtml);
}
}
the event just doesn't fire (it works fine for some other sites, although).
So, what I am missing to do it?
Update 1
MCVE
XAML
<Grid>
<WebBrowser Name="wb" />
</Grid>
Code behind
public partial class MainWindow : Window
{
public MainWindow()
{
InitializeComponent();
wb.Navigated += Wb_Navigated;
wb.LoadCompleted += Wb_LoadCompleted;
wb.Navigate("http://aksmod.ru/skajrim-mod-kukri-ot-aksyonov-v5-0/");
}
private void Wb_LoadCompleted(object sender, NavigationEventArgs e)
{
Console.WriteLine("Loaded");
string url = e.Uri.ToString();
if (!(url.StartsWith("http://") || url.StartsWith("https://")))
{ }
if (e.Uri.AbsolutePath != wb.Source.AbsolutePath)
{ }
else
{
Console.WriteLine("Full Loaded");
HTMLDocument html = (HTMLDocument)wb.Document;
var value = html.getElementsByTagName("html").item(index: 0);
//do something
wb.NavigateToString(value.OuterHtml);
}
}
private void Wb_Navigated(object sender, NavigationEventArgs e)
{
FieldInfo fiComWebBrowser = typeof(WebBrowser)
.GetField("_axIWebBrowser2",
BindingFlags.Instance | BindingFlags.NonPublic);
if (fiComWebBrowser == null) return;
object objComWebBrowser = fiComWebBrowser.GetValue(wb);
if (objComWebBrowser == null) return;
objComWebBrowser.GetType().InvokeMember(
"Silent", BindingFlags.SetProperty, null, objComWebBrowser,
new object[] { true });
Console.WriteLine("Navigated");
}
}
The ads are embedded as iFrame within the page you presented. In my case, the Ad URL loaded in the iFrame is something like https://cdn.254a.com/images/hosted/elv/retargeting/v5/728x90.html?... (check with web browser's inspector tool)
Probably the ad does not allow iframing in your page (Check what the ad returns in X-Frame-Options header field). If this is the issue, it should be possible to implement a proxy for the ad, and let the proxy change the X-Frame-Options header.
In this case, if the ad URL is https (and not just http), you'd need to create a proxy that acts as Man-in-the-Middle. See accepted answer of What's the point of the X-Frame-Options header?. But you could replace the URL by your proxy URL, with the original URL in the ARGS. the proxy acts as HTTPS client, gets the content, proxy is able to modify the header, and returns the content to your page just via HTTP.
You can use: http://html-agility-pack.net for manipulate the Html code on C#.

Cannot get rendered html via WebBrowser

I want to get html code from website. In Browser I usually can just click on ‘View Page Source’ in context menu or something similar. But how can I automatized it? I’ve tried it with WebBrowser class but sometimes it doesn’t work. I am not web developer so I don’t really know if my approach at least make sense. I think main problem is that I sometimes get html where not all code was executed. Hence it is uncompleted. I have problem with e.g. this site: http://www.sreality.cz/en/search/for-sale/praha
My code (I’ve tried to make it small but runnable on its own):
using System;
using System.Collections.Generic;
using System.Runtime.InteropServices;
using System.Windows.Forms;
namespace WebBrowserForm
{
internal static class Program
{
[STAThread]
private static void Main()
{
Application.EnableVisualStyles();
Application.SetCompatibleTextRenderingDefault(false);
for (int i = 0; i < 10; i++)
{
Form1 f = new Form1();
f.ShowDialog();
}
// Now I can check Form1.List and see that some html is final and some is not
}
}
public class Form1 : Form
{
public static List<string> List = new List<string>();
private const string Url = "http://www.sreality.cz/en/search/for-sale/praha";
private System.Windows.Forms.WebBrowser webBrowser1;
public Form1()
{
this.webBrowser1 = new System.Windows.Forms.WebBrowser();
this.SuspendLayout();
this.webBrowser1.Dock = System.Windows.Forms.DockStyle.Fill;
this.webBrowser1.Name = "webBrowser1";
this.webBrowser1.TabIndex = 0;
this.ResumeLayout(false);
Load += new EventHandler(Form1_Load);
this.webBrowser1.ObjectForScripting = new MyScript();
}
private void Form1_Load(object sender, EventArgs e)
{
webBrowser1.Navigate(Url);
webBrowser1.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);
}
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (webBrowser1.ReadyState == WebBrowserReadyState.Complete)
{
// Final html for 99% of web pages, but unfortunately not for all
string tst = webBrowser1.Document.GetElementsByTagName("HTML")[0].OuterHtml;
webBrowser1.DocumentCompleted -= new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);
Application.DoEvents();
webBrowser1.Navigate("javascript: window.external.CallServerSideCode();");
Application.DoEvents();
}
}
[ComVisible(true)]
public class MyScript
{
public void CallServerSideCode()
{
HtmlDocument doc = ((Form1)Application.OpenForms[0]).webBrowser1.Document;
string renderedHtml = doc.GetElementsByTagName("HTML")[0].OuterHtml;
// here I sometimes get full html but sometimes the same as in webBrowser1_DocumentCompleted method
List.Add(renderedHtml);
((Form1)Application.OpenForms[0]).Close();
}
}
}
}
I would expect that in ‘webBrowser1_DocumentCompleted’ method I could get final html. It usually works, but with this site it doesn’t. So I’ve tried get html in my own code which should be executed in web site -> method ‘CallServerSideCode’. What is strange that sometimes I get final html (basically the same as if I do it manually via Browser) but sometimes not. I think the problem is caused because my script start before whole web site is rendered instead after. But I am not really sure since this kind of things are far from my comfort zone and I don’t really understand what I am doing. I’m just trying to apply something what I found on the internet.
So, does anyone knows what is wrong with the code? Or even more importantly how to easily get final html from the site?
Any help appreciated.
You should use WebClient class to download HTML page. No display control necessary.
You want method DownloadString
May be it will be helpful if you add calling of your external function to the end of the body and wrap it by Jquery "ondomready" function. I mean something like this:
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (webBrowser1.ReadyState == WebBrowserReadyState.Complete)
{
// Final html for 99% of web pages, but unfortunately not for all
string tst = webBrowser1.Document.GetElementsByTagName("HTML")[0].OuterHtml;
webBrowser1.DocumentCompleted -= new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);
HtmlElement body = webBrowser1.Document.GetElementsByTagName("body")[0];
HtmlElement scriptEl = webBrowser1.Document.CreateElement("script");
IHTMLScriptElement element = (IHTMLScriptElement)scriptEl.DomElement;
element.text = "$(function() { window.external.CallServerSideCode(); });";
body.AppendChild(scriptEl);
}
}
[ComVisible(true)]
public class MyScript
{
public void CallServerSideCode()
{
HtmlDocument doc = ((Form1)Application.OpenForms[0]).webBrowser1.Document;
string renderedHtml = doc.GetElementsByTagName("HTML")[0].OuterHtml;
// here I sometimes get full html but sometimes the same as in webBrowser1_DocumentCompleted method
List.Add(renderedHtml);
((Form1)Application.OpenForms[0]).Close();
}
}

WebClient on Store Universal Apps

I'm using this code on Windows Desktop App to get the values of a combobox that I after need to select which is going to update the page with new information using JavaScript
private WebBrowser withEventsField_wb;
WebBrowser wb {
get { return withEventsField_wb; }
set {
if (withEventsField_wb != null) {
withEventsField_wb.Navigated -= navigated;
}
withEventsField_wb = value;
if (withEventsField_wb != null) {
withEventsField_wb.Navigated += navigated;
}
}
}
private void Form1_Load(object sender, EventArgs e)
{
wb = new WebBrowser();
wb.Navigate("https://academicos.ubi.pt/online/horarios.aspx?p=a");
}
private void navigated()
{
HtmlElementCollection allelements = wb.Document.All;
HtmlElement year = default(HtmlElement);
foreach (HtmlElement webpageelement in allelements) {
if (webpageelement.GetAttribute("id").Contains("ContentPlaceHolder1_ddlAnoLect") == true) {
year = webpageelement;
HtmlElementCollection yoptions = year.Children;
foreach (HtmlElement yopt in yoptions) {
ComboBox1.Items.Add(yopt.InnerText);
}
}
}
}
But now I'm trying to do the same on Universal App (Windows Phone/Windows) but I'm being unable to do the same. I know that I have to use HttpClient but it does not work like a WebBrowser, this web browser is only created by code to get all the data needed and as for each step of data that I need to retrieve the website does not refresh normally but uses jQuery to load the new information.
Any help?
Well after a lot of searching I got something that helps and even gave me other idea
http://blog.gauravchouhan.com/tag/advance-web-scraping-using-c/

How to handle WPF WebBrowser control navigation exception

Let's say that WPF WebBrowser control shows some navigation errors and the page is not showing.
So there is an exception of WPF WebBrowser control.
I found some similar questions here but it is not what I need.
In fact, I need some method and object that has an exception to get it somehow.
How do we can handle it?
Thank you!
P.S. There is some approach for WinForm WebBrowser Control... Can we do something similar to WPF WebBrowser control?
public Form13()
{
InitializeComponent();
this.webBrowser1.Navigate("http://blablablabla.bla");
SHDocVw.WebBrowser axBrowser = (SHDocVw.WebBrowser)this.webBrowser1.ActiveXInstance;
axBrowser.NavigateError +=
new SHDocVw.DWebBrowserEvents2_NavigateErrorEventHandler(axBrowser_NavigateError);
}
void axBrowser_NavigateError(object pDisp, ref object URL,
ref object Frame, ref object StatusCode, ref bool Cancel)
{
if (StatusCode.ToString() == "404")
{
MessageBox.Show("Page no found");
}
}
P.S. #2 To host WinForm WebBrowser control under WPF App is not an answer I think.
I'm struggling with a similar problem. When the computer loses internet connection we want to handle that in a nice way.
In the lack of a better solution, I hooked up the Navigated event of the WebBrowser and look at the URL for the document. If it is res://ieframe.dll I'm pretty confident that some error has occurred.
Maybe it is possible to look at the document and see if a server returned 404.
private void Navigated(object sender, NavigationEventArgs navigationEventArgs)
{
var browser = sender as WebBrowser;
if(browser != null)
{
var doc = AssociatedObject.Document as HTMLDocument;
if (doc != null)
{
if (doc.url.StartsWith("res://ieframe.dll"))
{
// Do stuff to handle error navigation
}
}
}
}
It's an old question but since I have just suffered through this, I thought I may as well share. First, I implemented Markus' solution but wanted something a bit better as our Firewall remaps 403 message pages.
I found an answer here (amongst other places) that suggests using NavigationService as it has a NavigationFailed event.
In your XAML, add:
<Frame x:Name="frame"/>
In your code-behind's constructor, add:
frame.Navigated += new System.Windows.Navigation.NavigatedEventHandler(frame_Navigated);
frame.NavigationFailed += frame_NavigationFailed;
frame.LoadCompleted += frame_LoadCompleted;
frame.NavigationService.Navigate(new Uri("http://theage.com.au"));
The handlers can now deal with either a failed navigation or a successful one:
void frame_NavigationFailed(object sender, System.Windows.Navigation.NavigationFailedEventArgs e)
{
e.Handled = true;
// TODO: Goto an error page.
}
private void frame_Navigated(object sender, System.Windows.Navigation.NavigationEventArgs e)
{
System.Diagnostics.Trace.WriteLine(e.WebResponse.Headers);
}
BTW: This is on the .Net 4.5 framework
It is also possible to use dynamic approach here.
wb.Navigated += delegate(object sender, NavigationEventArgs args)
{
dynamic doc = ((WebBrowser)sender).Document;
var url = doc.url as string;
if (url != null && url.StartsWith("res://ieframe.dll"))
{
// Do stuff to handle error navigation
}
};
I'd been struggling with this issue for some time. I discovered a cleaner way to handle this than the accepted answer. Checking for res://ieframe.dll didn't always work for me. Sometimes the document url is null when a navigation error happened.
Add the following References to you project:
Microsoft.mshtml
Microsoft.VisualStudio.OLE.Interop
SHDocVw (Under COM it's called "Microsoft Internet Controls")
Create the following helper class:
using System;
using System.Diagnostics.CodeAnalysis;
using System.Runtime.InteropServices;
using System.Windows.Controls;
using System.Windows.Navigation;
/// <summary>
/// Adds event handlers to a webbrowser control
/// </summary>
internal class WebBrowserHelper
{
[SuppressMessage("StyleCop.CSharp.NamingRules", "SA1310:FieldNamesMustNotContainUnderscore", Justification = "consistent naming")]
private static readonly Guid SID_SWebBrowserApp = new Guid("0002DF05-0000-0000-C000-000000000046");
internal WebBrowserHelper(WebBrowser browser)
{
// Add event handlers
browser.Navigated += this.OnNavigated;
// Navigate to about:blank to setup the browser event handlers in first call to OnNavigated
browser.Source = null;
}
internal delegate void NavigateErrorEvent(string url, int statusCode);
internal event NavigateErrorEvent NavigateError;
private void OnNavigated(object sender, NavigationEventArgs e)
{
// Grab the browser and document instance
var browser = sender as WebBrowser;
var doc = browser?.Document;
// Check if this is a nav to about:blank
var aboutBlank = new Uri("about:blank");
if (aboutBlank.IsBaseOf(e.Uri))
{
Guid serviceGuid = SID_SWebBrowserApp;
Guid iid = typeof(SHDocVw.IWebBrowser2).GUID;
IntPtr obj = IntPtr.Zero;
var serviceProvider = doc as Microsoft.VisualStudio.OLE.Interop.IServiceProvider;
if (serviceProvider?.QueryService(ref serviceGuid, ref iid, out obj) == 0)
{
// Set up event handlers
var webBrowser2 = Marshal.GetObjectForIUnknown(obj) as SHDocVw.IWebBrowser2;
var webBrowserEvents2 = webBrowser2 as SHDocVw.DWebBrowserEvents2_Event;
if (webBrowserEvents2 != null)
{
// Add event handler for navigation error
webBrowserEvents2.NavigateError -= this.OnNavigateError;
webBrowserEvents2.NavigateError += this.OnNavigateError;
}
}
}
}
/// <summary>
/// Invoked when navigation fails
/// </summary>
[SuppressMessage("StyleCop.CSharp.NamingRules", "SA1305:FieldNamesMustNotUseHungarianNotation", Justification = "consistent naming")]
[SuppressMessage("StyleCop.CSharp.NamingRules", "SA1306:FieldNamesMustBeginWithLowerCaseLetter", Justification = "consistent naming")]
private void OnNavigateError(object pDisp, ref object URL, ref object Frame, ref object StatusCode, ref bool Cancel)
{
this.NavigateError.Invoke(URL as string, (int)StatusCode);
}
}
Then in your window class:
// Init the UI
this.InitializeComponent();
this.WebBrowserHelper = new WebBrowserHelper(this.MyBrowserPane);
// Handle nav error
this.WebBrowserHelper.NavigateError += this.OnNavigateError;

Strange behavior with WPF MediaElement

I am current using MediaElement to play a variety of different files and I seem to have most of it working.
One thing I noticed is that Audio files (in this case mp3's specifically) refuse to play on the first attempt. Sometimes you can hear a millisecond (very unattractive) worth of sound. More like a blip and then nothing. Any subsequent attempt to load music works just fine, odd. Videos will play on the first attempt, and so will streamed media. This seems to only apply to local audio files.
The code that starts both audio and video files are pretty much identical.
private void lvVideos_MouseDoubleClick(object sender, MouseButtonEventArgs e)
{
var depObj = e.OriginalSource as DependencyObject;
if (depObj != null)
{
var parent = depObj.FindVisualAncestor<ListViewItem>();
if (parent != null && lvVideos.SelectedItem != null)
{
State = PlayState.Closed;
Video video = lvVideos.SelectedItem as Video;
if (video == null) return;
lblTrackName.Text = video.Title;
MediaPlayer.Source = null;
MediaPlayer.Source = new Uri(video.Location);
CurrentMedia = MediaType.Video;
State = PlayState.Playing;
}
}
}
private void lvMusic_MouseDoubleClick(object sender, MouseButtonEventArgs e)
{
var depObj = e.OriginalSource as DependencyObject;
if (depObj != null)
{
var parent = depObj.FindVisualAncestor<ListViewItem>();
if (parent != null && lvMusic.SelectedItem != null)
{
State = PlayState.Closed;
Music song = lvMusic.SelectedItem as Music;
if (song == null) return;
lblTrackName.Text = song.Title;
MediaPlayer.Source = null;
MediaPlayer.Source = new Uri(song.Location);
CurrentMedia = MediaType.Music;
State = PlayState.Playing;
}
}
}
As you can see I attempted to null the source property prior to loading the audio to no avail. I have managed to come up with a dirty hack of a workaround. Which involved setting the source to a file that is gaurenteed to fail (the app's .exe) and playing it as the app initialized. This allows for the first music file loaded to play properly.
Has anybody else come across this before? and are there any fixes?
EDIT: omg I feel stupid. apparently the culprit was mediaElement.ScrubbingEnabled = true; which (by the documentation) is a seemingly useful option, perhaps it should only be enabled for remote streams?
Apparently the culprit was mediaElement.ScrubbingEnabled = true; which (by the documentation) is a seemingly useful option, perhaps it should only be enabled for remote streams?

Categories