How can I get the value of an HTML element with CefSharp?
I know how to do with this default WebBrowser Control:
Dim Elem As HtmlElement = WebBrowser1.Document.GetElementByID("id")
But I didn't find anything similar for CefSharp. The main reason I am using CefSharp is because part of the website is using iframes to store the source and default WebBrowser doesn't support it. Also, does CefSharp have an option to InvokeMember or similar call?
I'm using the latest release of CefSharp by the way.
There is a really good example of how to do this in their FAQ.
https://github.com/cefsharp/CefSharp/wiki/Frequently-asked-questions#2-how-do-you-call-a-javascript-method-that-return-a-result
Here is the code for the lazy. Pretty self explanatory and it worked well for me.
string script = string.Format("document.getElementById('startMonth').value;");
browser.EvaluateScriptAsync(script).ContinueWith(x =>
{
var response = x.Result;
if (response.Success && response.Result != null)
{
var startDate = response.Result;
//startDate is the value of a HTML element.
}
});
this is the only way that worked for me, version 57.0.0.0..
((CefSharp.Wpf.ChromiumWebBrowser)chromeBrowser).FrameLoadEnd += Browser_FrameLoadEnd;
....
async void Browser_FrameLoadEnd(object sender, CefSharp.FrameLoadEndEventArgs e)
{
Console.WriteLine("cef-"+e.Url);
if (e.Frame.IsMain)
{
string HTML = await e.Frame.GetSourceAsync();
Console.WriteLine(HTML);
}
}
This worked for me. You can modify it by yourself.
private async void TEST()
{
string script = "document.getElementsByClassName('glass')[0]['firstElementChild']['firstChild']['wholeText']";
JavascriptResponse response = await browser.EvaluateScriptAsync(script);
label1.Text = response.Result.ToString();
}
Maybe this can do your job.
private async void TEST()
{
string script = "Document.GetElementByID('id').value";
JavascriptResponse response = await browser.EvaluateScriptAsync(script);
string resultS = response.Result.ToString(); // whatever you need
}
With CefSharp,you can get elements' value by javascript.
For example,
m_browser.ExecuteScriptAsync("document.GetElementById('id1');");
About javascript,you can learn it from w3s.
And I think you should read this passage.
Have fun.
string script = #"document.getElementById('id_element').style;";
browser.EvaluateScriptAsync(script).ContinueWith(x=> {
var response = x.Result;
if (response.Success && response.Result != null)
{
System.Dynamic.ExpandoObject abc = (System.Dynamic.ExpandoObject)response.Result;
foreach (KeyValuePair<string,object> item in abc)
{
string key = item.Key.ToString();
string value = item.Value.ToString();
}
}
});
It working for me.
Related
Just after some help with some code i've written to extract data using HttpClient.
I am new to writing code so can't find my problem. Could someone pls help me troubleshoot this.
I expect to write the data of the table i'm scraping to the console line.
Any help appreciated
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net.Http;
using HtmlAgilityPack;
namespace weatherCheck
{
class Program
{
private static void Main(string[] args)
{
GetHtmlAsync();
Console.ReadLine();
}
protected static async void GetHtmlAsync()
{
var url = "https://www.weatherzone.com.au/vic/melbourne/melbourne";
var httpClient = new HttpClient();
var html = await httpClient.GetStringAsync(url);
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(html);
//grab the rain chance, rain in mm and date
var MyTable = Enumerable.FirstOrDefault(htmlDocument.DocumentNode.Descendants("table")
.Where(table => table.Attributes.Contains("id"))
, table => table.Attributes["id"].Value == "forecast-table");
List<HtmlNode> rows = htmlDocument.DocumentNode.SelectNodes("//tr").ToList();
foreach (var row in rows)
{
try
{
if (MyTable != null)
{
Console.WriteLine(MyTable.GetAttributeValue("forecast-table", " "));
}
}
catch (Exception)
{
}
}
}
}
}
I used your code to look up the values but it didnt produce anything for me either. When i look at the htmlDocument.DocumentNode.OuterHtml to view the entire Html it is scraping, I dont see anything in the document that reflects an attribute forecast-table.
Also, you are validating MyTable each time you loop through rows. You should validate row != null along with printing attribute from row.
var MyTable = Enumerable.FirstOrDefault(htmlDocument.DocumentNode.Descendants("table")
.Where(table => table.Attributes.Contains("id")), table => table.Attributes["id"].Value == "forecast-table");
List<HtmlNode> rows = htmlDocument.DocumentNode.SelectNodes("//tr").ToList();
foreach (var row in rows)
{
try
{
if (row != null) // Here, it should be row, not My Table along with MyTable in line below.
Console.WriteLine(row.GetAttributeValue("forecast-table", " "));
}
catch (Exception)
{
}
}
Problem is
You also should know that Html you view by using Dev Tools on chrome is not the same as the one you see in HtmlAgilityPack. Chrome renders the page after executing the scripts where HtmlAgilityPack simply provides you with default HTML of the page. This is the reason why you are not able to get the value of forecast-table.
From Doc, For GetAttributeValue(name,def) it'll return def if the attribute not found.
So, it'll print ""(empty string if the attribute not found in your case)
remove async and await as you already calling httpClient.GetStringAsync(url);
var html =httpClient.GetStringAsync(url).Result;
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(html);
And print,
Console.WriteLine(MyTable.GetAttributeValue("forecast-table","SOME_TEXT_HERE").ToString());
I have the following method for replacing emoticons in a string using C#
public static string Emotify(string inputText)
{
var emoticonFolder = EmoticonFolder;
var emoticons = new Hashtable(100)
{
{":)", "facebook-smiley-face-for-comments.png"},
{":D", "big-smile-emoticon-for-facebook.png"},
{":(", "facebook-frown-emoticon.png"},
{":'(", "facebook-cry-emoticon-crying-symbol.png"},
{":P", "facebook-tongue-out-emoticon.png"},
{"O:)", "angel-emoticon.png"},
{"3:)", "devil-emoticon.png"},
{":/", "unsure-emoticon.png"},
{">:O", "angry-emoticon.png"},
{":O", "surprised-emoticon.png"},
{"-_-", "squinting-emoticon.png"},
{":*", "kiss-emoticon.png"},
{"^_^", "kiki-emoticon.png"},
{">:(", "grumpy-emoticon.png"},
{":v", "pacman-emoticon.png"},
{":3", "curly-lips-emoticon.png"},
{"o.O", "confused-emoticon-wtf-symbol-for-facebook.png"},
{";)", "wink-emoticon.png"},
{"8-)", "glasses-emoticon.png"},
{"8| B|", "sunglasses-emoticon.png"}
};
var sb = new StringBuilder(inputText.Length);
for (var i = 0; i < inputText.Length; i++)
{
var strEmote = string.Empty;
foreach (string emote in emoticons.Keys)
{
if (inputText.Length - i >= emote.Length && emote.Equals(inputText.Substring(i, emote.Length), StringComparison.InvariantCultureIgnoreCase))
{
strEmote = emote;
break;
}
}
if (strEmote.Length != 0)
{
sb.AppendFormat("<img src=\"{0}{1}\" alt=\"\" class=\"emoticon\" />", emoticonFolder, emoticons[strEmote]);
i += strEmote.Length - 1;
}
else
{
sb.Append(inputText[i]);
}
}
return sb.ToString();
}
It works great and 'seems' pretty fast, however I realised a slight problem with Html.
This method breaks pages with a link in them because of the..
:/
emoticon. It breaks the
http://
By sticking an image in the middle. I'm trying to figure out a way to adapt this method to take into account links and ignore them - But without sacrificing performance.
Any help or pointers greatly appreciated.
HTML agility pack and regex will be your friend here. You could have a decorator where your decorations build up the src?. Can we have an example of the src that causes the issue? :)
Here What I am trying to do, my employer want to be able to be able do 301 redirect with regex expression with the alias in Sitecore so the way I am trying to implement this is like this!
a singleline text field
with a checkbox to tell sitecore it will be a regex expression I am a noob in .NET and Sitecore how can I implement this ? here a exemple http://postimg.org/image/lwr524hkn/
I need help the exemple of redirect I want handle is like this, this is a exemple of the redirect I want to do it could be product at the place of solution.
exemple.com/en/solution/platform-features to
exemple.com/en/platform-features
I base the code from http://www.cmssource.co.uk/blog/2011/December/modifying-sitecore-alias-to-append-custom-query-strings-via-301-redirect this is for query string I want to use regex expression.
namespace helloworld.Website.SC.Common
{
public class AliasResolver : Sitecore.Pipelines.HttpRequest.AliasResolver
{
// Beginning of the Methods
public new void Process(HttpRequestArgs args)
{
Assert.ArgumentNotNull(args, "args");
if (!Settings.AliasesActive)
{
Tracer.Warning("Aliases in AliasResolver are not active.");
}
else
{
Sitecore.Data.Database database = Context.Database;
if (database == null)
{
Tracer.Warning("There is no context in the AliasResolver.");
}
else
{
{
Profiler.StartOperation("Resolve virgin alias pipeline.");
Item item = ItemManager.GetItem(FileUtil.MakePath("/sitecore/system/aliases", args.LocalPath, '/'), Language.Current, Sitecore.Data.Version.First, database, SecurityCheck.Disable);
if (item != null)
{
//Alias existis (now we have the alias item)
if (item.Fields["Regular Expressions"] != null)
{
if (!String.IsNullOrEmpty(item.Fields["Regular Expressions"].Value) && !args.Url.QueryString.Contains("aproc"))
{
var reg = new Regex(#"(?<Begin>([^/]*/){2})[^/]*/(?<End>.*)");
var match = reg.Match(#"exemple.com/en/solution/platform-features");
var result = match.Groups["Begin"].Value + match.Groups["End"].Value;
}
}
}
Profiler.EndOperation();
}
catch (Exception ex)
{
Log.Error("Had a problem in the VirginAliasResolver. Error: " + ex.Message, this);
}
}
}
}
///<summary>
/// Once a match is found and we have a Sitecore Item, we can send the 301 response.
///</summary>
private static void SendResponse(string redirectToUrl, HttpRequestArgs args)
{
args.Context.Response.Status = "301 Moved Permanently";
args.Context.Response.StatusCode = 301;
args.Context.Response.AddHeader("Location", redirectToUrl);
args.Context.Response.End();
}
}
}
PS: I know they have module for this but my employer want it done that way and I am reaching for help since it's been a week I'm trying to add this feature
So if I understand correctly, you do not want to select an Alias by path:
Item item = ItemManager.GetItem(FileUtil.MakePath("/sitecore/system/aliases", args.LocalPath, '/'), Language.Current, Sitecore.Data.Version.First, database, SecurityCheck.Disable);
But rather find an Alias comparing a Regex field to the Url. I have not tested this, but it could be someting like:
var originalUrl = HttpContext.Current.Request.Url;
var allAliases = Sitecore.Context.Database.SelectItems("/sitecore/system/aliases//*");
var foundAlias = allAliases.FirstOrDefault( alias =>
!string.IsNullOrEmpty(alias["Regular Expressions"]) &&
Regex.IsMatch(HttpContext.Current.Request.Url.ToString(), alias["Regular Expressions"]));
Then, if foundAlias != null, you can retrieve the url and redirect like you do in your private SendResponse function.
var linkField = (LinkField)foundAlias.Fields["linked item"];
var targetUrl = linkField.Url;
using (new SecurityDisabler())
{
if (string.IsNullOrEmpty(targetUrl) && linkField.TargetItem != null)
targetUrl = LinkManager.GetItemUrl(linkField.TargetItem);
}
SendResponse(targetUrl, args);
Again, I have not tested this so don't shoot me if it needs some corrections, but this should help you get on your way.
Using MvcMailer, the problem is that our emails are being sent without our CSS as inline style attributes.
PreMailer.Net is a C# Library that can read in an HTML source string, and return a resultant HTML string with CSS in-lined.
How do we use them together? Using the scaffolding example in the MvcMailer step-by-step guide, we start out with this example method in our UserMailer Mailer class:
public virtual MvcMailMessage Welcome()
{
return Populate(x => {
x.ViewName = "Welcome";
x.To.Add("some-email#example.com");
x.Subject = "Welcome";
});
}
Simply install PreMailer.Net via NugGet
Update the Mailer class:
public virtual MvcMailMessage Welcome()
{
var message = Populate(x => {
x.ViewName = "Welcome";
x.To.Add("some-email#example.com");
x.Subject = "Welcome";
});
message.Body = PreMailer.Net.PreMailer.MoveCssInline(message.Body).Html;
return message;
}
Done!
If you have a text body with HTML as an alternate view (which I recommend) you'll need to do the following:
var message = Populate(m =>
{
m.Subject = subject;
m.ViewName = viewName;
m.To.Add(model.CustomerEmail);
m.From = new System.Net.Mail.MailAddress(model.FromEmail);
});
// get the BODY so we can process it
var body = EmailBody(message.ViewName);
var processedBody = PreMailer.Net.PreMailer.MoveCssInline(body, true).Html;
// start again with alternate view
message.AlternateViews.Clear();
// add BODY as alternate view
var htmlView = AlternateView.CreateAlternateViewFromString(processedBody, new ContentType("text/html"));
message.AlternateViews.Add(htmlView);
// add linked resources to the HTML view
PopulateLinkedResources(htmlView, message.LinkedResources);
Note: Even if you think you don't care about text it can help with spam filters.
I recommend reading the source for MailerBase to get a better idea what's going on cos all these Populate methods get confusing.
Note: This may not run as-is but you get the idea. I have code (not shown) that parses for any img tags and adds as auto attachments.
Important part is to clear the HTML alternate view. You must have a .text.cshtml file for the text view.
If you're using ActionMailer.Net(.Next), you can do this:
protected override void OnMailSending(MailSendingContext context)
{
if (context.Mail.IsBodyHtml)
{
var inlineResult = PreMailer.Net.PreMailer.MoveCssInline(context.Mail.Body);
context.Mail.Body = inlineResult.Html;
}
for (var i = 0; i < context.Mail.AlternateViews.Count; i++)
{
var alternateView = context.Mail.AlternateViews[i];
if (alternateView.ContentType.MediaType != AngleSharp.Network.MimeTypeNames.Html) continue;
using (alternateView) // make sure it is disposed
{
string content;
using (var reader = new StreamReader(alternateView.ContentStream))
{
content = reader.ReadToEnd();
}
var inlineResult = PreMailer.Net.PreMailer.MoveCssInline(content);
context.Mail.AlternateViews[i] = AlternateView.CreateAlternateViewFromString(inlineResult.Html, alternateView.ContentType);
}
}
base.OnMailSending(context);
}
If you don't like using AngleSharp.Network.MimeTypeNames, you can just use "text/html". AngleSharp comes as a dependency of ActionMailer.Net.
I for parsing html use Html Agility Pack and so Grate stuff
but i encountered some bad things :|
this is my Background Code
public static HtmlDocument GetXHtmlFromUri2(string uri)
{
HttpClient client = HttpClientFactory.Create(new CustomeHeaderHandler());
var htmlDoc = new HtmlDocument()
{
OptionCheckSyntax = true,
OptionFixNestedTags = true,
OptionAutoCloseOnEnd = true,
OptionReadEncoding = true,
OptionDefaultStreamEncoding = Encoding.UTF8,
};
htmlDoc.LoadHtml(client.GetStringAsync(uri).Result);
return htmlDoc;
}
i use html agility for WebApi (Mvc4) and this is Get Method Logic
//GET api/values
public string GetHtmlFlights()
{
var result = ClientFlightTabale.GetXHtmlFromUri2("http://ikiafids.ir/departureFA.html");
HtmlNode node = result.DocumentNode.SelectSingleNode("//table[1]/tbody/tr[1]");
string temp = node.FirstChild.InnerHtml.Trim();
return temp;
}
but when i Call this method (from Browser and Fiddler) encountered Exceptions , With this theme :
Object reference not set to an instance of an object, and this exception Is concerned this line
string temp = node.FirstChild.InnerHtml.Trim();
can anyone help me please ?
I think you are looking for something like this:
var result = ClientFlightTabale.GetXHtmlFromUri2("http://ikiafids.ir/departureFA.html");
var tableNode = result.DocumentNode.SelectSingleNode("//table[1]");
var titles = tableNode.Descendants("th")
.Select(th => th.InnerText)
.ToList();
var table = tableNode.Descendants("tr").Skip(1)
.Select(tr => tr.Descendants("td")
.Select(td => td.InnerText)
.ToList())
.ToList();
I think your selector is wrong. Try this instead?
result.DocumentNode.SelectSingleNode("//table/tr[1]")