HtmlAgilityPack XPath case ignoring

HtmlAgilityPack XPath case ignoring - c#

When I use
SelectSingleNode("//meta[#name='keywords']")
it doesn't work, but when I use the same case that used in original document it works good:
SelectSingleNode("//meta[#name='Keywords']")
So the question is how can I set case ignoring?

If the actual value is an unknown case, I think you have to use translate. I believe it's:
SelectSingleNode("//meta[translate(#name,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz')='keywords']")
This is the hack, but it's the only option in XPath 1.0 (except the opposite to upper-case).

If you need a more comprehensive solution, you can write an extension function for the XPath processor which will perform a case insensitive comparison. It is quite a bit of code, but you only write it once.
After implementing the extension you can write your query as follows
"//meta[#name[Extensions:CaseInsensitiveComparison('Keywords')]]"
Where Extensions:CaseInsensitiveComparison is the extension function implemented in the sample below.
NOTE: this is not well tested I just threw it together for this response so the error handling etc. is non-existent!
The following is the code for the custom XSLT Context which provides one or more extension functions
using System;
using System.Xml.XPath;
using System.Xml.Xsl;
using System.Xml;
using HtmlAgilityPack;
public class XsltCustomContext : XsltContext
{
public const string NamespaceUri = "http://XsltCustomContext";
public XsltCustomContext()
{
}
public XsltCustomContext(NameTable nt)
: base(nt)
{
}
public override IXsltContextFunction ResolveFunction(string prefix, string name, XPathResultType[] ArgTypes)
{
// Check that the function prefix is for the correct namespace
if (this.LookupNamespace(prefix) == NamespaceUri)
{
// Lookup the function and return the appropriate IXsltContextFunction implementation
switch (name)
{
case "CaseInsensitiveComparison":
return CaseInsensitiveComparison.Instance;
}
}
return null;
}
public override IXsltContextVariable ResolveVariable(string prefix, string name)
{
return null;
}
public override int CompareDocument(string baseUri, string nextbaseUri)
{
return 0;
}
public override bool PreserveWhitespace(XPathNavigator node)
{
return false;
}
public override bool Whitespace
{
get { return true; }
}
// Class implementing the XSLT Function for Case Insensitive Comparison
class CaseInsensitiveComparison : IXsltContextFunction
{
private static XPathResultType[] _argTypes = new XPathResultType[] { XPathResultType.String };
private static CaseInsensitiveComparison _instance = new CaseInsensitiveComparison();
public static CaseInsensitiveComparison Instance
{
get { return _instance; }
}
#region IXsltContextFunction Members
public XPathResultType[] ArgTypes
{
get { return _argTypes; }
}
public int Maxargs
{
get { return 1; }
}
public int Minargs
{
get { return 1; }
}
public XPathResultType ReturnType
{
get { return XPathResultType.Boolean; }
}
public object Invoke(XsltContext xsltContext, object[] args, XPathNavigator navigator)
{
// Perform the function of comparing the current element to the string argument
// NOTE: You should add some error checking here.
string text = args[0] as string;
return string.Equals(navigator.Value, text, StringComparison.InvariantCultureIgnoreCase);
}
#endregion
}
}
You can then use the above extension function in your XPath queries, here is an example for our case
class Program
{
static string html = "<html><meta name=\"keywords\" content=\"HTML, CSS, XML\" /></html>";
static void Main(string[] args)
{
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
XPathNavigator nav = doc.CreateNavigator();
// Create the custom context and add the namespace to the context
XsltCustomContext ctx = new XsltCustomContext(new NameTable());
ctx.AddNamespace("Extensions", XsltCustomContext.NamespaceUri);
// Build the XPath query using the new function
XPathExpression xpath =
XPathExpression.Compile("//meta[#name[Extensions:CaseInsensitiveComparison('Keywords')]]");
// Set the context for the XPath expression to the custom context containing the
// extensions
xpath.SetContext(ctx);
var element = nav.SelectSingleNode(xpath);
// Now we have the element
}
}

This is how I do it:
HtmlNodeCollection MetaDescription = document.DocumentNode.SelectNodes("//meta[#name='description' or #name='Description' or #name='DESCRIPTION']");
string metaDescription = MetaDescription != null ? HttpUtility.HtmlDecode(MetaDescription.FirstOrDefault().Attributes["content"].Value) : string.Empty;

Alternatively use the new Linq syntax which should support case insensitive matching:
node = doc.DocumentNode.Descendants("meta")
.Where(meta => meta.Attributes["name"] != null)
.Where(meta => string.Equals(meta.Attributes["name"].Value, "keywords", StringComparison.OrdinalIgnoreCase))
.Single();
But you have to do an ugly null check for the attributes in order to prevent a NullReferenceException...

Related

How to fix missing namespace prefix in XML

I have an XML file which I can't modify by myself. It contains the following root element:
<foo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" noNamespaceSchemaLocation="some.xsd">
As you can see the prefix xsi: is missing for noNamespaceSchemaLocation. This causes the XmlReader to not find the schema information while validating. If I add the prefix all is good. But as I said I can't modify the XML file (besides testing). I get them from an external source and my tool should automatically validate them.
Is there a possibility to make the XmlReader interpret noNamespaceSchemaLocation without the xsi: prefix? I don't want to add the prefix inside the XML in a preprocessing step or something like that as the sources should exactly remain as they are.

The XML is wrong and you need to fix it. Either get your supplier to improve the quality of what they send, or repair it on arrival.
I don't know why you want to retain the broken source (all quality standards say that's bad practice), but it's certainly possible to keep the broken original as well as the repaired version.

The internals of XmlReader are ugly and undocumented. So this solution is like playing with fire.
What I propose is: a XmlTextReader that "adds" the missing namespace. You can feed directly this FixingXmlTextReader to a XDocument.Load() OR you can feed it to a XmlTextReader/XmlValidatingReader (they all have a constructor/Create that accept a XmlReader as a parameter)
public class FixingXmlTextReader : XmlTextReader
{
public override string NamespaceURI
{
get
{
if (NodeType == XmlNodeType.Attribute && base.LocalName == "noNamespaceSchemaLocation")
{
return NameTable.Add("http://www.w3.org/2001/XMLSchema-instance");
}
return base.NamespaceURI;
}
}
public override string Prefix
{
get
{
if (NodeType == XmlNodeType.Attribute && base.NamespaceURI == string.Empty && base.LocalName == "noNamespaceSchemaLocation")
{
return NameTable.Add("xsi");
}
return base.Prefix;
}
}
public override string Name
{
get
{
if (NodeType == XmlNodeType.Attribute && base.NamespaceURI == string.Empty && base.LocalName == "noNamespaceSchemaLocation")
{
return NameTable.Add(Prefix + ":" + LocalName);
}
return base.Name;
}
}
public override string GetAttribute(string localName, string namespaceURI)
{
if (localName == "noNamespaceSchemaLocation" && namespaceURI == "http://www.w3.org/2001/XMLSchema-instance")
{
namespaceURI = string.Empty;
}
return base.GetAttribute(localName, namespaceURI);
}
public override string GetAttribute(string name)
{
if (name == "xsi:noNamespaceSchemaLocation")
{
name = "noNamespaceSchemaLocation";
}
return base.GetAttribute(name);
}
// There are tons of constructors, take the ones you need
public FixingXmlTextReader(Stream stream) : base(stream)
{
}
public FixingXmlTextReader(TextReader input) : base(input)
{
}
public FixingXmlTextReader(string url) : base(url)
{
}
}
Like:
using (var reader = new FixingXmlTextReader("XMLFile1.xml"))
using (var reader2 = XmlReader.Create(reader, new XmlReaderSettings
{
}))
{
// Use the reader2!
}
or
using (var reader = new FixingXmlTextReader("XMLFile1.xml"))
{
var xdoc = new XmlDocument();
xdoc.Load(reader);
}

Remove Extraneous Semicolons in C# Using Roslyn - (replace w empty trivia)

I've figured out how to open a solution and then iterate through the Projects and then Documents. I'm stuck with how to look for C# Classes, Enums, Structs, and Interfaces that may have an extraneous semicolon at the end of the declaration (C++ style). I'd like to remove those and save the .cs files back to disk. There are approximately 25 solutions written at my current company that I would run this against. Note: The reason we are doing this is to move forward with a better set of coding standards. (And I'd like to learn how to use Roslyn to do these 'simple' adjustments)
Example (UPDATED):
class Program
{
static void Main(string[] args)
{
string solutionFile = #"S:\source\dotnet\SimpleApp\SimpleApp.sln";
IWorkspace workspace = Workspace.LoadSolution(solutionFile);
var proj = workspace.CurrentSolution.Projects.First();
var doc = proj.Documents.First();
var root = (CompilationUnitSyntax)doc.GetSyntaxRoot();
var classes = root.DescendantNodes().OfType<ClassDeclarationSyntax>();
foreach (var decl in classes)
{
ProcessClass(decl);
}
Console.ReadKey();
}
private static SyntaxNode ProcessClass(ClassDeclarationSyntax node)
{
ClassDeclarationSyntax newNode;
if (node.HasTrailingTrivia)
{
foreach (var t in node.GetTrailingTrivia())
{
var es = new SyntaxTrivia();
es.Kind = SyntaxKind.EmptyStatement;
// kind is readonly - what is the right way to create
// the right SyntaxTrivia?
if (t.Kind == SyntaxKind.EndOfLineTrivia)
{
node.ReplaceTrivia(t, es);
}
}
return // unsure how to do transform and return it
}
}
Example Code I Want to Transform
using System;
public class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
};
// note: the semicolon at the end of the Person class

Here is a little program that removes the optional semicolon after all class-, struct-, interface and enum-declarations within a solution. The program loops through documents within the solution, and uses a SyntaxWriter for rewriting the syntaxtree. If any changes were made, the original code-files are overwritten with the new syntax.
using System;
using System.IO;
using System.Linq;
using Roslyn.Compilers.CSharp;
using Roslyn.Services;
namespace TrailingSemicolon
{
class Program
{
static void Main(string[] args)
{
string solutionfile = #"c:\temp\mysolution.sln";
var workspace = Workspace.LoadSolution(solutionfile);
var solution = workspace.CurrentSolution;
var rewriter = new TrailingSemicolonRewriter();
foreach (var project in solution.Projects)
{
foreach (var document in project.Documents)
{
SyntaxTree tree = (SyntaxTree)document.GetSyntaxTree();
var newSource = rewriter.Visit(tree.GetRoot());
if (newSource != tree.GetRoot())
{
File.WriteAllText(tree.FilePath, newSource.GetText().ToString());
}
}
}
}
class TrailingSemicolonRewriter : SyntaxRewriter
{
public override SyntaxNode VisitClassDeclaration(ClassDeclarationSyntax node)
{
return RemoveSemicolon(node, node.SemicolonToken, t => node.WithSemicolonToken(t));
}
public override SyntaxNode VisitInterfaceDeclaration(InterfaceDeclarationSyntax node)
{
return RemoveSemicolon(node, node.SemicolonToken, t => node.WithSemicolonToken(t));
}
public override SyntaxNode VisitStructDeclaration(StructDeclarationSyntax node)
{
return RemoveSemicolon(node, node.SemicolonToken, t => node.WithSemicolonToken(t));
}
public override SyntaxNode VisitEnumDeclaration(EnumDeclarationSyntax node)
{
return RemoveSemicolon(node, node.SemicolonToken, t => node.WithSemicolonToken(t));
}
private SyntaxNode RemoveSemicolon(SyntaxNode node,
SyntaxToken semicolonToken,
Func<SyntaxToken, SyntaxNode> withSemicolonToken)
{
if (semicolonToken.Kind != SyntaxKind.None)
{
var leadingTrivia = semicolonToken.LeadingTrivia;
var trailingTrivia = semicolonToken.TrailingTrivia;
SyntaxToken newToken = Syntax.Token(
leadingTrivia,
SyntaxKind.None,
trailingTrivia);
bool addNewline = semicolonToken.HasTrailingTrivia
&& trailingTrivia.Count() == 1
&& trailingTrivia.First().Kind == SyntaxKind.EndOfLineTrivia;
var newNode = withSemicolonToken(newToken);
if (addNewline)
return newNode.WithTrailingTrivia(Syntax.Whitespace(Environment.NewLine));
else
return newNode;
}
return node;
}
}
}
}
Hopefully it is something along the lines of what you were looking for.

This information would have to be stored in the ClassDeclaration node - as, according to the C# specification, the semi-colon is an optional token in the end of its productions:
class-declaration:
attributesopt class-modifiersopt partialopt class identifier type-parameter-listopt
class-baseopt type-parameter-constraints-clausesopt class-body ;opt
UPDATE
According to Roslyn's documentation, you cannot actually change Syntax Trees - as they are immutable structures. That's probably the reason why kind is readonly. You may, however, create a new tree, using With* methods, defined for each changeable tree property, and using ReplaceNode. There is a good example on Roslyn documentation:
var root = (CompilationUnitSyntax)tree.GetRoot();
var oldUsing = root.Usings[1];
var newUsing = oldUsing.WithName(name); //changes the name property of a Using statement
root = root.ReplaceNode(oldUsing, newUsing);
For converting your new tree into code again (aka pretty printing), you could use the GetText() method from the compilation unit node (in our example, the root variable).
You can also extend a SyntaxRewriter class for performing code transformations. There is an extensive example for doing so in the official Roslyn website; take a look at this particular walkthrough. The following commands write the transformed tree back to the original file:
SyntaxNode newSource = rewriter.Visit(sourceTree.GetRoot());
if (newSource != sourceTree.GetRoot())
{
File.WriteAllText(sourceTree.FilePath, newSource.GetFullText());
}
where rewriter is an instance of a SyntaxRewriter.

Equivalent code of CreateObject in C#

I have a code in VB6. Can anyone tell me how to write it in C#. This code is below:
Set Amibroker = CreateObject("Broker.Application")
Set STOCK = Amibroker.Stocks.Add(ticker)
Set quote = STOCK.Quotations.Add(stInDate)
quote.Open = stInOpen
quote.High = stInHigh
quote.Low = stInlow
quote.Close = stInYcp
quote.Volume = stInVolume
Set STOCK = Nothing
Set quote = Nothing
What is the equivalent of CreateObject in C#?. I try to add references to com object but i can't find any com object as Broker.Application or amibroker

If you are using .net 4 or later, and therefore can make use of dynamic, you can do this quite simply. Here's an example that uses the Excel automation interface.
Type ExcelType = Type.GetTypeFromProgID("Excel.Application");
dynamic ExcelInst = Activator.CreateInstance(ExcelType);
ExcelInst.Visible = true;
If you can't use dynamic then it's much more messy.
Type ExcelType = Type.GetTypeFromProgID("Excel.Application");
object ExcelInst = Activator.CreateInstance(ExcelType);
ExcelType.InvokeMember("Visible", BindingFlags.SetProperty, null,
ExcelInst, new object[1] {true});
Trying to do very much of that will sap the lifeblood from you.
COM is so much easier if you can use early bound dispatch rather than late bound as shown above. Are you sure you can't find the right reference for the COM object?

If you use .NET Framework 4.0 and above, you can use this pattern:
public sealed class Application: MarshalByRefObject {
private readonly dynamic _application;
// Methods
public Application() {
const string progId = "Broker.Application";
_application = Activator.CreateInstance(Type.GetTypeFromProgID(progId));
}
public Application(dynamic application) {
_application = application;
}
public int Import(ImportType type, string path) {
return _application.Import((short) type, path);
}
public int Import(ImportType type, string path, string defFileName) {
return _application.Import((short) type, path, defFileName);
}
public bool LoadDatabase(string path) {
return _application.LoadDatabase(path);
}
public bool LoadLayout(string path) {
return _application.LoadLayout(path);
}
public int Log(ImportLog action) {
return _application.Log((short) action);
}
public void Quit() {
_application.Quit();
}
public void RefreshAll() {
_application.RefreshAll();
}
public void SaveDatabase() {
_application.SaveDatabase();
}
public bool SaveLayout(string path) {
return _application.SaveLayout(path);
}
// Properties
public Document ActiveDocument {
get {
var document = _application.ActiveDocument;
return document != null ? new Document(document) : null;
}
}
public Window ActiveWindow {
get {
var window = _application.ActiveWindow;
return window != null ? new Window(window) : null;
}
}
public AnalysisDocs AnalysisDocs {
get {
var analysisDocs = _application.AnalysisDocs;
return analysisDocs != null ? new AnalysisDocs(analysisDocs) : null;
}
}
public Commentary Commentary {
get {
var commentary = _application.Commentary;
return commentary != null ? new Commentary(commentary) : null;
}
}
public Documents Documents {
get {
var documents = _application.Documents;
return documents != null ? new Documents(documents) : null;
}
}
public string DatabasePath {
get { return _application.DatabasePath; }
}
public bool Visible {
get { return _application.Visible != 0; }
set { _application.Visible = value ? 1 : 0; }
}
public string Version {
get { return _application.Version; }
}
}
}
Next you must wrap all AmiBroker OLE Automation classes. For example wrap Commentary class:
public sealed class Commentary : MarshalByRefObject {
// Fields
private readonly dynamic _commentary;
// Methods
internal Commentary(dynamic commentary) {
_commentary = commentary;
}
public void Apply() {
_commentary.Apply();
}
public void Close() {
_commentary.Close();
}
public bool LoadFormula(string path) {
return _commentary.LoadFormula(path);
}
public bool Save(string path) {
return _commentary.Save(path);
}
public bool SaveFormula(string path) {
return _commentary.SaveFormula(path);
}
}

Here's a snippet from the C# code I used to automate Amibroker (from when I went down that path). You'll need to reference System.Runtime.Interopservices
System.Type objType = System.Type.GetTypeFromProgID("Broker.Application");
dynamic comObject = System.Activator.CreateInstance(objType);
comObject.Import(0, fileName, "default.format");
comObject.RefreshAll();
Typing a dot won't bring up the comObject internal methods, though.
All I can say about this method is - it works, like a charm, but stay away from it, like David said. I got my inspiration for this method from:
http://www.codeproject.com/Articles/148959/How-the-new-C-dynamic-type-can-simplify-access-to
For another angle of attack, you may want to check out (I think this is early binding):
http://adamprescott.net/2012/04/05/net-vb6-interop-tutorial/
Hope at least some of this help you. I've used both these methods with Amibroker and C#, but I ended up leaving them behind. COM and Amibroker don't mix well. Even TJ says so.
Good luck anyway.

ami2py will read AmiBroker data into python. The current version is .0.8.1 WARNING: It only provides day resolution on data.
The following few lines of code will read a symbol from AmiBroker into a pandas df
import pandas
import ami2py
folder='C:/Program Files/AmiBroker/Databases/StockCharts'
symbol='indu'
df = pandas.DataFrame()
symbolData = ami2py.AmiDataBase(folder).get_dict_for_symbol(symbol)
for z in ['Year', 'Month', 'Day', 'Open', 'High', 'Low', 'Close', 'Volume'] :
df[symbol+':'+z] = symbolData[z]
print(df.describe())

Parse/Extract information from text template

i need to extract information from incoming (e.g. xml) data based on a given template.
The template may be XML or plain text (comma separated). For each type of message there exists a template, e.g.
<SomeMessage>
<Id>$id</Id>
<Position>
<X>$posX</X>
<Y>$posY</Y>
<Z>$posZ</Z>
</Position>
</SomeMessage>
The incoming data for example is:
<SomeMessage>
<Id>1</Id>
<Position>
<X>0.5f</X>
<Y>1.0f</Y>
<Z>0.0f</Z>
</Position>
</SomeMessage>
Now i need to extract information about $id, $posX, etc.
Parser p = new Parser(templateString);
int id = p.Extract("id", incomingString);
float posx = p.Extract("posX", incomingString);
I need something like the difference of incomingData and template and then extract the information at the appropiate position. Because there exist several tempaltes which contain different information and may be extended in the future i am looking for a general approach.
The template in this case may also be
$id,$posX,$posY,$posZ
and the incoming data would be then
1,0.5f,1.0f,0.0f
The latter case may be eaiser to parse, but i need a solution which is able the handle both (xml template as well as non xml).

You could create a parsing class having a property for each field:
class Parser
{
public string PositionX { get; set; }
public string PositionY { get; set; }
public string PositionZ { get; set; }
public Parser(XmlNode item)
{
this.PositionX = GetNodeValue(item, "Position/X");
this.PositionY = GetNodeValue(item, "Position/X/Y");
this.PositionZ = GetNodeValue(item, "Position/X/Y/Z");
}
}
I can supply a routine that can generate such parsing classes from sample xml if your interested, when arrays do not concern. GetNodeValue is a method that uses an xpath query and returns the value for the xpath (basicly XmlNode.SelectSingleNode with some added parsing added to it).

It would probably be a good idea to use a Interface and 2 different templates for each occasion. Note that the returned Message is not completed but it gives you an idea. With the static XElement.Parse you can parse well formed XML strings for easier usage.
public interface IParser
{
Message Parse(String Payload);
}
// Position Class
public class Position
{
public int X { get; private set; }
public int Y { get; private set; }
public int Z { get; private set; }
public Position(int X, int Y, int Z)
{
this.X = X;
this.Y = Y;
this.Z = Z;
}
}
// Message Class
public class Message
{
public String ID { get; private set; }
public Position Position { get; private set; }
public Message(String ID, Position Position)
{
this.ID = ID;
this.Position = Position;
}
}
// Parser Class
public class XMLParser : IParser
{
public Message Parse(string Payload)
{
var result = XElement.Parse(Payload);
return new Message(result.Elements().ElementAt(0).Value, new Position(X,Y,Z);
}
}

For each template create a parser definition file of the format:
Parser Type (XML or CSV)
Variable1, path
variable2, path
etc
for xml path might be someMessage,Position,x.
for csv you might forget the path and just list the variables in order.
Then when you read in your template file your pick up the parser type and the path to each of the variables. If you have a lot of hierarchical information then you'll have to apply a bit of imagination to this but it should be fine for the simple case you've given.
For anything over CSV you will probably have to use a parser, but XML/XPATH is pretty simple to find the basics for.

using System;
using System.IO;
using System.Xml;
class TemplateParse {
XmlDocument xdoc;
string GetPath(XmlNode node, string val, string path){
if(node.HasChildNodes){
if(node.ChildNodes.Count == 1 && node.FirstChild.NodeType == XmlNodeType.Text)
return (node.FirstChild.Value == val) ? path + "/" + node.Name : String.Empty;
foreach(XmlNode cnode in node.ChildNodes){
if(cnode.NodeType != XmlNodeType.Element) continue;
string result = GetPath(cnode, val, path + "/" + node.Name);
if(result != String.Empty) return result;
}
}
return "";
}
public TemplateParse(string templateXml){
xdoc = new XmlDocument();
xdoc.LoadXml(templateXml);
}
public string Extract(string valName, string data){
string xpath = GetPath((XmlNode)xdoc.DocumentElement, "$" + valName, "/");
var doc = new XmlDocument();
doc.LoadXml(data);
return doc.SelectSingleNode(xpath).InnerText;
// var value = doc.SelectSingleNode(xpath).InnerText;
// var retType = typeof(T);
// return (T)retType.InvokeMember("Parse", System.Reflection.BindingFlags.InvokeMethod, null, null, new []{value});
}
}
class Sample {
static void Main(){
string templateString = File.ReadAllText(#".\template.xml");
string incomingString = File.ReadAllText(#".\data.xml");
var p = new TemplateParse(templateString);
string[] names = new [] { "id", "posX", "posY", "posZ" };
foreach(var name in names){
var value = p.Extract(name, incomingString);
Console.WriteLine("{0}:{1}", name, value);
}
}
}
OUTPUT
id:1
posX:0.5f
posY:1.0f
posZ:0.0f

Getting a parameterless method to act like a Func<ReturnT>

I'm trying to make a part of my code more fluent.
I have a string extension that makes an HTTP request out of the string and returns the response as a string. So I can do something like...
string _html = "http://www.stackoverflow.com".Request();
I'm trying to write an extension that will keep trying the request until it succeeds. My signature looks something like...
public static T KeepTrying<T>(this Func<T> KeepTryingThis) {
// Code to ignore exceptions and keep trying goes here
// Returns the result of KeepTryingThis if it succeeds
}
I intend to call it something like...
string _html = "http://www.stackoverflow.com".Request.KeepTrying();
Alas, that doesn't seem to work =). I tried making it into a lambda first but that doesn't seem to work either.
string _html = (() => "http://www.stackoverflow.com".Request()).KeepTrying();
Is there a way to do what I'm trying to do while keeping the syntax fairly fluent?
Suggestions much appreciated.
Thanks.

You can't use a method group for extension methods, or lambda expressions. I blogged about this a while ago.
I suspect you could cast to Func<string>:
string _html = ((Func<string>)"http://www.stackoverflow.com".Request)
.KeepTrying();
but that's pretty nasty.
One alternative would be to change Request() to return a Func, and use:
string _html = "http://www.stackoverflow.com".Request().KeepTrying();
Or if you wanted to keep the Request method itself simple, just add a RequestFunc method:
public static Func<string> RequestFunc(this string url)
{
return () => url.Request();
}
and then call:
string _html = "http://www.stackoverflow.com".RequestFunc().KeepTrying();

Why not turn this on its head?
static T KeepTrying<T>(Func<T> func) {
T val = default(T);
while (true) {
try {
val = func();
break;
} catch { }
}
return val;
}
var html = KeepTrying(() => "http://www.stackoverflow.com".Request());

What about enhancing the Request?
string _html = "http://www.stackoverflow.com".Request(RequestOptions.KeepTrying);
string _html = "http://www.stackoverflow.com".Request(RequestOptions.Once);
RequestOptions is a enum. You could also have more options, timeout arguments, number of retries etc.
OR
public static string RepeatingRequest(this string url) {
string response = null;
while ( response != null /* how ever */ ) {
response = url.Request();
}
return response;
}
string _html = "http://www.stackoverflow.com".RepeatingRequest();

AFAIK you can write an extension method that extends a Func<T> delegate, but the compiler doesn't know what do you mean:
string _html = "http://www.stackoverflow.com".Request.KeepTrying(); // won't work
But if you explicitly cast the delegate will work:
string _html = ((Func<string>)"http://www.stackoverflow.com".Request).KeepTrying(); // works
The question here it whether the code readability is really improved in this case by an extension method.

I wouldn't write an extension method for string. Use a more specific type, like the Uri.
The full code:
public static class Extensions
{
public static UriRequest Request(this Uri uri)
{
return new UriRequest(uri);
}
public static UriRequest KeepTrying(this UriRequest uriRequest)
{
uriRequest.KeepTrying = true;
return uriRequest;
}
}
public class UriRequest
{
public Uri Uri { get; set; }
public bool KeepTrying { get; set; }
public UriRequest(Uri uri)
{
this.Uri = uri;
}
public string ToHtml()
{
var client = new System.Net.WebClient();
do
{
try
{
using (var reader = new StreamReader(client.OpenRead(this.Uri)))
{
return reader.ReadToEnd();
}
}
catch (WebException ex)
{
// log ex
}
}
while (KeepTrying);
return null;
}
public static implicit operator string(UriRequest uriRequest)
{
return uriRequest.ToHtml();
}
}
Calling it:
string html = new Uri("http://www.stackoverflow.com").Request().KeepTrying();

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

HtmlAgilityPack XPath case ignoring - c#

When I use SelectSingleNode("//meta[#name='keywords']") it doesn't work, but when I use the same case that used in original document it works good: SelectSingleNode("//meta[#name='Keywords']") So the question is how can I set case ignoring?

If the actual value is an unknown case, I think you have to use translate. I believe it's: SelectSingleNode("//meta[translate(#name,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz')='keywords']") This is the hack, but it's the only option in XPath 1.0 (except the opposite to upper-case).

Related

How to fix missing namespace prefix in XML

Remove Extraneous Semicolons in C# Using Roslyn - (replace w empty trivia)

Equivalent code of CreateObject in C#

Parse/Extract information from text template

Getting a parameterless method to act like a Func<ReturnT>

Categories

Resources