Essential Objects WebView how to navigate through the HTML-tree? - c#

I am using the Essential objects library to read out websites.
I've done that before with windows forms webbrowser, but this time the website is not working with windows forms webbrowser so I had to change to EO webView.
The documentary is so poor, that I can't find an answer.
In windows forms webbrowser you have a HtmlElementCollection which is in principle a list of HtmlElement.
On these elements you can read out attributes or make an InvokeMember("Click") and navigate through children / parent elements.
what is the equivalent in EO WebView to this HtmlElementCollection / HtmlElement?
How can I navigate through the HTML tree?
BTW: I am using it together with C#.

See the documentation: here, here, here.
Essentially, you have to rely on the ability to execute JavaScript.
You can access the document JavaScript object in a couple of ways:
JSObject document = (JSObject)_webView.EvalScript("document");
//or: Document document = _webView.GetDOMWindow().document;
GetDOMWindow() returns a EO.WebBrowser.DOM.Document instance; that type derives from JSObject and offers some extra properties (e.g., there's a body property that gets you the BODY element of type EO.WebBrowser.DOM.Element).
But overall, the API these offer is not much richer.
You can use JSObject like this:
// access a property on the JavaScript object:
jsObj["children"]
// access an element of an array-like JavaScript object:
var children = (JSObject)jsObj["children"];
var first = (JSObject)children[0];
// (note that you have to cast; all these have the `object` return type)
// access an attribute on the associated DOM element
jsObj.InvokeFunction("getAttribute", "class")
// etc.
It's all a bit fiddly, but you can write some extension methods to make your life easier (however, see the note on performance below):
public static class JSObjectExtensions
{
public static string GetTagName(this JSObject jsObj)
{
return (jsObj["tagName"] as string ?? string.Empty).ToUpper();
}
public static string GetID(this JSObject jsObj)
{
return jsObj["id"] as string ?? string.Empty;
}
public static string GetAttribute(this JSObject jsObj, string attribute)
{
return jsObj.InvokeFunction("getAttribute", attribute) as string ?? string.Empty;
}
public static JSObject GetParent(this JSObject jsObj)
{
return jsObj["parentElement"] as JSObject;
}
public static IEnumerable<JSObject> GetChildren(this JSObject jsObj)
{
var childrenCollection = (JSObject)jsObj["children"];
int childObjectCount = (int)childrenCollection["length"];
for (int i = 0; i < childObjectCount; i++)
{
yield return (JSObject)childrenCollection[i];
}
}
// Add a few more if necessary
}
Then you can do something like this:
private void TraverseElementTree(JSObject root, Action<JSObject> action)
{
action(root);
foreach(var child in root.GetChildren())
TraverseElementTree(child, action);
}
Here's an example of how you could use this method:
TraverseElementTree(document, (currentElement) =>
{
string tagName = currentElement.GetTagName();
string id = currentElement.GetID();
if (tagName == "TD" && id.StartsWith("codetab"))
{
string elementClass = currentElement.GetAttribute("class");
// do something...
}
});
But, again, it's a bit fiddly - while this seems to work reasonably well, you'll need to experiment a bit to find any tricky parts that can result in errors, and figure out how to modify the approach to make it more stable.
Note on performance
Another alternative is to use JavaScript for most of the element processing, and just return the values you need to be used in your C# code. Depending on how complex the logic is, this is likely going to be more efficient in certain scenarios, as it would result in a single browser engine round trip, so it's something to consider if performance becomes an issue. (See the Performance section here.)

Since this is one of the rare questions handling the element/child issue of the EO.WebBrowser and since I'm more into VB.NET, I was struggling to get the above working. It took me quite some hours to figure it out, so for everybody who is also more into VB.NET, here are some working VB equivalents.
Accessing your document and retrieving an element with children:
win = WebView1.GetDOMWindow()
doc = win.document
Dim element As EO.WebBrowser.DOM.Element = doc.getElementById("ddlSamples")
Access an element of an array-like object:
Dim children = CType(element("children"), EO.WebBrowser.JSObject)
Dim childrenCount As Integer = CInt(children("length"))
Get e.g. the selected OPTION-child within the element and get the innerText of the child:
Dim child As EO.WebBrowser.JSObject
Dim txt As String = Nothing
For c = 0 To childrenCount - 1
child = CType(children(c), EO.WebBrowser.JSObject)
If GetAttribute(child, "selected") = "selected" Then
txt = CStr(child("text"))
Exit For
End If
Next
And ofcourse declaring the 'GetAttribute' function:
Public Shared Function GetAttribute(ByVal jsObj As EO.WebBrowser.JSObject, ByVal attribute As String) As String
Return If(TryCast(jsObj.InvokeFunction("getAttribute", attribute), String), String.Empty)
End Function
Hopefully people will benefit from it.

Related

How to extend Selenium to find a button with a specific text in C#, and making it work with implicit wait?

I'm trying to create an extension method which task is to find a button with a specific text. This is what I currently have:
public static IWebElement FindButtonByPartialText(this ISearchContext searchContext, string partialText)
{
partialText = partialText.ToLowerInvariant();
var elements = searchContext.FindElements(By.CssSelector("button, input[type='button']"));
foreach (var e in elements)
{
if (e.TagName == "INPUT")
{
if (e.GetAttribute("value")?.ToLowerInvariant().Contains(partialText) == true)
return e;
}
else if (e.Text.ToLowerInvariant().Contains(partialText))
return e;
}
throw new Exception("foo");
}
(I have set the implicit wait option to 30 seconds)
So, if the page hasn't loaded properly yet, but there is at least one button on the page (with the incorrect text), this will fail miserably I expect, because it doesn't have the proper implicit wait behavior.
What is the correct way to create this extension method so that it waits the proper amount of time before throwing?
You cannot use a CSS selector. You will need to use an xpath expression instead. The | character allows you to search for multiple different kinds of elements. In this case, you can search for buttons or inputs:
private const string FindButtonByPartialTextXPath = "//button[contains(lower-case(.), '{0}')]|//input[#type = 'button' and contains(lower-case(#value), '{0}')]";
public static IWebElement FindButtonByPartialText(this ISearchContext searchContext, string partialText)
{
var xpath = string.Format(FindButtonByPartialTextXPath, parialText.ToLowerInvariant());
var locator = By.XPath(xpath);
return searchContext.FindElement(locator);
}
Note: implicit waits will not guarantee an element is visible or something a user can interact with.

How to recognize dashed commands from Console user input?

let's say user enters -path D:\TestFolder\Test.txt -output c -formula x+1-20 in console.
That means he wants to process file which is located in that path, output results to console and apply that formula to each number in file.
what is the proper way to recognize commands if they start with dash?
for example I want to build an object from that string:
public UserInput(string input)
{
public string Path { get; set; }
public string OutputParam { get; set; }
public string Formula { get; set; }
}
Depending on your target framework, and depending on the complexity of the input parameters, you may want to consider using a 3rd party package to do the parsing for you. There are plenty to choose from each with their own quirks and best use cases (e.g. one example is Command Line Parser which can work with .NET Standard. If you are using .NET Core, there is a built in System.CommandLine that can do what you need as well). Each of these will have their own particular implementations and specifics on how to use, so showing examples might not be as helpful if you're not interested in parsing complex user input.
If you are specifically simply trying to parse only the string -path <path> -output <file> -formula <formula>, you could simply write a helper function to return an array of values after parsing, rather than creating a class (and having to deal with static and all that in your Main function). If you want to create a custom class to handle things like mapping your mathematical formula to something, or manipulate the data in some way, you should probably refactor the example below.
Helper:
private static string[] ParseInput(string[] input)
{
try
{
var results = new string[3];
// shift over one index based on the position of the arguments
// get the 'path' value
var index = input.FindIndex(x => x == "-path") + 1;
results[0] = input[index];
// get the 'output' value
index = input.FindIndex(x => x == "-output") + 1;
results[1] = input[index];
// get the formula value
index = input.FindIndex(x => x == "-formula") + 1;
results[2] = input[index];
return results;
}
catch { throw new ArgumentException("Input string was not formatted correctly.")
}
You can then use this in the main program:
...
static void Main(string[] args)
{
...
try
{
var results = ParseInput(args);
// do stuff with results[0] through [2]
...
}
catch (ArgumentException ex)
{
Console.WriteLine($"ERROR: {ex.Message}");
return;
}
...
}
Note: In general I would avoid hardcoding stuff like this, because it makes your application nearly impossible to extend as you get new requirements. For example, if you ever decide you want a -formula2 input argument, well now you need to recode your entire helper function, and any dependencies to consume that information. On the flip side, I also do not advise to install large 3rd party packages that "do everything" if you have a very specific requirement that (you assume) will never change. On this one it's really up to your requirements and scope to decide if you need to use a larger solution to solve your problem.

Formatting a method signature loses indentation

I'm creating a Code Fix which turns the access modifier of detected methods public. The implementation is straightforward: remove all existing access modifiers and add public at the front. Afterwards I replace the node and return the solution.
This however results in a modifier list that looks like this: publicvirtual void Method(). On top of the modifiers being pasted against eachother, that line of code is wrongly indented. It looks like this:
[TestClass]
public class MyClass
{
[TestMethod]
publicvirtual void Method()
{
}
}
So as a solution I format the code instead. Using
var formattedMethod = Formatter.Format(newMethod,
newMethod.Modifiers.Span,
document.Project.Solution.Workspace,
document.Project.Solution.Workspace.Options);
I can format the modifiers but they are still wrongly indented:
[TestClass]
public class MyClass
{
[TestMethod]
public virtual void Method()
{
}
}
I assume this is because of trivia but prepending the formatted method with the original method's leading trivia does not make a difference. I want to avoid formatting the entire document because, well, this isn't an action to format the entire document.
The entire relevant implementation of this Code Fix:
private Task<Solution> MakePublicAsync(Document document, SyntaxNode root, MethodDeclarationSyntax method)
{
var removableModifiers = new[]
{
SyntaxFactory.Token(SyntaxKind.InternalKeyword),
SyntaxFactory.Token(SyntaxKind.ProtectedKeyword),
SyntaxFactory.Token(SyntaxKind.PrivateKeyword)
};
var modifierList = new SyntaxTokenList()
.Add(SyntaxFactory.Token(SyntaxKind.PublicKeyword))
.AddRange(method.Modifiers.Where(x => !removableModifiers.Select(y => y.RawKind).Contains(x.RawKind)));
var newMethod = method.WithModifiers(modifierList);
var formattedMethod = Formatter.Format(newMethod, newMethod.Modifiers.Span, document.Project.Solution.Workspace, document.Project.Solution.Workspace.Options);
var newRoot = root.ReplaceNode(method, formattedMethod.WithLeadingTrivia(method.GetLeadingTrivia()));
var newDocument = document.WithSyntaxRoot(newRoot);
return Task.FromResult(newDocument.Project.Solution);
}
Instead of calling Formatter.Format manually, just put the Formatter.Annotation on your fixed nodes, and the CodeFix engine will call it automatically for you.
The issue is that you need to call Format on the root of the tree, but specify the span of the tree you want formatted, otherwise the formatter will run on just the tree you pass in, with no context from its parent.
The problem was that I had my tests indented in the string representation itself, like this:
var original = #"
using System;
using System.Text;
namespace ConsoleApplication1
{
class MyClass
{
void Method(Nullable<int> myVar = 5)
{
}
}
}";
As you can see there's still a tab between the left margin and the actual code. Apparently the Roslyn formatter can't handle this scenario (which is, admittedly, not a common situation).
In a situation unlike this though, you're probably interested in the formatter which is why I'll accept Kevin's answer.

Caching attribute for method?

Maybe this is dreaming, but is it possible to create an attribute that caches the output of a function (say, in HttpRuntime.Cache) and returns the value from the cache instead of actually executing the function when the parameters to the function are the same?
When I say function, I'm talking about any function, whether it fetches data from a DB, whether it adds two integers, or whether it spits out the content of a file. Any function.
Your best bet is Postsharp. I have no idea if they have what you need, but that's certainly worth checking. By the way, make sure to publish the answer here if you find one.
EDIT: also, googling "postsharp caching" gives some links, like this one: Caching with C#, AOP and PostSharp
UPDATE: I recently stumbled upon this article: Introducing Attribute Based Caching. It describes a postsharp-based library on http://cache.codeplex.com/ if you are still looking for a solution.
I have just the same problem - I have multiply expensive methods in my app and it is necessary for me to cache those results. Some time ago I just copy-pasted similar code but then I decided to factor this logic out of my domain.
This is how I did it before:
static List<News> _topNews = null;
static DateTime _topNewsLastUpdateTime = DateTime.MinValue;
const int CacheTime = 5; // In minutes
public IList<News> GetTopNews()
{
if (_topNewsLastUpdateTime.AddMinutes(CacheTime) < DateTime.Now)
{
_topNews = GetList(TopNewsCount);
}
return _topNews;
}
And that is how I can write it now:
public IList<News> GetTopNews()
{
return Cacher.GetFromCache(() => GetList(TopNewsCount));
}
Cacher - is a simple helper class, here it is:
public static class Cacher
{
const int CacheTime = 5; // In minutes
static Dictionary<long, CacheItem> _cachedResults = new Dictionary<long, CacheItem>();
public static T GetFromCache<T>(Func<T> action)
{
long code = action.GetHashCode();
if (!_cachedResults.ContainsKey(code))
{
lock (_cachedResults)
{
if (!_cachedResults.ContainsKey(code))
{
_cachedResults.Add(code, new CacheItem { LastUpdateTime = DateTime.MinValue });
}
}
}
CacheItem item = _cachedResults[code];
if (item.LastUpdateTime.AddMinutes(CacheTime) >= DateTime.Now)
{
return (T)item.Result;
}
T result = action();
_cachedResults[code] = new CacheItem
{
LastUpdateTime = DateTime.Now,
Result = result
};
return result;
}
}
class CacheItem
{
public DateTime LastUpdateTime { get; set; }
public object Result { get; set; }
}
A few words about Cacher. You might notice that I don't use Monitor.Enter() ( lock(...) ) while computing results. It's because copying CacheItem pointer ( return (T)_cachedResults[code].Result; line) is thread safe operation - it is performed by only one stroke. Also it is ok if more than one thread will change this pointer at the same time - they all will be valid.
You could add a dictionary to your class using a comma separated string including the function name as the key, and the result as the value. Then when your functions can check the dictionary for the existence of that value. Save the dictionary in the cache so that it exists for all users.
PostSharp is your one stop shop for this if you want to create a [Cache] attribute (or similar) that you can stick on any method anywhere. Previously when I used PostSharp I could never get past how slow it made my builds (this was back in 2007ish, so this might not be relevant anymore).
An alternate solution is to look into using Render.Partial with ASP.NET MVC in combination with OutputCaching. This is a great solution for serving html for widgets / page regions.
Another solution that would be with MVC would be to implement your [Cache] attribute as an ActionFilterAttribute. This would allow you to take a controller method and tag it to be cached. It would only work for controller methods since the AOP magic only can occur with the ActionFilterAttributes during the MVC pipeline.
Implementing AOP through ActionFilterAttribute has evolved to be the goto solution for my shop.
AFAIK, frankly, no.
But this would be quite an undertaking to implement within the framework in order for it to work generically for everybody in all circumstances, anyway - you could, however, tailor something quite sufficient to needs by simply (where simplicity is relative to needs, obviously) using abstraction, inheritance and the existing ASP.NET Cache.
If you don't need attribute configuration but accept code configuration, maybe MbCache is what you're looking for?

Implementing a CVAR system

I would like to implement what I know as a CVAR System, I'm not entirely sure on what the official name of it is (if any).
It's essentially a system used in some programs and video games, where a user can pull down a console and input a command, such as "variable 500" to set that variable to 500. Instances of this can be found in any Half-Life game, Doom and Quake games, and many more. The general idea seems to be to hide the underlying architecture, but still allow protected access, for instance, one may be able to view the value for, say, gravity, but not change it. Some of these values may also be functions, for instance, a user may be able to input "create " to create an enemy type at their location, or some other location specified.
Looking through the Half Life 2 SDK, and from what I remember on the GoldSrc SDK, it seems like they at least implemented "flagging" of sorts, where certain commands would only work under certain conditions, such as if another value was set, or if the user has some permission level.
My initial thought was to create a Dictionary, or an object similar to do that, and use that to bind string values to function delegates, as well as keep a "protection" level of sorts, to limit usage of certain commands. However, this seems rather cumbersome, as I believe I would have to go through and add in a new entry manually for each value or function I wanted to implement. I also don't know if this would give me the control level I'm looking for.
I believe ideally what I would like would be a CVAR System class, as well as a Register function that can take it say, a variable/function delegate, a string to access it, and whatever protection level I need. This way I can add what I need as I see them, so everything is still in it's related classes and files.
I'm really just looking for some ideas here, so my questions are:
Has anyone ever done something like this before, and if so, how?
Would my implementation work? (Theoretically, if not, can you think of a better way?)
If someone is more knowledgeable with how one of the previously mentioned titles does it, can you elaborate on that a bit? It seems to be hard to find documentation on them.
I'm not really looking for specific code, just more of structuring design. And it doesn't have to be "commercial" or work just like another, I just need something to get me going.
Were you thinking about something like this?
class CVAR
{
[ProtectionLevel(CVARFlags.InGameOnly | CVARFlags.Admin)]
private float gravity = 0.1f;
[ProtectionLevel(CVARFlags.InGameOnly | CVARFlags.Admin)]
private float friction = 0.1f;
[ProtectionLevel(CVARFlags.ReadOnly)]
private string serverVersion = "x.x.x";
public void SetCVARValue(string commandLine) {
string cvarName = GetCvarName(commandLine); // parse the cmd line and get the cvar name from there
object cvarValue = GetCvarValue(commandLine); // parse the value from the string
FieldInfo field = typeof(CVAR).GetField(cvarName);
object[] attributes = field.GetCustomAttributes(typeof(ProtectionLevel), false);
if(attributes.Length > 0) {
ProtectionLevelAttribute attr = (ProtectionLevelAttribute)attributes[0];
if(attr.CheckConditions(World.Instance)) {
field.SetValue(this, cvarValue);
} else {
// error report
}
}
}
}
You could write a parser that looks for commands like
/object_property value
/object_method arg1 arg2
A dictionary, like you suggested, could map those strings to properties and functions. The creation of the dictionary could be done dynamically using reflection by looping through eligible objects, taking their public methods and accessors, and generating a string for them.
Then the dictionary could be mapped in a class for convenience and error checking.
For the methods, the dictionary values could be delegates that take 0..n arguments, for the properties/fields, you will need to be able to some data binding between your actual fields and the dictionary value. UNLESS, your objects themselves refer to the dictionaries for their values, in which case the values only live in place.
To do so, you could simply register your properties using reflection in the object constructor, then call the dictionary in your properties.
[Flags]
public enum CVarAccessibilities
{
Settable,
Gettable
}
public class CVar<T>
{
public CVarAccessibilities Accessibility { get; set; }
T val;
public T Value {
get { return val; }
set
{
if (!Accessibility.HasFlag(CVarAccessibilities.Settable))
return; // just don't set it, maybe print some warning
val = value;
}
}
}
public static class CVarRegistry
{
static Dictionary<string, Object> CVars;
static CVarRegistry { /* use reflections to initialize the dictionary */ }
public static T GetValue<T>(Type owner, string paramName)
{
CVar cvar;
if (!CVars.TryGetValue(owner.Name + "_" + paramName, out cvar)
throw new MyCustomException();
return (T)cvar.Value;
}
public static void SetValue<T>(Type owner, string paramName, T value)
{
CVar cvar;
if (!CVars.TryGetValue(owner.Name + "_" + paramName, out cvar)
throw new MyCustomException();
cvar.Value = value;
}
}
public class MyObject
{
public static int MyRegisteredValue
{
get { return Global.CVarRegistry.GetValue<int>(typeof(MyObject), "MyRegisteredValue"); }
set { Global.CVarRegistry.SetValue(typeof(MyObject), "MyRegisteredValue"); }
}
}
Hope that helps!
This is more commonly known as 'tweak' variables.
Good discussion here: https://gamedev.stackexchange.com/questions/3631/tweaking-and-settings-runtime-variable-modification-and-persistence

Categories