Library for traversing object property tree on C#

Library for traversing object property tree on C# - c#

I want to have a method that could traverse an object by property names and get me the value of the property.
More specifically as an input I have a string like "Model.Child.Name" and I want this method to take an object and get me the value that could be found programatically via: object.Model.Child.Name.
I understand that the only way to do this is to use Reflection, but I don't want to write this code on my own, because I believe that there are pitfalls. Moreover, I think it is more or less usual task.
Is there any well-known implementation of algorithm like that on C#?

Reflection is the way to go.
Reflection to access properties at runtime
You can take a look at ObjectDumper and modify the source code as per your requirement.
ObjectDumper take a .NET object and dump it to string, file, textWriter etc.

The is not that difficult to write. Yes there are some pitfalls, but it's good to know the pitfalls.
The algorithm is straightforward, it's traversing a tree structure. At each node you inspect it for a primitive value (int, string, char, etc) if it's not one of these times, then its a structure that has one or more primitives and needs to be traversed to it's primitives.
The pitfalls are dealing with nulls, nullable types, value versus reference types, etc. Straight forward stuff that every developer should know about.

Related

Alias for long lookups

I am working on a project that has nested Lists of classes. hence my code look like this when I want to get a variable.
MainClass.subclass1[element1].subClass2[element2].subClass3[element3].value;
I was wondering how I could get an alias for subClass3 so I can get all the variables in it without having to look in all the subclasses, like this.
subClass3Alias.value
in c++ this would be easy simply have a pointer pointing to it, but C# does not really have pointers.

No need for pointers – types in C# are usually reference types anyway, meaning that you can just copy them and they will refer to the original object (like pointers in C++):
var subclassAlias = MainClass.subclass1[element1].subClass2[element2].subClass3[element3];
Now you can use subclassAlias.value.
A slightly different thing occurs if your type happens to be a value type: in that case, the above will still work – but subclassAlias will be a value copy of the original value, meaning that changes to subclassAlias will not be reflected in the original object.
That said, this looks like suspicious code anyway – normally such deep levels of nesting are a sign of bad design and violate the Law of Demeter.
(Incidentally, in C++ you wouldn’t use pointers either.)

Should type information be encoded in parse tree?

I am working on a project including a small DSL. Lexing and parsing a string in this language results in a parse tree, implemented as an abstract class called Expr, which then has many of the usual derived classes such as AssignmentExpr, InvokeExpr, AdditionExpr, et cetera, corresponding to parse tree nodes which are assignments, function invocations, additions and so forth. The project is implemented in C#.
I am currently considering the implementation of type inference for this DSL. This means that I would like to be able to take an instance of the Expr class and return something encoding information about the types of the different nodes in the tree. This type information depends on a symbol table (types of variables) and a function table (function signatures). Thus, I would like to do something like:
TypedExpr typedExpr = inferTypes(expr, symbolTable, functionTable)
Here, TypedExpr would ideally be like Expr, except with a Type property giving the type of the expression. This, however, presents the following design problems:
It would make sense for TypedExpr to inherit from Expr and simply implement an additional property, Type. However, this would create two parallel inheritance hierarchies, one for TypedExpr (TypedAssignmentExpr, TypedInvokeExpr et cetera) and one for Expr (AssignmentExpr, InvokeExpr, et cetera). This is inconvenient to maintain, and the problem expands if further extensions of parse trees are required. I am not sure how this can be mitigated. One possibility would be the bridge design pattern, but I don't think this is capable of entirely solving the problem.
Alternatively, Expr could simply implement a Type property, which is then null at the time of construction from the parser, and later filled out by the type inference algorithm. However, passing around objects with null fields invites NullReferenceExceptions. The TypedExpr idea would have mitigated this. Furthermore, given that the idea of the Expr class is to express a parse tree, type information is not really a part of the tree: typing is context-sensitive, and requires particular symbol and function tables.
Third, the type inference method could also simply return a Dictionary< Expr, Type> which encodes type information about all nodes. This would mean that Expr remains representative of just the parse tree. The drawback of this is that the dictionary object constructed does not have any obvious properties showing that it is linked specifically to the Expr object passed to the type inference method.
I am not entirely satisfied with either of the three solutions given above.
My question is: What are the benefits and drawbacks of various approaches to this problem? Should type information be encoded directly in the parse tree, or should a parallel tree class be used? Or is the Dictionary solution the best? Is there an accepted "best practice" solution?

Go ahead with option two. This is what can be considered a “best practice”.
The reason is that a compiler usually works in many passes (stages, phases). Parsing being the first one, type resolution another one. You can later add an optimization pass, a code generation pass etc. Usually, a single data structure, an abstract syntax tree (AST; or parse tree), is maintianed allong these passes.
The idea that “passing around objects with null fields invites NullReferenceExceptions” is just false worries. You have to handle invalid cases a introduce counter-measures to validate inputs / outputs anyway. Compilers, including simple expression processors, are pretty complex things driven by complicated rules, which involve high degrees of data structure complexity and application logic you can't simply avoid.
It is very normal for an AST to have uninitialized data. Each compilation pass, besides initial construction of the AST by the parser, then manipulates the AST, computes more information (like your type resolution phase). The AST may even change substantially, i.e. due to an optimization pass.
Side note: modern compilers, such as the latest C# compiler, employ a non-mutability policy over ASTs and other internal data structures. In that case each pass builds its own new data structure. You could then design a new set of data structures for each pass, but that may turn into an overly complex code to maintain. Someone from the C# compiler team could elaborate more on this topic.

Snappy names for a ReferenceType value and a ValueType value

I have 2 classes. One handles a ReferenceType value, another does the same on a ValueType value. This is the only difference, but it is important. I am struggling to find a decent name for each class:
ReferenceTypeValueHandler and ValueTypeValueHandler?
Neah, ValueTypeValue sounds confusing.
ClassValueHandler and StructValueHandler?
I shouldn't use "Class" in a name of a class, should I?
NullableValueHandler and NonNullableValueHandler?
"Nullable" is already used for nullable value types (Nullable<>)
HeapValueHandler and StackValueHandler?
That's dumb. Exploiting the fact that reference type values are stored in the heap and value type values are in the stack, who cares? Also "Stack" is confusing implying it has something to do with a stack.
Any more ideas?
Update:
Some people suggest I should explain the purpose of the class. Well, although I don't think it's important, here it is: I am working on a XML to entity deserializer. I use XmlReader to take advantage of the streamline reading rather then working with DOM. As I read XML I build entities. Some entities are just wrappers for some other ones. These wrappers can take either a single entity or a collection (enumerable) of entities. Speaking of those which take a single entity, this entity has to be provided and it has to be provided exactly one time. If XML doesn't have it, it's a problem. If XML has more than 1 it's a problem too. So for keeping and ensuring that the entity is provided exactly one time I have a class ValueKeeper<TValue>. It has 2 methods TakeValue(TValue value) and TValue ClaimValue(). The TakeValue methods takes the value and checks if there is already a value provided before, if so it throws the exception with appropriate details. The ClaimValue method is called once the reading of the wrapper XML is finished and the wrapper entity has to be created over the scraped value, this method checks whether there is a value that was received via the TakeValue method, if so
it just returns that value, if no, then it throws the exception. Now, the problem is that for reference type values I am using comparison to NULL in order to see if the value was provided. In order to make such comparison possible there must be a generic constraint on the TValue type parameter: where TValue: class. Having this constraint in place I cannot use this class for value type values. So I need another class that does the same, but operates on values where TValue: struct using a Nullable<TValue> field to keep either provided or not-yet provided value. Now, with 2 classes I cannot go along just with ValueKeeper, I need one name for the reference type and another for the value type value. Here is where the question comes up. I need a way to express this subtle difference. But again, it's not important what the class does, what's important is to find appropriate way to put this difference clear.

I wouldn't agree that the rest of the class name is not important. You want to make your code speak for itself and to make it easy for the reader to understand the concepts you had in mind when designing the classes/structs. The class names you suggest would give me no idea of what the class is actually doing. I suggest to search for more concrete names: How is the value being handled? What value?
How do struct and class values differ from each other apart from that one is a class and the other one a struct? There must be some more difference because otherwise it wouldn't make sense to have the same thing as a struct and as a class (DRY).
If it's a very abstract operation you perform, try to search for the pattern, or a general name for a concept. To keep the value and make sure it was provided sounds a bit like a caching mechanism?
Secondly, your facing a semantic issue here: what is the term which subsumes 'values' of value types and 'values' of reference types. We could simply ask the inheritance chain of the .NET framework here and call it both an object.
So, in this case, something like CacheForValueTypeObjects and CacheForReferenceTypeObjects could work. I don't know whether cache expresses the purpose well, but if not, I would try to search for a term which best describes the 'final' purpose of the class, the reason why its there.
I bet you didn't think 'Well, what I really need now is a ValueTypeValueHandler!'. There was something more to it. ;) I like this kind of questions, thanks!

Convention while using Helper Casting Functions

I recently began to start using functions to make casting easier on my fingers for one instance I had something like this
((Dictionary<string,string>)value).Add(foo);
and converted it to a tiny little helper function so I can do this
ToDictionary(value).Add(foo);
Is this against the coding standards?
Also, what about simpler examples? For example in my scripting engine I've considered making things like this
((StringVariable)arg).Value="foo";
be
ToStringVar(arg).Value="foo";
I really just dislike how inorder to cast a value and instantly get a property from it you must enclose it in double parentheses. I have a feeling the last one is much worse than the first one though

Ignoring for a moment that you may actually need to do this casting - which I personally doubt - if you really just want to "save your fingers", you can use a using statement to shorten the name of your generic types.
At the top of your file, with all the other usings:
using ShorterType = Dictionary<string, Dictionary<int, List<Dictionary<OtherType, ThisIsRidiculous>>>>;

I don't think so. You've also done something nice in that it's a bit easier to read and see what's going on. Glib (in C) provides casting macros for their classes, so this isn't a new concept. Just don't go overkill trying to save your fingers.

In general, I would consider this to be code smell. In most situations where the type of casting you describe is necessary, you could get the same behavior by proper use of interfaces (Java) or virtual inheritance (C++) in addition to generics/templates. It is much safer to leave that responsibility of managing types to the compiler than attempting to manage it yourself.
Without additional context, it is hard to say about the example you have included. There are certainly situations in which the type of casting you describe is unavoidable; but they're the exception rather than the rule. For example, the type of casting (and the associated helper functions/macros) you're describing extremely common-place in generic C libraries.

Property or Method? [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Properties vs Methods
When is it best to use a property or a method?
I have a class which is a logger. When I create this class I pass in a file name.
My file name for the log is fairly simple, basically it just gets the application path and then combines it with myapp.log.
Now instead of having the log file name lines in my method I want to create a new method to get this.
So my question is, since it's fairly simple, is creating a property a good idea instead of creating a method since there are no parameters.
Duplicate Properties vs Methods

Properties are typically used to store a state for an object. Method are typically used to perform an action on the object or return a result. Properties offer both getters and setters and can have different scope (at least in .NET 2.0). There is also some advantages to using a property vs methods for serialization or cloning and UI controls look for properties via reflection to display values.

Properties can be used to return simple values. Methods should always been used when fetching the value might incur any kind of performance hit. Simple linear operations can be fine in properties, though.

Ask yourself whether it's an aspect of your class (something it has) versus a behaviour of your class (something it does).
In your case, I'd suggest a property is the way to go.

I'd definitely go with the property. If you were doing something complex or computationally or time intensive, then you would go the method route. Properties can hide the fact that a complex operation is taking place, so I like to reserve properties for fast operations and ones that actually describe a property on the object. Simply: Methods "do" something, properties describe.

When you want to use it like a variable, you should go for a property. When you want it to be clear that this is a method, you should use a method.
As a property is a method, it depends on the semantic/design you want to communicate here.

Properties should be used to wrap instance variables or provide simple calculated fields. The rule of thumb that I use is if there is anything more that very light processing make it a method.

If you are not doing anything significant, use proerties.
In your case, a readonly property (get only) should be good.
Methods make sense when you are doing something other than returning reference to an internal member.

Properties are a design smell.
They are sometimes appropriate in library classes, where the author cannot know how the data will be used but must simply output the same value that was put in (e.g. the Key and Value properties of the KeyValuePair class.)
IMHO some uses of properties in library classes are bad. e.g. assigning to the InnerHTML property of a DOM element triggers parsing. This should probably be a method instead.
But in most application classes, you do know exactly how the value will be used, and it is not the class's responsibility to remind you what value you put in, only to use the data to do its job. Messages should exercise capabilities, not request information
And if the value you want is computed by the class (or if the class is a factory and the value is a newly-created object), using a method makes it more clear that there is computation involved.
If you are setting the log filename, then there is also the side effect of opening the file (which presumably may throw an exception?) That should probably be a method.
And if the filename is just part of the log message, you do not need a getter for the property, only a setter. But then you would have a write-only property. Some people find these confusing, so I avoid them.
So I would definitely go for the method.

The answer in the dupicate question is correct. MSDN has a very good article on the differences and when one should be used over an other. http://msdn.microsoft.com/en-us/library/ms229054.aspx
In my case I believe using the Property would be correct because it just returns the path of the exe + a file name combined.
If however I decided to pass a file name to get it to combine with the exe path, then I would use a method.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.