Using reflection to SetValue on property with private setter - c#

I ran into, what I thought was a bug is actually, a feature detailed in This post. Can anyone explain to me why this is allowed? It seems like a legacy quirk/bug that became useful.

I'm not sure which part of that you think is a bug, but it has always been possible to access the internals of a class via reflection when you cannot do so at compile time. This is by design. Numerous aspects of the CLR rely on reflection to access fields, such as serialization. The compiled IL needs to be able to access all fields of all objects, or else you couldn't set private fields from within your class.
The access modifiers in C# are not a security mechanism. If you are relying on a field being private to prevent anyone from setting it from the outside, you're doing something wrong. They exist to clearly delineate which parts of your class are it's public contract (and thus, in theory, stable) from those parts that are implementation details (and thus can change without notice.)
If you choose to use reflection to change the internal state of an object, despite all indications that you should leave it alone, you are taking the stability of your application into your own hands, and you get what deserve.

Reflection is allowed only for Full Trust code, so the code already able to do anything (including directly poking in memory of the process). So having supported way of changing values even for private properties does not make code any less secure. It makes reflection API consistent and allows useful scenarios especially for testing.

Related

How do get set methods stop dependencies?

So I understand that if we want to change the implementation detail of a class, using those details outside of the class will cause errors when things are changed, this is why we set those fields to private. However, if we use get set methods with a private field doesn't this do the same thing? If I decided I didn't want my class to have a name and a username, just a name, and I delete the private username field, the get / set methods will break with that and it will cause the places where those methods are used to also break. Isn't referencing one class a dependency no matter what in case we change that classes methods or fields? What is the point of Get Set methods then and how do they stop code from breaking like this?
However, if we use get set methods with a private field doesn't this do the same thing?
Yes. Arguably, yes. The original idea of Object Oriented Programming, as Alan Kay -who coined the term- initially thought about it, has been distorted. Alan Kay has expressed his dislike for setters:
Lots of so called object oriented languages have setters and when you have a setter on an object you turned it back into a data structure.
-- Alan Kay - Programming and Scaling (video).
Isn't referencing one class a dependency no matter what in case we change that classes methods or fields?
Correct. If you are referencing a class from another, your classes are tightly coupled. In that case a change of one class will propagate to the other. Regardless if the change is in public fields, getter, setters or something else.
If you are using an interface or similar indirection, they are loosely coupled. This looseness gives you an opportunity to stop the propagation of the change. Which you may or may not do.
Finally, if you are using an observer pattern or similar (e.g. events or listeners), you can have classes decoupled. This is, in a way, retrofitting the idea of passing messages as originally conceived by Alan Kay.
What is the point of Get Set methods then and how do they stop code from breaking like this?
They allow you to change the internal representation of the class. While the common approach is to have setters and getters correspond to a field, that does not have to be the case. A getter might return a constant, or compute a value form multiple fields. Similarly, a setter might update multiple fields (or even do nothing).
Reasons to have setters:
They give you an opportunity to implement validations.
They give you an opportunity to raise "changed" events.
They might be necessary to work with other systems (e.g. some Dependency Injection frameworks, also some User Interface frameworks).
You need to update multiple fields to keep an invariant. Presumably updating those other fields don't result in some public property changing value in an unexpected way (also don't break single responsibility principle, but that should be obvious). See Principle of least astonishment.
Reasons of getters:
They give you an opportunity to implement lazy initialization.
They give you an opportunity to return computed values.
They might make debugging easier. Consider some getters for DEBUG builds only.
If you had public fields, and then you decided you needed anything like what I described above, you may want to change to getters and setters. Plus, that change require to recompile the code that uses it (even if the source is the same, which would be the case with C# properties). Which is a reason it is advised to do it preemptively, in particular in code libraries (so that an application that uses it does not have to be recompiled if the library changed to a newer version that needed these changes).
These are reasons to not have getters: Often, getters exist to access a member to call method on it, which leads to very awkward interfaces (see Law of Demeter). Or to take a decision, which may lead to a Time-of-check to time-of-use bug, which also means the interface is not thread-safe ready. Or to do a computation, which is often better if the class has a method to do it itself (Tell, Don't Ask).
And for setters, aside for being a code smell of bad encapsulation, could be indicative of an unintended state machine. If code needs to call a setter (change the state), to make sure it has the intended value before calling a method, just make it a parameter (yes, even if you are going to repeat that parameter in a lot of methods). Such interface is easy to misuse, plus is not thread-safe ready. In general, avoid any interface design in which the code using it has to call things in an order that it does not forces you to (a good design will not let you call things in an order that results in an invalid state (see poka-yoke). Of course, not every contract can be expressed in the interface, we have exceptions for the rest.).
A thread-safe ready interface, is one that can be implemented in a thread-safe fashion. If an interface is not thread-safe ready, the only way to avoid threading problems while using it is to wrap access to it with locks external to it, regardless of how the interface is implemented. Often because the interface prevents consolidating reads and writes leading to a Time-of-check to time-of-use bug or an ABA problem.
There is value in public fields, when appropriate, too. In particular for performance, and for interoperability with native code. You will find, for example, that Vector types used in game development libraries often have public fields for its coordinates.
As you can see, there can be good reasons for both having and not having getters and setters. Similarly, there can be good reasons for both having or not having public fields. Plus, either case can be problematic if not used appropriately.
We have guidelines and "best practices" to avoid the pitfalls. Not having public fields is a very good default. And not every field needs getters and setters. However, you can make getters and setters, and you can make fields public. Do that if you have a good reason to do it.
If you make every field public you will likely run into trouble, braking encapsulation. If you make getters and setters for each and every field, it is not much better. Use them thoughtfully.

Does internal types compromise good API design?

It seems to me that anytime I come across internal calls or types, it's like I hit a road block.
Even if they are accessible in code like open-source, it still feels they are not usable parts of the API code itself. i.e. it's as if they are discouraged to be modified.
Should one keep oneself from using the internal keyword unless it's absolutely necessary?
I am asking this for an open-source API. But still not everyone will want to change the API, but mostly use it to write their own code for the app itself.
There is nothing wrong with having an internal type in your DLL that is not a part of your public API. In fact, if you have anything other than a trivial DLL is more likely a sign of bad design if you don't have an internal type (or at least a non-public type)
Why? Public APIs are a way of exposing the parts of your object model you want a consumer to use. Having an API of entirely public types means that you want the consumer to see literally everything in your DLL.
Think of the versioning issues that come along with that stance. Changing literally anything in your object model is a breaking change. Having internal types allows you great flexibility in your model while avoiding breaking changes to your consumers.
Internal types are types that are explicitly meant to be kept out of the API. You should only mark things internal that you don't want people to see.
My guess is that you're coming across types that are internal, but would have been valuable additions to the public API. I've seen this in quite a few projects. That's a different issue, though - it's really the same issue as whether a private type should have been public.
In general, a good project should have internal or private types. They help implement the required feature set without bloating the public API. Keeping the public API as small as possible to provide the required feature set is part of what makes a library usable.
An API is comprised of its public types and members, anything else is an implementation detail.
That being said, I think that internal types can be very useful especially when you want to return interface types from your API and don't want to expose the concrete types that you have used to implement those interfaces. This gives the API designer a lot of flexibility.

Why can reflection access protected/private member of class in C#?

Why can reflection access protected/private member of class in C#?
Is this not safe for the class, why is reflection given such power? Is this an anti-pattern?
Member accessibility is not a security feature. It is there to protect the programmer against himself or herself. It helps implementing encapsulation but it is by no means a security feature.
Reflection is tedious enough to use so that people normally don't go out of their way to use it to access non-public members. It's also quite slow. Reflection is normally used in special cases only. However, nothing can protect completely against human stupidity, if someone wants to abuse reflection he can easily do it, but even without the reflection API, they can achieve the same thing (if they're running in full trust, that is) if they are determined enough.
This is necessary for scenarios such as remoting, serialization, materialization, etc. You shouldn't use it blindly, but note that these facilities have always been available in any system (essentially, by addressing the memory directly). Reflection simply formalises it, and places controls and checks in the way - which you aren't seeing because you are presumably running at "full trust", so you are already stronger than the system that is being protected.
If you try this in partial trust, you'll see much more control over the internal state.
Is it an anti-pattern?
Only if your code uses it inappropriately. For example, consider the following (valid for a WCF data-contract):
[DataMember]
private int foo;
public int Foo { get {return foo;} set {foo = value;} }
Is it incorrect for WCF to support this? I suspect not... there are multiple scenarios where you want to serialize something that isn't part of the public API, without having a separate DTO. Likewise, LINQ-to-SQL will materialize into private members if you so elect.
Reflection is absolute necessary for a debugger. Imagine that you are stepping through your program and unable to see values of your private variables. That's probably the reason why reflection works in .NET and Java the way it works, to make debugging really easy.
If we wouldn't need debuggers, then I can imagine that reflection would be restricted more in the spirit of OOP.

How do I safely use an obfuscator?

When I attempt to use dotfuscate on my application, I get an application error when I run it.
Dotfuscator (and all obfuscators) are typically safe to run on an application, but they do occasionally cause problems. Without specific details of your problem, it's difficult to diagnose.
However, one common problem with obfuscators is when you mix them with reflection. Since you're changing the type names, but not strings, any time you try to reflect on objects with a specific string name, and use the reflection namespace to construct objects, you'll likely have problems.
Most of the problem I have encountered with obfuscation revolve around types that can't have their name changed, because something needs to reflect on them (your code or the runtime).
for example if you have a class that is being used as a web service proxy, you can't safely obfuscate the class name:
public class MyWebServiceProxy : SoapHttpClientProtocol
{
}
Also some obfuscators can not handle generic methods and classes.
The trick is you need to find these types and prevent the obfuscater from renaming them. This is done with the Obfuscation attribute:
[global::System.Reflection.Obfuscation(Exclude=true, Feature="renaming")]
Another thing that can be a problem with obfuscators is serialization using BinaryFormatter, since it changes the field names. I have some users who use protobuf-net for serialization on their obfuscated code for this reason.

Justification for Reflection in C#

I have wondered about the appropriateness of reflection in C# code. For example I have written a function which iterates through the properties of a given source object and creates a new instance of a specified type, then copies the values of properties with the same name from one to the other. I created this to copy data from one auto-generated LINQ object to another in order to get around the lack of inheritance from multiple tables in LINQ.
However, I can't help but think code like this is really 'cheating', i.e. rather than using using the provided language constructs to achieve a given end it allows you to circumvent them.
To what degree is this sort of code acceptable? What are the risks? What are legitimate uses of this approach?
Sometimes using reflection can be a bit of a hack, but a lot of the time it's simply the most fantastic code tool.
Look at the .Net property grid - anyone who's used Visual Studio will be familiar with it. You can point it at any object and it it will produce a simple property editor. That uses reflection, in fact most of VS's toolbox does.
Look at unit tests - they're loaded by reflection (at least in NUnit and MSTest).
Reflection allows dynamic-style behaviour from static languages.
The one thing it really needs is duck typing - the C# compiler already supports this: you can foreach anything that looks like IEnumerable, whether it implements the interface or not. You can use the C#3 collection syntax on any class that has a method called Add.
Use reflection wherever you need dynamic-style behaviour - for instance you have a collection of objects and you want to check the same property on each.
The risks are similar for dynamic types - compile time exceptions become run time ones. You code is not as 'safe' and you have to react accordingly.
The .Net reflection code is very quick, but not as fast as the explicit call would have been.
I agree, it gives me the it works but it feels like a hack feeling. I try to avoid reflection whenever possible. I have been burned many times after refactoring code which had reflection in it. Code compiles fine, tests even run, but under special circumstances (which the tests didn't cover) the program blows up run-time because of my refactoring in one of the objects the reflection code poked into.
Example 1: Reflection in OR mapper, you change the name or the type of the property in your object model: Blows up run-time.
Example 2: You are in a SOA shop. Web Services are complete decoupled (or so you think). They have their own set of generated proxy classes, but in the mapping you decide to save some time and you do this:
ExternalColor c = (ExternalColor)Enum.Parse(typeof(ExternalColor),
internalColor.ToString());
Under the covers this is also reflection but done by the .net framework itself. Now what happens if you decide to rename InternalColor.Grey to InternalColor.Gray? Everything looks ok, it builds fine, and even runs fine.. until the day some stupid user decides to use the color Gray... at which point the mapper will blow up.
Reflection is a wonderful tool that I could not live without. It can make programming much easier and faster.
For instance, I use reflection in my ORM layer to be able to assign properties with column values from tables. If it wasn't for reflection I have had to create a copy class for each table/class mapping.
As for the external color exception above. The problem is not Enum.Parse, but that the coder didnt not catch the proper exception. Since a string is parsed, the coder should always assume that the string can contain an incorrect value.
The same problem applies to all advanced programming in .Net. "With great power, comes great responsibility". Using reflection gives you much power. But make sure that you know how to use it properly. There are dozens of examples on the web.
It may be just me, but the way I'd get into this is by creating a code generator - using reflection at runtime is a bit costly and untyped. Creating classes that would get generated according to your latest code and copy everything in a strongly typed manner would mean that you will catch these errors at build-time.
For instance, a generated class may look like this:
static class AtoBCopier
{
public static B Copy(A item)
{
return new B() { Prop1 = item.Prop1, Prop2 = item.Prop2 };
}
}
If either class doesn't have the properties or their types change, the code doesn't compile. Plus, there's a huge improvement in times.
I recently used reflection in C# for finding implementations of a specific interface. I had written a simple batch-style interpreter that looked up "actions" for each step of the computation based on the class name. Reflecting the current namespace then pops up the right implementation of my IStep inteface that can be Execute()ed. This way, adding new "actions" is as easy as creating a new derived class - no need to add it to a registry, or even worse: forgetting to add it to a registry...
Reflection makes it very easy to implement plugin architectures where plugin DLLs are automatically loaded at runtime (not explicitly linked at compile time).
These can be scanned for classes that implement/extend relevant interfaces/classes. Reflection can then be used to instantiate instances of these on demand.

Categories