C#: why are there no automatically generated equals/gethashcode/==/!=? [closed]

C#: why are there no automatically generated equals/gethashcode/==/!=? [closed] - c#

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I would like to know if there is a specific reason because of which C# doesn't have automatically generated Equals, GetHashCode, and operator ==, operator != geared towards value comparison in reference types.
*Explanation:
I do not see an easy way to quickly request "compare actual objects" operation for values/contents of reference types. Coming from C++ background I have impression that it is something that should be done automatically by compiler at simple request of user.
The lack of that feature most likely indicates that it might be against language's "design goal"/"vision"/"philosophy". So I would like to know for which reason this functionality was deemed to be unimportant.
--original text--
As far as I can tell, Equals pretty much amounts to few comparisons to null, attempted cast and field-by-field comparison.
GetHashCode pretty much amounts to combination of all hashes for members using some operations (multiply with overflow, xor, anything).
As far as I can tell, it should either automated: the methods should be generated by default OR there should be a simple way to request default implementation. However, there is no such thing. Why?
As I understand it, it is either massive technical oversight that persisted for years, or some kind of language philosophy I'm not aware of.
So, what is the reason?

In order for a compiler/framework to usefully auto-generate equivalence-related methods, it would need to be able to distinguish two kinds of equivalence and multiple kinds of reference. For example, suppose Foo has a single field of type int[], and two instances of Foo hold references to different arrays holding the sequence {1,2,3}. Whether or not a comparison between references to those instances should report them equal would depend upon the purpose for which Foo holds the array reference and the purpose for which the references to Foo objects are held by the code requesting the comparison.
If neither array's contents will ever be altered, the two Foo instances should report each other as being permanently equivalent (and also presently equivalent); if the arrays can be modified, but only at the request of code holding references to the Foo instances, then the instances should report themselves as being presently equivalent, but not permanently equivalent [if code which holds the only reference to a Foo instance and never shares it or calls any of its mutating methods, then it can know that the state of that instance will never change even if that instance doesn't know that]. If references to the arrays are in the hands of outside code that might modify them, then the instances are not equivalent even though the arrays presently hold the same value.
Since the type system has no way of knowing how what kind of comparison to do on int[] fields, there's no way it can generate a semantically-meaningful equality override.

For value types, Equals and GetHashCode are implemented for you automatically (though the implementation uses reflection, so it's faster to write your own).
And for reference types, it's not clear whether you want to compare the contents or compare the references. I've used both. If I'm writing an immutable type, I probably want its Equals to compare its contents. For anything else, I probably want the default Equals implementation that only returns true if I compare an instance to itself (reference equality); comparing contents would be the wrong thing in this case.
So, for value types (which are defined by their contents), .NET gives you what you want (but not as performant as what you could write yourself). For reference types, you have to opt into content equality, since often that wouldn't be what you want.

Related

Why does C++ allow functions that don't actually return a value? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
In C++, a function with a non-void return type without a return statement is allowed. So, the following code will compile:
std::string give_me_a_string()
{
}
In C#, however, such a method is not allowed. So, the following code will not compile:
public string GiveMeAString()
{
}
Why is this the case? What was the design rationale in these two languages?

C++ requires code to be "well-behaved" in order to be executed in a defined manner, but the language doesn't try to be smarter than the programmer – when a situation arises that could lead to undefined behaviour, the compiler is free to assume that such a situation can actually never happen at runtime, even though it cannot be proved via its static analysis.
Flowing off the end of a function is equivalent to a return with no value; this results in undefined behavior in a value-returning function.
Calling such a function is a legitimate action; only flowing off its end without providing a value is undefined. I'd say there are legitimate (and mostly legacy) reasons for permitting this, for example you might be calling a function that always throws an exception or performs longjmp (or does so conditionally but you know it always happens in this place, and [[noreturn]] only came in C++11).
This is a double-edged sword though, as while not having to provide a value in a situation you know cannot happen can be advantageous to further optimization of the code, you could also omit it by mistake, akin to reading from an uninitialized variable. There have been lots of mistakes like this in the past, so that's why modern compilers warn you about this, and sometimes also insert guards that make this somewhat manageable at runtime.
As an illustration, an overly optimizing compiler could assume that a function that never produces its return value actually never returns, and it could proceed with this reasoning up to the point of creating an empty main method instead of your code.
C#, on the other hand, has different design principles. It is meant to be compiled to intermediate code, not native code, and thus its definability rules must comply with the rules of the intermediate code. And CIL must be verifiable in order to be executed in some places, so a situation like flowing off the end of a function must be detected beforehand.
Another principle of C# is disallowing undefined behaviour in common cases. Since it is also younger than C++, it has the advantage of assuming computers are efficient enough to support more powerful static analysis than what the situation was during the beginning of C++. The compilers can afford detecting this situation, and since the CIL has to be verifiable, only two actions were viable: silently emit code that throws an exception (sort of assert false), or disallow this completely. Since C# also had the advantage of learning from C++'s lessons, the developers chose the latter option.
This still has its drawbacks – there are helper methods that are made to never return, and there is still no way to statically represent this in the language, so you have to use something like return default; after calling such methods, potentially confusing anyone who reads the code.

Why is there no inverse to object.ToString()? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
It seems like a good design decision that the System.Object class, and hence all classes, in .NET provide a ToString() method which, unsurprisingly, returns a string representation of the object. Additionally in C# this method is implemented for native types so that they integrate nicely with the type system.
This often comes in handy when user interaction is required. For example, objects can directly be held in GUI widgets like lists and are "automatically" displayed as text.
What is the rationale in the language design to not provide a similarly general object.FromString(string) method?
Other questions and their answers discuss possible objections, but I find them not convincing.
The parse could fail, while a conversion to string is always possible.
Well, that does not keep Parse() methods from existing, does it? If exception handling is considered an undesirable design, one could still define a TryParse() method whose standard implementation for System.Object simply returns false, but which is overridden for concrete types where it makes sense (e.g. the types where this method exists today anyway).
Alternatively, at a minimum it would be nice to have an IParseable interface which declares a ParseMe() or TryParse() method, along the lines of ICloneable.
Comment by Tim Schmelter's "Roll your own": That works of course. But I cannot write general code for native types or, say, IPAddress if I must parse the values; instead I have to resort to type introspection or write wrappers which implement a self-defined interface, which is either maintenance-unfriendly or tedious and error-prone.
Comment by Damien: An interface can only declare non-static functions for reasons discussed here by Eric Lippert. This is a very valid objection. A static TryParse() method cannot be specified in an interface. A virtual ParseMe(string) method though needs a dummy object, which is a kludge at best and impossible at worst (with RAII). I almost suspect that this is the main reason such an interface doesn't exist. Instead there is the elaborate type conversion framework, one of the alternatives mentioned as solutions to the "static interface" oxymoron.
But even given the objections listed, the absence of a general parsing facility in the type system or language appears to me as an awkward asymmetry, given that a general ToString() method exists and is extremely useful.
Was that ever discussed during language/CLR design?

It seems like a good design decision that the System.object class, and hence all classes, in .NET provide a ToString() method
Maybe to you. It's always seemed like a really bad idea to me.
which, unsurprisingly, returns a string representation of the object.
Does it though? For the vast majority of types, ToString returns the name of the type. How is that a string representation of the object?
No, ToString was a bad design in the first place. It has no clear contract. There's no clear guidance on what its semantics should be, aside from having no side effects and producing a string.
Since ToString has no clear contract, there is practically nothing you can safely use it for except for debugger output. I mean really, think about it: when was the last time you called ToString on object in production code? I never have.
The better design therefore would have been methods static string ToString<T>(T) and static string ToString(object) on the Debug class. Those could have then produced "null" if the object is null, or done some reflection on T to determine if there is a debugger visualizer for that object, and so on.
So now let's consider the merits of your actual proposal, which is a general requirement that all objects be deserializable from string. Note that first, obviously this is not the inverse operation of ToString. The vast majority of implementations of ToString do not produce anything that you could use even in theory to reconstitute the object.
So is your proposal that ToString and FromString be inverses? That then requires that every object not just be "represented" as a string, but that it actually be round trip serializable to string.
Let's think of an example. I have an object representing a database table. Does ToString on that table now serialize the entire contents of the table? Does FromString deserialize it? Suppose the object is actually a wrapper around a connection that fetches the table on demand; what do we serialize and deserialize then? If the connection needs my password, does it put my password into the string?
Suppose I have an object that refers to another object, such that I cannot deserialize the first object without also having the second in hand. Is serialization recursive across objects? What about objects where the graph of references contains loops; how do we deal with those?
Serialization is difficult, and that's why there are entire libraries devoted to it. Making it a requirement that all types be serializable and deserializable is onerous.
Even supposing that we wanted to do so, why string of all things? Strings are a terrible serialization data type. They can't easily hold binary data, they have to be entirely present in memory at once, they can't be more than a billion characters tops, they have no structure to them, and so on. What you really want for serialization is a structured binary storage system.
But even given the objections listed, the absence of a general parsing facility in the type system or language appears to me as an awkward asymmetry, given that a general ToString() method exists and is extremely useful.
Those are two completely different things that have nothing to do with each other. One is a super hard problem best solved by libraries devoted to it, and the other is a trivial little debugging aid with no specification constraining its output.
Was that ever discussed during language/CLR design?
Was ToString ever discussed? Obviously it was; it got implemented. Was a generalized serialization library ever discussed? Obviously it was; it got implemented. I'm not sure what you're getting at here.

Why is there no inverse to object.ToString()?
Because object should hold the bare minimum functionality required by every object. Comparing equality and converting to string (for a lot of reasons) are two of them. Converting isn't. The problem is: how should it convert? Using JSON? Binary? XML? Something else? There isn't one uniform way to convert from a string. Hence, this would unnecessarily bloat the object class.
Alternatively, at a minimum it would be nice to have an IParseable interface
There is: IXmlSerializable for example, or one of the many alternatives.

Best way to pair a Pre-made Class and primitive type C# [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Basically, I'm wanting to efficiently pair multiple bool and bytes ( to store values ) with a pre-made class (RenderTarget2D if you must know).
Obviously, I can wrap this all in a class, however there are situations when I will be having many of these, and would like to save on memory where possible (ie use a struct).
I know it's bad behaviour to use a struct with reference variables, and I'd prefer not to just use separate variables to hold the information ( would rather pair it all together ).
Essentially I am wanting a structure to hold a reference to a class, a bool, and a byte, and create a 2D array of this to make many (thus am looking for a mitigate memory usage)
Am I overlooking an obvious solution?

Understanding the question as:
You want something that holds a bool, a byte and an instance of the class RenderTarget2D.
If that is the case you can use Tuple<bool, byte, RenderTarget2d>.
Creating a custom class or struct is also a viable option. In fact there is a proposal for "C# 7" to include language native tuples (that will not be System.Tuple) and as currently written, they will be structs.
You may also want to consider that having a reference to the RenderTarget2d may prolong its lifespan.
Struct vs Class
The struct takes in memory (when compacted) the size of a bool plus the size of a byte plus the size of a reference (to RenderTarget2d). If you have an array of 600 by 600 (360000) of such structs it takes 360000 the size of the struct in memory.
If you use classes, the array will have 360000 references to the actual location of the data that in total takes at least as much as the array of structs.
So using the structs should take less memory...
But when you take a struct from your data structure, you are actually making a copy. So each time you access your array to get a item and read a property of that item, you are actually making a copy of the item and reading the property from that copy.
If you want to update it, you need to read it (that makes a copy as mentioned above) edit it, and then put it back... and that copies the data to the array.
So, if the memory is the main concern. Use struct. To quote Jon Skeet: "so long as you're aware of the consequences".
Using struct also means less RAM round trips. Not only because it avoids resolving the references, but also because the data is guaranteed to be close together. This allows for better performance because the CPU will load a chunk (or the totality) of the data structure in cache and since the code is not using references outside of that it will be able to keep it in cache instead of going to load another thing.

Why use int, float etc. instead of object? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I've been reading a book on C# and it explains that int, float, double etc. are "basic" types meaning that they store the information at the lowest level of the language, while a type such as 'object' puts information in the memory and then the program has to access this information there. I don't know exactly what this means as a beginner, though!
The book however does not explain what the difference is. Why would I use int or string or whatever instead of just object every time, as object is essentially any of these types?
How does it impact the performance of the program?

This is a very broad subject, and I'm afraid your question as currently stated is prone to being put on hold as such. This answer will barely scratch the surface. Try to educate yourself more, and then you'll be able to ask a more specific question.
Why would I use int or string or whatever instead of just object every time, as object is essentially any of these types?
Basically you use the appropriate types to store different types of information in order to execute useful operations on them.
Just as you don't "add" two objects, you can't get the substring of a number.
So:
int foo = 42;
int bar = 21;
int fooBar = foo + bar;
This won't work if you declared the variables as object. You can do an addition because the numeric types have mathematical operators defined on them, such as the + operator.
You can refer to an integer type as an object (or any type really, as in C# everything inherits from object):
object foo = 42;
However now you won't be able to add this foo to another number. It is said to be a boxed value type.
Where exactly these different types are stored is a different subject altoghether, about which a lot has been written already. See for example Why are Value Types created on the Stack and Reference Types created on the Heap?. Also relevant is the difference between value types and reference types, as pointed out in the comments.

C# is a strongly typed language, which means that the compiler checks that the types of the variables and methods that you use are always consistent. This is what prevents you from writing things like this:
void PrintOrder(Order order)
{
...
}
PrintOrder("Hello world");
because it would make no sense.
If you just use object everywhere, the compiler can't check anything. And anyway, it wouldn't let you access the members of the actual type, because it doesn't know that they exist. For instance, this works:
OrderPrinter printer = new OrderPrinter();
printer.PrintOrder(myOrder);
But this doesn't
object printer = new OrderPrinter();
printer.PrintOrder(myOrder);
because there is no PrintOrder method defined in the class Object.
This can seem constraining if you come from a loosely-typed language, but you'll come to appreciate it, because it lets you detect lots of potential errors at compile time, rather than at runtime.

What the book is referring to is basically the difference between value types (int, float, struct, etc) and reference types (string, object, etc).
Value types store the content in a memory allocated on the stack which is efficient where as reference types (almost anything that can have the null value) store the address where data is. Reference types are allocated on the heap which is less efficient than the stack because there is a cost to allocating and deallocating the memory used to store your data. (and it's only deallocated by the garbage collector)
So if you are using object every time it will be slower to allocate the memory and slower to reclaim it.
Documentation

C#: How should ToString() be implemented? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
The problems are:
GUI libraries like to use ToString as a default representation for classes. There it needs to be localized.
ToString is used for logging. There it should provide programming related information, is not translated and includes internal states like surrogate keys and enum values.
ToString is used by many string operations which take objects as arguments, for instance String.Format, when writing to streams. Depending on the context you expect something different.
ToString is too limited if there are many different representations of the same object, eg. a long and a short form.
Because of the different usages, there are many different kinds of implementation. So they are too unreliable to be really useful.
How should ToString be implemented to be useful? When should ToString be used, when should it be avoided?
The .NET Framework documentation says:
This method returns a human-readable
string that is culture-sensitive.
There is a similar question, but not the same.

It seems you have great expectations from a tiny little method :) As far as I know it's not a good idea to use a general method in so many different contexts specially when its behavior can differ from class to class.
Here is my suggestions:
1.Do not let GUI libraries use ToString() of your objects.Instead use more meaningful properties (Almost all controls can be customized to show other properties than ToString)
for example use DisplayMember.
2.When getting some information about an object (for logging or other usages) let somebody decide (another object or the object itself)what should be provided and how it should be displayed.(A strategy pattern may come in handy)

Here is a nice article which explains Overriding System.Object.ToString() and Implementing IFormattable

It depends on the indended usage of your class.
Many classes don't have a natural string representation (i.e. a Form object). Then I would implement ToString as an informative method (Form text, size, and so on) useful when debugging.
If the class is meant to give information to the user then I would implement ToString as a default representation of the value. If you have a Vector object for instance, then ToString might return the vector as an X and Y coordinate. Here I would also add alternative methods if there are other ways to describe the class. So for the Vector I might add a method that returns a description as an angle and a lenght.
For debugging purposes you may also want to add the DebuggerDisplay attribute to your class. This tells how to display the class in the debugger, but it doesn't affect the string representation.
You may also want to consider making the value returned by ToString to be parseable so that you can create an object from a string representation. Like you can do with the Int32.Parse method.

Another wrinkle to consider is the tight integration between ToString and Visual Studio's debugger. The Watch window displays the result of ToString as the value of the expression, so if your method performs any lazy-loading, has any side-effects, or takes a long time, then you may see strange behavior or the debugger may appear to hang. Granted, these qualities are not the mark of a well designed ToString method, but they happen (e.g. a naive "fetch the translation from the database" implementation).
Consequently, I consider the default ToString method (without parameters) to be a Visual Studio debugging hook -- with the implication that it should not generally be overloaded for use by the program outside of a debugging context.
While those in the know leverage the debugging attributes (DebuggerTypeProxyAttribute, DebuggerDisplayAttribute, DebuggerBrowsableAttribute) to customize the debugger, many (including myself) generally consider the default output as generated by ToString and displayed in the Watch windows to be good enough.
I understand that this is a rather strict perspective -- writing off ToString as a debugger hook -- but I find that implementing IFormattable seems to be the more reliable and extensible route.

Personally, I don't implement ToString that often. In many cases, it wouldn't make a whole lot of sense, since a type's main role may be to define behavior, not data. In other cases, it simply doesn't matter because no clients ever need it.
In any case, here are some cases where it makes sense (not an exhaustive list):
If the result of ToString could conceivable be parsed back into an instance of the type without data loss.
When the type has a simple (i.e. not complex) value.
When the main purpose of the type is to format data into text.
I don't agree that there is a conflict between the usage scenarios that you list. When display is the main purpose, ToString should provide a user-friendly text, but for logging (or rather, as you describe it, for tracing) I would say that you shouldn't be tracing a UI-specific element in any case, but rather an object whose purpose is to write detailed trace data.
So there is no conflict because it should not be the same type according to the Single Responsibility Principle.
Remember that you can always overload the ToString method if you need more control.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.