Why do unboxed types have methods? - c#

Why can you do things like
int i = 10;
i.ToString();
'c'.Equals('d');
1.ToString();
true.GetType();
in C#? Those things right there are either primitive, literal, unboxed, or any combination of those things; so why do they have methods? They are not objects and so should not have methods. Is this syntax sugar for something else? If so, what? I can understand having functions that do these things, for example:
string ToString(int number)
{
// Do mad code
return newString;
}
but in that case you would call it as a function, not a method:
string ranch = ToString(1);
What's going on here?
edit:
Just realised C# isn't a java clone anymore and the rules are totally different. oops :P

They act like that because the spec says so (and it's pretty nice) :
1.28 Value types
A value type is either a struct type or an enumeration type. C# provides a set of predefined struct types called the simple types.
The simple types are identified through reserved words.
...
1.28.4 Simple types
C# provides a set of predefined struct types called the simple types.
The simple types are identified through reserved words, but these
reserved words are simply aliases for predefined struct types in the
System namespace, as described in the table below.
...
Because a simple type aliases a struct type, every simple type has
members. For example, int has the members declared in System.Int32 and
the members inherited from System.Object, and the following statements
are permitted:
int i = int.MaxValue; // System.Int32.MaxValue constant
string s = i.ToString(); // System.Int32.ToString() instance method
string t = 123.ToString(); // System.Int32.ToString() instance method
The simple types differ from other struct types in that they permit
certain additional operations:
Most simple types permit values to be created by writing literals
(§1.16.4). For example, 123 is a literal of type int and 'a' is a
literal of type char. C# makes no provision for literals of struct
types in general, and nondefault values of other struct types are
ultimately always created through instance constructors of those
struct types.
As the spec explains simple types have some super powers like the ability to be const, a special literal syntax that could be used instead of new, and the capacity to be computed at compilation time (2+2 is actually written as 4 in the final MSIL stream)
But methods (as well as operators) aren't a special super powers and all structs could have them.
The specification (for C# 4.0, my copy paste is from an earlier version) could be downloaded from the microsoft website : C# Language Specification 4.0

Eric Lippert's recent article Inheritance and Representation explains.(Spoiler: You are confusing inheritance and representation.)
Not sure why you claim that the integer i, the character 'c' and the integer 1 are not objects. They are.

In C# all primitive types are actually structures.

So that you can use them!
It's convenient to be able to do so, so you can.
Now, in order to do so, primitives can be treated as structs. E.g. a 32-bit integer can be processed as a 32-bit integer, but it can also be processed as public struct Int32 : IComparable, IFormattable, IConvertible, IComparable<int>, IEquatable<int>. We mostly get the best of both worlds.

Related

Design reasons behind making ToUpper a static method on Char

In C#, we have this non-static method on the type string:
"abc".ToUpper()
but for char, we need to use a static method:
char.ToUpper('a')
When introducing c# to beginners, they always expect to be able to write the following:
'a'.ToUpper()
Does anyone have insights as why it was designed like this?
The only thing I can think of is performance but then I would have expected a static ToUpper() for the type string too.
The difference lies in the fact that string is a reference type, and char is a keyword that represents the .Net Framework's Char Structure. When you call Char.ToUpper('a') you are actually using the Char Structure in C#. Structures are Value Types. Value Types are immutable.
Since structs are immutable, methods that act upon the struct itself do not work as expected (see Why are Mutable Structs Evil). Thus static methods are needed. When calling Char.ToUpper(aChar) you are not actually changing aChar, instead, you are creating a new instance of a character that is the uppercase representation of the character you passed in as a parameter and returning it. The example below demonstrates this.
Char aChar = 'a';
Char.ToUpper(aChar);
//aChar still equals 'a'
Char bChar = 'b';
bChar = Char.ToUpper(bChar);
//bChar now equals 'B'
The reason char has other methods that allow you to do things like 'a'.Equals('a'); is because value types and reference types both inherit from Object, which defines those methods (technically, value types are of type System.ValueType, which inherits from System.Object). These methods do not enact any changes to the object itself.
Edit - Why this question is actually speculation
As I'm very curious to see if there's an actual answer to "why do chars not have a .ToUpper() method", I decided to check out the CSharp 5 Language Specification Document, I have found the following:
char is an Integral Type (pg 80), which is a subset of Simple Types. Simple Types themselves are just predefined Struct Types. Struct types are Value Types that "can declare constants, fields, methods, properties, indexers, operators, instance constructors, static constructors, and nested types" (pg 79).
string is a Class Type, which is a Reference Type (pg 85). Class Types define "a data structure that contains data members (constants and fields), function members (methods, properties, events, indexers, operators, instance constructors, destructors and static constructors), and nested types" (pg 84).
At this point, it is obvious that chars can support a .ToUpper() method (which is why the extension method works). However, as the question states, they do not support one. At this point I'm convinced any reasoning as to why this is true is pure speculation (unless you're on the C# team, of course).
Hans Passant mentioned that it is possible to achieve this syntax easily via extension methods. I'll provide the code here in case anyone is deeply attached to using that syntax.
public static class MyExtensionMethods
{
public static char ToUpper( this char c )
{
return char.ToUpper( c );
}
}
Then you can do:
'a'.ToUpper()
(sorry, not enough space in a comment -- I know this isn't a complete answer.)
This seems to be a pattern across all the primitive types; int, double, and bool, for example, also do not have methods (except ToString() variants). So it's not just char -- it's a property of all the primitive types that c# defines.
I would guess (and it is a guess) that any time you access data, you're either directly accessing bits of RAM -- the primitive values like int and char and byte -- or you're accessing a .NET construct like an object or struct. A char is always 2 bytes at a particular memory address. So the framework can treat it like a raw memory location.
If we try to treat the raw RAM as objects, you'll either have to 'box' everything to do any work, or it's just not possible. My guess is that you can't do some core feature like virtual method dispatch on primitives, and that the world of objects and the world of primitives has to be kept separate.
Anyway, hope that advances the conversation on some level...

When to use String and string? [duplicate]

What are the differences between these two and which one should I use?
string s = "Hello world!";
String s = "Hello world!";
string is an alias in C# for System.String.
So technically, there is no difference. It's like int vs. System.Int32.
As far as guidelines, it's generally recommended to use string any time you're referring to an object.
e.g.
string place = "world";
Likewise, I think it's generally recommended to use String if you need to refer specifically to the class.
e.g.
string greet = String.Format("Hello {0}!", place);
This is the style that Microsoft tends to use in their examples.
It appears that the guidance in this area may have changed, as StyleCop now enforces the use of the C# specific aliases.
Just for the sake of completeness, here's a brain dump of related information...
As others have noted, string is an alias for System.String. Assuming your code using String compiles to System.String (i.e. you haven't got a using directive for some other namespace with a different String type), they compile to the same code, so at execution time there is no difference whatsoever. This is just one of the aliases in C#. The complete list is:
object: System.Object
string: System.String
bool: System.Boolean
byte: System.Byte
sbyte: System.SByte
short: System.Int16
ushort: System.UInt16
int: System.Int32
uint: System.UInt32
long: System.Int64
ulong: System.UInt64
float: System.Single
double: System.Double
decimal: System.Decimal
char: System.Char
Apart from string and object, the aliases are all to value types. decimal is a value type, but not a primitive type in the CLR. The only primitive type which doesn't have an alias is System.IntPtr.
In the spec, the value type aliases are known as "simple types". Literals can be used for constant values of every simple type; no other value types have literal forms available. (Compare this with VB, which allows DateTime literals, and has an alias for it too.)
There is one circumstance in which you have to use the aliases: when explicitly specifying an enum's underlying type. For instance:
public enum Foo : UInt32 {} // Invalid
public enum Bar : uint {} // Valid
That's just a matter of the way the spec defines enum declarations - the part after the colon has to be the integral-type production, which is one token of sbyte, byte, short, ushort, int, uint, long, ulong, char... as opposed to a type production as used by variable declarations for example. It doesn't indicate any other difference.
Finally, when it comes to which to use: personally I use the aliases everywhere for the implementation, but the CLR type for any APIs. It really doesn't matter too much which you use in terms of implementation - consistency among your team is nice, but no-one else is going to care. On the other hand, it's genuinely important that if you refer to a type in an API, you do so in a language-neutral way. A method called ReadInt32 is unambiguous, whereas a method called ReadInt requires interpretation. The caller could be using a language that defines an int alias for Int16, for example. The .NET framework designers have followed this pattern, good examples being in the BitConverter, BinaryReader and Convert classes.
String stands for System.String and it is a .NET Framework type. string is an alias in the C# language for System.String. Both of them are compiled to System.String in IL (Intermediate Language), so there is no difference. Choose what you like and use that. If you code in C#, I'd prefer string as it's a C# type alias and well-known by C# programmers.
I can say the same about (int, System.Int32) etc..
The best answer I have ever heard about using the provided type aliases in C# comes from Jeffrey Richter in his book CLR Via C#. Here are his 3 reasons:
I've seen a number of developers confused, not knowing whether to use string or String in their code. Because in C# the string (a keyword) maps exactly to System.String (an FCL type), there is no difference and either can be used.
In C#, long maps to System.Int64, but in a different programming language, long could map to an Int16 or Int32. In fact, C++/CLI does in fact treat long as an Int32. Someone reading source code in one language could easily misinterpret the code's intention if he or she were used to programming in a different programming language. In fact, most languages won't even treat long as a keyword and won't compile code that uses it.
The FCL has many methods that have type names as part of their method names. For example, the BinaryReader type offers methods such as ReadBoolean, ReadInt32, ReadSingle, and so on, and the System.Convert type offers methods such as ToBoolean, ToInt32, ToSingle, and so on. Although it's legal to write the following code, the line with float feels very unnatural to me, and it's not obvious that the line is correct:
BinaryReader br = new BinaryReader(...);
float val = br.ReadSingle(); // OK, but feels unnatural
Single val = br.ReadSingle(); // OK and feels good
So there you have it. I think these are all really good points. I however, don't find myself using Jeffrey's advice in my own code. Maybe I am too stuck in my C# world but I end up trying to make my code look like the framework code.
string is a reserved word, but String is just a class name.
This means that string cannot be used as a variable name by itself.
If for some reason you wanted a variable called string, you'd see only the first of these compiles:
StringBuilder String = new StringBuilder(); // compiles
StringBuilder string = new StringBuilder(); // doesn't compile
If you really want a variable name called string you can use # as a prefix:
StringBuilder #string = new StringBuilder();
Another critical difference: Stack Overflow highlights them differently.
There is one difference - you can't use String without using System; beforehand.
It's been covered above; however, you can't use string in reflection; you must use String.
System.String is the .NET string class - in C# string is an alias for System.String - so in use they are the same.
As for guidelines I wouldn't get too bogged down and just use whichever you feel like - there are more important things in life and the code is going to be the same anyway.
If you find yourselves building systems where it is necessary to specify the size of the integers you are using and so tend to use Int16, Int32, UInt16, UInt32 etc. then it might look more natural to use String - and when moving around between different .net languages it might make things more understandable - otherwise I would use string and int.
I prefer the capitalized .NET types (rather than the aliases) for formatting reasons. The .NET types are colored the same as other object types (the value types are proper objects, after all).
Conditional and control keywords (like if, switch, and return) are lowercase and colored dark blue (by default). And I would rather not have the disagreement in use and format.
Consider:
String someString;
string anotherString;
string and String are identical in all ways (except the uppercase "S"). There are no performance implications either way.
Lowercase string is preferred in most projects due to the syntax highlighting
This YouTube video demonstrates practically how they differ.
But now for a long textual answer.
When we talk about .NET there are two different things one there is .NET framework and the other there are languages (C#, VB.NET etc) which use that framework.
"System.String" a.k.a "String" (capital "S") is a .NET framework data type while "string" is a C# data type.
In short "String" is an alias (the same thing called with different names) of "string". So technically both the below code statements will give the same output.
String s = "I am String";
or
string s = "I am String";
In the same way, there are aliases for other C# data types as shown below:
object: System.Object, string: System.String, bool: System.Boolean, byte: System.Byte, sbyte: System.SByte, short: System.Int16 and so on.
Now the million-dollar question from programmer's point of view: So when to use "String" and "string"?
The first thing to avoid confusion use one of them consistently. But from best practices perspective when you do variable declaration it's good to use "string" (small "s") and when you are using it as a class name then "String" (capital "S") is preferred.
In the below code the left-hand side is a variable declaration and it is declared using "string". On the right-hand side, we are calling a method so "String" is more sensible.
string s = String.ToUpper() ;
C# is a language which is used together with the CLR.
string is a type in C#.
System.String is a type in the CLR.
When you use C# together with the CLR string will be mapped to System.String.
Theoretically, you could implement a C#-compiler that generated Java bytecode. A sensible implementation of this compiler would probably map string to java.lang.String in order to interoperate with the Java runtime library.
Lower case string is an alias for System.String.
They are the same in C#.
There's a debate over whether you should use the System types (System.Int32, System.String, etc.) types or the C# aliases (int, string, etc). I personally believe you should use the C# aliases, but that's just my personal preference.
string is just an alias for System.String. The compiler will treat them identically.
The only practical difference is the syntax highlighting as you mention, and that you have to write using System if you use String.
Both are same. But from coding guidelines perspective it's better to use string instead of String. This is what generally developers use. e.g. instead of using Int32 we use int as int is alias to Int32
FYI
“The keyword string is simply an alias for the predefined class System.String.” - C# Language Specification 4.2.3
http://msdn2.microsoft.com/En-US/library/aa691153.aspx
As the others are saying, they're the same. StyleCop rules, by default, will enforce you to use string as a C# code style best practice, except when referencing System.String static functions, such as String.Format, String.Join, String.Concat, etc...
New answer after 6 years and 5 months (procrastination).
While string is a reserved C# keyword that always has a fixed meaning, String is just an ordinary identifier which could refer to anything. Depending on members of the current type, the current namespace and the applied using directives and their placement, String could be a value or a type distinct from global::System.String.
I shall provide two examples where using directives will not help.
First, when String is a value of the current type (or a local variable):
class MySequence<TElement>
{
public IEnumerable<TElement> String { get; set; }
void Example()
{
var test = String.Format("Hello {0}.", DateTime.Today.DayOfWeek);
}
}
The above will not compile because IEnumerable<> does not have a non-static member called Format, and no extension methods apply. In the above case, it may still be possible to use String in other contexts where a type is the only possibility syntactically. For example String local = "Hi mum!"; could be OK (depending on namespace and using directives).
Worse: Saying String.Concat(someSequence) will likely (depending on usings) go to the Linq extension method Enumerable.Concat. It will not go to the static method string.Concat.
Secondly, when String is another type, nested inside the current type:
class MyPiano
{
protected class String
{
}
void Example()
{
var test1 = String.Format("Hello {0}.", DateTime.Today.DayOfWeek);
String test2 = "Goodbye";
}
}
Neither statement in the Example method compiles. Here String is always a piano string, MyPiano.String. No member (static or not) Format exists on it (or is inherited from its base class). And the value "Goodbye" cannot be converted into it.
Using System types makes it easier to port between C# and VB.Net, if you are into that sort of thing.
Against what seems to be common practice among other programmers, I prefer String over string, just to highlight the fact that String is a reference type, as Jon Skeet mentioned.
string is an alias (or shorthand) of System.String. That means, by typing string we meant System.String. You can read more in think link: 'string' is an alias/shorthand of System.String.
I'd just like to add this to lfousts answer, from Ritchers book:
The C# language specification states, “As a matter of style, use of the keyword is favored over
use of the complete system type name.” I disagree with the language specification; I prefer
to use the FCL type names and completely avoid the primitive type names. In fact, I wish that
compilers didn’t even offer the primitive type names and forced developers to use the FCL
type names instead. Here are my reasons:
I’ve seen a number of developers confused, not knowing whether to use string
or String in their code. Because in C# string (a keyword) maps exactly to
System.String (an FCL type), there is no difference and either can be used. Similarly,
I’ve heard some developers say that int represents a 32-bit integer when the application
is running on a 32-bit OS and that it represents a 64-bit integer when the application
is running on a 64-bit OS. This statement is absolutely false: in C#, an int always maps
to System.Int32, and therefore it represents a 32-bit integer regardless of the OS the
code is running on. If programmers would use Int32 in their code, then this potential
confusion is also eliminated.
In C#, long maps to System.Int64, but in a different programming language, long
could map to an Int16 or Int32. In fact, C++/CLI does treat long as an Int32.
Someone reading source code in one language could easily misinterpret the code’s
intention if he or she were used to programming in a different programming language.
In fact, most languages won’t even treat long as a keyword and won’t compile code
that uses it.
The FCL has many methods that have type names as part of their method names. For
example, the BinaryReader type offers methods such as ReadBoolean, ReadInt32,
ReadSingle, and so on, and the System.Convert type offers methods such as
ToBoolean, ToInt32, ToSingle, and so on. Although it’s legal to write the following
code, the line with float feels very unnatural to me, and it’s not obvious that the line is
correct:
BinaryReader br = new BinaryReader(...);
float val = br.ReadSingle(); // OK, but feels unnatural
Single val = br.ReadSingle(); // OK and feels good
Many programmers that use C# exclusively tend to forget that other programming
languages can be used against the CLR, and because of this, C#-isms creep into the
class library code. For example, Microsoft’s FCL is almost exclusively written in C# and
developers on the FCL team have now introduced methods into the library such as
Array’s GetLongLength, which returns an Int64 value that is a long in C# but not
in other languages (like C++/CLI). Another example is System.Linq.Enumerable’s
LongCount method.
I didn't get his opinion before I read the complete paragraph.
String (System.String) is a class in the base class library. string (lower case) is a reserved work in C# that is an alias for System.String. Int32 vs int is a similar situation as is Boolean vs. bool. These C# language specific keywords enable you to declare primitives in a style similar to C.
It's a matter of convention, really. string just looks more like C/C++ style. The general convention is to use whatever shortcuts your chosen language has provided (int/Int for Int32). This goes for "object" and decimal as well.
Theoretically this could help to port code into some future 64-bit standard in which "int" might mean Int64, but that's not the point, and I would expect any upgrade wizard to change any int references to Int32 anyway just to be safe.
#JaredPar (a developer on the C# compiler and prolific SO user!) wrote a great blog post on this issue. I think it is worth sharing here. It is a nice perspective on our subject.
string vs. String is not a style debate
[...]
The keyword string has concrete meaning in C#. It is the type System.String which exists in the core runtime assembly. The runtime intrinsically understands this type and provides the capabilities developers expect for strings in .NET. Its presence is so critical to C# that if that type doesn’t exist the compiler will exit before attempting to even parse a line of code. Hence string has a precise, unambiguous meaning in C# code.
The identifier String though has no concrete meaning in C#. It is an identifier that goes through all the name lookup rules as Widget, Student, etc … It could bind to string or it could bind to a type in another assembly entirely whose purposes may be entirely different than string. Worse it could be defined in a way such that code like String s = "hello"; continued to compile.
class TricksterString {
void Example() {
String s = "Hello World"; // Okay but probably not what you expect.
}
}
class String {
public static implicit operator String(string s) => null;
}
The actual meaning of String will always depend on name resolution.
That means it depends on all the source files in the project and all
the types defined in all the referenced assemblies. In short it
requires quite a bit of context to know what it means.
True that in the vast majority of cases String and string will bind to
the same type. But using String still means developers are leaving
their program up to interpretation in places where there is only one
correct answer. When String does bind to the wrong type it can leave
developers debugging for hours, filing bugs on the compiler team, and
generally wasting time that could’ve been saved by using string.
Another way to visualize the difference is with this sample:
string s1 = 42; // Errors 100% of the time
String s2 = 42; // Might error, might not, depends on the code
Many will argue that while this is information technically accurate using String is still fine because it’s exceedingly rare that a codebase would define a type of this name. Or that when String is defined it’s a sign of a bad codebase.
[...]
You’ll see that String is defined for a number of completely valid purposes: reflection helpers, serialization libraries, lexers, protocols, etc … For any of these libraries String vs. string has real consequences depending on where the code is used.
So remember when you see the String vs. string debate this is about semantics, not style. Choosing string gives crisp meaning to your codebase. Choosing String isn’t wrong but it’s leaving the door open for surprises in the future.
Note: I copy/pasted most of the blog posts for archive reasons. I ignore some parts, so I recommend skipping and reading the blog post if you can.
String is not a keyword and it can be used as Identifier whereas string is a keyword and cannot be used as Identifier. And in function point of view both are same.
Coming late to the party: I use the CLR types 100% of the time (well, except if forced to use the C# type, but I don't remember when the last time that was).
I originally started doing this years ago, as per the CLR books by Ritchie. It made sense to me that all CLR languages ultimately have to be able to support the set of CLR types, so using the CLR types yourself provided clearer, and possibly more "reusable" code.
Now that I've been doing it for years, it's a habit and I like the coloration that VS shows for the CLR types.
The only real downer is that auto-complete uses the C# type, so I end up re-typing automatically generated types to specify the CLR type instead.
Also, now, when I see "int" or "string", it just looks really wrong to me, like I'm looking at 1970's C code.
There is no difference.
The C# keyword string maps to the .NET type System.String - it is an alias that keeps to the naming conventions of the language.
Similarly, int maps to System.Int32.
There's a quote on this issue from Daniel Solis' book.
All the predefined types are mapped directly to
underlying .NET types. The C# type names (string) are simply aliases for the
.NET types (String or System.String), so using the .NET names works fine syntactically, although
this is discouraged. Within a C# program, you should use the C# names
rather than the .NET names.
Yes, that's no difference between them, just like the bool and Boolean.
string is a keyword, and you can't use string as an identifier.
String is not a keyword, and you can use it as an identifier:
Example
string String = "I am a string";
The keyword string is an alias for
System.String aside from the keyword issue, the two are exactly
equivalent.
typeof(string) == typeof(String) == typeof(System.String)

What's the difference between string and String in C#? [duplicate]

What are the differences between these two and which one should I use?
string s = "Hello world!";
String s = "Hello world!";
string is an alias in C# for System.String.
So technically, there is no difference. It's like int vs. System.Int32.
As far as guidelines, it's generally recommended to use string any time you're referring to an object.
e.g.
string place = "world";
Likewise, I think it's generally recommended to use String if you need to refer specifically to the class.
e.g.
string greet = String.Format("Hello {0}!", place);
This is the style that Microsoft tends to use in their examples.
It appears that the guidance in this area may have changed, as StyleCop now enforces the use of the C# specific aliases.
Just for the sake of completeness, here's a brain dump of related information...
As others have noted, string is an alias for System.String. Assuming your code using String compiles to System.String (i.e. you haven't got a using directive for some other namespace with a different String type), they compile to the same code, so at execution time there is no difference whatsoever. This is just one of the aliases in C#. The complete list is:
object: System.Object
string: System.String
bool: System.Boolean
byte: System.Byte
sbyte: System.SByte
short: System.Int16
ushort: System.UInt16
int: System.Int32
uint: System.UInt32
long: System.Int64
ulong: System.UInt64
float: System.Single
double: System.Double
decimal: System.Decimal
char: System.Char
Apart from string and object, the aliases are all to value types. decimal is a value type, but not a primitive type in the CLR. The only primitive type which doesn't have an alias is System.IntPtr.
In the spec, the value type aliases are known as "simple types". Literals can be used for constant values of every simple type; no other value types have literal forms available. (Compare this with VB, which allows DateTime literals, and has an alias for it too.)
There is one circumstance in which you have to use the aliases: when explicitly specifying an enum's underlying type. For instance:
public enum Foo : UInt32 {} // Invalid
public enum Bar : uint {} // Valid
That's just a matter of the way the spec defines enum declarations - the part after the colon has to be the integral-type production, which is one token of sbyte, byte, short, ushort, int, uint, long, ulong, char... as opposed to a type production as used by variable declarations for example. It doesn't indicate any other difference.
Finally, when it comes to which to use: personally I use the aliases everywhere for the implementation, but the CLR type for any APIs. It really doesn't matter too much which you use in terms of implementation - consistency among your team is nice, but no-one else is going to care. On the other hand, it's genuinely important that if you refer to a type in an API, you do so in a language-neutral way. A method called ReadInt32 is unambiguous, whereas a method called ReadInt requires interpretation. The caller could be using a language that defines an int alias for Int16, for example. The .NET framework designers have followed this pattern, good examples being in the BitConverter, BinaryReader and Convert classes.
String stands for System.String and it is a .NET Framework type. string is an alias in the C# language for System.String. Both of them are compiled to System.String in IL (Intermediate Language), so there is no difference. Choose what you like and use that. If you code in C#, I'd prefer string as it's a C# type alias and well-known by C# programmers.
I can say the same about (int, System.Int32) etc..
The best answer I have ever heard about using the provided type aliases in C# comes from Jeffrey Richter in his book CLR Via C#. Here are his 3 reasons:
I've seen a number of developers confused, not knowing whether to use string or String in their code. Because in C# the string (a keyword) maps exactly to System.String (an FCL type), there is no difference and either can be used.
In C#, long maps to System.Int64, but in a different programming language, long could map to an Int16 or Int32. In fact, C++/CLI does in fact treat long as an Int32. Someone reading source code in one language could easily misinterpret the code's intention if he or she were used to programming in a different programming language. In fact, most languages won't even treat long as a keyword and won't compile code that uses it.
The FCL has many methods that have type names as part of their method names. For example, the BinaryReader type offers methods such as ReadBoolean, ReadInt32, ReadSingle, and so on, and the System.Convert type offers methods such as ToBoolean, ToInt32, ToSingle, and so on. Although it's legal to write the following code, the line with float feels very unnatural to me, and it's not obvious that the line is correct:
BinaryReader br = new BinaryReader(...);
float val = br.ReadSingle(); // OK, but feels unnatural
Single val = br.ReadSingle(); // OK and feels good
So there you have it. I think these are all really good points. I however, don't find myself using Jeffrey's advice in my own code. Maybe I am too stuck in my C# world but I end up trying to make my code look like the framework code.
string is a reserved word, but String is just a class name.
This means that string cannot be used as a variable name by itself.
If for some reason you wanted a variable called string, you'd see only the first of these compiles:
StringBuilder String = new StringBuilder(); // compiles
StringBuilder string = new StringBuilder(); // doesn't compile
If you really want a variable name called string you can use # as a prefix:
StringBuilder #string = new StringBuilder();
Another critical difference: Stack Overflow highlights them differently.
There is one difference - you can't use String without using System; beforehand.
It's been covered above; however, you can't use string in reflection; you must use String.
System.String is the .NET string class - in C# string is an alias for System.String - so in use they are the same.
As for guidelines I wouldn't get too bogged down and just use whichever you feel like - there are more important things in life and the code is going to be the same anyway.
If you find yourselves building systems where it is necessary to specify the size of the integers you are using and so tend to use Int16, Int32, UInt16, UInt32 etc. then it might look more natural to use String - and when moving around between different .net languages it might make things more understandable - otherwise I would use string and int.
I prefer the capitalized .NET types (rather than the aliases) for formatting reasons. The .NET types are colored the same as other object types (the value types are proper objects, after all).
Conditional and control keywords (like if, switch, and return) are lowercase and colored dark blue (by default). And I would rather not have the disagreement in use and format.
Consider:
String someString;
string anotherString;
string and String are identical in all ways (except the uppercase "S"). There are no performance implications either way.
Lowercase string is preferred in most projects due to the syntax highlighting
This YouTube video demonstrates practically how they differ.
But now for a long textual answer.
When we talk about .NET there are two different things one there is .NET framework and the other there are languages (C#, VB.NET etc) which use that framework.
"System.String" a.k.a "String" (capital "S") is a .NET framework data type while "string" is a C# data type.
In short "String" is an alias (the same thing called with different names) of "string". So technically both the below code statements will give the same output.
String s = "I am String";
or
string s = "I am String";
In the same way, there are aliases for other C# data types as shown below:
object: System.Object, string: System.String, bool: System.Boolean, byte: System.Byte, sbyte: System.SByte, short: System.Int16 and so on.
Now the million-dollar question from programmer's point of view: So when to use "String" and "string"?
The first thing to avoid confusion use one of them consistently. But from best practices perspective when you do variable declaration it's good to use "string" (small "s") and when you are using it as a class name then "String" (capital "S") is preferred.
In the below code the left-hand side is a variable declaration and it is declared using "string". On the right-hand side, we are calling a method so "String" is more sensible.
string s = String.ToUpper() ;
C# is a language which is used together with the CLR.
string is a type in C#.
System.String is a type in the CLR.
When you use C# together with the CLR string will be mapped to System.String.
Theoretically, you could implement a C#-compiler that generated Java bytecode. A sensible implementation of this compiler would probably map string to java.lang.String in order to interoperate with the Java runtime library.
Lower case string is an alias for System.String.
They are the same in C#.
There's a debate over whether you should use the System types (System.Int32, System.String, etc.) types or the C# aliases (int, string, etc). I personally believe you should use the C# aliases, but that's just my personal preference.
string is just an alias for System.String. The compiler will treat them identically.
The only practical difference is the syntax highlighting as you mention, and that you have to write using System if you use String.
Both are same. But from coding guidelines perspective it's better to use string instead of String. This is what generally developers use. e.g. instead of using Int32 we use int as int is alias to Int32
FYI
“The keyword string is simply an alias for the predefined class System.String.” - C# Language Specification 4.2.3
http://msdn2.microsoft.com/En-US/library/aa691153.aspx
As the others are saying, they're the same. StyleCop rules, by default, will enforce you to use string as a C# code style best practice, except when referencing System.String static functions, such as String.Format, String.Join, String.Concat, etc...
New answer after 6 years and 5 months (procrastination).
While string is a reserved C# keyword that always has a fixed meaning, String is just an ordinary identifier which could refer to anything. Depending on members of the current type, the current namespace and the applied using directives and their placement, String could be a value or a type distinct from global::System.String.
I shall provide two examples where using directives will not help.
First, when String is a value of the current type (or a local variable):
class MySequence<TElement>
{
public IEnumerable<TElement> String { get; set; }
void Example()
{
var test = String.Format("Hello {0}.", DateTime.Today.DayOfWeek);
}
}
The above will not compile because IEnumerable<> does not have a non-static member called Format, and no extension methods apply. In the above case, it may still be possible to use String in other contexts where a type is the only possibility syntactically. For example String local = "Hi mum!"; could be OK (depending on namespace and using directives).
Worse: Saying String.Concat(someSequence) will likely (depending on usings) go to the Linq extension method Enumerable.Concat. It will not go to the static method string.Concat.
Secondly, when String is another type, nested inside the current type:
class MyPiano
{
protected class String
{
}
void Example()
{
var test1 = String.Format("Hello {0}.", DateTime.Today.DayOfWeek);
String test2 = "Goodbye";
}
}
Neither statement in the Example method compiles. Here String is always a piano string, MyPiano.String. No member (static or not) Format exists on it (or is inherited from its base class). And the value "Goodbye" cannot be converted into it.
Using System types makes it easier to port between C# and VB.Net, if you are into that sort of thing.
Against what seems to be common practice among other programmers, I prefer String over string, just to highlight the fact that String is a reference type, as Jon Skeet mentioned.
string is an alias (or shorthand) of System.String. That means, by typing string we meant System.String. You can read more in think link: 'string' is an alias/shorthand of System.String.
I'd just like to add this to lfousts answer, from Ritchers book:
The C# language specification states, “As a matter of style, use of the keyword is favored over
use of the complete system type name.” I disagree with the language specification; I prefer
to use the FCL type names and completely avoid the primitive type names. In fact, I wish that
compilers didn’t even offer the primitive type names and forced developers to use the FCL
type names instead. Here are my reasons:
I’ve seen a number of developers confused, not knowing whether to use string
or String in their code. Because in C# string (a keyword) maps exactly to
System.String (an FCL type), there is no difference and either can be used. Similarly,
I’ve heard some developers say that int represents a 32-bit integer when the application
is running on a 32-bit OS and that it represents a 64-bit integer when the application
is running on a 64-bit OS. This statement is absolutely false: in C#, an int always maps
to System.Int32, and therefore it represents a 32-bit integer regardless of the OS the
code is running on. If programmers would use Int32 in their code, then this potential
confusion is also eliminated.
In C#, long maps to System.Int64, but in a different programming language, long
could map to an Int16 or Int32. In fact, C++/CLI does treat long as an Int32.
Someone reading source code in one language could easily misinterpret the code’s
intention if he or she were used to programming in a different programming language.
In fact, most languages won’t even treat long as a keyword and won’t compile code
that uses it.
The FCL has many methods that have type names as part of their method names. For
example, the BinaryReader type offers methods such as ReadBoolean, ReadInt32,
ReadSingle, and so on, and the System.Convert type offers methods such as
ToBoolean, ToInt32, ToSingle, and so on. Although it’s legal to write the following
code, the line with float feels very unnatural to me, and it’s not obvious that the line is
correct:
BinaryReader br = new BinaryReader(...);
float val = br.ReadSingle(); // OK, but feels unnatural
Single val = br.ReadSingle(); // OK and feels good
Many programmers that use C# exclusively tend to forget that other programming
languages can be used against the CLR, and because of this, C#-isms creep into the
class library code. For example, Microsoft’s FCL is almost exclusively written in C# and
developers on the FCL team have now introduced methods into the library such as
Array’s GetLongLength, which returns an Int64 value that is a long in C# but not
in other languages (like C++/CLI). Another example is System.Linq.Enumerable’s
LongCount method.
I didn't get his opinion before I read the complete paragraph.
String (System.String) is a class in the base class library. string (lower case) is a reserved work in C# that is an alias for System.String. Int32 vs int is a similar situation as is Boolean vs. bool. These C# language specific keywords enable you to declare primitives in a style similar to C.
It's a matter of convention, really. string just looks more like C/C++ style. The general convention is to use whatever shortcuts your chosen language has provided (int/Int for Int32). This goes for "object" and decimal as well.
Theoretically this could help to port code into some future 64-bit standard in which "int" might mean Int64, but that's not the point, and I would expect any upgrade wizard to change any int references to Int32 anyway just to be safe.
#JaredPar (a developer on the C# compiler and prolific SO user!) wrote a great blog post on this issue. I think it is worth sharing here. It is a nice perspective on our subject.
string vs. String is not a style debate
[...]
The keyword string has concrete meaning in C#. It is the type System.String which exists in the core runtime assembly. The runtime intrinsically understands this type and provides the capabilities developers expect for strings in .NET. Its presence is so critical to C# that if that type doesn’t exist the compiler will exit before attempting to even parse a line of code. Hence string has a precise, unambiguous meaning in C# code.
The identifier String though has no concrete meaning in C#. It is an identifier that goes through all the name lookup rules as Widget, Student, etc … It could bind to string or it could bind to a type in another assembly entirely whose purposes may be entirely different than string. Worse it could be defined in a way such that code like String s = "hello"; continued to compile.
class TricksterString {
void Example() {
String s = "Hello World"; // Okay but probably not what you expect.
}
}
class String {
public static implicit operator String(string s) => null;
}
The actual meaning of String will always depend on name resolution.
That means it depends on all the source files in the project and all
the types defined in all the referenced assemblies. In short it
requires quite a bit of context to know what it means.
True that in the vast majority of cases String and string will bind to
the same type. But using String still means developers are leaving
their program up to interpretation in places where there is only one
correct answer. When String does bind to the wrong type it can leave
developers debugging for hours, filing bugs on the compiler team, and
generally wasting time that could’ve been saved by using string.
Another way to visualize the difference is with this sample:
string s1 = 42; // Errors 100% of the time
String s2 = 42; // Might error, might not, depends on the code
Many will argue that while this is information technically accurate using String is still fine because it’s exceedingly rare that a codebase would define a type of this name. Or that when String is defined it’s a sign of a bad codebase.
[...]
You’ll see that String is defined for a number of completely valid purposes: reflection helpers, serialization libraries, lexers, protocols, etc … For any of these libraries String vs. string has real consequences depending on where the code is used.
So remember when you see the String vs. string debate this is about semantics, not style. Choosing string gives crisp meaning to your codebase. Choosing String isn’t wrong but it’s leaving the door open for surprises in the future.
Note: I copy/pasted most of the blog posts for archive reasons. I ignore some parts, so I recommend skipping and reading the blog post if you can.
String is not a keyword and it can be used as Identifier whereas string is a keyword and cannot be used as Identifier. And in function point of view both are same.
Coming late to the party: I use the CLR types 100% of the time (well, except if forced to use the C# type, but I don't remember when the last time that was).
I originally started doing this years ago, as per the CLR books by Ritchie. It made sense to me that all CLR languages ultimately have to be able to support the set of CLR types, so using the CLR types yourself provided clearer, and possibly more "reusable" code.
Now that I've been doing it for years, it's a habit and I like the coloration that VS shows for the CLR types.
The only real downer is that auto-complete uses the C# type, so I end up re-typing automatically generated types to specify the CLR type instead.
Also, now, when I see "int" or "string", it just looks really wrong to me, like I'm looking at 1970's C code.
There is no difference.
The C# keyword string maps to the .NET type System.String - it is an alias that keeps to the naming conventions of the language.
Similarly, int maps to System.Int32.
There's a quote on this issue from Daniel Solis' book.
All the predefined types are mapped directly to
underlying .NET types. The C# type names (string) are simply aliases for the
.NET types (String or System.String), so using the .NET names works fine syntactically, although
this is discouraged. Within a C# program, you should use the C# names
rather than the .NET names.
Yes, that's no difference between them, just like the bool and Boolean.
string is a keyword, and you can't use string as an identifier.
String is not a keyword, and you can use it as an identifier:
Example
string String = "I am a string";
The keyword string is an alias for
System.String aside from the keyword issue, the two are exactly
equivalent.
typeof(string) == typeof(String) == typeof(System.String)

How are the "primitive" types defined non-recursively?

Since a struct in C# consists of the bits of its members, you cannot have a value type T which includes any T fields:
// Struct member 'T.m_field' of type 'T' causes a cycle in the struct layout
struct T { T m_field; }
My understanding is that an instance of the above type could never be instantiated*—any attempt to do so would result in an infinite loop of instantiation/allocation (which I guess would cause a stack overflow?**)—or, alternately, another way of looking at it might be that the definition itself just doesn't make sense; perhaps it's a self-defeating entity, sort of like "This statement is false."
Curiously, though, if you run this code:
BindingFlags privateInstance = BindingFlags.NonPublic | BindingFlags.Instance;
// Give me all the private instance fields of the int type.
FieldInfo[] int32Fields = typeof(int).GetFields(privateInstance);
foreach (FieldInfo field in int32Fields)
{
Console.WriteLine("{0} ({1})", field.Name, field.FieldType);
}
...you will get the following output:
m_value (System.Int32)
It seems we are being "lied" to here***. Obviously I understand that the primitive types like int, double, etc. must be defined in some special way deep down in the bowels of C# (you cannot define every possible unit within a system in terms of that system... can you?—different topic, regardless!); I'm just interested to know what's going on here.
How does the System.Int32 type (for example) actually account for the storage of a 32-bit integer? More generally, how can a value type (as a definition of a kind of value) include a field whose type is itself? It just seems like turtles all the way down.
Black magic?
*On a separate note: is this the right word for a value type ("instantiated")? I feel like it carries "reference-like" connotations; but maybe that's just me. Also, I feel like I may have asked this question before—if so, I forget what people answered.
**Both Martin v. Löwis and Eric Lippert have pointed out that this is neither entirely accurate nor an appropriate perspective on the issue. See their answers for more info.
***OK, I realize nobody's actually lying. I didn't mean to imply that I thought this was false; my suspicion had been that it was somehow an oversimplification. After coming to understand (I think) thecoop's answer, it makes a lot more sense to me.
As far as I know, within a field signature that is stored in an assembly, there are certain hardcoded byte patterns representing the 'core' primitive types - the signed/unsigned integers, and floats (as well as strings, which are reference types and a special case). The CLR knows natively how to deal with those. Check out Partition II, section 23.2.12 of the CLR spec for the bit patterns of the signatures.
Within each primitive struct ([mscorlib]System.Int32, [mscorlib]System.Single etc) in the BCL is a single field of that native type, and because a struct is exactly the same size as its constituent fields, each primitive struct is the same bit pattern as its native type in memory, and so can be interpreted as either, by the CLR, C# compiler, or libraries using those types.
From C#, int, double etc are synonyms of the mscorlib structs, which each have their primitive field of a type that is natively recognised by the CLR.
(There's an extra complication here, in that the CLR spec specifies that any types that have a 'short form' (the native CLR types) always have to be encoded as that short form (int32), rather than valuetype [mscorlib]System.Int32. So the C# compiler knows about the primitive types as well, but I'm not sure of the exact semantics and special-casing that goes on in the C# compiler and CLR for, say, method calls on primitive structs)
So, due to Godel's Incompleteness Theorem, there has to be something 'outside' the system by which it can be defined. This is the Magic that lets the CLR interpret 4 bytes as a native int32 or an instance of [mscorlib]System.Int32, which is aliased from C#.
My understanding is that an instance of the above type could never be instantiated any attempt to do so would result in an infinite loop of instantiation/allocation (which I guess would cause a stack overflow?)—or, alternately, another way of looking at it might be that the definition itself just doesn't make sense;
That's not the best way of characterizing the situation. A better way to look at it is that the size of every struct must be well-defined. An attempt to determine the size of T goes into an infinite loop, and therefore the size of T is not well-defined. Therefore, it's not a legal struct because every struct must have a well-defined size.
It seems we are being lied to here
There's no lie. An int is a struct that contains a field of type int. An int is of known size; it is by definition four bytes. Therefore it is a legal struct, because the size of all its fields is known.
How does the System.Int32 type (for example) actually store a 32-bit integer value
The type doesn't do anything. The type is just an abstract concept. The thing that does the storage is the CLR, and it does so by allocating four bytes of space on the heap, on the stack, or in registers. How else do you suppose a four-byte integer would be stored, if not in four bytes of memory?
how does the System.Type object referenced with typeof(int) present itself as though this value is itself an everyday instance field typed as System.Int32?
That's just an object, written in code like any other object. There's nothing special about it. You call methods on it, it returns more objects, just like every other object in the world. Why do you think there's something special about it?
Three remarks, in addition to thecoop's answer:
Your assertion that recursive structs inherently couldn't work is not entirely correct. It's more like a statement "this statement is true": which is true if it is. It's plausible to have a type T whose only member is of type T: such an instance might consume 0 bytes, for example (since its only member consumes 0 bytes). Recursive value types only stop working if you have a second member (which is why they are disallowed).
Take a look at Mono's definition of Int32. As you can see: it actually is a type containing itself (since int is just an alias for Int32 in C#). There is certainly "black magic" involved (i.e. special-casing), as the comments explain: the runtime will lookup the field by name, and just expect that it's there - I also assume that the C# compiler will special-case the presence of int here.
In PE assemblies, type information is represented through "type signature blobs". These are sequences of type declarations, e.g. for method signatures, but also for fields. The list of available primitive types in such a signature is defined in section 22.1.15 of the CLR specification; a copy of the allowed values is in the CorElementType enumeration. Apparently, the reflection API maps these primitive types to their corresponding System.XYZ valuetypes.

c# enums: can they take members and functions like java enums?

in java it is possible to give an enum a constructor as well as member variables and functions.
i was wondering if something like this is possible in c# enums as well. if so, how?
thanks a lot!
The only way to do something similar to this is to use extension methods, which can make it appear as though the enumeration has member methods.
Other than that, you could create a companion struct type to your enumeration that has a property for the enumeration value and then adds additional properties and methods to support that value.
It is not. In Java, enumerations are a class while in C#, an enumeration is just syntactic sugar wrapping a primitive type.
Enums are strongly typed constants. They are essentially unique types that allow you to assign symbolic names to integral values. In the C# tradition, they are strongly typed, meaning that an enum of one type may not be implicitly assigned to an enum of another type even though the underlying value of their members are the same. Along the same lines, integral types and enums are not implicitly interchangable. All assignments between different enum types and integral types require an explicit cast.
You can't use member variables or constructors in an enum. Maybe what you are looking for is an struct.
A struct type is a value type that can contain constructors, constants, fields, methods, properties, indexers, operators, events, and nested types. The declaration of a struct takes the following form:
You could imitate the Java TypeSafe enum pattern (what was so common in Java before enum was introduced in Java 5 to address it):
See Item 21 here (warning, PDF link) for a description.
You would do this if the object functionality was more important than the switch functionality, since in C# you can get the type saftey without it (which you couldn't in Java before 5).
One thing I've always loved to do is to use the Description attribute on my enums so I can store 3 values of my enum easily
Public Enum States
{
[Description("Florida")]
FL = 124
}
And then I had a class that easily reads to/from the description attribute so I could store a whole database code table in an enum file. Aside from all the posters bringing up extension methods you could use attributes to drive log with your enum classes.
You would still need to leverage another class to actually do something with the enum but you could use attributes to add more depth to your enum instead of just having the key/value pair that it basically is.
Not directly but you can use Extension methods to provide similar functionality
This is not possible in C#. Enums can only have name / value members.
As far as I am aware, no you can't in C#. Although why would you want too? Seems a bit of an odd thing to attach variables and functions too!
You can define extension methods for enum types, but you can't add state to enums, since enums are represented as simple integer types internally, and there'd be nowhere to store the state.

Categories