Using generics instead of casting for AsEnumerable

Using generics instead of casting for AsEnumerable - c#

I am new to generics in C# and while reading a book stumbled upon an example:
var cars = from car in data.AsEnumerable()
where
car.Field<string>("Color") == "Red"
select new
{
ID = car.Field<int>("CarID"),
Make = car.Field<string>("Make")
};
The author says that car.Field<string>("Color") gives the additional compile-time checking comparing to (string)car["Color"]. But how does the compiler know that car.Field<string>("Color") is compilable for "Color" and not for "CarID"? Or there is some kind of another "additional compile-time checking" that I miss?

It doesn't give you any additional compile-time checking. If you use the wrong type, in both cases you'll get an exception during run-time.
But it can be useful to do additional stuff that simple cast can't. For example Field<int>("CarId") could call a method that converts the string in the field to an int.
And assuming you're talking about DataRow.Field<T>(), then, according to the documentation, it's useful mostly for dealing with null values and nullable types correctly.

The compiler doesn't know, your specifying that field "Color" is of type string. Internally, the Field<T>() method does it's magic to make that happen.
If you perform a cast ((string)car["Color"]) you could encounter a runtime exception if the field value cannot be converted to the destination type.
From memory, if you specify car.Field<string>("ColorID"), you will be able to safely convert int to string without any issues.

Specifically, the real benefit of car.Field<string>("Color") is that it encapsulates testing field values for equality with DBNull.Value, making your code cleaner and easier to read.
In your example, if the value of the "Color" field is null, the Field<T> extension method will return null, while car["Color"] will return DBNull.Value. You can't cast DBNull.Value to string, so the expression (string)car["Color"] raises an InvalidCastException in that case.
Before the development of the DataSetExtensions class, you would need a somewhat verbose expression to assign the value of that field to a string variable:
var color = DBNull.Value.Equals(car["Color"]) ? null : (string)car["Color"];
Another benefit of Field<T> is, as svick notes, that it enables you to work with nullable types and reference types using the same syntax.

Related

Difference between named field and named column in datatable

What is the difference bewteen the two below please?
var dtime = (DateTime)datatable[0]["SomeDateTime"];
var dtime = datatable[0].field<DateTime>("SomeDateTime");
EDIT
and this:
var dtime = Convert.ToDateTime(datatable[0]["SomeDateTime"]);

The difference is that the first uses an explicit cast by yourself whereas the second casts behind the scenes in the DataRow extension method Field.
So if the datetime-column can contain null-values you would use DateTime?:
DateTime? dtime = datatable.Rows[0].Field<DateTime?>("SomeDateTime");
So i would suggest to use Field. It's strongly typed(hides the cast) and supports nullable types. The explicit cast via (TypeName) is also less readable.

In the first line
var dtime = (DateTime)datatable[0]["SomeDateTime"];
you use the indexer property that returns an object. So you need to cast it yourself to the actual type you expect the object to be.
The second case
var dtime = datatable[0].Field<DateTime>("SomeDateTime");
calls the generic extension method Field<T> that tries to do the casting for you and returns the type you specified (so DateTime in your example).
You can inspect the different implementation in the reference source for the indexer and the Field<T>() extension method

The main difference between the examples given is the method of conversion
(Type) takes what ever type you have and forces it to become that type, erroring it not suitible
<Type> is passing in a generic type for the implementation of field to use in its typeless definition, which means that there is behind the scenes logic helping you convert
and Convert.ToType() is using the converter class to do the conversion, as its call required types to be compatible with the IConvertable interface this means someone has put some thought into making that type convertable which mean your likely to get a successful result
otherwise you are just using different means to access the named field (either the this property or the field method

Why is the default value of the string type null instead of an empty string?

It's quite annoying to test all my strings for null before I can safely apply methods like ToUpper(), StartWith() etc...
If the default value of string were the empty string, I would not have to test, and I would feel it to be more consistent with the other value types like int or double for example.
Additionally Nullable<String> would make sense.
So why did the designers of C# choose to use null as the default value of strings?
Note: This relates to this question, but is more focused on the why instead of what to do with it.

Why is the default value of the string type null instead of an empty
string?
Because string is a reference type and the default value for all reference types is null.
It's quite annoying to test all my strings for null before I can
safely apply methods like ToUpper(), StartWith() etc...
That is consistent with the behaviour of reference types. Before invoking their instance members, one should put a check in place for a null reference.
If the default value of string were the empty string, I would not have
to test, and I would feel it to be more consistent with the other
value types like int or double for example.
Assigning the default value to a specific reference type other than null would make it inconsistent.
Additionally Nullable<String> would make sense.
Nullable<T> works with the value types. Of note is the fact that Nullable was not introduced on the original .NET platform so there would have been a lot of broken code had they changed that rule.(Courtesy #jcolebrand)

Habib is right -- because string is a reference type.
But more importantly, you don't have to check for null each time you use it. You probably should throw a ArgumentNullException if someone passes your function a null reference, though.
Here's the thing -- the framework would throw a NullReferenceException for you anyway if you tried to call .ToUpper() on a string. Remember that this case still can happen even if you test your arguments for null since any property or method on the objects passed to your function as parameters may evaluate to null.
That being said, checking for empty strings or nulls is a common thing to do, so they provide String.IsNullOrEmpty() and String.IsNullOrWhiteSpace() for just this purpose.

You could write an extension method (for what it's worth):
public static string EmptyNull(this string str)
{
return str ?? "";
}
Now this works safely:
string str = null;
string upper = str.EmptyNull().ToUpper();

You could also use the following, as of C# 6.0
string myString = null;
string result = myString?.ToUpper();
The string result will be null.

Empty strings and nulls are fundamentally different. A null is an absence of a value and an empty string is a value that is empty.
The programming language making assumptions about the "value" of a variable, in this case an empty string, will be as good as initiazing the string with any other value that will not cause a null reference problem.
Also, if you pass the handle to that string variable to other parts of the application, then that code will have no ways of validating whether you have intentionally passed a blank value or you have forgotten to populate the value of that variable.
Another occasion where this would be a problem is when the string is a return value from some function. Since string is a reference type and can technically have a value as null and empty both, therefore the function can also technically return a null or empty (there is nothing to stop it from doing so). Now, since there are 2 notions of the "absence of a value", i.e an empty string and a null, all the code that consumes this function will have to do 2 checks. One for empty and the other for null.
In short, its always good to have only 1 representation for a single state. For a broader discussion on empty and nulls, see the links below.
https://softwareengineering.stackexchange.com/questions/32578/sql-empty-string-vs-null-value
NULL vs Empty when dealing with user input

Why the designers of C# chose to use null as the default value of
strings?
Because strings are reference types, reference types are default value is null. Variables of reference types store references to the actual data.
Let's use default keyword for this case;
string str = default(string);
str is a string, so it is a reference type, so default value is null.
int str = (default)(int);
str is an int, so it is a value type, so default value is zero.

The fundamental reason/problem is that the designers of the CLS specification (which defines how languages interact with .net) did not define a means by which class members could specify that they must be called directly, rather than via callvirt, without the caller performing a null-reference check; nor did it provide a meany of defining structures which would not be subject to "normal" boxing.
Had the CLS specification defined such a means, then it would be possible for .net to consistently follow the lead established by the Common Object Model (COM), under which a null string reference was considered semantically equivalent to an empty string, and for other user-defined immutable class types which are supposed to have value semantics to likewise define default values. Essentially, what would happen would be for each member of String, e.g. Length to be written as something like [InvokableOnNull()] int String Length { get { if (this==null) return 0; else return _Length;} }. This approach would have offered very nice semantics for things which should behave like values, but because of implementation issues need to be stored on the heap. The biggest difficulty with this approach is that the semantics of conversion between such types and Object could get a little murky.
An alternative approach would have been to allow the definition of special structure types which did not inherit from Object but instead had custom boxing and unboxing operations (which would convert to/from some other class type). Under such an approach, there would be a class type NullableString which behaves as string does now, and a custom-boxed struct type String, which would hold a single private field Value of type String. Attempting to convert a String to NullableString or Object would return Value if non-null, or String.Empty if null. Attempting to cast to String, a non-null reference to a NullableString instance would store the reference in Value (perhaps storing null if the length was zero); casting any other reference would throw an exception.
Even though strings have to be stored on the heap, there is conceptually no reason why they shouldn't behave like value types that have a non-null default value. Having them be stored as a "normal" structure which held a reference would have been efficient for code that used them as type "string", but would have added an extra layer of indirection and inefficiency when casting to "object". While I don't foresee .net adding either of the above features at this late date, perhaps designers of future frameworks might consider including them.

Because a string variable is a reference, not an instance.
Initializing it to Empty by default would have been possible but it would have introduced a lot of inconsistencies all over the board.

If the default value of string were the empty string, I would not have to test
Wrong! Changing the default value doesn't change the fact that it's a reference type and someone can still explicitly set the reference to be null.
Additionally Nullable<String> would make sense.
True point. It would make more sense to not allow null for any reference types, instead requiring Nullable<TheRefType> for that feature.
So why did the designers of C# choose to use null as the default value of strings?
Consistency with other reference types. Now, why allow null in reference types at all? Probably so that it feels like C, even though this is a questionable design decision in a language that also provides Nullable.

Perhaps if you'd use ?? operator when assigning your string variable, it might help you.
string str = SomeMethodThatReturnsaString() ?? "";
// if SomeMethodThatReturnsaString() returns a null value, "" is assigned to str.

A String is an immutable object which means when given a value, the old value doesn't get wiped out of memory, but remains in the old location, and the new value is put in a new location. So if the default value of String a was String.Empty, it would waste the String.Empty block in memory when it was given its first value.
Although it seems minuscule, it could turn into a problem when initializing a large array of strings with default values of String.Empty. Of course, you could always use the mutable StringBuilder class if this was going to be a problem.

Since string is a reference type and the default value for reference type is null.

Since you mentioned ToUpper(), and this usage is how I found this thread, I will share this shortcut (string ?? "").ToUpper():
private string _city;
public string City
{
get
{
return (this._city ?? "").ToUpper();
}
set
{
this._city = value;
}
}
Seems better than:
if(null != this._city)
{ this._city = this._city.ToUpper(); }

Maybe the string keyword confused you, as it looks exactly like any other value type declaration, but it is actually an alias to System.String as explained in this question.
Also the dark blue color in Visual Studio and the lowercase first letter may mislead into thinking it is a struct.

Nullable types did not come in until 2.0.
If nullable types had been made in the beginning of the language then string would have been non-nullable and string? would have been nullable. But they could not do this du to backward compatibility.
A lot of people talk about ref-type or not ref type, but string is an out of the ordinary class and solutions would have been found to make it possible.

What is the type of null literal?

Dear all, I wonder what is the type of null literal in C#?
In Java, the null literal is of the special null type:
There is also a special null type, the type of the expression null, which has no name. Because the null type has no name, it is impossible to declare a variable of the null type or to cast to the null type. The null reference is the only possible value of an expression of null type. The null reference can always be cast to any reference type.
In C++11, there is nullptr (the recommended version of the old buddy NULL), which is of type std::nullptr_t.
I searched MSDN about C#, but the specification doesn't seem to say anything about that.

According to the ECMA C# language specification:
9.4.4.6 The null literal:
The type of a null-literal is the null type (§11.2.7).
11.2.7 The null type:
The null literal (§9.4.4.6) evaluates to the null value, which is used
to denote a reference not pointing at any object or array, or the
absence of a value. The null type has a single value, which is the
null value. Hence an expression whose type is the null type can
evaluate only to the null value. There is no way to explicitly write
the null type and, therefore, no way to use it in a declared type.
Moreover, the null type can never be the type inferred for a type
parameter (§25.6.4)
So to answer your question, null is it's own type - the null type.
Although it's odd how it's not mentioned in the C# 4.0 language specification or the C# 3.0 language specification but is mentioned in the overview of C# 3.0, the ECMA C# language specification and the C# 2.0 language specification.

UPDATE: This question was the subject of my blog in July 2013. Thanks for the great question!
J.Kommer's answer is correct (and good on them for doing what was evidently a lot of spec digging!) but I thought I'd add a little historical perspective.
When Mads and I were sorting out the exact wording of various parts of the specification for C# 3.0 we realized that the "null type" was bizarre. It is a "type" with only one value. It is a "type" that Reflection knows nothing about. It is a "type" that doesn't have a name, that GetType never returns, that you can't specify as the type of a local variable or field or anything. In short, it really is a "type" that is there only to make the type system "complete", so that every compile-time expression has a type.
Except that C# already had expressions that had no type: method groups in C# 1.0, anonymous methods in C# 2.0 and lambdas in C# 3.0 all have no type. If all those things can have no type, we realized that "null" need not have a type either. Therefore we removed references to the useless "null type" in C# 3.0.
As an implementation detail, the Microsoft implementations of C# 1.0 through 5.0 all do have an internal object to represent the "null type". They also have objects to represent the non-existing types of lambdas, anonymous methods and method groups. This implementation choice has a number of pros and cons. On the pro side, the compiler can ask for the type of any expression and get an answer. On the con side, it means that sometimes bugs in the type analysis that really ought to have crashed the compiler instead cause semantic changes in programs. My favourite example of that is that it is possible in C# 2.0 to use the illegal expression "null ?? null"; due to a bug the compiler fails to flag it as an erroneous usage of the ?? operator, and goes on to infer that the type of this expression is "the null type", even though that is not a null literal. That then goes on to cause many other downstream bugs as the type analyzer tries to make sense of the type.
In Roslyn we will probably not use this strategy; rather, we'll simply bake into the compiler implementation that some expressions have no type.

Despite of having no runtime type, null can be cast to a type at compile time, as this example shows.
At runtime, you can find that variable stringAsObject holds a string, not only an object, but you cannot find any type for variables nullString and nullStringAsObject.
public enum Answer { Object, String, Int32, FileInfo };
private Answer GetAnswer(int i) { return Answer.Int32; }
private Answer GetAnswer(string s) { return Answer.String; }
private Answer GetAnswer(object o) { return Answer.Object; }
[TestMethod]
public void MusingAboutNullAtRuntimeVsCompileTime()
{
string nullString = null;
object nullStringAsObject = (string)null;
object stringAsObject = "a string";
// resolved at runtime
Expect.Throws(typeof(ArgumentNullException), () => Type.GetTypeHandle(nullString));
Expect.Throws(typeof(ArgumentNullException), () => Type.GetTypeHandle(nullStringAsObject));
Assert.AreEqual(typeof(string), Type.GetTypeFromHandle(Type.GetTypeHandle(stringAsObject)));
Assert.AreEqual(typeof(string), stringAsObject.GetType());
// resolved at compile time
Assert.AreEqual(Answer.String, this.GetAnswer(nullString));
Assert.AreEqual(Answer.Object, this.GetAnswer(nullStringAsObject));
Assert.AreEqual(Answer.Object, this.GetAnswer(stringAsObject));
Assert.AreEqual(Answer.Object, this.GetAnswer((object)null));
Assert.AreEqual(Answer.String, this.GetAnswer((string)null));
Assert.AreEqual(Answer.String, this.GetAnswer(null));
}
// Uncommenting the following method overload
// makes the last statement in the test case ambiguous to the compiler
// private Answer GetAnswer(FileInfo f) { return Answer.FileInfo; }

C# implicit conversions

I'm currently working on an application where I need to load data from an SQL database and then assign the retrieved values into the properties of an object. I'm doing this by using reflection since the property names and column names are the same. However, many of the properties are using a custom struct type that is basically a currency wrapper for the decimal type. I have defined an implicit conversion in my struct:
public static implicit operator Currency(decimal d)
{
return new Currency(d);
}
This works fine when I use it in code. However, when I have this:
foreach (PropertyInfo p in props)
{
p.SetValue(this, table.Rows[0][p.Name], null);
}
It throws an ArgumentException stating that it cannot convert from System.Decimal to Currency. I'm confused since it works fine in any other circumstance.

Unfortunately, these user-defined conversion operators are not used by the runtime; they are only used by the compiler at compile time. So if you take a strongly-typed decimal and assign it to a strongly-typed Currency, the compiler will insert a call to your conversion operator and everybody's happy. However, when you call SetValue as you're doing here, the runtime expects you to give it a value of the appropriate type; the runtime has no idea that this conversion operator exists, and won't ever call it.

I think you need to first unbox the value in table.Rows[0][p.Name] as a decimal.
In other words:
foreach (PropertyInfo p in props)
{
if (p.PropertyType == typeof(Currency))
{
Currency c = (decimal)table.Rows[0][p.Name];
p.SetValue(this, c, null);
}
else
{
p.SetValue(this, table.Rows[0][p.Name], null);
}
}
This is an issue I've seen once or twice before, so I actually decided to write a blog post about it. Anybody looking for a little more explanation, feel free to give it a read.

I assume that table is of type DataTable in your code, so the first indexer returns a DataRow, and the second one returns an object. Then PropertyInfo.SetValue also takes an object as the second argument. At no place in this code a cast happens in the code, which is why the overloaded conversion operator is not applied.
Generally speaking, it is only applied when static types are known (forgetting about dynamic in C# 4.0 for the moment). It is not applied when boxing and unboxing things. In this case, the indexer on DataRow boxes the value, and PropertyInfo.SetValue tries to unbox it to a different type - and fails.

Whilst I am not answering your problem, I think in this kind of situation it would be more appropriate to use a ORM Framework like the Entity Framework or NHibernate which will map your tables into your domain objects and handle all the conversions for you. Using something like reflection to figure out what fields to fill in a domain object is a slow way to do it.

is this explicit cast necessary when I know it isn't null?

I have the following:
MyEnum? maybeValue = GetValueOrNull();
if (null != maybeValue)
{
MyEnum value = (MyEnum)maybeValue;
}
What I want to know is if that explicit (MyEnum) cast is necessary on an instance of type MyEnum?. This seems like a simple question, I just felt paranoid that there could possibly be some runtime error if I just did MyEnum value = maybeValue within that if statement.

For a nullable type, you can do
if (maybeValue.HasValue)
{
MyEnum value = maybeValue.Value;
}

Since you're using nullable types, try using
if(maybeValue.HasValue)
{
MyEnum value = maybeValue.Value; // no cast needed! Yay!
}

A number of answers have noted that using the Value property avoids the cast. This is correct and it answers the question:
is that explicit (MyEnum) cast necessary on an instance of type "MyEnum?" ?
However, we haven't addressed the other concern:
I just felt paranoid that there could possibly be some runtime error if I just did MyEnum value = maybeValue within that if statement.
Well, first off, you cannot simply assign a nullable value to a variable of its underlying type. You have to do something to do the conversion explicitly.
However, if you do so in the manner you describe -- checking whether there is a value first -- then this is safe. (Provided of course that no one is mutating the variable containing the nullable value between the call to HasValue and the fetch of the value.)
If you use ILDASM or some other tool to examine the generated code you will discover that casting a nullable value to its underlying type is simply generated as an access to the Value property; using the cast operator or accessing the Value property is a difference which actually makes no difference at all. The Value property accessor will throw if HasValue is false.
Use whichever syntax you feel looks better. I personally would probably choose the "Value" syntax over the cast syntax because I think it reads better to say "if it has a value, gimme the value" than "if it has a value, convert it to its underlying type".

I believe that when you define a variable as a nullable type, it adds a .HasValue property and a .Value property. You should be able to use those to avoid any casting.
You can rewrite your code like this
MyEnum? maybeValue = GetValueOrNull();
if (maybeValue.HasValue == true)
{
MyEnum value = maybeValue.Value;
}

Another option would be to set up your enum to have a Default (0) value and then you would be able to just do the following:
MyEnum value = GetValueOrNull().GetValueOrDefault();
However, this would require that your code have knowledge of what the default value means and possibly handle it differently from what the other enum types do.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.