An issue came up on another forum and I knew how to fix it, but it revealed a feature of the compiler peculiar to me. The person was getting the error "Embedded statement cannot be a declaration or labeled statement" because they had a declaration of a variable following an if statement with no brackets. That was not their intent, but they had commented out the line of code immediately following the if statement, which made the variable declaration the de facto line of code to execute. Anyway, that's the background, which brings me to this.
The following code is illegal
if (true)
int i = 7;
However, if you wrap that in brackets, it's all legal.
if (true)
{
int i = 7;
}
Neither piece of code is useful. Yet the second one is OK. What specifically is the explanation for this behavior?
The C# language specification distinguishes between three types of statements (see chapter 8 for more details). In general you can have these statements:
labeled-statement - my guess that this is for the old-fashioned goto statement
declaration-statement - which would be a variable declaration
embedded-statement - which includes pretty much all the remaining statements
In the if statement the body has to be embedded-statement, which explains why the first version of the code doesn't work. Here is the syntax of if from the specification (section 8.7.1):
if ( boolean-expression ) embedded-statement
if ( boolean-expression ) embedded-statement else embedded-statement
A variable declaration is declaration-statement, so it cannot appear in the body. If you enclose the declaration in brackets, you'll get a statement block, which is an embedded-statement (and so it can appear in that position).
When you don't include the brackets, it executes the next line as if it were surrounded by brackets. Since it doesn't make much sense to declare a variable in that line (you wouldn't be able to use it ever), the C# compiler will not allow this to prevent you from accidentally doing it without knowing (which might introduce subtle bugs).
Here's part of Eric Lippert has to say about the C# compiler on this SO answer about name resolution:
...C# is not a "guess what the user
meant" language...the compiler by
design complains loudly if the best
match is something that doesn't work
All compilers will allow you to compile code that is useless or of exceedingly low use. There are simply too many ways that a developer can use the language to create constructs with no use. Having the compiler catch all of them is simply too much effort and typically not worth it.
The second case is called out directly in the C# language specification at the start of section 8.0
The example results in a compile-time error because an if statement requires an embedded-statement rather than a statement for its if branch. If this code were permitted, then the variable i would be declared, but it could never be used. Note, however, that by placing i’s declaration in a block, the example is valid.
Example Code
void F(bool b) {
if (b)
int i = 44;
}
Adding the closing and opening braces on the else part of the if helped me like i have done below as opposed to what i was doing before adding them;
Before: this caused the error:
protected void btnAdd_Click(object sender, EventArgs e)
{
if (btnAdd.Text == "ADD")
{
CATEGORY cat = new CATEGORY
{
NAME = tbxCategory.Text.Trim(),
TOTALSALEVALUE = tbxSaleValue.Text.Trim(),
PROFIT = tbxProfit.Text.Trim()
};
dm.AddCategory(cat, tbxCategory.Text.Trim());
}
else
// missing brackets - this was causing the error
var c = getCategory();
c.NAME = tbxCategory.Text.Trim();
c.TOTALSALEVALUE = tbxSaleValue.Text.Trim();
c.PROFIT = tbxProfit.Text.Trim();
dm.UpdateCategory(c);
btnSearchCat_Click(btnSearchCat, e);
}
After: Added brackets in the else branch
protected void btnAdd_Click(object sender, EventArgs e)
{
if (btnAdd.Text == "ADD")
{
CATEGORY cat = new CATEGORY
{
NAME = tbxCategory.Text.Trim(),
TOTALSALEVALUE = tbxSaleValue.Text.Trim(),
PROFIT = tbxProfit.Text.Trim()
};
dm.AddCategory(cat, tbxCategory.Text.Trim());
}
else
{
var c = getCategory();
c.NAME = tbxCategory.Text.Trim();
c.TOTALSALEVALUE = tbxSaleValue.Text.Trim();
c.PROFIT = tbxProfit.Text.Trim();
dm.UpdateCategory(c);
}
btnSearchCat_Click(btnSearchCat, e);
}
Related
While going through new C# 7.0 features, I stuck up with discard feature. It says:
Discards are local variables which you can assign but cannot read
from. i.e. they are “write-only” local variables.
and, then, an example follows:
if (bool.TryParse("TRUE", out bool _))
What is real use case when this will be beneficial? I mean what if I would have defined it in normal way, say:
if (bool.TryParse("TRUE", out bool isOK))
The discards are basically a way to intentionally ignore local variables which are irrelevant for the purposes of the code being produced. It's like when you call a method that returns a value but, since you are interested only in the underlying operations it performs, you don't assign its output to a local variable defined in the caller method, for example:
public static void Main(string[] args)
{
// I want to modify the records but I'm not interested
// in knowing how many of them have been modified.
ModifyRecords();
}
public static Int32 ModifyRecords()
{
Int32 affectedRecords = 0;
for (Int32 i = 0; i < s_Records.Count; ++i)
{
Record r = s_Records[i];
if (String.IsNullOrWhiteSpace(r.Name))
{
r.Name = "Default Name";
++affectedRecords;
}
}
return affectedRecords;
}
Actually, I would call it a cosmetic feature... in the sense that it's a design time feature (the computations concerning the discarded variables are performed anyway) that helps keeping the code clear, readable and easy to maintain.
I find the example shown in the link you provided kinda misleading. If I try to parse a String as a Boolean, chances are I want to use the parsed value somewhere in my code. Otherwise I would just try to see if the String corresponds to the text representation of a Boolean (a regular expression, for example... even a simple if statement could do the job if casing is properly handled). I'm far from saying that this never happens or that it's a bad practice, I'm just saying it's not the most common coding pattern you may need to produce.
The example provided in this article, on the opposite, really shows the full potential of this feature:
public static void Main()
{
var (_, _, _, pop1, _, pop2) = QueryCityDataForYears("New York City", 1960, 2010);
Console.WriteLine($"Population change, 1960 to 2010: {pop2 - pop1:N0}");
}
private static (string, double, int, int, int, int) QueryCityDataForYears(string name, int year1, int year2)
{
int population1 = 0, population2 = 0;
double area = 0;
if (name == "New York City")
{
area = 468.48;
if (year1 == 1960) {
population1 = 7781984;
}
if (year2 == 2010) {
population2 = 8175133;
}
return (name, area, year1, population1, year2, population2);
}
return ("", 0, 0, 0, 0, 0);
}
From what I can see reading the above code, it seems that the discards have a higher sinergy with other paradigms introduced in the most recent versions of C# like tuples deconstruction.
For Matlab programmers, discards are far from being a new concept because the programming language implements them since very, very, very long time (probably since the beginning, but I can't say for sure). The official documentation describes them as follows (link here):
Request all three possible outputs from the fileparts function:
helpFile = which('help');
[helpPath,name,ext] = fileparts('C:\Path\data.txt');
The current workspace now contains three variables from fileparts: helpPath, name, and ext. In this case, the variables are small. However, some functions return results that use much more memory. If you do not need those variables, they waste space on your system.
Ignore the first output using a tilde (~):
[~,name,ext] = fileparts(helpFile);
The only difference is that, in Matlab, inner computations for discarded outputs are normally skipped because output arguments are flexible and you can know how many and which one of them have been requested by the caller.
I have seen discards used mainly against methods which return Task<T> but you don't want to await the output.
So in the example below, we don't want to await the output of SomeOtherMethod() so we could do something like this:
//myClass.cs
public async Task<bool> Example() => await SomeOtherMethod()
// example.cs
Example();
Except this will generate the following warning:
CS4014 Because this call is not awaited, execution of the
current method continues before the call is completed. Consider
applying the 'await' operator to the result of the call.
To mitigate this warning and essentially ensure the compiler that we know what we are doing, you can use a discard:
//myClass.cs
public async Task<bool> Example() => await SomeOtherMethod()
// example.cs
_ = Example();
No more warnings.
To add another use case to the above answers.
You can use a discard in conjunction with a null coalescing operator to do a nice one-line null check at the start of your functions:
_ = myParam ?? throw new MyException();
Many times I've done code along these lines:
TextBox.BackColor = int32.TryParse(TextBox.Text, out int32 _) ? Color.LightGreen : Color.Pink;
Note that this would be part of a larger collection of data, not a standalone thing. The idea is to provide immediate feedback on the validity of each field of the data they are entering.
I use light green and pink rather than the green and red one would expect--the latter colors are dark enough that the text becomes a bit hard to read and the meaning of the lighter versions is still totally obvious.
(In some cases I also have a Color.Yellow to flag something which is not valid but neither is it totally invalid. Say the parser will accept fractions and the field currently contains "2 1". That could be part of "2 1/2" so it's not garbage, but neither is it valid.)
Discard pattern can be used with a switch expression as well.
string result = shape switch
{
Rectangule r => $"Rectangule",
Circle c => $"Circle",
_ => "Unknown Shape"
};
For a list of patterns with discards refer to this article: Discards.
Consider this:
5 + 7;
This "statement" performs an evaluation but is not assigned to something. It will be immediately highlighted with the CS error code CS0201.
// Only assignment, call, increment, decrement, and new object expressions can be used as a statement
https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/compiler-messages/cs0201?f1url=%3FappId%3Droslyn%26k%3Dk(CS0201)
A discard variable used here will not change the fact that it is an unused expression, rather it will appear to the compiler, to you, and others reviewing your code that it was intentionally unused.
_ = 5 + 7; //acceptable
It can also be used in lambda expressions when having unused parameters:
builder.Services.AddSingleton<ICommandDispatcher>(_ => dispatcher);
I don't understand the use case of var patterns in C#7. MSDN:
A pattern match with the var pattern always succeeds. Its syntax is
expr is var varname
where the value of expr is always assigned to a local variable named
varname. varname is a static variable of the same type as expr.
The example on MSDN is pretty useless in my opinion, especially because the if is redundant:
object[] items = { new Book("The Tempest"), new Person("John") };
foreach (var item in items) {
if (item is var obj)
Console.WriteLine($"Type: {obj.GetType().Name}, Value: {obj}");
}
Here i don't see any benefits, you could have the same if you access the loop variable item directly which is also of type Object. The if is confusing as well because it's never false.
I could use var otherItem = item or use item diectly.
Can someone explain the use case better?
The var pattern was very frequently discussed in the C# language repository given that it’s not perfectly clear what its use case is and given the fact that is var x does not perform a null check while is T x does, making it appear rather useless.
However, it is actually not meant to be used as obj is var x. It is meant to be used when the left hand side is not a variable on its own.
Here are some examples from the specification. They all use features that are not in C# yet but this just shows that the introduction of the var pattern was primarly made in preparation for those things, so they won’t have to touch it again later.
The following example declares a function Deriv to construct the derivative of a function using structural pattern matching on an expression tree:
Expr Deriv(Expr e)
{
switch (e) {
// …
case Const(_): return Const(0);
case Add(var Left, var Right):
return Add(Deriv(Left), Deriv(Right));
// …
}
Here, the var pattern can be used inside the structures to “pull out” elements from the structure. Similarly, the following example simplifies an expression:
Expr Simplify(Expr e)
{
switch (e) {
case Mult(Const(0), _): return Const(0);
// …
case Add(Const(0), var x): return Simplify(x);
}
}
As gafter writes here, the idea is also to have property pattern matching, allowing the following:
if (o is Point {X is 3, Y is var y})
{ … }
Without checking the design notes on Github I'd guess this was added more for consistency with switch and as a stepping stone for more advanced pattern matching cases,
From the original What’s New in C# 7.0 post :
Var patterns of the form var x (where x is an identifier), which always match, and simply put the value of the input into a fresh variable x with the same type as the input.
And the recent dissection post by Sergey Teplyakov :
if you know what exactly is going on you may find this pattern useful. It can be used for introducing a temporary variable inside the expression:
This pattern essentially creates a temporary variable using the actual type of the object.
public void VarPattern(IEnumerable<string> s)
{
if (s.FirstOrDefault(o => o != null) is var v
&& int.TryParse(v, out var n))
{
Console.WriteLine(n);
}
}
The warning righ before that snippet is also significant:
It is not clear why the behavior is different in the Release mode only. But I think all the issues falls into the same bucket: the initial implementation of the feature is suboptimal. But based on this comment by Neal Gafter, this is going to change: "The pattern-matching lowering code is being rewritten from scratch (to support recursive patterns, too). I expect most of the improvements you seek here will come for "free" in the new code. But it will be some time before that rewrite is ready for prime time.".
According to Christian Nagel :
The advantage is that the variable declared with the var keyword is of the real type of the object,
Only thing I can think of offhand is if you find that you've written two identical blocks of code (in say a single switch), one for expr is object a and the other for expr is null.
You can combine the blocks by switching to expr is var a.
It may also be useful in code generation scenarios where, for whatever reason, you've already written yourself into a corner and always expect to generate a pattern match but now want to issue a "match all" pattern.
In most cases it is true, the var pattern benefit is not clear, and can even be a bad idea. However as a way of capturing anonymous types in temp variable it works great.
Hopefully this example can illustrate this:
Note below, adding a null case avoids var to ever be null, and no null check is required.
var sample = new(int id, string name, int age)[] {
(1, "jonas", 50),
(2, "frank", 48) };
var f48 = from s in sample
where s.age == 48
select new { Name = s.name, Age = s.age };
switch(f48.FirstOrDefault())
{
case var choosen when choosen.Name == "frank":
WriteLine(choosen.Age);
break;
case null:
WriteLine("not found");
break;
}
class Program
{
static void Main(string[] args)
{
var x = new Program();
Console.Write(x.Text);
Console.Write(x.Num);
//Console.Write(x.Num);//line A
}
private string Text_;
public string Text
{
get
{
return Text_ ?? (Text_ = "hello");//line B
}
}
private int? Num_;
public int Num
{
get
{
return (int)(Num_ ?? (Num_ = 42));//line C
}
}
}
I'm using Visual Studio 2010 to get code coverage results. It shows that line B is totally covered while line C is partially covered. I would expect that line B is partially covered as well instead of being totally covered. Why is it the case that the code coverage results shows line B as being totally covered?
To demonstrate that it's working "correctly" for Num property, uncomment line A and run the coverage. It should show line C as totally being covered.
When I rewrite the code to a more verbose form (see below), it works correctly and reports that Text_ is partially covered. I prefer to use the former for its brevity, and would like to know if the two forms are equivalent too. Thanks in advance.
if (Text_ != null)
{
return Text_;
}
else
{
return Text_ = "hello";
}
The ?? on Nullable<T> expands to something more complex at the compiler, i.e.
a ?? b
is really
a.HasValue ? a.GetValueOrDefault() : b
Now, since a was null/empty for the only time it was executed, the a.GetValueOrDefault() has never been called. The underlying code when using a string (or any flat reference) is simpler.
In reality, just call it twice to make this go away:
Console.Write(x.Text); // first call; performs init
Console.Write(x.Text); // test once initialized
Console.Write(x.Num); // first call; performs init
Console.Write(x.Num); // test once initialized
Without going into the IL generated by this line I'd suspect that the compiler has done some internal optimisation that causes your "line B" to be regarded as a single statement.
The principal difference between the two statements is that line B refers to an object, whereas line C uses a shorthand reference to Num_.Value.
In general, it's definitely not worth getting stressed about "100% code coverage" -- as this example indicates, sometimes the instrumentation is wrong, anyway. What's important is that most of the code is covered, at least the principal execution paths through business logic.
Coverage is no replacement for code reviews by another human.
This one's really an offshoot of this question, but I think it deserves its own answer.
According to section 15.13 of the ECMA-334 (on the using statement, below referred to as resource-acquisition):
Local variables declared in a
resource-acquisition are read-only, and shall include an initializer. A
compile-time error occurs if the
embedded statement attempts to modify
these local variables (via assignment
or the ++ and -- operators) or
pass them as ref or out
parameters.
This seems to explain why the code below is illegal.
struct Mutable : IDisposable
{
public int Field;
public void SetField(int value) { Field = value; }
public void Dispose() { }
}
using (var m = new Mutable())
{
// This results in a compiler error.
m.Field = 10;
}
But what about this?
using (var e = new Mutable())
{
// This is doing exactly the same thing, but it compiles and runs just fine.
e.SetField(10);
}
Is the above snippet undefined and/or illegal in C#? If it's legal, what is the relationship between this code and the excerpt from the spec above? If it's illegal, why does it work? Is there some subtle loophole that permits it, or is the fact that it works attributable only to mere luck (so that one shouldn't ever rely on the functionality of such seemingly harmless-looking code)?
I would read the standard in such a way that
using( var m = new Mutable() )
{
m = new Mutable();
}
is forbidden - with reason that seem obious.
Why for the struct Mutable it is not allowed beats me. Because for a class the code is legal and compiles fine...(object type i know..)
Also I do not see a reason why changing the contents of the value type does endanger the RA. Someone care to explain?
Maybe someone doing the syntx checking just misread the standard ;-)
Mario
I suspect the reason it compiles and runs is that SetField(int) is a function call, not an assignment or ref or out parameter call. The compiler has no way of knowing (in general) whether SetField(int) is going to mutate the variable or not.
This appears completely legal according to the spec.
And consider the alternatives. Static analysis to determine whether a given function call is going to mutate a value is clearly cost prohibitive in the C# compiler. The spec is designed to avoid that situation in all cases.
The other alternative would be for C# to not allow any method calls on value type variables declared in a using statement. That might not be a bad idea, since implementing IDisposable on a struct is just asking for trouble anyway. But when the C# language was first developed, I think they had high hopes for using structs in lots of interesting ways (as the GetEnumerator() example that you originally used demonstrates).
To sum it up
struct Mutable : IDisposable
{
public int Field;
public void SetField( int value ) { Field = value; }
public void Dispose() { }
}
class Program
{
protected static readonly Mutable xxx = new Mutable();
static void Main( string[] args )
{
//not allowed by compiler
//xxx.Field = 10;
xxx.SetField( 10 );
//prints out 0 !!!! <--- I do think that this is pretty bad
System.Console.Out.WriteLine( xxx.Field );
using ( var m = new Mutable() )
{
// This results in a compiler error.
//m.Field = 10;
m.SetField( 10 );
//This prints out 10 !!!
System.Console.Out.WriteLine( m.Field );
}
System.Console.In.ReadLine();
}
So in contrast to what I wrote above, I would recommend to NOT use a function to modify a struct within a using block. This seems wo work, but may stop to work in the future.
Mario
This behavior is undefined. In The C# Programming language at the end of the C# 4.0 spec section 7.6.4 (Member Access) Peter Sestoft states:
The two bulleted points stating "if the field is readonly...then
the result is a value" have a slightly surprising effect when the
field has a struct type, and that struct type has a mutable field (not
a recommended combination--see other annotations on this point).
He provides an example. I created my own example which displays more detail below.
Then, he goes on to say:
Somewhat strangely, if instead s were a local variable of struct type
declared in a using statement, which also has the effect of making s
immutable, then s.SetX() updates s.x as expected.
Here we see one of the authors acknowledge that this behavior is inconsistent. Per section 7.6.4, readonly fields are treated as values and do not change (copies change). Because section 8.13 tells us using statements treat resources as read-only:
the resource variable is read-only in the embedded statement,
resources in using statements should behave like readonly fields. Per the rules of 7.6.4 we should be dealing with a value not a variable. But surprisingly, the original value of the resource does change as demonstrated in this example:
//Sections relate to C# 4.0 spec
class Test
{
readonly S readonlyS = new S();
static void Main()
{
Test test = new Test();
test.readonlyS.SetX();//valid we are incrementing the value of a copy of readonlyS. This is per the rules defined in 7.6.4
Console.WriteLine(test.readonlyS.x);//outputs 0 because readonlyS is a value not a variable
//test.readonlyS.x = 0;//invalid
using (S s = new S())
{
s.SetX();//valid, changes the original value.
Console.WriteLine(s.x);//Surprisingly...outputs 2. Although S is supposed to be a readonly field...the behavior diverges.
//s.x = 0;//invalid
}
}
}
struct S : IDisposable
{
public int x;
public void SetX()
{
x = 2;
}
public void Dispose()
{
}
}
The situation is bizarre. Bottom line, avoid creating readonly mutable fields.
I was checking some of the code that make up LINQ extensions in Reflector, and this is the kind of code I come across:
private bool MoveNext()
{
bool flag;
try
{
switch (this.<>1__state)
{
case 0:
this.<>1__state = -1;
this.<set>5__7b = new Set<TSource>(this.comparer);
this.<>7__wrap7d = this.source.GetEnumerator();
this.<>1__state = 1;
goto Label_0092;
case 2:
this.<>1__state = 1;
goto Label_0092;
default:
goto Label_00A5;
}
Label_0050:
this.<element>5__7c = this.<>7__wrap7d.Current;
if (this.<set>5__7b.Add(this.<element>5__7c))
{
this.<>2__current = this.<element>5__7c;
this.<>1__state = 2;
return true;
}
Label_0092:
if (this.<>7__wrap7d.MoveNext())
{
goto Label_0050;
}
this.<>m__Finally7e();
Label_00A5:
flag = false;
}
fault
{
this.System.IDisposable.Dispose();
}
return flag;
}
Was there a reason for Microsoft to write it this way?
Also what does the <> syntax mean, in lines like:
switch (this.<>1__state)
I have never seen it written before a variable, only after.
The MSIL is still valid 2.x code and the <> names you're seeing are auto generated by the C# 3.x compilers.
For example:
public void AttachEvents()
{
_ctl.Click += (sender,e) => MessageBox.Show( "Hello!" );
}
Translates to something like:
public void AttachEvents()
{
_ctl.Click += new EventHandler( <>b_1 );
}
private void <>b_1( object sender, EventArgs e )
{
MessageBox.Show( "Hello!" );
}
I should also note that the reason you're seeing it like that in Reflector is that you don't have .NET 3.5 optimization turned on. Go to View | Options and change Optimization to .NET 3.5 and it will do a better job of translating the generated identifiers back to their lamda expressions.
You're seeing the internal guts of the finite state machines that the C# compiler emits on your behalf when it handles iterators.
Jon Skeet has some great articles (Iterator block implementation details and Iterators, iterator blocks and data pipelines) on this subject. See also Chapter 6 of his book.
There was previously an SO post on this subject.
And, finally, Microsoft Research has a nice paper on the subject.
Read until your heart is content.
Identifiers starting with <> aren't valid C# identifiers, so I suspect they use them to mangle the names without fear of conflict, as no identifier in the C# code could be the same.
As to why it's hard to read, I suspect that it's more down to the fact it's easy to generate.
This is code that is automatically generated when you use iterators. The <> is used to ensure there are no collisions, and also to prevent you from accessing the compiler-generator classes directly in your code.
See the following for more information:
Using C# Yield for Readability and Performance
C# Iterators
These are types that have been auto-generated by the compiler from iterator methods.
The compiler will do exactly the same sort of thing to your own iterators. For example, write something like this and then take a look at the actual generated code in Reflector:
public IEnumerable<int> GetRandom()
{
Random rng = new Random();
while (true)
{
yield return rng.Next();
}
}
This is state machine that is automatically generated from an iterator, such as the following:
static IEnumerable<Func<KeyValuePair<int, int>>> FunnyMethod() {
for (var i = 0; i < 10; i++) {
var localVar = i;
yield return () => new KeyValuePair(localVar, i);
}
}
This method will return 10 for all of the values.
The compiler transforms these methods into state machines that store their state in the <>1__state field and call each part of the iterator for a different value of the field.
The <> part is part of the generated field name, and is chosen so as not to conflict with anything.
You must understand what Reflector does. It is not getting the source code back. That's not what a developer at MS wrote. :) It takes Intermediate Language (IL) and systematically converts it back to C# (or VB.NET). In doing so, it must come up with an approach. As you know there are many ways to skin a cat in code that will eventually lead to the same IL. Reflector has to pick a way to move back wards from IL to a higher level language and use that way every time.
(Fixed per comment, thank you.)