Why does C# allow code blocks without a preceding statement (e.g. if, else, for, while)?
void Main()
{
{ // any sense in this?
Console.Write("foo");
}
}
The { ... } has at least the side-effect of introducing a new scope for local variables.
I tend to use them in switch statements to provide a different scope for each case and in this way allowing me to define local variable with the same name at closest possible location of their use and to also denote that they are only valid at the case level.
In the context you give, there is no significance. Writing a constant string to the console is going to work the same way anywhere in program flow.1
Instead, you typically use them to restrict the scope of some local variables. This is further elaborated here and here. Look at João Angelo’s answer and Chris Wallis’s answer for brief examples. I believe the same applies to some other languages with C-style syntax as well, not that they’d be relevant to this question though.
1 Unless, of course, you decide to try to be funny and create your own Console class, with a Write() method that does something entirely unexpected.
It is not so much a feature of C# than it is a logical side-effect of many C syntax languages that use braces to define scope.
In your example the braces have no effect at all, but in the following code they define the scope, and therefore the visibility, of a variable:
This is allowed as i falls out of scope in the first block and is defined again in the next:
{
{
int i = 0;
}
{
int i = 0;
}
}
This is not allowed as i has fallen out of scope and is no longer visible in the outer scope:
{
{
int i = 0;
}
i = 1;
}
And so on and so on.
I consider {} as a statement that can contain several statements.
Consider an if statement that exists out of a boolean expression followed by one statement.
This would work:
if (true) Console.Write("FooBar");
This would work as well:
if (true)
{
Console.Write("Foo");
Console.Write("Bar");
}
If I'm not mistaken this is called a block statement.
Since {} can contain other statements it can also contain other {}.
The scope of a variable is defined by it's parent {} (block statement).
The point that I'm trying to make is that {} is just a statement, so it doesn't require an if or whatever...
The general rule in C-syntax languages is "anything between { } should be treated as a single statement, and it can go wherever a single statement could":
After an if.
After a for, while or do.
Anywhere in code.
For all intents and purposes, it's as the language grammar included this:
<statement> :== <definition of valid statement> | "{" <statement-list> "}"
<statement-list> :== <statement> | <statement-list> <statement>
That is, "a statement can be composed of (various things) or of an opening brace, followed by a statement list (which may include one or more statements), followed by a closed brace". I.E. "a { } block can replace any statement, anywhere". Including in the middle of code.
Not allowing a { } block anywhere a single statement can go would actually have made the language definition more complex.
Because C++ (and java) allowed code blocks without a preceding statement.
C++ allowed them because C did.
You could say it all comes down to the fact that USA programme language (C based) design won rather than European programme language (Modula-2 based) design.
(Control statements act on a single statement, statements can be groups to create new statements)
// if (a == b)
// if (a != b)
{
// do something
}
1Because...Its Maintain the Scope Area of the
statement.. or Function, This is really useful for mane the large code..
{
{
// Here this 'i' is we can use for this scope only and out side of this scope we can't get this 'i' variable.
int i = 0;
}
{
int i = 0;
}
}
You asked "why" C# allows code blocks without preceeding statements. The question "why" could also be interpreted as "what would be possible benefits of this construct?"
Personally, I use statement-less code blocks in C# where readability is greatly improved for other developers, while keeping in mind that the code block limits the scope of local variables. For example, consider the following code snippet, which is a lot easier to read thanks to the additional code blocks:
OrgUnit world = new OrgUnit() { Name = "World" };
{
OrgUnit europe = new OrgUnit() { Name = "Europe" };
world.SubUnits.Add(europe);
{
OrgUnit germany = new OrgUnit() { Name = "Germany" };
europe.SubUnits.Add(germany);
//...etc.
}
}
//...commit structure to DB here
I'm aware that this could be solved more elegantly by using methods for each structure level. But then again, keep in mind that things like sample data seeders usually need to be quick.
So even though the code above is executed linearly, the code structure represents the "real-world" structure of the objects, thus making it easier for other developers to understand, maintain and extend.
Related
I've inherited some code that makes occasional use of the following if notation:
if (a)
foo();
{
if (b)
boo();
moo();
}
I'm not sure how to read that naturally but the code compiles and runs.
Can someone please explain how this works so that I can rewrite it in more human readable format? Alternatively, could someone explain why this notation could be useful?
The code you've posted would be better written as:
if (a)
{
foo();
}
if (b)
{
boo();
}
moo();
Braces in C# have two purposes:
They create a scope for variable declarations.
They group statements together so that conditionals and loops and such can apply to several statements at a time.
Whoever wrote the code you've posted chose not to use them for the second purpose. if statements can be totally legitimate without using any braces, but they'll only apply to the statement that immediately follows them (like the call to foo() after the first if).
Because there is a legitimate use case for braces that has nothing to do with control flow, however, it is perfectly acceptable for someone to put braces in random places that have nothing to do with the if statements.
This code:
foo();
{
var a = boo();
}
{
var a = moo();
}
... is equivalent to this code:
foo();
var a = boo();
var b = moo();
... but you'll notice that I couldn't name the second variable a because it's no longer separated from the first variable by scoping braces.
Alternatively, could someone explain why this notation could be useful?
There are three possibilities I can think of:
They wanted to reuse variable names in different parts of the method, so they created a new variable scope with the braces.
They thought braces might be a nice visual aid to encapsulate all the logic that's found inside them.
They were actually confused, and the program doesn't work the way they think it does.
The first two possibilities assume they're actually doing more than calling foo() and moo() in the real production code.
In any case, I don't think any of these possibilities are good reasons to write code like this, and you are totally within your rights to want to rewrite the code in a way that's easier to understand.
The curly brace starts after foo() instead of if(a) (like if(a){) therefore these braces are useless in this context.
if (a)
foo();
//{
if (b)
boo();
moo();
//}
which is equal to
if (a)
foo();
if (b)
boo();
moo();
Braces like those are used to limit context. Variables created inside these braces will exist only until the brace ends.
One of my programming philosophy is that defining variables just before it is really being used the first time. For example the way of defining variable 'x', I usually don't write code like this:
var total =0;
var x;
for(int i=0;i<100000;i++)
{
x = i;
total += x;
}
Instead, I prefer to this:
var total = 0;
for(int i=0;i<100000;i++)
{
var x = i;
total = +x;
}
This is just an example code, don't care about the real meaning of the code.
what downsides is the second way? performance?
Don't bother yourself with performance unless you really really need to (hint: 99% of the time you don't need to).
My usual philosophy (which has been confirmed by books like "The Art of Readable Code") is to declare variables in the smallest scope possible. The reason being that in terms of readability and code comprehension the less variables you have to think about at any one time the better. And defining variables in a smaller scope definitely helps with that.
Also, often times if a compiler is able to determine that (in the case of your example) moving the variable outside of the for loop to save having to create/destroy it every iteration won't change the outcome but will help performance he'll do it for you. And that's another reason not to bother with performance, the compiler is usually smarter about it than we are.
There is no performance implications, only the scope ones. You should always define variables in the innermost scope possible. This improves readability of your program.
The only "downside" is that the second version need compiler support. Old compilers needed to know all the variables the function(or a scope inside it) will be using, so you had to declare the variables in a special section(Pascal) or in the beginning of the block(C). This is not really a problem nowadays - C is the only language that does not support declaring variables anywhere and still being widely used.
The problem is that C is the most common first-language they teach in schools and universities. They teach you C, and force you to declare all variables at the beginning of the block. Then they teach you a more modern language, and because you are already used to declaring all variables at the beginning, they need to teach you to not do it.
If your first language allows you to declare a variable anywhere in the function's body, you would instinctively declare it just before you use it, and they wouldn't need to tell you that declaring variables beforehand is bad just like they don't need to tell you that smashing your computer with a 5 Kilo hammer is bad.
I recommend, like most, to keep variables within an inner scope, but exceptions
occur and I think that is what you are seeking.
C++ potentially has expensive constructor/destructor time that would be best paid for once, rather than N times. Compare
void TestPrimacyOfNUnsignedLongs(int n) {
PrimeList List(); // Makes a list of all unsigned long primes
for (int i = 0; i<n; i++) {
unsinged long x = random_ul();
if (List.IsAPrime(x)) DoThis();
}
}
or
void TestPrimacyOfNUnsignedLongs(int n) {
for (int i = 0; i<n; i++) {
PrimeList List(); // Makes a list of all unsigned long primes
unsinged long lx = random_ul();
if (List.IsAPrime(x)) DoThis();
}
}
Certainly, I could put List inside the for loop, but at a significant run time cost.
Having all variables of the same scope in the same location of the code is easier to see what variables you have and what data type there are. You don't have to look through the entire code to find it.
You have different scopes for the x variable. In the second example, you won't be able to use the x variable outside the loop.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I mean other than using it when required for functions, classes, if, while, switch, try-catch.
I didn't know that it could be done like this until I saw this SO question.
In the above link, Eli mentioned that "They use it to fold up their code in logical sections that don't fall into a function, class, loop, etc. that would usually be folded up."
What other uses are there besides those mentioned?
Is it a good idea to use curly braces to limit the scope of your variables and expand the scope only if required (working on a "need-to-access" basis)? Or is it actually silly?
How about using scopes just so that you can use the same variable names in different scopes but in the same bigger scope? Or is it a better practise to reuse the same variable (if you want to use the same variable name) and save on deallocating and allocating (I think some compilers can optimise on this?)? Or is it better to use different variable names altogether?
I do if I am using a resource which I want to free at a specific time eg:
void myfunction()
{
{
// Open serial port
SerialPort port("COM1", 9600);
port.doTransfer(data);
} // Serial port gets closed here.
for(int i = 0; i < data.size(); i++)
doProcessData(data[i]);
etc...
}
I would not use curly braces for that purpose for a couple reasons.
If your particular function is big enough that you need to do various scoping tricks, perhaps break the function into smaller sub-functions.
Introducing braces for scoping to reuse variable names is only going to lead to confusion and trouble in code.
Just my 2 cents, but I have seen a lot of these types of things in other best practice materials.
C++:
Sometimes you need to introduce an extra brace level of scope to reuse variable names when it makes sense to do so:
switch (x) {
case 0:
int i = 0;
foo(i);
break;
case 1:
int i = 1;
bar(i);
break;
}
The code above doesn't compile. You need to make it:
switch (x) {
case 0:
{
int i = 0;
foo(i);
}
break;
case 1:
{
int i = 1;
bar(i);
}
break;
}
The most common "non-standard" use of scoping that I use regularly is to utilize a scoped mutex.
void MyClass::Somefun()
{
//do some stuff
{
// example imlementation that has a mutex passed into a lock object:
scopedMutex lockObject(m_mutex);
// protected code here
} // mutex is unlocked here
// more code here
}
This has many benefits, but the most important is that the lock will always be cleaned up, even if an exception is thrown in the protected code.
The most common use, as others have said, is to ensure that destructors run when you want them to. It's also handy for making platform-specific code a little clearer:
#if defined( UNIX )
if( some unix-specific condition )
#endif
{
// This code should always run on Windows but
// only if the above condition holds on unix
}
Code built for Windows doesn't see the if, only the braces. This is much clearer than:
#if defined( UNIX )
if( some unix-specific condition ) {
#endif
// This code should always run on Windows but
// only if the above condition holds on unix
#if defined( UNIX )
}
#endif
It can be a boon to code generators. Suppose you have an Embedded SQL (ESQL) compiler; it might want to convert an SQL statement into a block of code that needs local variables. By using a block, it can reuse fixed variable names over and over, rather than having to create all the variables with separate names. Granted, that's not too hard, but it is harder than necessary.
As others have said, this is fairly common in C++ due to the all-powerful RAII (resource acquisition is initialization) idiom/pattern.
For Java programmers (and maybe C#, I don't know) this will be a foreign concept because heap-based objects and GC kills RAII. IMHO, being able to put objects on the stack is the greatest single advantage of C++ over Java and makes well-written C++ code MUCH cleaner than well-written Java code.
I only use it when I need to release something by the means of RAII and even then only when it should be released as early as I possibly can (releasing a lock for example).
Programming in Java I have quite often wanted to limit scope within a method, but it never occurred to me to use a label. Since I uppercase my labels when using them as the target of a break, using a mixed case labeled block like you have suggested is just what I have wanted on these occasions.
Often the code blocks are too short to break out into a small method, and often the code in a framework method (like startup(), or shutdown()) and it's actually better to keep the code together in one method.
Personally I hate the plain floating/dangling braces (though that's because we are a strict banner style indent shop), and I hate the comment marker:
// yuk!
some code
{
scoped code
}
more code
// also yuk!
some code
/* do xyz */ {
scoped code
}
some more code
// this I like
some code
DoXyz: {
scoped code
}
some more code
We considered using "if(true) {" because the Java spec specifically says these will be optimized away in compilation (as will the entire content of an if(false) - it's a debugging feature), but I hated that in the few places I tried it.
So I think your idea is a good one, not at all silly. I always thought I was the only one who wanted to do this.
Yes, I use this technique because of RAII. I also use this technique in plain C since it brings the variables closer together. Of course, I should be thinking about breaking up the functions even more.
One thing I do that is probably stylistically controversial is put the opening curly brace on the line of the declaration or put a comment right on it. I want to decrease the amount of wasted vertical space. This is based on the Google C++ Style Guide recommendation..
/// c++ code
/// references to boost::test
BOOST_TEST_CASE( curly_brace )
{
// init
MyClass instance_to_test( "initial", TestCase::STUFF ); {
instance_to_test.permutate(42u);
instance_to_test.rotate_left_face();
instance_to_test.top_gun();
}
{ // test check
const uint8_t kEXP_FAP_BOOST = 240u;
BOOST_CHECK_EQUAL( instance_to_test.get_fap_boost(), kEXP_FAP_BOOST);
}
}
I agree with agartzke. If you feel that you need to segment larger logical code blocks for readability, you should consider refactoring to clean up busy and cluttered members.
It has its place, but I don't think that doing it so that $foo can be one variable here and a different variable there, within the same function or other (logical, rather than lexical) scope is a good idea. Even though the compiler may understand that perfectly, it seems too likely to make life difficult for humans trying to read the code.
The company I'm working at has a static analysis policy to keep local variable declarations near the beginning of a function. Many times, the usage is many lines after the first line of a function so I cannot see the declaration and the first reference at the same time on the screen. What I do to 'circumvent' the policy is to keep the declaration near the reference, but provide additional scope by using curly braces. It increases indentation though, and some may argue that it makes the code uglier.
I'm reading through someone else's code, and I see a lot of instances of this. I'll provide a snippet. It's a library function, which wraps nHibernate. It's the fifth line, after the session is created that I'm confused about.
public T GetById<T>(string id) where T : BaseObject
{
T retObj = null;
ISession session = EnsureCurrentSession();
{
retObj = session.Get<T>(id);
}
return retObj;
}
At first glance I thought it was an example of the using statement, but it's not. As far as I can see, the curly braces might as well not be there. The only practical purpose for setting up a block there would be to create variables inside and their scope be limited to the block, but that's not happening here.
Or am I missing something?
That code looks like an incomplete edit; the code is legal but weird.
To follow up on your statement:
The only practical purpose for setting up a block there would be to create variables inside and their scope be limited to the block
That is a practical purpose for creating a block but not the only purpose. For example:
class C
{
public int x;
void M()
{
x = 123;
if (whatever)
{
int x = q;
}
}
}
This code is not legal because the simple name x is used inconsistently throughout the block which first uses it. x means this.x at first, and a local variable later. That's not legal in C#; in C# a name may only mean one thing throughout the block which first uses the name.
You could "fix" the problem by...
class C
{
public int x;
void M()
{
{
x = 123;
}
if (whatever)
{
int x = q;
}
}
}
Because now the two blocks that use the same name to mean two different things do not overlap in any way. But that is a dumb way to fix the problem; the better thing to do is rename the local.
The braces here are superfluous, however, you are correct, you can create braces in order to create variables within the scope of that block. However, this pattern is very rarely used.
Indeed in this case it does nothing. It might be the leftover of a using block actually, i.e. the code could have looked like this in a previous version:
using (ISession session = EnsureCurrentSession())
{
retObj = session.Get<T>(id);
}
As it is now, I would review how EnsureCurrentSession is implemented. Possibly the using should really be there, or if not, remove the braces.
I think you're not missing anything - the braces don't do anything in this case.
Actually, the braces after a new statement are meant to initialize variables in the class. This subject was discussed earlier in another post.
Not missing anything. They are superfluous, and frankly, bad to leave such code. But it might have been changed / added as habit for when a code does require local scope / using.
My best guess is that there used to be a if (session != null) there once upon a time. Then there was a code review where it was pointed out the test is unnecessary because EnsureCurrentSession() never returns null, it throws an Exception if session is not current.
Just curious: Why is the syntax for try catch in C# (Java also?) hard coded for multiple statements? Why doesn't the language allow:
int i;
string s = DateTime.Now.Seconds % 2 == 1 ? "1" : "not 1";
try
i = int.Parse(s);
catch
i = 0;
The example is for trivial purposes only. I know there's int.TryParse.
Consider the fact that there are really three (or more) code blocks in play here:
try {}
catch (myexcption)
{}
catch (myotherexception)
{}
finally
{}
Keep in mind that these are in the scope of a larger context and the exceptions not caught are potentually caught further up the stack.
Note that this is basically the same thing as a class construct that also has the {} structure.
Say for instance you might have:
try
try
if (iAmnotsane)
beatMe(please);
catch (Exception myexception)
catch (myotherexception)
logerror("howdy")
finally
NOW does that second catch belong to the first or the second try? What about the finally? SO you see the optional/multiple portions make the requirement.
UPDATE: This question was the subject of my blog on December 4th, 2012. There are a number of insightful comments on the blog that you might also be interested in. Thanks for the great question!
As others have noted, the proposed feature introduces ambiguities that are confusing. I was interested to see if there were any other justifications for the decision to not support the feature, so I checked the language design notes archive.
I see nothing in the language design notes archive that justifies this decision. As far as I know, C# does it that way because that's how other languages with similar syntax do it, and they do it that way because of the ambiguity problem.
I did learn something interesting though. In the initial design of C# there was no try-catch-finally! If you wanted a try with a catch and a finally then you had to write:
try
{
try
{
XYZ();
}
catch(whatever)
{
DEF();
}
}
finally
{
ABC();
}
which, not surprisingly, is exactly how the compiler analyzes try-catch-finally; it just breaks it up into try-catch inside try-finally upon initial analysis and pretends that's what you said in the first place.
More or less, this is a play on the dangling else problem.
For example,
if( blah )
if ( more blah )
// do some blah
else
// no blah I suppose
Without curly braces, the else is ambiguous because you don't know if it's associated with the first or second if statement. So you have to fallback on a compiler convention (e.g. in Pascal or C, the compiler assumes the dangling else is associated with the closest if statement) to resolve the ambiguity, or fail the compile entirely if you don't want to allow such ambiguity in the first place.
Similarly,
try
try
// some code that throws!
catch(some blah)
// which try block are we catching???
catch(more blah )
// not so sure...
finally
// totally unclear what try this is associated with.
You could solve it with a convention, where catch blocks are always associated with the closest try, but I find this solution generally allows programmers to write code that is potentially dangerous. For example, in C, this:
if( blah )
if( more blah )
x = blah;
else
x = blahblah;
...is how the compiler would interpret this if/if/else block. However, it's also perfectly legitimate to screw up your indenting and write:
if( blah )
if( more blah )
x = blah;
else
x = blahblah;
...which now makes it appear like the else is associated with the outer if statement, when in fact it is associated with the inner if statement due to C conventions. So I think requiring the braces goes a long way towards resolving ambiguity and preventing a rather sneaky bug (these sorts of issues can be trivial to miss, even during code inspection). Languages like python don't have this issue since indentation and whitespace matter.
If you assume that the designers of C# simply choose to use the same syntax as C++ then the question becomes why are braces necessary with single statements try and catch blocks in C++. The simple answer is that Bjarne Stroustrup thought the syntax was easier to explain.
In The Design and Evolution of C++ Stroustrup writes:
"The try keyword is completely redundant and so are the { } braces except where multiple statements are actually used in a try-block or a handler."
He goes on to give an example where the try keyword and { } are not needed. He then writes:
"However, I found this so difficult to explain that the redundancy was introduced to save support staff from confused users."
Reference:
Stroustrup, Bjarne (1994). The Design and Evolution of C++. Addison-Wesley.
The first think I can think of is that the curly braces create a block with its own variable scope.
Look at the following code
try
{
int foo = 2;
}
catch (Exception)
{
Console.WriteLine(foo); // The name 'foo' does not exist in the current context
}
foo is not accessible in the catch block due to the variable scoping. I think this makes it easier to reason about whether an variable has been initialized before use or not.
Compare with this code
int foo;
try
{
foo = 2;
}
catch (Exception)
{
Console.WriteLine(foo); // Use of unassigned local variable 'foo'
}
here you can not guarantee that foo is initialized.
try // 1
try // 2
something();
catch { // A
}
catch { // B
}
catch { // C
}
does B catches try 1 or 2?
I don't think you can resolve this unambiguously, since the snippet might mean:
try // 1
{
try // 2
something();
catch { // A
}
}
catch { // B
}
catch { // C
}
try // 1
{
try // 2
something();
catch { // A
}
catch { // B
}
}
catch { // C
}
Probably to discourage overuse. A try-catch block is big and ugly, and you're going to notice when you're using it. This mirrors the effect that a catch has on your application's performance - catching an exception is extremely slow compared to a simple boolean test.
In general you should avoid errors, not handle them. In the example you give, a much more efficient method would be to use
if(!int.TryParse(s, out i))
i=0;
The rational is that it's more maintainable (easier to change, less likely to break, ergo higher quality):
it's clearer, and
it's easier to change because if you need to add a line to your blocks you don't introduce a bug.
As to why exception handling is different than conditional expressions...
If/Else is conditional upon an expression to use one of two (or more If/Else if/Else) paths in the code
Try/Catch is part of exception handling, it is not a conditional expression. Try/Catch/Finally operates only when an exception has been thrown inside the scope of the Try block.
Exception handling will traverse up the stack/scope until it finds a Catch block that will catch the type of exception that was thrown. Forcing scope identifiers makes this check for blocks simplified. Forcing you to scope when dealing with exceptions seems like a good idea, it also is a good indication that this is part of exception handling rather than normal code. Exceptions are exceptions, not something you really want happening normally but know can happen and want to handle when they do happen.
EDIT: There is one more reason which I can think of, is that CATCH is mandatory after a TRY unlike ELSE. Hence there needs to be definite way to define the TRY block.
Another way of looking at this…
Given all the maintenance problem that have been created by “if”, “while”, “for” and “foreach” statements without bases, a lot of companies have coding standards that always require bases on statements that act on a “block”.
So they make you write:
if (itIsSo)
{
ASingleLineOfCode();
}
Rather then:
if (itIsSo)
ASingleLineOfCode();
(Note as indenting is not checked by the compiler, it can't be depended on to be right)
A good case could be made for designing a language that always require the bases, but then too many people would have hated C# due to having to always use the bases. However for try/catch there was not an expectation of being able to get away without using the bases, so it was possible to require them without to many people complaining.
Given a choose I would much rather have if/endIf (and while/endWhile) as the block delimiters but the USA got its way on that one. (C got to define what most languages look like rather than Module2, afterall most of what we do is defined by history not logic)
The simplest (I think) answer is that each block of code in C/C++/C# requires curly braces.
EDIT #1
In response to negative votes, directly from MSDN:
try-catch (C# Reference)
The try-catch statement consists of a try block followed by one or more catch clauses, which specify handlers for different exceptions.
As per definition says, it is a block, so it requires curly braces. That is why we cannot use it without { }.