What is the purpose of anonymous { } blocks in C style languages (C, C++, C#)
Example -
void function()
{
{
int i = 0;
i = i + 1;
}
{
int k = 0;
k = k + 1;
}
}
Edit - Thanks for all of the excellent answers!
It limits the scope of variables to the block inside the { }.
Brackets designate an area of scope - anything declared within the brackets is invisible outside of them.
Furthermore, in C++ an object allocated on the stack (e.g. without the use of 'new') will be destructed when it goes out of scope.
In some cases it can also be a way to highlight a particular piece of a function that the author feels is worthy of attention for people looking at the source. Whether this is a good use or not is debatable, but I have seen it done.
They are often useful for RAII purposes, which means that a given resource will be released when the object goes out of scope. For example:
void function()
{
{
std::ofstream out( "file.txt" );
out << "some data\n";
}
// You can be sure that "out" is closed here
}
By creating a new scope they can be used to define local variables in a switch statement.
e.g.
switch (i)
{
case 0 :
int j = 0; // error!
break;
vs.
switch (i)
{
case 0 :
{
int j = 0; // ok!
}
break;
{ ... } opens up a new scope
In C++, you can use them like this:
void function() {
// ...
{
// lock some mutex.
mutex_locker lock(m_mutex);
// ...
}
// ...
}
Once control goes out of the block, the mutex locker is destroyed. And in its destructor, it would automatically unlock the mutex that it's connected to. That's very often done, and is called RAII (resource acquisition is initialization) and also SBRM (scope bound resource management). Another common application is to allocate memory, and then in the destructor free that memory again.
Another purpose is to do several similar things:
void function() {
// set up timer A
{
int config = get_config(TIMER_A);
// ...
}
// set up timer B
{
int config = get_config(TIMER_B);
// ...
}
}
It will keep things separate so one can easily find out the different building blocks. You may use variables having the same name, like the code does above, because they are not visible outside their scope, thus they do not conflict with each other.
Another common use is with OpenGL's glPushMatrix() and glPopMatrix() functions to create logical blocks relating to the matrix stack:
glPushMatrix();
{
glTranslate(...);
glPushMatrix();
{
glRotate(...);
// draw some stuff
}
glPopMatrix();
// maybe draw some more stuff
}
glPopMatrix();
class ExpensiveObject {
public:
ExpensiveObject() {
// acquire a resource
}
~ExpensiveObject() {
// release the resource
}
}
int main() {
// some initial processing
{
ExpensiveObject obj;
// do some expensive stuff with the obj
} // don't worry, the variable's scope ended, so the destructor was called, and the resources were released
// some final processing
}
Scoping of course. (Has that horse been beaten to death yet?)
But if you look at the language definition, you see patterns like:
if ( expression ) statement
if ( expression ) statement else statement
switch ( expression ) statement
while ( expression ) statement
do statement while ( expression ) ;
It simplifies the language syntax that compound-statement is just one of several possible statement's.
compound-statement: { statement-listopt }
statement-list:
statement
statement-list statement
statement:
labeled-statement
expression-statement
compound-statement
selection-statement
iteration-statement
jump-statement
declaration-statement
try-block
You are doing two things.
You are forcing a scope restriction on the variables in that block.
You are enabling sibling code blocks to use the same variable names.
They're very often used for scoping variables, so that variables are local to an arbitrary block defined by the braces. In your example, the variables i and k aren't accessible outside of their enclosing braces so they can't be modified in any sneaky ways, and that those variable names can be re-used elsewhere in your code. Another benefit to using braces to create local scope like this is that in languages with garbage collection, the garbage collector knows that it's safe to clean up out-of-scope variables. That's not available in C/C++, but I believe that it should be in C#.
One simple way to think about it is that the braces define an atomic piece of code, kind of like a namespace, function or method, but without having to actually create a namespace, function or method.
As far as I understand, they are simply for scoping. They allow you to reuse variable names in the parent/sibling scopes, which can be useful from time to time.
EDIT: This question has in fact been answered on another Stack Overflow question. Hope that helps.
As the previous posters mentioned, it limits the use of a variable to the scope in which it is declared.
In garbage collected languages such as C# and Java, it also allows the garbage collector to reclaim memory used by any variables used within the scope (although setting the variables to null would have the same effect).
{
int[] myArray = new int[1000];
... // Do some work
}
// The garbage collector can now reclaim the memory used by myArray
It's about the scope, it refers to the visibility of variables and methods in one part of a program to another part of that program, consider this example:
int a=25;
int b=30;
{ //at this point, a=25, b=30
a*=2; //a=50, b=30
b /= 2; //a=50,b=15
int a = b*b; //a=225,b=15 <--- this new a it's
// declared on the inner scope
}
//a = 50, b = 15
If you are limited to ANSI C, then they could be used to declare variables closer to where you use them:
int main() {
/* Blah blah blah. */
{
int i;
for (i = 0; i < 10; ++i) {
}
}
}
Not neccessary with a modern C compiler though.
A useful use-cas ihmo is defining critical sections in C++.
e.g.:
int MyClass::foo()
{
// stuff uncritical for multithreading
...
{
someKindOfScopeLock lock(&mutexForThisCriticalResource);
// stuff critical for multithreading!
}
// stuff uncritical for multithreading
...
}
using anonymous scope there is no need calling lock/unlock of a mutex or a semaphore explicitly.
I use it for blocks of code that need temporary variables.
One thing to mention is that scope is a compiler controlled phenomenon. Even though the variables go out of scope (and the compiler will call any destructors; POD types are optimised immediately into the code), they are left on the stack and any new variables defined in the parent scope do not overwrite them on gcc or clang (even when compiling with -Ofast). Except it is undefined behaviour to access them via address because the variables have conceptually gone out of scope at the compiler level -- the compiler will stop you accessing them by their identifiers.
#include <stdio.h>
int main(void) {
int* c;
{
int b = 5;
c=&b;
}
printf("%d", *c); //undefined behaviour but prints 5 for reasons stated above
printf("%d", b); //compiler error, out of scope
return 0;
}
Also, for, if and else all precede anonymous blocks aka. compound statements, which either execute one block or the other block based on a condition.
Related
https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/proposals/csharp-8.0/using#using-declaration
The lifetime of a using local will extend to the end of the scope in which it is declared. The using locals will then be disposed in the reverse order in which they are declared.
My question is: when is a using local considered out-of-scope?
Is it necessarily at the end of the block?
Is it necessarily right after its last use in the block?
Or is it implementation-defined, so that it could be either, or even anywhere in-between?
In other words:
{
using var res = Res();
res.DoStuff();
somethingElse.DoWhatever();
res.DoMoreStuff();
// 100 more statements that have nothing to do with res
}
Is this always equivalent to this (1)?
{
using (var res = Res()) {
res.DoStuff();
somethingElse.DoWhatever();
res.DoMoreStuff();
// 100 more statements that have nothing to do with res
}
}
Or always to this (2)?
{
using (var res = Res()) {
res.DoStuff();
somethingElse.DoWhatever();
res.DoMoreStuff();
}
// 100 more statements that have nothing to do with res
}
Or is this an implementation detail?
Does the spec define this? What is such "scope" technically? If one of the above is always the case, is there a reason to prefer this behavior over the other? I'd assume (2) is better, but maybe I'm wrong.
I know it probably doesn't change much for high-level programming, but I'm curious.
Somewhat related: Does a using block guarantee that the object will not be disposed until the end of the block?
This is what a Scope means according to the language spec:
The scope of a name is the region of program text within which it is possible to refer to the entity declared by the name without qualification of the name.
[...]
The scope of a local variable declared in a local_variable_declaration (ยง12.6.2) is the block in which the declaration occurs.
Note that a using declaration is just the word using added to a local variable declaration. According to the using declaration proposal,
The language will allow for using to be added to a local variable declaration.
Therefore, your using declaration is equivalent to
{
using (var res = Res()) {
res.DoStuff();
somethingElse.DoWhatever();
res.DoMoreStuff();
// 100 more statements that have nothing to do with res
}
}
The outermost { ... } denotes a Block.
In the .NET Blog in Do more with patterns in C# 8.0 we can read:
Using declarations are simply local variable declarations with a using keyword in front, and their contents are disposed at the end of the current statement block.
And then we can see:
static void Main(string[] args)
{
using var options = Parse(args);
if (options["verbose"]) { WriteLine("Logging..."); }
} // options disposed here
Also, about
The using locals will then be disposed in the reverse order in which they are declared.
using (var v1 = Class1()) // disposed third
using (var v2 = Class2()) // disposed second
using (var v3 = Class3()) // disposed first
{
. . . .
}
I have ASP.Net Core 2.1, C# application. I am using Clr Heap Allocation Analyzer
https://marketplace.visualstudio.com/items?itemName=MukulSabharwal.ClrHeapAllocationAnalyzer
One of the methods looks as below
Ex#1
public void ConfigureServices(IServiceCollection services) {
services.AddSingleton<IPocoDynamo>(serviceProvider => {
var pocoDynamo = new PocoDynamo(serviceProvider.GetRequieredService<IAmazonDynamoDB>());
pocoDynamo.SomeMethod();
return pocoDynamo;
});
}
Ex.#2
public async Task<EventTO> AddEvent(EventTO eventObj)
{
try
{
throw new Exception("Error!");
}
catch (Exception ex)
{
Logger.Log(ex, eventObj);
return null;
}
}
I am using DI throughout the app. But wherever the analyzer is finding new keyword thing, it is warning as
HAA0502 Explicit new reference type allocation
Also wherever Lambda expression is used, it is warning as (like in ex#1)
Warning HAA0301 Heap allocation of closure Captures:
What is causing this & how to address this?
Thanks!
Heap Allocation Analyzer is used to mark all the allocations your code performs. This is not something you would like to have always on: consider the following silly code
public static string MyToString(object? o)
{
if (o == null)
throw new ArgumentNullException(nameof(o)); // HAA0502 here
return o.ToString() ?? string.Empty;
}
The analyzer will emit HAA0502 in the form of warning as information on the marked line to tell you that you are allocating a new instance. Now, it is obvious in this case what you are doing, and it is a trivial warning, but the purpose of the analyzer is to help you spot nasty allocations that might turn your code into something slower.
Now consider this silly code here:
public static void Test1()
{
for (int i = 0; i < 100; i++)
{
var a = i + 1;
var action = new Action(
() => // HAA0301 Heap allocation of closure Capture: a
{
Console.WriteLine(a);
}
);
action();
}
}
Other than HAA0502 which will be marked on new Action( because we are creating a new object, there is an additional warning on the lambda: HAA0301. This is why the analyzer gets more useful: here the analyzer is telling you that the runtime will create a new object containing your captured variable a. If you are not familiar with this, you may think at that code to get transformed in something like this (for explanatory purposes only):
private sealed class Temp1
{
public int Value1 { get; }
public Temp1(int value1)
{
Value1 = value1;
}
public void Method1()
{
Console.WriteLine(Value1);
}
}
public static void Test1()
{
for (int i = 0; i < 100; i++)
{
var a = i + 1;
var t = new Temp1(a);
t.Method1();
}
}
In the latter code, it becomes evident that at every iteration you are allocating an object.
The main question you may have is: is allocating an object a problem? In 99.9% of the cases it is not a problem and you may embrace the simplicity of writing readable, precise and concise code without dealing with low level details, but if you are caught in performance issues (i.e. the remaining 0.01%), the analyzer can get quite handy as it shows in one shots where you or the compiler in your behalf is allocating something. Allocating objects require a future garbage collector cycle to reclaim the memory.
Regarding your code, you are initializing a service via DI with the factory pattern: that code runs once. Therefore there is no surprise you are allocating a new object. So you can safely suppress the warning on this portion of code. You may use the IDE to let generate the suppression code. This is why I suggest to keep the analyzer disabled and enable it only when hunting performance problems.
public void func1()
{
object obj= null;
try
{
obj=new object();
}
finally
{
obj = null;
}
}
is there any advantage of assigning null to a reference in finally block in regards of memory management of large objects?
Let's deal with the explicit and implicit questions here.
Q: First and foremost, is there a point in assigning null to a local variable when you're done with it?
A: No, none at all. When compiled with optimizations and not running under a debugger, the JITter knows the segment of the code where a variable is in use and will automatically stop considering it a root when you've passed that segment. In other words, if you assign something to a variable, and then at some point never again read from it, it may be collected even if you don't explicitly set it to null.
So your example can safely be written as:
public void func1()
{
object obj = new object();
// implied more code here
}
If no code in the "implied more code here" ever accesses the obj variable, it is no longer considered a root.
Note that this changes if running in a non-optimized assembly, or if you hook up a debugger to the process. In that case the scope of variables is artificially extended until the end of their scope to make it easier to debug.
Q: Secondly, what about fields in the surrounding class?
A: Here it can definitely make a difference.
If the object surrounding your method is kept alive for an extended period of time, and the need for the contents of a field has gone, then yes, setting the field to null will make the old object it referenced eligible for collection.
So this code might have merit:
public class SomeClass
{
private object obj;
public void func1()
{
try
{
obj=new object();
// implied more code here
}
finally
{
obj = null;
}
}
}
But then, why are you doing it like this? You should instead strive to write cleaner code that doesn't rely on surrounding state. In the above code you should instead refactor the "implied more code here" to be passed in the object to use, and remove the global field.
Obviously, if you can't do that, then yes, setting the field to null as soon as its object reference is no longer needed is a good idea.
Fun experiment, if you run the below code in LINQPad with optimizations on, what do you expect the output to be?
void Main()
{
var s = new Scary();
s.Test();
}
public class Scary
{
public Scary()
{
Console.WriteLine(".ctor");
}
~Scary()
{
Console.WriteLine("finalizer");
}
public void Test()
{
Console.WriteLine("starting test");
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
Console.WriteLine("ending test");
}
}
Answer (mouseover to show when you think you've got it):
.ctor
starting test
finalizer
ending test
Explanation:
Since the implicit this parameter to an instance method is never used inside the method, the object surrounding the method is collected, even if the method is currently running.
Most of the code I have seen deletes the pointer in the finalizer/destructor:
public ref class CPPObjectWrapper
{
private:
CPPObject *_cppObject;
public:
CPPObjectWrapper()
{
_cppObject = new CPPObject();
}
CPPObjectWrapper(IntPtr ^ptr)
{
_cppObject = ptr->ToPointer();
}
~CPPObjectWrapper()
{
delete _cppObject;
_cppObject = 0;
}
!CPPObjectWrapper()
{
if (_cppObject != 0) delete _cppObject;
}
IntPtr^ GetPointer()
{
return gcnew IntPtr(_cppObject);
}
}
My question is what would be standard practice if the library your wrapping does something like this:
void AddObject(CPPObject *cppObject)
{
// adds to a std::list
}
CPPObject* FindObject(/* criteria */)
{
// return reference to std::list item based on criteria
}
If your c# wrapper does this:
void AddObject(CPPObjectWrapper ^cppObject)
{
_internal->addObject(cppObject->GetPointer()->ToPointer());
}
CPPObjectWrapper^ FindObject(/* criteria */)
{
CPPObject *cppObject = _internal->findObject(/* criteria */);
return gcnew CPPObjectWrapper(gcnew IntPtr(cppObjet));
}
You run into a memory issue because your managed object should not delete the pointer because its referenced in another object. The same is true when returning. Would you simply add functionality to tell your managed wrapper not to free the memory when ownership is transferred?
A classic situation when dealing with mixed mode projects, and your suggestion is OK!
It would make sense to have a bool in the constructor that tells it not to destroy the pointer if the same object is used in another non-wrapped object. The ideal case is that every object was wrapped, and the destruction would be done by the CLR.
You can make a generic base class out of this (using the code you already have there), setting the bool by default by the subclass. You are guaranteed to have this functionality many times over. Another tip is to have a virtual OnFinalize() method that is called from the CLR destructor (!) that can do special operations in the subclass, like calling some special free function provided by the native library.
I'm writing HFT trading software. I do care about every single microsecond. Now it written on C# but i will migrate to C++ soon.
Let's consider such code
// Original
class Foo {
....
// method is called from one thread only so no need to be thread-safe
public void FrequentlyCalledMethod() {
var actions = new List<Action>();
for (int i = 0; i < 10; i++) {
actions.Add(new Action(....));
}
// use actions, synchronous
executor.Execute(actions);
// now actions can be deleted
}
I guess that ultra-low latency software should not use "new" keyword too much, so I moved actions to be a field:
// Version 1
class Foo {
....
private List<Action> actions = new List<Action>();
// method is called from one thread only so no need to be thread-safe
public void FrequentlyCalledMethod() {
actions.Clear()
for (int i = 0; i < 10; i++) {
actions.Add(new Action { type = ActionType.AddOrder; price = 100 + i; });
}
// use actions, synchronous
executor.Execute(actions);
// now actions can be deleted
}
And probably I should try to avoid "new" keyword at all? I can use some "pool" of pre-allocated objects:
// Version 2
class Foo {
....
private List<Action> actions = new List<Action>();
private Action[] actionPool = new Action[10];
// method is called from one thread only so no need to be thread-safe
public void FrequentlyCalledMethod() {
actions.Clear()
for (int i = 0; i < 10; i++) {
var action = actionsPool[i];
action.type = ActionType.AddOrder;
action.price = 100 + i;
actions.Add(action);
}
// use actions, synchronous
executor.Execute(actions);
// now actions can be deleted
}
How far should I go?
How important to avoid new?
Will I win anything while using preallocated object which I only need to configure? (set type and price in example above)
Please note that this is ultra-low latency so let's assume that performance is preferred against readability maintainability etc. etc.
In C++ you don't need new to create an object that has limited scope.
void FrequentlyCalledMethod()
{
std::vector<Action> actions;
actions.reserve( 10 );
for (int i = 0; i < 10; i++)
{
actions.push_back( Action(....) );
}
// use actions, synchronous
executor.Execute(actions);
// now actions can be deleted
}
If Action is a base class and the actual types you have are of a derived class, you will need a pointer or smart pointer and new here. But no need if Action is a concrete type and all the elements will be of this type, and if this type is default-constructible, copyable and assignable.
In general though, it is highly unlikely that your performance benefits will come from not using new. It is just good practice here in C++ to use local function scope when that is the scope of your object. This is because in C++ you have to take more care of resource management, and that is done with a technique known as "RAII" - which essentially means taking care of how a resource will be deleted (through a destructor of an object) at the point of allocation.
High performance is more likely to come about through:
proper use of algorithms
proper parallel-processing and synchronisation techniques
effective caching and lazy evaluation.
As much as I detest HFT, I'm going to tell you how to get maximum performance out of each thread on a given piece of iron.
Here's an explanation of an example where a program as originally written was made 730 times faster.
You do it in stages. At each stage, you find something that takes a good percentage of time, and you fix it.
The keyword is find, as opposed to guess.
Too many people just eyeball the code, and fix what they think will help, and often but not always it does help, some.
That's guesswork.
To get real speedup, you need to find all the problems, not just the few you can guess.
If your program is doing new, then chances are at some point that will be what you need to fix.
But it's not the only thing.
Here's the theory behind it.
For high-performance trading engines at good HFT shops, avoiding new/malloc in C++ code is a basic.