What is the difference between these two method calls in c#? - c#

For example I have some class with one void method in it.
This is my class:
class MyClassTest
{
public void Print()
{
Console.WriteLine("Hello");
}
}
I am new to classes and little confused, is there a difference between these two method calls?
Here is my main method
static void Main(string[] args)
{
//first call
MyClassTest ms = new MyClassTest();
ms.Print();
//second call
new MyClassTest().Print();
}

In the case below you'll want to do this when you want to keep a reference to the constructed object and perform some further operations with it later on.
MyClassTest ms = new MyClassTest();
ms.Print();
Whereas, in the case below you'll only want to do this when you no longer care about the constructed object after construction but are just interested in calling the method Print.
new MyClassTest().Print();
The subtle difference between these two scenarios is that in the case where
the object being referenced performs further operations it will most likely get destroyed later than an object that is no longer referenced i.e. the second example above as the GC (Garbage Collector) will find out that it has no references and therefore decides to get rid of it.

There's no difference, actually. You use the first case when you need to refer to MyTestClass further in your program. You use second case as fire-and-forget. If you plan using second case heavily, it's recommended to make Print method as static.
The IL code shows no difference, except WithInstance method when variable holding reference is loaded onto stack (stloc.0 and ldloc.0 IL instructions):
MyClassTest ms = new MyClassTest();
ms.Print();
new MyClassTest().Print();

Your two calls perform the same semantic operation in c#: the difference is that in your first call, you create the ms variable, which suggests the reader that your intent is to use it again in your code: in fact you are calling ms.Print() after.
Your second call, does not declare any variable, this means that your intent is exactly to call the Print method on a brand new MyClassTest instance only once in your code, and you don't care about the instance your just created.
Side note: when compiling in release mode, the C# compiler will compact and reduce variable usage, therefore your two calls will compile the same and they will be like your second call.

In this particular case, no.
Any time you call a method on a result of another method call, new, property access, etc. as per:
new MyClassTest().Print();
It's akin to if you did:
var temp = new MyClassTest()
temp.Print();
So in this case your two examples are the same.
There are some variants where they differ.
One would be value-type objects that are accessed from array or field access. Here the access might use the address of the actual object rather than making a copy. Now it's possible for the opposite to happen where instead of an implicit temporary local being created and explicit local is removed, but it's not promised. Note that with mutable value-types the code with and without a temporary local are also not semantically the same for these cases (but are for a case closer to your example, where the object is the result of a method call that wasn't a ref return to a ref variable).
Another is when it is inside a yield-using or async method. Here the locals in your method become fields in an object produced (which either implements the IEnumerable<T> and/or IEnumerator<T> for yield or the Task for async) while the "invisible" temporary locals I described above do not. (The compiler could, and likely will in the future, do a better job at getting rid of some of these that don't exist after yield or async calls and therefore don't really have to be fields, but for the time being all the locals become fields).
So there are a few times when explicit locals with a single operation on them are slightly different to doing the operation directly on the means by which you obtained the value, though your example is not one of those times.

Related

Why does this lambda closure generate garbage although it is not executed at runtime?

I've noticed that the following code generates heap allocations which trigger the garbage collector at some point and I would like to know why this is the case and how to avoid it:
private Dictionary<Type, Action> actionTable = new Dictionary<Type, Action>();
private void Update(int num)
{
Action action;
// if (!actionTable.TryGetValue(typeof(int), out action))
if (false)
{
action = () => Debug.Log(num);
actionTable.Add(typeof(int), action);
}
action?.Invoke();
}
I understand that using a lambda such as () => Debug.Log(num) will generate a small helper class (e.g. <>c__DisplayClass7_0) to hold the local variable. This is why I wanted to test if I could cache this allocation in a dictionary. However, I noticed, that the call to Update leads to allocations even when the lambda code is never reached due to the if-statement. When I comment out the lambda, the allocation disappears from the profiler. I am using the Unity Profiler tool (a performance reporting tool within the Unity game engine) which shows such allocations in bytes per frame while in development/debug mode.
I surmise that the compiler or JIT compiler generates the helper class for the lambda for the scope of the method even though I don't understand why this would be desirable.
Finally, is there any way of caching delegates in this manner without allocating and without forcing the calling code to cache the action in advance? (I do know, that I could also allocate the action once in the client code, but in this example I would strictly like to implement some kind of automatic caching because I do not have complete control over the client).
Disclaimer: This is mostly a theoretical question out of interest. I do realize that most applications will not benefit from micro-optimizations like this.
Servy's answer is correct and gives a good workaround. I thought I might add a few more details.
First off: implementation choices of the C# compiler are subject to change at any time and for any reason; nothing I say here is a requirement of the language and you should not depend on it.
If you have a closed-over outer variable of a lambda then all closed-over variables are made into fields of a closure class, and that closure class is allocated from the long-term pool ("the heap") as soon as the function is activated. This happens regardless of whether the closure class is ever read from.
The compiler team could have chosen to defer creation of the closure class until the first point where it was used: where a local was read or written or a delegate was created. However, that would then add additional complexity to the method! That makes the method larger, it makes it slower, it makes it more likely that you'll have a cache miss, it makes the jitter work harder, it makes more basic blocks so the jitter might skip an optimization, and so on. This optimization likely does not pay for itself.
However, the compiler team does make similar optimizations in cases where it is more likely to pay off. Two examples:
The 99.99% likely scenario for an iterator block (a method with a yield return in it) is that the IEnumerable will have GetEnumerator called exactly once. The generated enumerable therefore has logic that implements both IEnumerable and IEnumerator; the first time GetEnumerator is called, the object is cast to IEnumerator and returned. The second time, we allocate a second enumerator. This saves one object in the highly likely scenario, and the extra code generated is pretty simple and rarely called.
It is common for async methods to have a "fast path" that returns without ever awaiting -- for example, you might have an expensive asynchronous call the first time, and then the result is cached and returned the second time. The C# compiler generates code that avoids creating the "state machine" closure until the first await is encountered, and therefore prevents an allocation on the fast path, if there is one.
These optimizations tend to pay off, but 99% of the time when you have a method that makes a closure, it actually makes the closure. It's not really worth deferring it.
I surmise that the compiler or JIT compiler generates the helper class for the lambda for the scope of the method even though I don't understand why this would be desirable.
Consider the case where there's more than one anonymous method with a closure in the same method (a common enough occurrence). Do you want to create a new instance for every single one, or just have them all share a single instance? They went with the latter. There are advantages and disadvantages to either approach.
Finally, is there any way of caching delegates in this manner without allocating and without forcing the calling code to cache the action in advance?
Simply move that anonymous method into its own method, so that when that method is called the anonymous method is created unconditionally.
private void Update(int num)
{
Action action = null;
// if (!actionTable.TryGetValue(typeof(int), out action))
if (false)
{
Action CreateAction()
{
return () => Debug.Log(num);
}
action = CreateAction();
actionTable.Add(typeof(int), action);
}
action?.Invoke();
}
(I didn't check if the allocation happened for a nested method. If it does, make it a non-nested method and pass in the int.)

Passing Variables into a class once as setup vs passing it multiple times? C#

Right, so I'm looking for a performance impact/code style answer here so this might not be the right place to ask.
I've got a DataRow 'reader' set up from a Database result and then I extract the values into an object. Since for each variable I want to check for null values and return a 'failure' value if it is null I have to perform the check:
try
{
if (reader[property] != DBNull.Value)
{
var = reader[property];
}
else
{
var = failureValue;
}
}
catch (ArgumentException e)
{
// DISPLAY ERROR ABOUT NO VALUE CALLED Property
var = failureValue;
}
catch (Exception e)
{
// DISPLAY GENERIC ERROR ABOUT VALUE
var = failureValue;
}
In my current situation I perform that check 47 times in one class so I moved the code into a method and then called it. Its important to note I put it in a superclass for my data access classes so every one could use it.
My question now, when I created this method I originally passed in the full DataRow 'reader' every time the method was called. Should I make a static variable within the superclass and set it when I set up the 'reader'? Thereby, allowing the method to access a static variable within itself and not having to perform the passing of the full reader every time? Or is it slower to change it over like this thread suggests?
EDIT: As both the first answer and comment have questioned, the variable does need to be static as the superclass is never initialised itself, just inherited from.
From a performance-perspective it makes no difference at all (maybe some nano-seconds, but you won´t care for that) if you pass the reader again and again or store it locally into a variable. In contrast to what´s said in the post you´ve mentioned there is no difference on where an object is stored between decalaring it in a method or in the class itself. If it is stored on the stack or heap is determined by other means.
However from an API-perspective you should ask yourself if the dependency (in your case the reader) is needed for all members (or at least a few) of your class or just one single. I suggest passing the dependencies where you really need them.
Another appraoch is to set the dependency by a setter before calling your method the first time:
var instance = new MyType();
instance.SetReader(reader);
for(int i = 0; i < 100; i++) instance.DoSomething();
As per your EDIT: only because your base-class isn´t instantiated doesn´t mean all it´s members should be static. Actually your base-class is instantiated, whenever you create a new instance of one of its child-classes (which cab be converted to the base-class though). So you can just use a usual instance-member that is inherited by all deriving classes.
You may have a look at dependecy injection.
You should pass instances rather than using a class-level, static variable.
The reason is because you want to maintain tight control over the scope and thus lifetime of your variables. You don't want them to exist any longer than they need to or be available outside of the scope within which they're needed.
Lets leave aside for the moment the fact that a data reader -- e.g. SqlDataReader -- implements the disposable pattern; you would still be making your application unnecessarily complicated. It may not seem to matter that much for a small application with relatively few classes, methods, and variables, but small applications have a habit of growing into larger, more-complex ones.
As I alluded to above, data readers like SqlDataReader implement IDisposable, which means that the way you are supposed to use instances of them is within a using statement:
using (var reader = command.ExecuteReader())
{
// ... do stuff
}
That way, the reader is guaranteed to be disposed and garbage collected before the block exits, even in the event of an exception. What's important to note is that the variable reader only exists within the scope of the using block. Narrower scope means not only is it easier to read your code and know what's happening where and when, but it also guarantees that your variables will be garbage collected when they are no longer needed.
Think about it: if you had a static, class-level variable, when would it be garbage collected? Not until the application exits. That is very hard to debug in cases where you're having memory issues.

Is this double instantation harmful, or simply unnecessary?

While perusing the legacy source, I found this:
DataSet myUPC = new DataSet();
myUPC = dbconn.getDataSet(dynSQL);
Resharper rightly "grays out" the "new Dataset()" part of it, and recommends, "Remove redundant initalizer," but is it as innocuous as that? Does the compiler simply dispose of the first instance just prior to the second assignment? IOW, is the first assignment simply unnecessary, or is it potentially harmful?
Does the compiler simply dispose of the first instance just prior to the second assignment?
No, there's no automatic disposal here.
IOW, is the first assignment simply unnecessary, or is it potentially harmful?
It's harmful in two small ways:
It makes more work for both the initialization code and the garbage collector. It's unlikely to be significant, but it's there. If the constructor acquired some native resource that could be more serious.
It makes your code look like it wants to do something it doesn't actually want to do. You don't want to create a new empty DataSet, so why do so?
Just initialize the variable with the value you really want:
DataSet myUPC = dbconn.getDataSet(dynSQL);
Now your code shows exactly what you want to do. (I would fix the method name so that it follows .NET naming conventions, mind you.)
It's usually just unnecessary.
It would only be actively harmful if the DataSet constructor initiated some long running background thread or allocated a huge amount of memory which would stay around until the redundant object was garbage collected, which isn't instantaneous.
However, a well mannered constructor shouldn't do these things so you're probably safe. However, I would take note and fix the code whenever I saw this as, as Jon Skeet points out, it's making your code do unnecessary work creating and disposing of an object you have no intention of using and looks like you're missing some code.
IOW, is the first assignment simply unnecessary, or is it potentially harmful?
The first assignment is unnecessary, but also potentially harmful, depending on the type. The first instance will become eligible for GC, but still get initialized (for no reason) and never used.
It'll hang around until it's garbage collected, as there are no other references to it. However, if the constructor has side effects (presumably DataSet's constructor doesn't), it could also be harmful.
myUPC is overwritten with the output of dbconn.getDataSet(). This is because getDataSet() is a factory method, and returns an object of type Dataset.
Yes, as indicated by the other answers it is actually somewhat harmful, mostly because it allocates an object that is never used and must be garbage collected eventually. But let me go into a bit more detail.
Could the first instance be disposed?
DataSet myUPC = new DataSet();
myUPC = dbconn.getDataSet(dynSQL);
Does the compiler simply dispose of the first instance just prior to the second assignment?
Assuming you mean by dispose the GC (garbage collector) collecting the new unused instance, then the answer is: no. Let me elaborate:
The GC may run at any time it pleases, for example when the heap will soon be full, or when trying to allocate an object that won't fit in the heap's remaining space. So the GC may also (just by chance) run exactly between your first and your second statement. However, this will not collect your new DataSet() object because there is a reference to it in the local variable myUPC. Objects are only considered for collection when there are no references to it1.
1) Actually, objects are only considered for collection when there is no chain of references from a so called root to the object. Roots include static fields, method arguments, local variables and evaluation stacks.
Could the constructor call be optimized away?
DataSet myUPC; /* Optimized away? */
myUPC = dbconn.getDataSet(dynSQL);
Also, the Just-In-Time compiler can't simply optimize the constructor call away because it may influence things other than the object being initialzed (i.e. have side-effects). For example, if the compiler optimized the constructor call away then the constructor would not print anything on the console. This is not desired or expected, and therefore the constructor call has to stay in there and result in a new instance.
class MyClass
{
public MyClass()
{
Console.WriteLine("Constructor called!");
}
}
abstract class X
{
void Do()
{
MyClass my = new MyClass(); // Should always print "Constructor called!"
my = GetMyClass();
// ...
}
protected abstract MyClass GetMyClass();
}

Performance and static method versus public method

I have a helper method that takes a begin date and an end date and through certain business logic yields an integer result. This helper method is sometimes called in excess of 10,000 times for a given set of data (though this doesn't occur often).
Question:
Considering performance only, is it more efficient to make this helper method as a static method to some helper class, or would it be more gainful to have the helper method as a public method to a class?
Static method example:
// an iterative loop
foreach (var result in results) {
int daysInQueue = HelperClass.CalcDaysInQueue(dtBegin, dtEnd);
}
Public member method example:
// an iterative loop
HelperClass hc = new HelperClass();
foreach (var result in results) {
int daysInQueue = hc.CalcDaysInQueue(dtBegin, dtEnd);
}
Thanks in advance for the help!
When you call an instance method the compiler always invisibly passes one extra parameter, available inside that method under this name. static methods are not called on behalf of any object, thus they don't have this reference.
I see few benefits of marking utility methods as static:
small performance improvement, you don't pay for a reference to this which you don't really use. However I doubt you will ever see the difference.
convenience - you can call static method wherever and whenever you want, the compiler is not forcing you to provide an instance of an object, which is not really needed for that method
readability: instance method should operate on instance's state, not merely on parameters. If it's an instance method not needing an instance to work, it's confusing.
The difference in performance here is effectively nothing. You will have a hard time actually measuring the difference in time (and getting over the "noise" of other stuff going on with your CPU), that's how small it will be.
Unless you happen to go and perform a whole bunch of database queries or read in several gigabytes of info from files in the constructor of the object (I'm assuming here that' it's just empty) it will have a fairly small cost, and since it's out of the loop it doesn't scale at all.
You should be making this decision based on what logically makes sense, not based on performance, until you have a strong reason to believe that there is a significant, and necessary performance gain to be had by violating standard practices/readability/etc.
In this particular case your operation is logically 'static'. There is no state that is used, so there is no need to have an instance of the object, as such the method should be made static. Others have said that it might perform better, which is very possibly true, but that shouldn't be why you make it static. If the operation logically made sense as an instance method you shouldn't try to force it into a static method just to try to get it to run faster; that's learning the wrong lesson here.
Just benchmark it :) In theory a static method should be faster since it leaves out the virtual call overhead but this overhead might not be significant in your case (but I'm not even sure what language the example is in). Just time both loops with a large enough number of iterations for it to take a minute or so and see for yourself. Jut make sure you use non-trivial data so your compiler doesn't optimize the calls out.
Based on my understanding, it would be more beneficial for performance to make it a static method. This means that there isn't an instance of the object created, although the performance difference would be negligible, I think. That is the case if there isn't some data that has to be recreated every time you call the static function, which could be stored in the class object.
You say 'considering performance only'. In that case you should fully focus on whats inside
HelperClass.CalcDaysInQueue(dtBegin, dtEnd);
And not on the 0.0001% of runtime spent in calling that routine. If it's a short routine the JIT compiler will inline it anyway and in that case there will be NO performance difference between the static and instance method.

Automatically calling method after code block

I'm adding the notion of actions that are repeatable after a set time interval in my game.
I have a class that manages whether a given action can be performed.
Callers query whether they can perform the action by calling CanDoAction, then if so, perform the action and record that they've done the action with MarkActionDone.
if (WorldManager.CanDoAction(playerControlComponent.CreateBulletActionId))
{
// Do the action
WorldManager.MarkActionDone(playerControlComponent.CreateBulletActionId);
}
Obviously this could be error prone, as you could forget to call MarkActionDone, or possibly you could forget to call CanDoAction to check.
Ideally I want to keep a similar interface, not having to pass around Action's or anything like that as I'm running on the Xbox and would prefer to avoid passing actions around and invoking them. Particularly as there would have to be a lot of closures involved as the actions are typically dependent on surrounding code.
I was thinking of somehow (ab)using the IDisposeable interface, as that would ensure the MarkActionDone could be called at the end, however i don't think i can skip the using block if CanDoAction would be false.
Any ideas?
My preferred approach would be to keep this logic as an implementation detail of WorldManager (since that defines the rules about whether an action can be performed), using a delegate pattern:
public class WorldManager
{
public bool TryDoAction(ActionId actionId, Action action)
{
if (!this.CanDoAction(actionId)) return false;
try
{
action();
return true;
}
finally
{
this.MarkActionDone(actionId);
}
}
private bool CanDoAction(ActionId actionId) { ... }
private void MarkActionDone(ActionId actionId) { ... }
}
This seems to fit best with SOLID principals, since it avoids any other class having to 'know' about the 'CanDoAction', 'MarkActionDone' implementation detail of WorldManager.
Update
Using an AOP framework, such as PostSharp, may be a good choice to ensure this aspect is added to all necessary code blocks in a clean manner.
If you want to minimize GC pressure, I would suggest using interfaces rather than delegates. If you use IDisposable, you can't avoid having Dispose called, but you could have the IDisposable implementation use a flag to indicate that the Dispose method shouldn't do anything. Beyond the fact that delegates have some built-in language support, there isn't really anything they can do that interfaces cannot, but interfaces offer two advantages over delegates:
Using a delegate which is bound to some data will generally require creating a heap object for the data and a second for the delegate itself. Interfaces don't require that second heap instance.
In circumstances where one can use generic types which are constrained to an interface, instead of using interface types directly, one may be able to avoid creating any heap instances, as explained below (since back-tick formatting doesn't work in list items). A struct that combines a delegate to a static method along with data to be consumed by that method can behave much like a delegate, without requiring a heap allocation.
One caveat with the second approach: Although avoiding GC pressure is a good thing, the second approach may end up creating a very large number of types at run-time. The number of types created will in most cases be bounded, but there are circumstances where it could increase without bound. I'm not sure if there would any convenient way to determine the full set of types that could be produced by a program in cases where static analysis would be sufficient (in the general case, where static analysis does not suffice, determining whether any particular run-time type would get produced would be equivalent to the Halting Problem, but just as many programs can in practice be statically determined to always halt or never halt, I would expect that in practice one could often identify a closed set of types that a program could produce at run-time).
Edit
The formatting in point #2 above was messed up. Here's the explanation, cleaned up.
Define a type ConditionalCleaner<T> : IDisposable, which holds an instance of T and an Action<T> (both supplied in the constructor--probably with the Action<T> as the first parameter). In the IDisposable.Dispose() method, if the Action<T> is non-null, invoke it on the T. In a SkipDispose() method, null out the Action<T>. For convenience, you may want to also define ConditionalCleaner<T,U>: IDisposable similarly (perhaps three- and four-argument versions as well), and you may want to define a static class ConditionalCleaner with generic Create<T>, Create<T,U>, etc. methods (so one could say e.g. using (var cc = ConditionalCleaner.Create(Console.WriteLine, "ABCDEF") {...} or ConditionalCleaner.Create((x) => {Console.WriteLine(x);}, "ABCDEF") to have the indicated action performed when the using block exits. The biggest requirement if one uses a Lambda expression is to ensure that the lambda expression doesn't close over any local variables or parameters from the calling function; anything the calling function wants to pass to the lambda expression must be an explicit parameter thereof. Otherwise the system will define a class object to hold any closed-over variables, as well as a new delegate pointing to it.

Categories