Are static methods always held in memory? - c#

My whole development team thinks, static methods are a terrible thing to use.
I really don't see any disadvantages in some cases. When I needed a stateless method before, I always used static methods for that purpose.
I agree with some of their points, e.g. I know they are quite hard to test (although it's not impossible).
What I don't get is, that they claim, static methods are always held in memory and will fill the basic memory usage. So, if you are using 100 static methods in your program, when the program starts all methods are loaded into memory and will fill the memory unnecessarily. Furthermore static methods increase the risk of memory leaks.
Is that true?
It's quite inconvenient to have to create a new instance of a class just to call the method. But thats how they do it right now, the create a instance in mid of a method and call that method, that could be just a static one.

There is no distinction between static and instance methods as far as memory is concerned. Instance methods only have an extra argument, this. Everything else is exactly the same. Also the basic way in which extension methods were easy to add to C#, all that was needed was syntax to expose that hidden this argument.
Methods occupy space for their machine code, the actual code that the processor executes. And a table that describes how the method stores objects, that helps the garbage collector to discover object roots held in local variables and CPU registers. This space is taken from the "loader heap", an internal data structure that the CLR creates that is associated with the AppDomain. Happens just once, when the method first executes, just-in-time. Releasing that space requires unloading that appdomain. Static variables are also allocated in the loader heap.
Do not throw away the big advantage of static methods. They can greatly improve the readability and maintainability of code. Thanks to their contract, they cannot alter the object state. They can therefore have very few side-effects, makes it really easy to reason about what they do. However, if they make you add static variables then they do the exact opposite, global variables are evil.

It's quite inconvenient to have to create a new instance of a class just to call the method. But thats how they do it right now
They are being ridiculous, to be blunt.
As commenter Groo has pointed out, at the native compiled level, an instance method isn't even all that different from a static method. It's just that there's an implicit parameter being passed to the instance method, while with the static method "what you see is what you get".
The runtime may optimize access to a method. It may not JIT-compile the method until the first time it's executed. It may not even load the IL from your assembly into memory until the IL is actually needed. But it can perform these optimizations with static and instance methods equally well.
In fact, forcing all methods to be instance methods is worse than using static method, because it means that for some methods, one is arbitrarily creating an otherwise-useless object. While the runtime may be able to detect the object reference is unused and so cause it to have a minimum lifespan, it can't avoid allocating the object altogether, even a degenerate object will take up some memory while it's alive, and it will add to the cost of garbage collection.
And beyond all that, let's suppose for a moment your colleagues were correct. What would you have gained? Any measurable performance difference at all? Doubtful. Let's face it: managed code, and especially C#, has the potential to hide any number of performance-draining implementation details from us. The main reason we use a managed code language like C# is that we gain so much in productivity and code-correctness, that these possible inefficiencies are well worth it. Most of the time, and especially on modern computers, they are practically invisible.
As it happens, your colleagues are not correct, and it is not in any way beneficial to make into instance methods, methods that otherwise could be static methods. But even if that wasn't the case, to spend time writing obfuscated, unexpressive code to avoid some unmeasured, unproven performance cost is a waste. (And I know no one bothered to compare the actual performance difference, because if they had, they'd have found no improvement in performance by eliminating all static methods).

Related

Should variables be reused to optimize resource utilization?

I am Using Microsoft Visual C# 2010. I have several methods that use a large bitmap for local processing, and each method can be called several times.I can declare a global variable and reuse it:
Bitmap workPic, editPic;
...
void Method1() {
workPic = new Bitmap(editPic);
...
}
void Method2() {
workPic = new Bitmap(editPic.Width * 2, editPic.Height * 2);
...
}
or declare a local variable in each method:
Bitmap editPic;
...
void Method1() {
Bitmap workPic = new Bitmap(editPic);
...
}
void Method2() {
Bitmap workPic = new Bitmap(editPic.Width * 2, editPic.Height * 2);
...
}
The second way is better for code clarity (local variables for local use). Is there a difference in terms of resource utilization?
If you intend to keep the memory allocated to you can use workPic again after the method, you should register it as class variable. If not, you can free memory (always a good idea) by letting it go out of scope.
Allocating one variable doesn't matter much to the framework which manages memory. Only if you recreate a variable inside a tight loop you may benefit by reusing the variable. If you have basic types, you even reuse the same memory. Else, only the reference to the allocated memory is kept, so not that much benefit you have from there.
Note it is very important to Dispose your workPic since now you have a memory leak in the unmanaged memory behind Bitmap. Preferably use using.
Why Global Variables Should Be Avoided When Unnecessary
Non-locality -- Source code is easiest to understand when the scope of its individual elements are limited. Global variables can be
read or modified by any part of the program, making it difficult to
remember or reason about every possible use.
No Access Control or Constraint Checking -- A global variable can be get or set by any part of the program, and any rules regarding its
use can be easily broken or forgotten. (In other words, get/set
accessors are generally preferable over direct data access, and this
is even more so for global data.) By extension, the lack of access
control greatly hinders achieving security in situations where you may
wish to run untrusted code (such as working with 3rd party plugins).
Implicit coupling -- A program with many global variables often has tight couplings between some of those variables, and couplings
between variables and functions. Grouping coupled items into cohesive
units usually leads to better programs.
Concurrency issues -- if globals can be accessed by multiple threads of execution, synchronization is necessary (and too-often
neglected). When dynamically linking modules with globals, the
composed system might not be thread-safe even if the two independent
modules tested in dozens of different contexts were safe.
Namespace pollution -- Global names are available everywhere. You may unknowingly end up using a global when you think you are using a
local (by misspelling or forgetting to declare the local) or vice
versa. Also, if you ever have to link together modules that have the
same global variable names, if you are lucky, you will get linking
errors. If you are unlucky, the linker will simply treat all uses of
the same name as the same object.
Memory allocation issues -- Some
environments have memory allocation schemes that make allocation of
globals tricky. This is especially true in languages where
"constructors" have side-effects other than allocation (because, in
that case, you can express unsafe situations where two globals
mutually depend on one another). Also, when dynamically linking
modules, it can be unclear whether different libraries have their own
instances of globals or whether the globals are shared.
Testing and Confinement - source that utilizes globals is somewhat more difficult to test because one cannot readily set up a 'clean'
environment between runs. More generally, source that utilizes global
services of any sort (e.g. reading and writing files or databases)
that aren't explicitly provided to that source is difficult to test
for the same reason. For communicating systems, the ability to test
system invariants may require running more than one 'copy' of a system
simultaneously, which is greatly hindered by any use of shared
services - including global memory - that are not provided for sharing
as part of the test.
reference: http://c2.com/cgi/wiki?GlobalVariablesAreBad
The main thing to understand here is that field and variable is only holding a reference, the memory will be allocated to the object(s) created by "new".
So in both cases all created bitmap objects need to go through garbage collection.
Difference is that object only referenced in method will be ready to be collected right after method execution, when the object which still have reference in a field will be ready to be collected only when the object containing the field also will be ready to be collected.
The only case when it make sense to introduce the field is when you have the same object reused through the life cycle of the host object.
In cases when you recreate object in the beginning of the method definitely variable is recommended.

Why don't all member variables need volatile for thread safety even when using Monitor? (why does the model really work?)

(I know they don't but I'm looking for the underlying reason this actually works without using volatile since there should be nothing preventing the compiler from storing a variable in a register without volatile... or is there...)
This question stems from the discord of thought that without volatile the compiler (can optimize in theory any variable in various ways including storing it in a CPU register.) While the docs say that is not needed when using synchronization like lock around variables. But there is really in some cases seemingly no way the compiler/jit can know you will use them or not in your code path. So the suspicion is something else is really happening here to make the memory model "work".
In this example what prevents the compiler/jit from optimizing _count into a register and thus having the increment done on the register rather then directly to memory (later writing to memory after the exit call)? If _count were volatile it would seem everything should be fine, but a lot of code is written without volatile. It makes sense the compiler could know not to optimize _count into a register if it saw a lock or synchronization object in the method.. but in this case the lock call is in another function.
Most documentation says you don't need to use volatile if you use a synchronization call like lock.
So what prevents the compiler from optimizing _count into a register and potentially updating just the register within the lock? I have a feeling that most member variables won't be optimized into registers for this exact reason as then every member variable would really need to be volatile unless the compiler could tell it shouldn't optimize (otherwise I suspect tons of code would fail). I saw something similar when looking at C++ years ago local function variables got stored in registers, class member variables did not.
So the main question is, is it really the only way this possibly works without volatile that the compiler/jit won't put class member variables in registers and thus volatile is then unnecessary?
(Please ignore the lack of exception handling and safety in the calls, but you get the gist.)
public class MyClass
{
object _o=new object();
int _count=0;
public void Increment()
{
Enter();
// ... many usages of count here...
count++;
Exit();
}
//lets pretend these functions are too big to inline and even call other methods
// that actually make the monitor call (for example a base class that implemented these)
private void Enter() { Monitor.Enter(_o); }
private void Exit() { Monitor.Exit(_o); } //lets pretend this function is too big to inline
// ...
// ...
}
Entering and leaving a Monitor causes a full memory fence. Thus the CLR makes sure that all writing operations before the Monitor.Enter / Monitor.Exit become visible to all other threads and that all reading operations after the method call "happen" after it. That also means that statements before the call cannot be moved after the call and vice versa.
See http://www.albahari.com/threading/part4.aspx.
The best guess answer to this question would appear to be that that any variables that are stored in CPU registers are saved to memory before any function would be called. This makes sense because compiler design viewpoint from a single thread would require that, otherwise the object might appear to be inconsistent if it were used by other functions/methods/objects.
So it may not be so much as some people/articles claim that synchronization objects/classes are detected by the compilers and non-volatile variables are made safe through their calls. (Perhaps they are when a lock is used or other synchronization objects in the same method, but once you have calls in another method that calls those synchronization objects probably not), instead it is likely that just the fact of calling another method is probably enough to cause the values stored in CPU registers to be saved to memory. Thus not requiring all variables to be volatile.
Also I suspect and others have suspected too that fields of a class are not optimized as much due to some of the threading concerns.
Some notes (my understanding):
Thread.MemoryBarrier() is mostly a CPU instruction to insure writes/reads don't bypass the barrier from a CPU perspective. (This is not directly related to values stored in registers) So this is probably not what directly causes to save variables from registers to memory (except just by the fact it is a method call as per our discussion here, would likely cause that to happen- It could have really been any method call though perhaps to affect all class fields that were used being saved from registers)
It is theoretically possible the JIT/Compiler could also take that method into an account in the same method to ensure variables are stored from CPU registers. But just following our simple proposed rule of any calls to another method or class would result in saving variables stored in registers to memory. Plus if someone wrapped that call in another method (maybe many methods deep), the compiler wouldn't likely analyze that deep to speculate on execution. The JIT could do something but again it likely wouldn't analyze that deep, and both cases need to ensure locks/synchronization work no matter what, thus the simplest optimization is the likely answer.
Unless we have anyone that writes the compilers that can confirm this its all a guess, but its likely the best guess we have of why volatile is not needed.
If that rule is followed synchronization objects just need to employ their own call to MemoryBarrier when they enter and leave to ensure the CPU has the most up to date values from its write caches so they get flushed so proper values can be read. On this site you will see that is what is suggested implicit memory barriers: http://www.albahari.com/threading/part4.aspx
So what prevents the compiler from optimizing _count into a register
and potentially updating just the register within the lock?
There is nothing in the documentation that I am aware that would preclude that from happening. The point is that the call to Monitor.Exit will effectively guarantee that the final value of _count will be committed to memory upon completion.
It makes sense the compiler could know not to optimize _count into a
register if it saw a lock or synchronization object in the method..
but in this case the lock call is in another function.
The fact that the lock is acquired and released in other methods is irrelevant from your point of view. The model memory defines a pretty rigid set of rules that must be adhered to regarding memory barrier generators. The only consequence of putting those Monitor calls in another method is that JIT compiler will have a harder time complying with those rules. But, the JIT compiler must comply; period. If the method calls get to complex or nested too deep then I suspect the JIT compiler punts on any heuristics it might have in this regard and says, "Forget it, I'm just not going to optimize anything!"
So the main question is, is it really the only way this possibly works
without volatile that the compiler/jit won't put class member
variables in registers and thus volatile is then unnecessary?
It works because the protocol is to acquire the lock prior to reading _count as well. If the readers do not do that then all bets are off.

Should I use static function in c# where many calls the same func?

John's console application calls my DLL function many times (~15x per sec). I am thinking to put this function as a static method.
I know that :
It can only access static props and objects.
It doesn't need an instance to run the function.
But I don't know if these are the only questions which i need to ask myself.
Each John's calls to my function is in a new thread that he creates.
If there is an error in my function, how will this affect all other calls?
Should I make this function a regular function with instance to the class (which John will create)?
What about GC?
What is the best practice answer for this question?
Sounds like there could be a problem. Calling a method which operates on static (shared) objects in a multithread environment should ring some alert bells for you.
Review your code and if there's a chance that a shared object is accessed from two (or more) threads at the same time, make the object an instance field and make your methods instance methods.
Of course, whether or not there is a risk depends much on the actual code (which you don't show) but making all calls nonstatic means that you lower the potential risk.
Generally, if your static method doesn't use any state (i.e. both reading and writing to static fields and properties, or calling other static methods that do so), there won't be any side effects for the caller, no matter how many threads they start.
If you're asking for best practice, static methods are mostly a bad idea though. If at all, static methods should be only used for very generic utility functionality.
It's not recommended because you can't predict if requirements change and you need some state one day. Then you'd better use a class that the caller can instantiate, and you'll break all existing code that uses your function.
About garbage collection, yes sure that has some overhead, but that is currently the sacrifice if you go the route of memory-managed OO. If you want more control, use unsafe code or a language like C++ or Vala.
I would agree with Wiktor (+1), but would also add that if synchronization is required in the static method, it may be more efficient to use multiple instances. Otherwise, having many threads might be pointless as only one can access a critical section at a time.

Performance of very big classes

I have a class called "simulation" and a method for solving simulation.
class sim
{
void Step()
{
}
other methods (20+)...
}
Sim class is only instantiated once during the program.
Step method is called in the order of millions during the program.
Step method uses a lot of local variables (100+). None of those locals are used in other methods.
Is it better to make those local variables a member of the class or keep them as local in Step() for better performance?
It depends. What kinds of variables: primitive types or objects? If the latter, your IL code is still going to chase their pointers. If the former, it depends on what order you access them in, and what CPU you are targeting. Optimizing the layout of variables should come pretty far down the list of performance tuning activities, especially in C# when you're dependent on what assembly your IL is translated into.
As usual with optimizations: first measure performance to identify bottlenecks. Then consider what you can do to remove them. Until you know that you need to do something like that, just write your code as clearly as possible: don't expose local variables unnecessarily by lifting them into the class level. And do consider splitting that Step() method: a large method is hard to understand and therefore even harder to optimize.
As a general rule, you should minimise the scope of variables and only increase the scope if it proves absolutely necessary. Converting locals to member variables is a poor design choice, and thus needs a very strong justification.
Also note that local variables only have a cost if they have non-trivial constructors. A local variable with a noop constructor or no constructor at all has no setup cost, so it would be pointless to expand its scope.
If some of the variables contain objects that can be reused in between multiple method calls, you'd skip the overhead of disposing the old instance and creating a new instance with each method call. For example, lookups in a dependency injection container that won't change in between method calls could be 'cached' in a field.
But aside from that, you probably won't get much of a performance increase. You could give it a try and use a profiler to measure your code performance. A profiler may also help you to identify other bottlenecks in your code.

GarbageCollector, Dispose or static Methods?

I developed a few classes last month. They grow big (round 30-40 Methods each class).
I never take a thought of Memory Leaks, GarbageColletor or something like this (I must say this is my first own big project).
Now I have classes with Methods, 15 Classes Round About, each class min. 20 methods. 50% are Linq-Classes in the DAL, 50% BusinessClasses with BusinessLogic. NO Class uses global variables (no need), so theoretically I can make them static classes + methods. At the moment they aren't, I initialize a class object and use the class - and not disposing it.
Where I should start when I be angry of having Memory Leaks etc. when the system runs by ~100 users?
Don't worry about the methods of your classes since they do not consume memory: each method exists only once, in the class definition. What really takes memory is the data contained in the objects in the form of fields.
About disposing objects (I assume .NET here), it is not necessary unless you use unmanaged resources. The garbage collector will take care of freeing all the managed resources (that is, plain objects with their data) when necessary.
If you want more information about the .NET garbage collector and how to deal with memory leaks, you can for example look here: http://www.codeproject.com/KB/dotnet/Memory_Leak_Detection.aspx. But if you are at the start of your project, I would concentrate on getting a clear and maintainable design, rather than on memory management issues.
Whether you can make a class static does not depend on the use of 'global' variables but on the fact if a class uses instance fields (class member variables). If your methods do not use instance data the can be static (and there is a light preference to do so). If all methods are static you can make the class static as well.

Categories