I am debating the pros and cons of a couple of utility classes I have. The classes have a couple of properties which are set prior to calling the class methods. However, I was wondering if there are any cons to this approach rather than sending a variable along with the method call? There are typically only one or two methods in these classes.
Thank you.
I don't know what your class looks like, so I'll make a guess...
I assume you have something like that:
public class MyClass
{
public static int X { get; set; }
public static void MyMethod()
{
Console.WriteLine("X = {0}", X);
}
}
And you call it like this:
MyClass.X = 42;
MyClass.MyMethod();
There are at least two problems with this approach:
there is no obvious indication that you need to set X before calling MyMethod
it makes the method non thread-safe: if both thread1 and thread2 are calling it, you can have something like that:
thread1 sets X to 42
thread2 sets X to 99
thread1 calls MyMethod => prints 99 instead of 42
thread2 calls MyMethod => prints 99
A better approach is to pass the value as a parameter to the method:
public class MyClass
{
public static void MyMethod(int x)
{
Console.WriteLine("X = {0}", x);
}
}
And call it like this:
MyClass.MyMethod(42);
This solves the two problems mentioned before:
it's clear than you need to provide the value of x to MyMethod
there is no state stored in the class, so the method is thread-safe
I'd say it mostly depends on how you tend to invoke those methods. There are situations where either approach might be preferred.
When you tend to pass the same values over multiple invocations, it may be more convenient to instantiate a class which holds those values as properties, read-only or writable depending on the exact needs. You can then call your method multiple times conveniently without repeating yourself much. A good example of this is the HttpClient: you configure it once, and then call certain methods multiple times.
This approach also works well if you need to maintain some state between method invocations.
However, by default, if the above considerations do not apply, I would recommend having pure static methods. They are self-contained, they don't behave differently based on relatively external factors (property values set some time ago). You don't need to worry before each call whether you've set the properties correctly, as all the values are passed in. Finally, self-contained methods are easier to understand and use in multi-threading scenarios.
Related
I have a singleton client class in my solution, which calls an external service/APIs. There are no locks inside this class (nothing protecting variables being accessed by multiple threads).
If this singleton instance gets request 1, and while handling that request, it gets another request (request2). What happens? Does it continue processing request1 to completion, then serve request 2? Or will it start serving request 2 at the same time which in turn might over-write any variables in this singleton class?
Thanks for the help!
When two threads concurrently execute a method on a single instance of a class, arguments passed to a method and variables defined within that method are not overwritten. The values of fields and properties, however, can be changed.
This class is thread safe:
public class Calculator
{
public long Add(int a, int b)
{
var result = a + b;
return result;
}
}
The arguments (a, b) and the variable (result) are stored in the stack frame, memory which is allocated each time the method is executed. So the result variable exists for each method call. Two method calls cannot share or overwrite that variable.
Similarly, if this method gets called while it's already executing, the a and b arguments are not overwritten.
As a result, any number of threads can safely call the Add method on a single instance of Calculator. They can do this concurrently. One execution does not wait for the other.
This class is not thread safe:
public class Calculator
{
private int _a;
private int _b;
public long Add(int a, int b)
{
_a = a;
_b = b;
var result = _a + _b;
return result;
}
}
It's a contrived example. The difference here is that calling Add modifies the state of the class, changing the _a and _b fields. It would be the same if these were properties instead of fields.
If two threads tried to execute this at once you would get unpredictable results. Right before the first thread adds _a and _b another thread might change the value of one of those fields.
This is the simple version. There are more complicated scenarios. Suppose, for instance, we pass a List<int> as an argument to a method. If another thread has a reference to the same list, both threads could try to modify it, or one could modify it while the other is reading it, all with unexpected results. How to manage all of that is outside the scope of this answer.
Here are a few takeaways:
Don't add state to class (fields and properties that change after its constructor is called) unless it's needed. There are scenarios where we must, and in some cases it's the whole purpose of the class. But if you can choose between the examples above, always choose the first one.
Whenever we pass around references to objects like lists or instances of our own classes, consider what would happen if two threads had access to that object. And then
Don't pass them around if they're not thread safe
Make them thread safe before they're passed around
Be very, very careful. This is the worst option because it puts a burden on us and future developers to simulate the behavior of the code in our heads and see potential problems. It's so much better to prevent problems than to figure out how to navigate around them.
Another way of describing it: Concurrency is like plutonium. It's powerful and useful, but we must always know where it is and make sure it never leaks.
This question already has answers here:
Performance of static methods vs instance methods
(3 answers)
Closed 6 years ago.
Performance-wise is there any difference on doing this?:
public static Class StaticTestClass()
{
public static void Function(object param) => //Do stuff with "param"
}
and this:
public Class ConstructedTestClass()
{
private object classParam;
public ConstructedTestClass(object param)
{
classParam = param;
}
public void Function() => //Do stuff with "classParam"
}
I think that there wouldn't be any performance differece if done it one single time, but what If I have to do it many times, and call Function() many times?
Will having many instances of ConstructedTestClass have a memory impact?
And will calling Function withing StaticTestClass with the parameter have any performance impact?
PS: There are similar questions to this but I can't find one that adresses performance upon many calls.
EDIT: I did some tests and this are the results:
With 1000000000 iterations and Creating a ConstructedClass each iteration.
Static way: 72542ms
Constructed way: 83579ms
In this case the static way is faster, then I tried not creating a class each time Function() is called, this are the results: [100000000 samples]
Static way: 7203ms
Constructed way: 7259ms
In this case there's almost no difference so I guess I can do whatever I like the most since i wont be creating 1000000000 instances of the class.
Technically yes, the static method will be slightly faster per call, because a static method doesn't have to check and see if the object it's attached to (because it's not) has been instantiated. This happens behind the scenes. (Technically there will be other slight overhead to set up the object etc.)
This is not a really good reason under most circumstances to choose one over the other though. They have different purposes. The a static method can't maintain state of internal variables like an object can etc.
In your case I would probably pick the static method. Based on the code you show, you don't have a real need to maintain a reference to the object you want to do something to. Perform a function on it, and be done with it.
With the other approach you have to create an object, then call the method. Furthermore the way it's set up, you have to instantiate a new object for each target object you have to perform the action on, because there is a reference stored in a private variable the method acts on. To me this would be more confusing from a readability perspective.
One difference is, that the generated objects have to be garbage collected. That overhead doesn't occur for the static call.
I tested it for 100000000 iterations:
static version takes ~0.7 seconds
non-static version (creating the instance one time and call the method n
times) takes ~ 0.7 seconds.
non-static version (creating one instance per call) takes ~1.4 seconds.
I have a class Meterage which I expect to be instantiated several times, probably in quick succession. Each class will need to know the location of the dropbox folder in the executing machine, and I have code for this.
The class currently has a variable:
private string dropboxPath = string.Empty;
to hold the path, but I am considering making this a static to save repeated execution of
this.LocateDropboxFolder();
in the constructor. But I am a little concerned by the switch: what if two constructors try to set this at the same time? Would this code in the constructor be safe (LocateDropboxFolder becomes static too in this example):
public Meterage()
{
if (dropboxPath == string.Empty)
{
LocateDropboxFolder();
}
}
I think my concerns are perhaps irrelevant as long as I don't have construction occurring in multiple threads?
If the field is made static then static field initializers or static constructors are the easy way to initialize them. This will be executed at most once in a thread safe manner.
private static string dropboxPath;
static Meterage()
{
LocateDropboxFolder();
}
If you don't want to re-assign the field I suggest you to use readonly modifier, then the code should look like:
private static readonly string dropboxPath;
static Meterage()
{
dropboxPath = LocateDropboxFolder();
}
LocateDropboxFolder needs to return a string in this case.
Variables declared outside the constructor are evaluated before the constructor. Then the constructor will evaluate it.
Do remember that you will end up have only one dropBoxPath. If this is intended, it is okay to do so. Optionally, make LocateDropboxFolder a static method and call it from the static constructor.
If you want to prevent other constructors to overwrite the default, try this:
if (string.IsNullOrEmpty(dropboxPath))
{
LocateDropboxFolder();
}
Or, in a static constructor (at most called once):
static Meterage()
{
LocateDropboxFolder();
}
private static LocateDropboxFolder()
{
...
}
Your example will be safe provided your code is executing synchronously. If multiple instances are created, their constructors will be called in the order they are created.
On the first run through, LocateDropboxFolder() will execute. When this completes, dropboxPath will be set.
On the second constructor execution, LocateDropboxFolder() will not execute because dropboxPath will no longer equal string.Empty (provided 'LocateDropboxFolder()' does not return string.Empty.
However, if LocateDropboxFolder() is asynchronous or the objects are instantiated on different threads, then it is possible to create a second Meterage instance before dropBoxPath has been set by the LocateDropboxFolder() function. As such, multiple calls to the function will likely be made.
If you wish to guard against multithreading errors like this, you could consider using lock statements.
You might potentially end up running the LocateDropboxFolder multiple times if the object tries to be constructed multiple times in close succession from multiple threads. As long as the method returns the same result every time though this shouldn't be a problem since it will still be using the same value.
Additionally if you are setting the value of dropboxPath in the constructor then there is no point setting a default value for it. I'd just declare it (and not assign it) and then check for null in your constructor.
I hava a feeling that your Meterage class is breaking a Single Responsibility Principle. What has the meterage to do with a file access? I would say you have 2 concerns here: your Meterage and, let's say, FolderLocator. the second one should have some property or method like Dropbox which could use lazy evaluation pattern. It should be instantiated once and this single instance can be injected to each Metarage instance.
Maybe not FolderLocator but FileSystem with some more methods than just a single property? Nos sure what you're actually doing. Anyway - make an interface for this. That would allow unit testing without using the actual Dropbox folder.
Static fields are being accessed using the class name like this:
public class Me()
{
public static int a=5;
}
I can access it with Me.a, so it is attached to the class.
But when I look at:
static ThreadLocal<int> _x = new ThreadLocal<int> (() => 3);
It guarantees that each thread sees different copy of _x.
Didn't we just see that static is per class and not per thread? How does ThreadLocal manage to give each thread a different copy of _x?
Didnt we just see that static is per class and not per thread ?
Yes. So imagine that a ThreadLocal<T> instance holds a static Dictionary<Thread, T> that looks up the Value for the current thread.
That is probably not how it actually works but it's a simple explanation of how it's possible. You could write it yourself.
So you still have only 1 static _x. But _x.Value can be bound to anything, like the courrent Thread.
The reference _x will indeed be one per class, as per its static specifier. However, only the reference will be shared among all threads, not the value inside its object. When you access _x.Value, ThreadLocal<T> invokes system-specific code that provides storage on the current thread, and reads or writes to that thread-specific storage.
My C# isn't that great, so here's a C++ answer to the same effect: Imagine a hypothetical class that contains a large array:
class Foo
{
int array[HUGE];
int & get() { return array[this_thread_id()]; }
}:
Now you can have one single, global (or class-static) object:
Foo tlstorage;
To access it from anywhere you say tlstorage.get() = 12;. However, the data is stored in the slot that "belongs" to your current thread. The entire storage is global, but only one slice is exposed to each thread.
Other languages like C and C++ have native support for this concept, and when you decorate a global or static variable as "thread-local", the compiler builds something that amounts to the same effect automatically. Perhaps in C# this is a library feature, though it probably also maps to something intrinsic.
When creating a class that has internal private methods, usually to reduce code duplication, that don't require the use of any instance fields, are there performance or memory advantages to declaring the method as static?
Example:
foreach (XmlElement element in xmlDoc.DocumentElement.SelectNodes("sample"))
{
string first = GetInnerXml(element, ".//first");
string second = GetInnerXml(element, ".//second");
string third = GetInnerXml(element, ".//third");
}
...
private static string GetInnerXml(XmlElement element, string nodeName)
{
return GetInnerXml(element, nodeName, null);
}
private static string GetInnerXml(XmlElement element, string nodeName, string defaultValue)
{
XmlNode node = element.SelectSingleNode(nodeName);
return node == null ? defaultValue : node.InnerXml;
}
Is there any advantage to declaring the GetInnerXml() methods as static? No opinion responses please, I have an opinion.
From the FxCop rule page on this:
After you mark the methods as static, the compiler will emit non-virtual call sites to these members. Emitting non-virtual call sites will prevent a check at runtime for each call that ensures that the current object pointer is non-null. This can result in a measurable performance gain for performance-sensitive code. In some cases, the failure to access the current object instance represents a correctness issue.
When I'm writing a class, most methods fall into two categories:
Methods that use/change the current instance's state.
Helper methods that don't use/change the current object's state, but help me compute values I need elsewhere.
Static methods are useful, because just by looking at its signature, you know that the calling it doesn't use or modify the current instance's state.
Take this example:
public class Library
{
private static Book findBook(List<Book> books, string title)
{
// code goes here
}
}
If an instance of library's state ever gets screwed up, and I'm trying to figure out why, I can rule out findBook as the culprit, just from its signature.
I try to communicate as much as I can with a method or function's signature, and this is an excellent way to do that.
A call to a static method generates a call instruction in Microsoft intermediate language (MSIL), whereas a call to an instance method generates a callvirt instruction, which also checks for a null object references. However, most of the time the performance difference between the two is not significant.
Source: MSDN - https://learn.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2012/79b3xss3(v=vs.110)
Yes, the compiler does not need to pass the implicit this pointer to static methods. Even if you don't use it in your instance method, it is still being passed.
It'll be slightly quicker as there is no this parameter passed (although the performance cost of calling the method is probably considerably more than this saving).
I'd say the best reason I can think of for private static methods is that it means you can't accidentally change the object (as there's no this pointer).
This forces you to remember to also declare any class-scoped members the function uses as static as well, which should save the memory of creating those items for each instance.
I very much prefer all private methods to be static unless they really can't be. I would much prefer the following:
public class MyClass
{
private readonly MyDependency _dependency;
public MyClass(MyDependency dependency)
{
_dependency = dependency;
}
public int CalculateHardStuff()
{
var intermediate = StepOne(_dependency);
return StepTwo(intermediate);
}
private static int StepOne(MyDependency dependency)
{
return dependency.GetFirst3Primes().Sum();
}
private static int StepTwo(int intermediate)
{
return (intermediate + 5)/4;
}
}
public class MyDependency
{
public IEnumerable<int> GetFirst3Primes()
{
yield return 2;
yield return 3;
yield return 5;
}
}
over every method accessing the instance field. Why is this? Because as this process of calculating becomes more complex and the class ends up with 15 private helper methods, then I REALLY want to be able to pull them out into a new class that encapsulates a subset of the steps in a semantically meaningful way.
When MyClass gets more dependencies because we need logging and also need to notify a web service (please excuse the cliche examples), then it's really helpful to easily see what methods have which dependencies.
Tools like R# lets you extract a class from a set of private static methods in a few keystrokes. Try doing it when all private helper methods are tightly coupled to the instance field and you'll see it can be quite a headache.
As has already been stated, there are many advantages to static methods. However; keep in mind that they will live on the heap for the life of the application. I recently spent a day tracking down a memory leak in a Windows Service... the leak was caused by private static methods inside a class that implemented IDisposable and was consistently called from a using statement. Each time this class was created, memory was reserved on the heap for the static methods within the class, unfortunately, when the class was disposed of, the memory for the static methods was not released. This caused the memory footprint of this service to consume the available memory of the server within a couple of days with predictable results.