I have a question about stateless singletons. I also have a question about singletons with state.
Stateless singleton services are a good way to help with scalability. The programmer who architected the project which I maintain basically said there'll be no concurrency issues because "it is just code" (the Singleton class, that is). Meaning the class has no class level variables. It is just methods.
This is where my knowledge of C# gets a little hazy. Is there any possible issue where 2 users, via separate web requests, hit the stateless singleton at the same time? Could they end up in the same method at the same time? Is that even possible? If so, does that mean they'd be using the same local variables in that method? Sounds like a big mess, so I'm assuming it just can't happen. I'm assuming that somehow method calls are never polluted by other users.
I've asked many colleagues about this and no-one knows the answer. So it is a tricky issue.
My question about singletons generally is whether there is any problem with 2 or more concurrent users reading a public property of a Singleton. I'm only interested in reads. Is there a possibility of some kind of concurrency exception where a property is not inside a lock block? Or are concurrent, simultaneous reads safe? I don't really want to use the lock keyword, as that is a performance hit that I don't need.
Thanks
Singleton is an anti-pattern. A stateless singleton is even worse. If something does not hold state, there is not even the faintest reason to make it a singleton.
A stateless singleton is a pure static function from someone who enjoyed adding a pattern without thinking about what the pattern would achieve. Because in this case, he would have noticed that it achieves nothing.
If you see a stateless singleton, you can safely remove every bit of code that makes it a singleton. Add a static to the class definition. Done. Way better than before.
I think you are pretty confused about multi threading, singleton or not. I suggest you read a good book or tutorial on this because it's way out of scope for a simple answer here. If you have shared resources (simple example, a variable that is not a local) then you need to take special care in multi-threaded environments.
If you are reading more often than writing, using a ReaderWriterLock instead of a simple lockmight be beneficial. See here.
Related
I have a c# windows forms mp3 player application. I have the audio files in my Resources folder and a separate "MyAudio" static class which handles all the audio related work like playing and increasing volume etc.
From my Form, I just call the play method using:
MyAudio.Play(track);
In the MyAudio class, I have a WindowsMediaPlayer object declared as:
private static WindowsMediaPlayer obj=new WindowsMediaPlayer();
My Question is, in terms of efficiency and less memory usage, is it better to declare MyAudio class as static or non static? Is it wise to create a Object of the MyAudio class in form and then call the methods or directly call using class Name?
Also is it good practice to declare the instance variables as static?
Your question is indeed broad, but there are few design principles that you can take care of, while you are designing a class:
Do I need the object and it's state throughout the application lifetime
Do I need to maintain the state of class variables for future use
Do I need to multi-thread or parallelize the application at any point of time
Do I need to decouple the component in the future and used in other scenarios like Ajax based web scenario
Important thing in this case is that you are keen to maintain the state for the application lifetime and the amount of memory usage is fine for the application environment, since after initializing you would be able to get all the data from memory and don't need to query a source like database. However, this is good for the scenario where you need to initialize once and read as a static information in the rest of the application. In case you plan to re query the information, then the part purpose of using static type would be lost
Let's assume in the future you need to parallelize your code for performance enhancement, then static will come to haunt you, since it would be shared among threads and invariably would need a synchronization construct like lock, mutex, which will serialize all threads and thus purpose would be lost. Same things would happen in a Web / Ajax scenario and your static component cannot handle the multiple parallel requests and will get corrupted until and unless synchronized. Here instance variable per thread is a boon, as they do task / data parallelization without requiring a lock, mutex
In my understanding static is a convenience, which many programmers misuse, by avoiding the instance variable and using at will, without understanding the implications. From the GC perspective, it cannot collect the static variable, so the working set of the application would invariably increase till it stabilize and will not decrease until and unless explicitly released by program, which is not good for any application, until and unless we are storing data to avoid network database calls.
Ideal design would suggest to always use the instance class, which gets created, does its work and gets released, not linger around. In case there's information that needs to be passed from one function to another like in your case from Play to Pause to Stop, then that data can be persisted to a static variable and modified in a thread safe manner, which is a much better approach
If we just take example given by you since it's a windows form, which does operations like Play, then static would be fine, as it is an executable running on a system, but for testing imagine a scenario that you initiate multiple instances by double clicking and play around on each one, by pressing different operations, then they all will access same static object and you may get a corruption issue, in fact to resolve such scenario you may even chose your class to be singleton, where at a given moment no more than one instance can exist in the memory, like it happens for Yahoo messenger, no matter how many times you click, always same instance comes up.
There are no static instance variables. However its best practice to define static members if they don't have anything to do with a particular instance of the class.
(I wish I could tag this question for all class-constructing languages implementing threads, but here under Java, C++, C# and Ruby. Not that I am cool with all of these)
I think I have seen statements to this effect (that class constructors are threadsafe) on blog posts/tutorials. I can't trace any direct statements, but many posts and tutorials make the assumption, or do not even mention the problem of threads running on constructors and destructors. Sticking to Java, which has a history and some formal approach to multi-threading,
Javamex
Jankov's tutorials
Oracle tutorials
All these articles/webpages are written in a confident way and contain rounded discussions. They all mention the Java feature of method synchronization, so you would hope they might mention how this would affect the special methods of construction and destruction. But they do not.
But class constructors and destructors need to be considered like any class methods. Here is an article on Java,
Safe construction techniques in Java
about leaking 'this' references from constructors. And here are a couple of StackOverflow posts,
Incompletely constructed objects in Java,
Java constructor needs locking
showing constructors with thread issues. I doubt threading issues in special methods are limited to Java.
So, I'm wondering,
Is the assumption of threadsafety (however defined) based on the general layout of constructors?
A tightly coded constructor with not much code would be close to re-entrant code (accepting data through parameters, etc.)
Or do interpreters/compilers handle constructors/destructors with special treatment or protections?
For example, the Java memory model makes some comments on expectations at the end of construction, and I expect other language specifications will too.
Wikipedia on constructors has little on this. In a different context this post Constructors in Programming languages contains some hints, but is not about threadsafety.
While there many be information in specialist books, it would be good to have general (though language-specific mentions are interesting!) explanations/discussion on StackOverflow.
In general local variables which do not point to shared data are thread safe. As you are only creating an object in one thread normally, it is effectively a thread local data structure and thus thread safe (mostly).
In Java, you can break this assumption in a number of ways which include
starting a new thread in a constructor
setting a reference to the object which is visible to another thread.
using non-final fields and adding the object to a thread unsafe container or shared data structure.
Normally these actions are all considered bad practice, so if you avoid these, you have a thread safe constructor without the need for locking.
I think the original question is based on some misunderstanding. Constructors are not considered threadsafe.
If the constructor is affecting anything outside of the object itself, then it is not threadsafe, just like any other member functions of the class.
I think the basis on this is based on a constructor that doesn't affect anything other than the object contents (and there are no static member variables), then it's threadsafe based on the fact that there is nothing outside of the object itself that is affected - and until the constructor has finished, nothing else knows that the object exists, so there is no possibility for another thread to "use" the object. But this fails as soon as some global state (any global/static variable, I/O, etc) gets involved, and at that point, thread safety depends on proper locking (of some sort).
Example of Java constructor thread-safety problem is Double Checked Locking pattern, see http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html. In other words
X x = new X();
is always safe, but
X x; <-- field
x = new X(); <- in a method
is not necessarily safe
John's console application calls my DLL function many times (~15x per sec). I am thinking to put this function as a static method.
I know that :
It can only access static props and objects.
It doesn't need an instance to run the function.
But I don't know if these are the only questions which i need to ask myself.
Each John's calls to my function is in a new thread that he creates.
If there is an error in my function, how will this affect all other calls?
Should I make this function a regular function with instance to the class (which John will create)?
What about GC?
What is the best practice answer for this question?
Sounds like there could be a problem. Calling a method which operates on static (shared) objects in a multithread environment should ring some alert bells for you.
Review your code and if there's a chance that a shared object is accessed from two (or more) threads at the same time, make the object an instance field and make your methods instance methods.
Of course, whether or not there is a risk depends much on the actual code (which you don't show) but making all calls nonstatic means that you lower the potential risk.
Generally, if your static method doesn't use any state (i.e. both reading and writing to static fields and properties, or calling other static methods that do so), there won't be any side effects for the caller, no matter how many threads they start.
If you're asking for best practice, static methods are mostly a bad idea though. If at all, static methods should be only used for very generic utility functionality.
It's not recommended because you can't predict if requirements change and you need some state one day. Then you'd better use a class that the caller can instantiate, and you'll break all existing code that uses your function.
About garbage collection, yes sure that has some overhead, but that is currently the sacrifice if you go the route of memory-managed OO. If you want more control, use unsafe code or a language like C++ or Vala.
I would agree with Wiktor (+1), but would also add that if synchronization is required in the static method, it may be more efficient to use multiple instances. Otherwise, having many threads might be pointless as only one can access a critical section at a time.
I have several classes based on System.Entity.Data.DbContext. They get used several times a request in disparate ends of the web application - is it expensive to instantiate them?
I was caching a copy of them in HttpContext.Current.Items because it didn't feel right to have several copies of them per request, but I have now found out that it doesn't get automatically disposed from the HttpContext at the end of the request. Before I set out writing the code to dispose it (in Application_EndRequest), I thought I'd readdress the situation as there really is no point caching them if I should just instantiate them where I need them and dispose them there and then.
Questions similar to this have been asked around the internet, but I can't seem to find one that answers my question exactly. Sorry if I'm repeating someone though.
Update
I've found out that disposing of the contexts probably doesn't matter in this blog post, but I'm still interested to hear whether they are expensive to instantiate in the first place. Basically, is there lots of EF magic going on there behind the scenes that I want to avoid doing too often?
Best bet would be to use an IoC container to manage lifecycles here -- they are very, very good at it and this is quite a common scenario. Has the added advantage of making dynamic invocation easy -- meaning requests for your stylesheet won't create a DB context because it is hardcoded in BeginRequest().
I'm answering my own question for completeness.
This answer provides more information about this issue.
In summary, it isn't that expensive to instantiate the DbContext, so don't worry.
Furthermore, you don't really need to worry about disposing the data contexts either. You might notice ScottGu doesn't in his samples (he usually has the context as a private field on the controller). This answer has some good information from the Linq to SQL team about disposing data contexts, and this blog post also expands on the subject.
Use HttpContext.Items and dispose your context manually in EndRequest - you can even create custom HTTP module for that. That is a correct handling. Context disposal will also release references to all tracked entities and allow GC collecting them.
You can use multiple context per request if you really need them but in most scenarios one is enough. If your server processing is one logical operation you should use one context for whole unit of work. It is especially important if you do more changes in transaction because with multiple context your transaction will be promoted to distributed and it has negative performance impact.
We have a web project using a similar pattern to the one you've described (albeit with multiple and independant L2S Contexts instead of EF). Although the context is not disposed at the end of the request, we have found that because the HttpContext.Current becomes unreferenced, the GC collects the context in due course, causing the dispose under the scenes. We confirmed this using a memory analyser. Although the context was persisting a bit longer than it should, it was acceptable for us.
Since noticing the behaviour we have tried a couple alternatives, including disposing the contexts on EndRequest, and forcing a GC Collect on EndRequest (that one wasn't my idea and was quickly receded).
We're now investigating the possibility of implementing a Unit of Work pattern that encompasses our collection of Contexts during a request. There are some great articles about if you google it, but for us, alas, the time it would take to implement outweighs the potential benefit.
On the side, I'm now investigating the complexity of moving to a combined SOA/Unit of Work approach, but again, it's one of those things hindsight slaps you with after having built up an enterprise sized application without the knowledge.
I'm keen to hear other peoples views on the subject.
There are a lot of articles and discussions explaining why it is good to build thread-safe classes. It is said that if multiple threads access e.g. a field at the same time, there can only be some bad consequences. So, what is the point of keeping non thread-safe code? I'm focusing mostly on .NET, but I believe the main reasons are not language-dependent.
E.g. .NET static fields are not thread-safe. What would be the result if they were thread-safe by default? (without a need to perform "manual" locking). What are the benefits of using (actually defaulting to) non-thread-safety?
One thing that comes to my mind is performance (more of a guess, though). It's rather intuitive that, when a function or field doesn't need to be thread-safe, it shouldn't be. However, the question is: what for? Is thread-safety just an additional amount of code you always need to implement? In what scenarios can I be 100% sure that e.g. a field won't be used by two threads at once?
Writing thread-safe code:
Requires more skilled developers
Is harder and consumes more coding efforts
Is harder to test and debug
Usually has bigger performance cost
But! Thread-safe code is not always needed. If you can be sure that some piece of code will be accessed by only one thread the list above becomes huge and unnecessary overhead. It is like renting a van when going to neighbor city when there are two of you and not much luggage.
Thread safety comes with costs - you need to lock fields that might cause problems if accessed simultaneously.
In applications that have no use of threads, but need high performance when every cpu cycle counts, there is no reason to have safe-thread classes.
So, what is the point of keeping non thread-safe code?
Cost. Like you assumed, there usually is a penalty in performance.
Also, writing thread-safe code is more difficult and time consuming.
Thread safety is not a "yes" or "no" proposition. The meaning of "thread safety" depends upon context; does it mean "concurrent-read safe, concurrent write unsafe"? Does it mean that the application just might return stale data instead of crashing? There are many things that it can mean.
The main reason not to make a class "thread safe" is the cost. If the type won't be accessed by multiple threads, there's no advantage to putting in the work and increase the maintenance cost.
Writing threadsafe code is painfully difficult at times. For example, simple lazy loading requires two checks for '== null' and a lock. It's really easy to screw up.
[EDIT]
I didn't mean to suggest that threaded lazy loading was particularly difficult, it's the "Oh and I didn't remember to lock that first!" moments that come fast and hard once you think you're done with the locking that are really the challenge.
There are situations where "thread-safe" doesn't make sense. This consideration is in addition to the higher developer skill and increased time (development, testing, and runtime all take hits).
For example, List<T> is a commonly-used non-thread-safe class. If we were to create a thread-safe equivalent, how would we implement GetEnumerator? Hint: there is no good solution.
Turn this question on its head.
In the early days of programming there was no Thread-Safe code because there was no concept of threads. A program started, then proceeded step by step to the end. Events? What's that? Threads? Huh?
As hardware became more powerful, concepts of what types of problems could be solved with software became more imaginative and developers more ambitious, the software infrastructure became more sophisticated. It also became much more top-heavy. And here we are today, with a sophisticated, powerful, and in some cases unnecessarily top-heavy software ecosystem which includes threads and "thread-safety".
I realize the question is aimed more at application developers than, say, firmware developers, but looking at the whole forest does offer insights into how that one tree evolved.
So, what is the point of keeping non thread-safe code?
By allowing for code that isn't thread safe you're leaving it up to the programmer to decide what the correct level of isolation is.
As others have mentioned this allows for complexity reduction and improved performance.
Rico Mariani wrote two articles entitled "Putting your synchronization at the correct level" and
Putting your synchronization at the correct level -- solution that have a nice example of this in action.
In the article he has a method called DoWork(). In it he calls other classes Read twice Write twice and then LogToSteam.
Read, Write, and LogToSteam all shared a lock and were thread safe. This is good except for the fact that because DoWork was also thread safe all the synchronizing work in each Read, Write and LogToSteam was a complete waste of time.
This is all related to the nature Imperative Programming. Its side effects cause the need for this.
However if you had an development platform where applications could be expressed as pure functions where there were no dependencies or side effects then it would be possible to create applications where the threading was managed without developer intervention.
So, what is the point of keeping non thread-safe code?
The rule of thumb is to avoid locking as much as possible. The Ideal code is re-entrant and thread safe with out any locking. But that would be utopia.
Coming back to reality, a good programmer tries his level best to have a sectional locking as opposed to locking the entire context. An example would be to lock few lines of code at a time in various routines than locking everything in a function.
So Also, one has to refactor the code to come up with a design that would minimize the locking if not get rid of it in entirity.
e.g. consider a foobar() function that gets new data on each call and uses switch() case on a type of data to changes a node in a tree. The locking can be mostly avoided (if not completely) As each case statement would touch a different node in a tree. This may be a more specific example but i think it elaborates my point.