I like to write an application that opens many sockets and files. Think of it as webserver (which is not true in my case, but to simplify the problem here).
If I would write it in C on Unix I would use poll/select and be quite efficient and because I don't have multiple threads, everything is easy to write, while being very efficient.
If I use multiple threads to use all cores of the CPU (given that I don't wanna use processes) I would use Unix FIFOs to transfer messages and use still poll/select on each thread (which works flawlessly with files/socket/fifos/). Things are still very simple while being quite efficient.
But when using C# it looks like there are different selects and most classes don't support that programming style at all (HttpWebListener just as one example). I don't like the BeginInvoke messiness because there are things happening in the background on which I don't have any control (ThreadPooling, Shutting down a blocking server gracefully, ...).
I wonder if there is any select/poll alike framework available for C#?
You can actually use your same approaches in C# - you just need to use the lower level Socket class, which provides Select and Poll.
That being said, the new asynchronous methods built on top of socket in the higher level classes tend to have many advantages. Once you learn and understand how they function, they can be very efficient and quite a bit nicer to develop against.
This extends all the way up the stack - with the "highest level" abstractions being frameworks like WCF, which provide huge benefits in terms of productivity, reliability, safety, and ease of development for many types of applications.
BeginInvoke (or Tasks based on the Begin/End pattern) are the standard model of async programming on .NET. They indeed force the continuation callbacks to run on the thread-pool. If you are fine with that the Begin/End model is actually very efficient and nice (as nice as callback-based code can be...).
Of the top of my head I cannot see a compelling reason why I wouldn't want to use the thread-pool for completion callbacks. Maybe you can squeeze out a little more efficiency using IOCPs.
Select/poll certainly isn't the way to become more efficient. Although .NET sockets support it.
You said
Shutting down a blocking server gracefully
would be a problem. I don't see why. Can you elaborate?
Related
Can a TCP socket that uses Socket.ReceiveAsync or even Socket.SendAsync experience the C10K problem ?
I don't think this is at all a duplicate of this other question, as has been suggested:
Async-Await vs ThreadPool vs MultiThreading on High-Performance Sockets (C10k Solutions?)
That question is more a discussion of various threading models, which don't always apply to Socket.****Async (which ALL use SocketAsyncEventArgs).
The answer to this question is HIGHLY dependent upon your runtime and your OS. I have just recently gone down this road while developing a solution for Unity, so I will share what I know even though the question is a bit old.
Let's first talk about Windows and MS .NET (anything from 3.5 when those methods were implemented to present). You can definitely break c10k when combining those socket methods on this runtime and platform. The ****Async methods that take a SocketAsyncEventArgs parameter go straight to native calls in the runtime, which map to IO Completion Ports in Windows. This is a special path through the runtime, which does more than prevent allocating IAsyncResult objects. I assume using .NET Standard on Linux is also very fast, but I cannot speak directly to that.
Next, let's talking about Mono on Linux. Specifically, I am currently building with:
Mono C# compiler version 4.6.2.0
This will also break c10k, but it works differently. From what I can see, the *****Async methods take the same path through the runtime as all other async methods, except ****Async calls take the shortest path and avoid an IAsyncResult allocation. The runtime submits your requests as an IOSelectorJob via IOSelector.Add. For this reason it appears that none of the ****Async methods will actually ever return false in Mono, though I wouldn't count on that behaviour always being true. I assume using Mono on Windows is equally fast, but I cannot speak to that.
Finally, let's talk about Unity. I am using 2018.3.2f1 which has the Mono 5.11.0 runtime, set to .NET 4.x compat. I'm not sure it is possible to break c10k in Unity. I don't know why, but I know that identical implementations that handily break c10k in Mono and .NET will not even come close. I assume this is because somehow their thread pools are set up differently ... perhaps to accomodate their job system, but that's merely a shot-in-the-dark guess. Very strange things happen, such as synchronous accept outperforming async. AcceptAsync requests that never get a callback, and so on.
Some tips to break c10k outside of Unity:
- Do as little as possible in your IO completion callbacks. Don't put calls to other Async methods inside them like the MSDN example
- If you can manage it, don't make calls to SendAsync and ReceiveAsync block each other due to your design. You can't have 2 outstanding calls with the same arg, but you can have both multiple and separate args (and therefore buffers) for send/recv. But be aware that things can get complicated for ordering, for example in the case of multiple outstanding recvs on the same socket, and concurrent queues dispatching the returned args from the callbacks
- If you have to share resources between threads then System.Collections.Concurrent is your friend. Don't roll your own as these containers are not why you aren't breaking c10k, and you will never be able to match the amount of testing these babies have undergone
Good luck and have fun :) And don't forget to tell me if you find the secret to success in Unity. I'll surely update this if I do.
We have a very high performance multitasking, near real-time C# application. This performance was achieved primarily by implementing cooperative multitasking in-house with a home grown scheduler. This is often called micro-threads. In this system all the tasks communicate with other tasks via queues.
The specific problem that we have seems to only be solvable via first class continuations which C# does not support.
Specifically the problem arises in 2 cases dealing with queues. Whenever any particular task performs some work before placing an item on a queue. What if the queue is full?
Conversely, a different task may do some work and then need to take an item off of a queue. What if that queue is empty?
We have solved this in 90% of the cases by linking queues to tasks to avoid tasks getting invoked if any of their outbound queues are full or inbound queue is empty.
Furthermore certain tasks were converted into state machines so they can handle if a queue is full/empty and continue without waiting.
The real problem arises in a few edge cases where it is impractical to do either of those solutions. The idea in that scenario would be to save the stack state at the point and switch to a different task so that it can do the work and subsequently retry the waiting task whenever it is able to continue.
In the past, we attempted to have the waiting task call back into the schedule (recursively) to allow the other tasks to and later retry the waiting task. However, that led to too many "deadlock" situations.
There was an example somewhere of a custom CLR host to make the .NET threads actually operate as "fibers" which essentially allows switching stack state between threads. But now I can't seem to find any sample code for that. Plus it seems that will take some significant complexity to get it right.
Does anyone have any other creative ideas how to switch between tasks efficiently and avoid the above problems?
Are there any other CLR hosts that offer this, commercial or otherwise? Is there any add-on native library that can offer some form of continuations for C#?
There is the C# 5 CTP, which performs a continuation-passing-style transformation over methods declared with the new async keyword, and continuation-passing based calls when using the await keyword.
This is not actually a new CLR feature but rather a set of directives for the compiler to perform the CPS transformation over your code and a handful of library routines for manipulating and scheduling continuations. Activation records for async methods are placed on the heap instead of the stack, so they're not tied to a specific thread.
Nope, not going to work. C# (and even IL) is too complex language to perform such transformations (CPS) in a general way. The best you can get is what C# 5 will offer. That said, you will probably not be able to break/resume with higher order loops/iterations, which is really want you want from general purpose reifiable continuations.
Fiber mode was removed from v2 of the CLR because of issues under stress, see:
Fiber mode is gone...
Fibers and the CLR
Question to the CLR experts : fiber mode support in hosting
To my knowledge fiber support has not yet bee re-added, although from reading the above articles it may be added again (however the fact that nothing has mentioned for 6-7 years on the topic makes me believe that its unlikely).
FYI fiber support was intended to be a way for existing applications that use fibers (such as SQL Server) to host the CLR in a way that allows them to maximise performance, not as a method to allow .Net applications to create hundereds of threads - in short fibers are not a magic bullet solution to your problem, however if you have an application that uses fibers an wishes to host the CLR then the managed hosting APIs do provide the means for the CLR to "work nicely" with your application. A good source of information on this would be the managed hosting API documentation, or to look into how SQL Server hosts the CLR, of which there are several highly informative articles around.
Also take a quick read of Threads, fibers, stacks and address space.
Actually, we decided on a direction to go with this. We're using the Observer pattern with Message Passsing. We built a home grown library to handle all communication between "Agents" which are similar to an Erlang process. Later we will consider using AppDomains to even better separate Agents from each other. Design ideas were borrowed from the Erlang programming language which has extremely reliable mult-core and distributed processing.
The solution to your problem is to use lock-free algorithms allowing for system wide progress of at least one task. You need to use inline assembler that is CPU dependent to make sure that you atomic CAS (compare-and-swap). Wikipedia has an article as well as patterns described the the book by Douglas Schmidt called "Pattern-Oriented Software Architecture, Patterns for Concurrent and Networked Objects". It is not immediately clear to me how you will do that under the dotnet framework.
Other way of solving your problem is using the publish-subscriber pattern or possible thread pools.
Hope this was helpful?
I'm working on a web application framework, which uses MSSQL for data storage, mostly just does CRUD operations (but on arbitrarly complex structures), provides a WCF interface for rich Silverlight admin and has an MVC3 display (and some basic forms like user settings, etc).
It's getting quite good at being able to load, display, edit and save any (reasonably) complex data structure, in a user-friendly way.
But, I'm looking towards the future, and want to expand my capabilities (and it would be fun to learn new things along the way as well...) - so I've decided (in the light of what's coming for C#5...) to try to get some parallel/async optimalization... Now, I haven't even learned TPL and PLinq yet, so I'm happy for any advice there as well.
So my question is, what are possible areas where parallel processing maybe of help, and where does TPL and PLinq help me on that?
My guts tell me, I could try saving branches of a data structure in a parallel way to the database (this is where I'd expect the biggest peformance optimalization), I could perform some complex operations (file upload, mail sending maybe?) in a multithreaded enviroment, etc. Can I build complex SL UI views in parallel on the client? (Creating 60 data-bound fields on a view can cause "blinking"...) Can I create partial views (menus, category trees, search forms, etc) in MVC at once?
ps: If this turns into "Tell me everything about parallel stuffs" thread, I'm happy to make it community-wiki...
Remember that an asp.net web application is intrinsically a parallel application in any case. Requests can be serviced in parallel and this will all be managed by the asp.net framework. So there are two cases:
You have lots of users all hitting the site at once. In which case the parallel processing capability of the server is probably being used to capacity in any case.
You don't have lots of users all hitting the site at once. In which case the server is probably quite capable of dealing with the responses without parallel processing in a suitable fast response time.
Any time you start thinking about optimising something just because it might be fun, or because you just think you should make stuff faster then you are almost certainly guilty of premature optimization. Your efforts could almost certainly be better spent enriching the functionality of the framework, rather than making what is probably a plenty fast enough solution a little bit faster (at the cost of significantly increase complexity).
In answer to the question of where can TPL and PLINQ really help. In my opinion the main advantage of these technologies is in places in the application where you really do have a lot of long running blocking processes. For example if you have a situation where you call out several times to an external web service - it can be a significant advantage to make these calls in parallel. I would strongly question whether writing to a local database - or even a database on a different box on a local network would count as being a long running blocking process to the extent that this kind of parallelisation is of any significant value.
Pretty much all the examples you list fall in to the category of getting the PC to do something in parallel that it was previously doing in sequence. How many CPUs are on your server - how many are really free when the website is under load. Making something parallel does not necessarily equate to making it faster unless the process involved has some measure of time when you PC is sitting around doing nothing waiting for an external event.
First question is to ask the users / testers which bits seem slow. The only way to know for sure what's slowing you down is to use a profiler like dottrace. The results are sometimes surprising.
If you do find something, parallel processing may not be the answer. You need to remember that there is an overhead in splitting tasks up, so if the task is fairly quick in the first place, it could end up being slower. You also have to consider the added complexity, e.g. what happens if half a task succeeds, and half fails? (Although TPL and PLINQ hide you from this to an extend)
Have fun, but I wondering whether this is a case of 1) solution chasing a problem, and 2) premature optimization.
Learning about threading is fascinating no doubt and there are some really good resources to do that. But, my question is threading applied explicitly either as part of design or development in real-world applications.
I have worked on some extensively used and well-architected .NET apps in C# but found no trace of explicit usage.Is there no real need due to this being managed by CLR or is there any specific reason?
Also, any example of threading coded in widely used .NET apps. in Codelplex or Gooogle Code are also welcome.
The simplest place to use threading is performing a long operation in a GUI while keeping the UI responsive.
If you perform the operation on the UI thread, the entire GUI will freeze until it finishes. (Because it won't run a message loop)
By executing it on a background thread, the UI will remain responsive.
The BackgroundWorker class is very useful here.
is threading applied explicitly either as part of design or development in real-world applications.
In order to take full advantage of modern, multi-core systems, threading must be part of the design from the start. While it's fairly easy (especially in .NET 4) to find small portions of code to thread, to get real scalability, you need to design your algorithms to handle being threaded, preferably at a "high level" in your code. The earlier this is done in the design phases, the easier it is to properly build threading into an application.
Is there no real need due to this being managed by CLR or is there any specific reason?
There is definitely a need. Threading doesn't come for free - it must be added in by the developer. The main reason this isn't found very often, especially in open source code, is really more a matter of difficulty. Even using .NET 4, properly designing algorithms to thread in a scalable, safe manner is difficult.
That entirely depends on the application.
For a client app that ever needs to do any significant work (or perform other potentially long-running tasks, such as making web service calls) I'd expect background threads to be used. This could be achieved via BackgroundWorker, explicit use of the thread pool, explicit use of Parallel Extensions, or creating new threads explicitly.
Web services and web applications are somewhat less likely to create their own threads, in my experience. You're more likely to effectively treat each request as having a separate thread (even if ASP.NET moves it around internally) and perform everything synchronously. Of course there are web applications which either execute asynchronously or start threads for other reasons - but I'd say this comes up less often than in client apps.
Definitely a +1 on the Parallel Extensions to .NET. Microsoft has done some great work here to improve the ThreadPool. You used to have one global queue which handled all tasks, even if they were spawned from a worker thread. Now they have a lock-free global queue and local queues for each worker thread. That's a very nice improvement.
I'm not as big a fan of things like Parallel.For, Parallel.Foreach, and Parallel.Invoke (regions), as I believe they should be pure language extensions rather than class libraries. Obviously, I understand why we have this intermediate step, but it's inevitable for C# to gain language improvements for concurrency and it's equally inevitable that we'll have to go back and change our code to take advantage of it :-)
Overall, if you're looking at building concurrent apps in .NET, you owe it to yourself to research the heck out of the Parallel Extensions. I also think, given that this is a pretty nascent effort from Microsoft, you should be very vocal about what works for you and what doesn't, independent of what you perceive your own skill level to be with concurrency. Microsoft is definitely listening, but I don't think there are that many people yet using the Parallel Extensions. I was at VSLive Redmond yesterday and watched a session on this topic and continue to be impressed with the team working on this.
Disclosure: I used to be the Marketing Director for Visual Studio and am now at a startup called Corensic where we're building tools to detect bugs in concurrent apps.
Most real-world usages of threading I've seen is to simply avoid blocking - UI, network, database calls, etc.
You might see it in use as BeginXXX and EndXXX method pairs, delegate.BeginInvoke calls, Control.Invoke calls.
Some systems I've seen, where threading would be a boon, actually use the isolation principle to achieve multiple "threads", in other words, split the work down into completely unrelated chunks and process them all independently of each other - "multi-threading" (or many-core utilisation) is automagically achieved by simply running all the processes at once.
I think it's fair to say you find a lot of stock-and-trade applications (data presentation) largely do not require massive parallisation, nor are they always able to be architected to be suitable for it. The examples I've seen are all very specific problems. This may attribute to why you've not seen any noticable implementations of it.
The question of whether to make use of an explicit threading implementation is normally a design consideration as others have mentioned here. Trying to implement concurrency as an afterthought usually requires a lot of radical and wholesale changes.
Keep in mind that simply throwing threads into an application doesn't inherently increase performance or speed, given that there is a cost in managing each thread, and also perhaps some memory overhead (not to mention, debugging it can be fun).
From my experience, the most common place to implement a threading design has been in Windows Services (background applications) and on applications which have had use case scenarios where a volume of work could be easily split up into smaller parcels of work (and handed off to threads to complete asynchronously).
As for examples, you could check out the Microsoft Robotics Studio (as far as I know there's a free version now) - it comes with an redistributable (I can't find it as a standalone download) of the Concurrency and Coordination Runtime, there's some coverage of it on Microsoft's Channel 9.
As mentioned by others the Parallel Extensions team (blog is here) have done some great work with thread safety and parallel execution and you can find some samples/examples on the MSDN Code site.
Threading is used in all sorts of scenarios, anything network based depends on threading, whether explicit (sockets stuff) or implicit (web services). Threading keeps UI responsive. And windows services having multiple parallel runs doing the same things in processing data working through queues that need to be processed.
Those are just the most common ones I've seen.
Most answers reference long-running tasks in a GUI application. Another very common usage scenario in my experience is Producer/Consumer queues. We have many utility applications that have to perform web requests etc. often to large number of endpoints. We use producer/consumer threading pattern (usually by integrating a custom thread pool) to allow high parallelization of these tasks.
In fact, at this very moment I am checking up on an application that uploads a 200MB file to 200 different FTP locations. We use SmartThreadPool and run up to around 50 uploads in parallel, which allows the whole batch to complete in under one hour (as opposed to over 50 hours were it all uploads to happen consecutively - so in our usage we find almost straight linear improvements in time).
As modern day programmers we love abstractions so we use threads by calling Async methods or BeginInvoke and by using things like BackgroundWorker or PFX in .Net 4.
Yet sometimes there is a need to do the threading yourself. For Example in a web app I built I have a mail queue that I add to from within the app and there is a background thread that sends the emails. If the thread notices that the queue is filling up faster that it is sending it creates another thread if it then sees that that thread is idle it kills it. This can be done with a higher level abstraction I guess but i did it manually.
I can't resist the edge case - in some applications where either a high degree of operational certainty must be achieved or a high degree of operational uncertainty must be tolerated, then threads and processes are considered from initial architecture design all the way through end delivery
Case 1 - for systems that must achieve extremely high levels of operational reliability, three completely separate subsystems using three different mechanisms may be used in a voting architecture - Spawn 3 threads/proceses across each of the voters, wait for them to conclude/die/be killed, and proceed IFF they all say the same thing - example - complex avionic susystems
Case 2 - for systems that must deal with a high degree of operational uncertainty - do the same thing, but once something/anything gets back to you, kill off the stragglers and go forth with the best answer you got - example - complex intraday trading algorithms endeavoring to destroy the business that employ them :-)
In python the yield keyword can be used in both push and pull contexts, I know how to do the pull context in c# but how would I achieve the push. I post the code I am trying to replicate in c# from python:
def coroutine(func):
def start(*args,**kwargs):
cr = func(*args,**kwargs)
cr.next()
return cr
return start
#coroutine
def grep(pattern):
print "Looking for %s" % pattern
try:
while True:
line = (yield)
if pattern in line:
print line,
except GeneratorExit:
print "Going away. Goodbye"
If what you want is an "observable collection" -- that is, a collection which pushes results at you rather than letting the consumer pull them -- then you probably want to look into the Reactive Framework extensions. Here's an article on it:
http://www.infoq.com/news/2009/07/Reactive-Framework-LINQ-Events
Now, as you note, you can build both "push" and "pull" style iterators easily if you have coroutines available. (Or, as Thomas points out, you can build them with continuations as well.) In the current version of C# we do not have true coroutines (or continuations). However, we are very concerned about the pain users feel around asynchronous programming.
Implementing fiber-based coroutines as a first-class language feature is one technique that could possibly be used to make asynchronous programming easier, but that is just one possible idea of many that we are at present researching. If you have a really solid awesome scenario where coroutines do a better job than anything else -- including the reactive framework -- then I'd love to hear more about it. The more realistic data we have about what real problems people are facing in asynchronous programming, the more likely we are to come up with a good solution. Thanks!
UPDATE: We have recently announced that we are adding coroutine-like asynchronous control flows to the next version of C# and VB. You can try it yourself with our Community Technology Preview edition, which you can download here.
C# does not have general co-routines. A general co-routine is where the co-routine has its own stack, i.e. it can invoke other methods and those methods can "yield" values. Implementation of general co-routines requires making some smart things with stacks, possibly up to and including allocating stack frames (the hidden structures which contain local variables) on the heap. This can be done, some languages do that (e.g. Scheme), but it is somewhat tricky to do it right. Also, many programmers find the feature difficult to understand.
General co-routines can be emulated with threads. Each thread has its own stack. In a co-routine setup, both threads (the initial caller, and the thread for the co-routine) will alternate control, they will never actually run simultaneously. The "yield" mechanism is then an exchange between the two threads, and as such it is expensive (synchronization, a roundtrip through the OS kernel and scheduler...). Also, there is much room for memory leaks (the co-routine must be explicitly "stopped", otherwise the waiting thread will stick forever). Thus, this is rarely done.
C# provides a bastardized-down co-routine feature called iterators. The C# compiler automatically converts the iterator code into a specific state class, with local variables becoming class fields. Yielding is then, at the VM level, a plain return. Such a thing is doable as long as the "yield" is performed from the iterator code itself, not from a method which the iterator code invokes. C# iterators already cover many use cases and the C# designers were unwilling to go further down the road to continuations. Some sarcastic people are keen to state that implementing full-featured continuations would have prevented C# from being as efficient as its arch-enemy Java (efficient continuations are feasible, but this requires quite some work with the GC and the JIT compiler).
Maybe this will help.
http://blogs.msdn.com/ericlippert/archive/2009/07/23/iterator-blocks-part-five-push-vs-pull.aspx
thanks #NickLarsen, you helped me remember the new stuff that MS have introduced, the IObservable interface.
link http://msdn.microsoft.com/en-us/library/dd783449(VS.100).aspx
Actually .NET does not make "incorrect assumptions" about thread affinity, in fact it totally decouples the notion of a .NET level thread from the OS level thread.
What you have to do is associate a logical .NET thread state with your fiber ( for that you need the CLR Hosting API's but you don'T need to write a host yourself you can use those needed from your own application directly ) and everything, lock tracking, exception handling works normally again.
An example can be found here: http://msdn.microsoft.com/en-us/magazine/cc164086.aspx
Btw Mono 2.6 contains low level Coroutine support and can be used to implement all higher level primitives easily.
I would love to see a fiber-based API for .Net.
I attempted to use the native fiber API in C# through p/invoke a while back, but because the runtime's exception handling (incorrectly) makes thread-based assumptions, things broke (badly) when exceptions happened.
One "killer app" for a fiber-based coroutine API is game programming; certain types of AI require a "lightweight" thread that you can time-slice at will. For example, game behavior trees require the ability to "pulse" the decision code every frame, allowing the AI code to cooperatively yield back to the caller when the decision slice is up. This is possible to implement with hard threads, but much, much more complicated.
So while true fiber use-cases are not mainstream, they definitely exist, and a small niche of us .Net coders would cheer mightily if the existing bugs in the fiber subsystem were worked out.
Well, i gave a try developing a full library to manage coroutines with only a single thread. The hard part was to call coroutines inside coroutines...and to return parameters, but finally i reached a pretty good result here. The only warning are that blocking I/O operations must be made through tasks and alll "return" must be replaced with "yield return".
With the application server based on this library i was able to nearly double the requests made with a standard async/await based on IIS. (Seek for Node.Cs and Node.Cs.Musicstore on github to try it at home)