I'm trying to wrap some existing APM calls (BeginX, EndX) in Tasks to get all the nice benefits of them. Unfortunately, our methods are unconventional and use out parameters and so can't use the standard FromAsync method where you give it both the begin and end delegates and let it wrap it nicely.
This article describes the alternative: an overload that takes an IAsyncResult and only requires you to implement the end callback. They take the IAsyncResult handle, then wait until it completes, then call the delegate you passed in.
This seemed fine, but then I read another article about wrapping APM calls in tasks. He also mentions that the IAsyncResult overload is not as efficient as the other methods. It seems to me like this means that the callback is not used to report completion of the method. That means they must be using the AsyncWaitHandle or polling IsCompleted. Which one do they use? How much performance penalty is it?
If it's doing polling that means the callback might not come right away and they have to busily check it during the whole call. If they've got an AsyncWaitHandle, they have another thread sitting and waiting on the result, which completely defeats the point of using an asynchronous method for me.
Does anyone know what they're doing and how severe this performance penalty is?
Does anyone know what they're doing and how severe this performance penalty is?
It's not incredibly severe, but there is more overhead. Since you aren't providing them the same information (only the IAsyncResult), the implementation has to call ThreadPool.RegisterWaitForSingleObject to trigger a callback when the IAsyncResult completes.
When this isn't used, and the Begin/End pair + callback exist, the callback automatically can trigger the completion of the task, eliminating the extra wait call here. This effectively ties up a ThreadPool thread to block (WaitOne) on the wait handle until the operation completes. If this is a rare occurrence, the performance overhead is probably negligible, but if you're doing this a lot, it could be problematic.
Looking at the code in JustDecompile, it creates a TaskCompletionSource wrapping the IAsyncObject, which creates a new Task, which probably means a thread waiting for the IAsyncObject to complete. Not ideal, but it's probably the only way to do it since there is no way to make a good one-size-fits-all polling rate.
When you pass in a ISyncResult there is no way for TaskFactory to insert a callback back. So the TaskFactory needs the ISynResult spin up a WaitHandle, and needs the ThreadPool to get a thread to wait on this handle. There is no polling involved.
Arrived here because I had a similar problem (with out parameters), but only on the End method. This is what I came up with, it might be helpful to someone:
var provider = new Provider();
return Task<Whatever>.Factory.FromAsync(provider.Begin, ar =>
{
Whatever outparam;
provider.End(out outparam);
return outparam;
}, state);
It compiles :)
Related
Generally, when implementing an asynchronous method on a class, I write something like this:
public Task<Guid> GetMyObjectIdAsync(string objectName)
{
return Task.Run(() => GetMyObjectId(objectName));
}
private Guid GetMyObjectId(string objectName)
{
using (var unitOfWork = _myUnitOfWorkFactory.CreateUnitOfWork())
{
var myObject = unitOfWork.MyObjects.Single(o => o.Name == objectName);
return myObject.Id;
}
}
This sort of pattern allows me to use the same logic synchronously and asynchronously, depending on the situation (most of my work is in an old code base, not a lot supports async calls yet), as I could expose the synchronous method publicly and get maximum compatibility if I need to.
Recently I've read several SO posts that suggest using Task.Run() is a bad idea, and should only be used under certain circumstances, but those circumstances did not seem very clear.
Is the pattern I've depicted above actually a bad idea? Am I losing some of the functionality/ intended purpose of async calls doing it this way? Or is this a legit implementation?
What you are doing is offloading a synchronous operation to another thread. If your thread is "special" then that's perfectly fine. One example of a "special" thread is a UI thread. In that case you may want to offload work off of it to keep the UI responsive (another example is some kind of listener).
In most cases however you're just moving work around from one thread to another. This doesn't add any value and does add unnecessary overhead.
So:
Is the pattern I've depicted above actually a bad idea?
Yes, it is. It's a bad idea to offload synchronous work to the ThreadPool and pretend as if it's asynchronous.
Am I losing some of the functionality/ intended purpose of async calls doing it this way?
There's actually nothing asynchronous about this operation to begin with. If your executing this on a remote machine and you can benefit from doing it asynchronously the operation itself needs to be truly asynchronous, meaning:
var myObject = await unitOfWork.MyObjects.SingleAsync(o => o.Name == objectName);
What you're currently doing is called "async over sync" and you probably shouldn't do it. More in Should I expose asynchronous wrappers for synchronous methods?
Recently I've read several SO posts that suggest using Task.Run() is a bad idea, and should only be used under certain circumstances, but those circumstances did not seem very clear.
The absolutely bare bones rules of thumb I tell people who are new to asynchrony is:
First, understand the purpose. Asynchrony is for mitigating the important inefficiencies of high-latency operations.
Is the thing you're doing low-latency? Then don't make it asynchronous in any way. Just do the work. It's fast. Using a tool to mitigate latency on low-latency tasks is just making your program unnecessarily complex.
Is the thing you're doing high-latency because it is waiting on a disk to spin or a packet to show up? Make this asynchronous but do not put it on another thread. You don't hire a worker to sit by your mailbox waiting for letters to arrive; the postal system is already running asynchronously to you. You don't need to hire people to make it more asynchronous. Read "There Is No Thread" if that's not clear.
Is the high-latency work waiting on a CPU to do some enormous computation? Like a computation that is going to take well over 10 ms? Then offload that task onto a thread so that the thread can be scheduled to an idle CPU.
public Task<Guid> GetMyObjectIdAsync(string objectName)
When I see this, I expect there to be some advantage in using this method rather than just wrapping it in Task.Run() myself.
In particular, I'd expect it to release the thread when it hits some I/O or otherwise has the opportunity to do so.
Now consider if I have the code:
_resource = GetResourceForID(GetMyObjectIdAsync(SomeLongRunningWayToGetName()));
If I have a reason to need to have this done in a task, and I'm in the sort of situation where Task.Run() does actually make sense (I have a reason to offload it onto another thread) the best way to do this would be to wrap the whole thing:
Task task = Task.Run(() => _resource = GetResourceForID(GetMyObjectIdAsync(SomeLongRunningWayToGetName())));
Here Task.Run() might be a bad idea for me as the caller, or it might be good because I really am gaining from what it gives me.
However, if I see your signature I'm going to think that the best way to do this with your code would be to turn it into code that uses that method.
Task task = SomeLongRunningWayToGetName()
.ContinueWith(t => GetMyObjectIdAsync(t.Result))
.ContinueWith(t => _resource = GetResourceForIDAsync(t.Result));
(Or similar using async and await).
At best this has less good chunking of the Task.Run(). At worse I'm awaiting this just to gain from the better asynchronicity that it doesn't offer in a context that could make use of it if it was really there. (E.g I might have used this in an MVC action that I'd made asynchronous because I thought the extra overhead would be repaid in better thread-pool use).
So while Task.Run() is sometimes useful, in this case it's always bad. If you can't offer me greater asynchronicity than I can bring to the use of the class myself, don't lead me to believe you do.
Only offer a public XXXAsync() method if it really does call into asynchronous I/O.
If you really need to stub out an asynchronous method to e.g. match a signature of a shared base or interface, then it would be better as:
public Task<Guid> GetMyObjectIdAsync(string objectName)
{
return Task.FromResult(GetMyObjectId(objectName);
}
This is bad too (the caller would still have been better off just calling GetMyObjectId() directly), but at least if code awaits it then while it operates on the same thread there's no overhead of using yet another thread to do the work, so if it's mixed in with other awaits the negative impact is reduced. It's therefore useful if you really need to return a Task but can't add anything useful in how you call it.
But if you don't really need to offer it, just don't.
(A private method calling Run() because you every call site benefits from it is different, and there you're just adding convenience rather than calling Run() in several places, but that should be well-documented as such).
A lot of API's are moving toward exposing only asynchronous methods. How much of a performance hit is there in scenarios where you have to immediately wait on these methods? Am I wrong in assuming that it causes the current thread to wait on a spawned thread to complete? Or does the CLR perform some sort of magic in these scenarios and make it all execute in the same thread?
By "asynchronous methods", I assume you mean Task<T> based async methods.
So if you have a method that returns a Task<T> and you immediately call its Wait() method, that causes the current that to wait on an internal WaitHandle object. The task most likely executes on a different thread and signals the WaitHandle when completed, which releases the waiting thread. There is no compliler optimization that turns this scenario into a synchronous call that I'm aware of.
This is of course more work than just calling a synchronous equivalent of the async method. However,depending on your use case, it probably won't be a significant difference.
The more important question is why would you want to loose the advantages of async by blocking the calling thread? That is generally not a good idea, you should ensure you have a very good reason to do this.
What is an asynchronous method. I think I know, but I keep confusing it with parallelism. I'm not sure what the difference between an asynchronous method is and what parallelism is.
Also what is difference between using threading classes and asynchronous classes?
EDIT
Some code demonstrating the difference between async, threading and parallelism would be useful.
What are asynchronous methods?
Asynchronous methods come into the discussion when we are talking about potentially lengthy operations. Typically we need such an operation to complete in order to meaningfully continue program execution, but we don't want to "pause" until the operation completes (because pausing might mean e.g. that the UI stops responding, which is clearly undesirable).
An asynchronous method is one that we call to start the lengthy operation. The method should do what it needs to start the operation and return "very quickly" so that there are no processing delays.
Async methods typically return a token that the caller can use to query if the operation has completed yet and what its result was. In some cases they take a callback (delegate) as an argument; when the operation is complete the callback is invoked to signal the caller that their results are ready and pass them back. This is a commonly used callback signature, although of course in general the callback can look like anything.
So who does actually run the lengthy operation?
I said above that an async method starts a length operation, but what does "start" mean in this context? Since the method returns immediately, where is the actual work being done?
In the general case an execution thread needs to keep watch over the process. Since it's not the thread that called the async method that pauses, who does? The answer is, a thread picked for this purpose from the managed thread pool.
What's the connection with threading?
In this context my interpretation of "threading" is simply that you explicitly spin up a thread of your own and delegate it to execute the task in question synchronously. This thread will block for a time and presumably will signal your "main" thread (which is free to continue executing) when the operation is complete.
This designated worker thread might be pulled out of the thread pool (beware: doing very lengthy processing in a thread pool thread is not recommended!) or it might be one that you started just for this purpose.
First off, what is a method and what is a thread? A method is a unit of work that either (1) performs a useful side effect, like writing to a file, or (2) computes a result, like making a bitmap of a fractal. A thread is a worker that performs that work.
A method is synchronous if in order to use the method -- to get the side effect or the result -- your thread must do nothing else from the point where you request the work to be done until the point where it is finished.
A method is asynchronous if your thread tells the method that it needs the work to be done, and the method says "OK, I'll do that and I'll call you when it is finished".
Usually the way an asynchronous method does that is it makes another worker -- it grabs a thread from the pool. This is particularly true if the method needs to make heavy use of a CPU. But not always; there is no requirement that an asynchronous method spins up another thread.
Does that make sense?
Say you need to clean the house, cook the dinner and put the children to bed.
Synchronous:
You clean the house, then cook dinner, then put the children to bed.
Parallel:
You hire 3 people to clean the house, cook dinner and put the children to bed. But you don't trust them so keep a supervisory role, looking over them and waiting for them to finish. Only when they've all finished do they get paid.
Asynchronous:
You one child to clean the house and another to cook dinner. When each have finished their chores they put themselves to bed, while you put your feet up with a glass of wine in front of the tv.
First you got to understand that if you want parallelism all the structure need to be parallel, I mean that if you have an asynchronous method you need a asynchronous call.
In webservices or web stuff, asynchronous methods can be (just one of the many ways) called with AJAX which is asynchronous. In one method you can have multiple threads, this is the key difference between async methods and multiplie threads.
And the main: the difference between a standard method and a async method is that if you make 2 calls to a standard method at the same time to the same controller with a asynchronous caller (like AJAX) the second call will just begin when the first call has already completed, if the methods that you called were asynchronous both the calls will begin at the same time, with multiple-cores servers it can achiev twice (2 calls) the standard speed.
The speed of the parallelism is measured by this law.
Is there any way I can abstract away what thread a particular delegate may execute on, such that I could execute it on the calling thread initially, but move execution to a background thread if it ends up taking longer than a certain amount of time?
Assume the delegate is written to be asynchronous. I'm not trying to take synchronous blocks and move them to background threads to increase parallelism, but rather I'm looking to increase performance of asynchronous execution by avoiding the overhead of threads for simple operations.
Basically I'm wondering if there's any way the execution of a delegate or lambda can be paused, moved to another thread and resumed, if I could establish clear stack boundaries, etc.
I doubt this is possible, I'm just curious.
It is possible, but it would be awkward and difficult to get right. The best way to make this happen is to use coroutines. The only mechanism in .NET that currently fits the coroutine paradigm is C#'s iterators via the yield return keyword. You could theorectically hack something together that allows the execution of a method to transition from one thread to another1. However, this would be nothing less than a blog worthy hack, but I do think it is possible.2
The next best option is to go ahead and upgrade to the Async CTP. This is a feature that will be available in C# and which will allow you do exactly what you are asking for. This is accomplished elegantly with the proposed await keyword and some clever exploits that will also be included. The end result would look something like the follwing.
public async void SomeMethod()
{
// Do stuff on the calling thread.
await ThreadPool.SwitchTo(); // Switch to the ThreadPool.
// Do stuff on a ThreadPool thread now!
await MyForm.Dispatcher.SwitchTo(); // Switch to the UI thread.
// Do stuff on the UI thread now!
}
This is just one of the many wicked cool tricks you can do with the new await keyword.
1The only way you can actually inject the execution of code onto an existing thread is if the target is specifically designed to accept the injection in the form of a work item.
2You can see my answer here for one such attempt at mimicking the await keyword with iterators. The MindTouch Dream framework is another, probably better, variation. The point is that it should be possible to cause the thread switching with some ingenious hacking.
Not easily.
If you structure your delegate as a state machine, you could track execution time between states and, when you reach your desired threshold, launch the next state in a new thread.
A simpler solution would be to launch it in a new thread to start with. Any reason that's not acceptable?
(posting from my phone - I'll provide some pseudocode when I'm at a real keyboard if necessary)
No I don't think that is possible. At least not directly with regular delegates. If you created some kind of IEnumerable that yielded after a little bit of work, then you could manually run a few iterations of it and then switch to running it on a background thread after so many iterations.
The ThreadPool and TPL's Task should be plenty performant, simply always run it on a background thread. Unless you have a specific benchmark showing that using a Task causes a bunch of overhead it sounds like you are trying to prematurely optimize.
In the asynchronous programming model, there looks to be 4 ways (As stated in Calling Synchronous Methods Asynchronously) for making asynchronous method calls.
Calling the EndInvoke() method makes the calling thread wait for the method completion and returns the result.
Going through the IAsyncResult.AsyncWaitHandle.WaitOne() also seem to do the same. AsyncWaitHandle gets a signal of completion (In other word the main thread waits for the Asynchronous method's completion). Then we can execute EndInvoke() to get the result.
What is the difference between calling the EndInvoke() directly and calling it after WaitOne()/WaitAll()?
In the polling technique we provide time for other threads to utilize the system resources by calling Thread.Sleep().
Does AsyncWaitHandle.WaitOne() or EndInvoke() make the main thread go on sleep while waiting?
Q1. There is no difference in the way your code runs or your application, but there might be some runtime differences (again not sure, but a guess based my understanding of Async delegates).
IAsyncResult.AsyncWaitHandle is provided mainly as a synchronization mechanism while using WaitAll() or WaitAny() if you dont have this synchronization need you shouldn't read AsyncWaitHandle property. Reason : AsyncWaitHandle doesnt have to be implemented (created) by the delegate while running asynchronously, until it is read by the external code. I'm not sure of the way CLR handles the Async delegates and whether it creates a WaitHandler or not, but ideally if it can handle running your async delegates without creating another WaitHandle it will not, but your call to WaitOne() would create this handle and you have extra responsibility of disposing(close) it for efficient resource release. Therefore recommendation would be when there is no sycnchronization requirement which can be supported with WaitAll() or WaitAny() dont read this property.
Q2. This Question answers the difference between Sleep and Wait.
Simple things first. For your second question, yes, WaitOne and EndInvoke does indeed make the current thread sleep while waiting.
For your first questions, I can immediately identify 2 differences.
Using WaitOne requires the wait handle to be released, while using EndInvoke directly doesn't require any cleanup.
In return, using WaitOne allows for something to be done before EndInvoke, but after the task has been completed.
As for what that "something" might be, I don't really know. I suspect allocating resources to receive the output might be something that would need to be done before EndInvoke. If you really have no reason to do something at that moment, try not to bother yourself with WaitOne.
You can pass a timeout to WaitOne, so you could, for instance want to perform some other activities on a regular basis whilst waiting for the operation to complete:
do {
//Something else
) while (!waitHandle.WaitOne(100))
Would do something every ~100 milliseconds (+ whatever the something else time is), until the operation completed.