WCF Async Operation + IO Operation - c#

What is the advantage of writing the following WCF service operation using Async CTP?
Task.Factory.StartNew will anyway block the threadpool thread for the duration of the longRunningIOOperation?
public Task<string> SampleMethodAsync(string msg)
{
return await Task.Factory.StartNew(() =>
{
return longRunningIOOperation();
});
}
Is there a better way to write this so we take advanage of IO completion threads?

You'll need to make the longRunningIOOperation an asynchronous operation as well. As long as any operation in your code blocks the thread, some thread will be blocked, whether it's a threadpool one or the one in which your operation was called. If your operation is asynchronous, you can write something similar to the code below.
public Task<string> SampleMethodAsync(string msg)
{
var tcs = new TaskCompletionSource<string>();
longRunningIOOperationAsync().ContinueWith(task =>
{
tcs.SetResult(task.Result);
});
return tcs.Task;
}

Finally I figured out how this works. I installed .net FX4.5 and everything worked like a charm.
In my scenario, Service A makes a call to Service B like this.
public class ServiceA : IServiceA
{
public async Task<string> GetGreeting(string name)
{
ServiceBClient client = new ServiceBClient();
return await client.GetGreetingAsync();
}
}
client.GetGreetingAsync() takes 10 seconds to process. My understading is Service A request thread will not be blocked by calling GetGreetingAsync().
Can you explaing how this is implemented by WCF behind the scenes or point me to some documentation to understand how all this works from the perspective of WCF?

Related

Task.Run() schedules tasks randomly / erratically under high loads

As I am working on processing bulk of emails, I have used the Task method to process those emails asynchronously without affecting the primary work. (Basically sending email functionality should work on different thread than primary thread.) Imagine you're processing more than 1K email per 30 seconds in Windows Service.
The issue I am facing is-- many time the Task method is not executed, it completely behave randomly. Technically, it schedules the task randomly. Sometime I receive the call in SendEmail method and sometime not. I have tried both the approaches as mentioned below.
Method 1
public void ProcessMails()
{
Task.Run(() => SendEmail(emailModel));
}
Method 2
public async void ProcessMails()
{
// here the SendEmail method is awaitable, but I have not used 'await' because
// I need non-blocking operation on main thread.
SendEmail(emailModel));
}
Would anybody please let me know what could be the problem OR I am missing anything here?
As has been noted it seems as though you're running out resources to schedule tasks that ultimately send your emails. Right now the sample code tries to force feed all the work that needs to get scheduled immediatly.
The other answer provided suggects using a blocking collection but I think there is a cleaner easier way. This sample should at least give you the right idea.
using System;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
namespace ClassLibrary1
{
public class MailHandler
{
private EmailLibrary emailLibrary = new EmailLibrary();
private ExecutionDataflowBlockOptions options = new ExecutionDataflowBlockOptions() { MaxDegreeOfParallelism = Environment.ProcessorCount };
private ActionBlock<string> messageHandler;
public MailHandler() => messageHandler = new ActionBlock<string>(msg => DelegateSendEmail(msg), options);
public Task ProcessMail(string message) => messageHandler.SendAsync(message);
private Task DelegateSendEmail(string message) => emailLibrary.SendEmail(message);
}
public class EmailLibrary
{
public Task SendEmail(string message) => Task.Delay(1000);
}
}
Given the fairly high frequency of sending email, it is likely that you are scheduling too many Tasks for the Scheduler.
In Method 1, calling Task.Run will create a new task each time, each of which needs to be scheduled on a thread. It is quite likely that you are exhausting your thread pool by doing this.
Although Method 2 will be less Task hungry, even with unawaited Task invocation (fire and forget), the completion of the continuation after the async method will still need to be scheduled on the Threadpool, which will adversely affect your system.
Instead of unawaited Tasks or Task.Run, and since you are a Windows Service, I would instead have a long-running background thread dedicated to sending emails. This thread can work independently to your primary work, and emails can be scheduled to this thread via a queue.
If a single mail sending thread is insufficient to keep pace with the mails, you can extend the number of EmailSender threads, but constrain this to a reasonable, finite number).
You should explore other optimizations too, which again will improve the throughput of your email sender e.g.
Can the email senders keep long lived connections to the mail server?
Does the mail server accept batches of email?
Here's an example using BlockingCollection with a backing ConcurrentQueue of your email message Model.
Creating a queue which is shared between the producer "PrimaryWork" thread and the "EmailConsumer" thread (obviously, if you have an IoC container, it's best registered there)
Enqueuing mails on the primary work thread
The consumer EmailSender runs a loop on the blocking collection queue until CompleteAdding is called
I've used a TaskCompletionSource to provide a Task which will complete once all messages have been sent, i.e. so that graceful exit is possible without losing emails still in the queue.
public class PrimaryWork
{
private readonly BlockingCollection<EmailModel> _enqueuer;
public PrimaryWork(BlockingCollection<EmailModel> enqueuer)
{
_enqueuer = enqueuer;
}
public void DoWork()
{
// ... do your work
for (var i = 0; i < 100; i++)
{
EnqueueEmail(new EmailModel {
To = $"recipient{i}#foo.com",
Message = $"Message {i}" });
}
}
// i.e. Queue work for the email sender
private void EnqueueEmail(EmailModel message)
{
_enqueuer.Add(message);
}
}
public class EmailSender
{
private readonly BlockingCollection<EmailModel> _mailQueue;
private readonly TaskCompletionSource<string> _tcsIsCompleted
= new TaskCompletionSource<string>();
public EmailSender(BlockingCollection<EmailModel> mailQueue)
{
_mailQueue = mailQueue;
}
public void Start()
{
Task.Run(() =>
{
try
{
while (!_mailQueue.IsCompleted)
{
var nextMessage = _mailQueue.Take();
SendEmail(nextMessage).Wait();
}
_tcsIsCompleted.SetResult("ok");
}
catch (Exception)
{
_tcsIsCompleted.SetResult("fail");
}
});
}
public async Task Stop()
{
_mailQueue.CompleteAdding();
await _tcsIsCompleted.Task;
}
private async Task SendEmail(EmailModel message)
{
// IO bound work to the email server goes here ...
}
}
Example of bootstrapping and starting the above producer / consumer classes:
public static async Task Main(string[] args)
{
var theQueue = new BlockingCollection<EmailModel>(new ConcurrentQueue<EmailModel>());
var primaryWork = new PrimaryWork(theQueue);
var mailSender = new EmailSender(theQueue);
mailSender.Start();
primaryWork.DoWork();
// Wait for all mails to be sent
await mailSender.Stop();
}
I've put a complete sample up on Bitbucket here
Other Notes
The blocking collection (and the backing ConcurrentQueue) are thread safe, so you can concurrently use more than one many producer and consumer thread.
As per above, batching is encouraged, and asynchronous parallelism is possible (Since each mail sender uses a thread, Task.WaitAll(tasks) will wait for a batch of tasks). A totally asynchronous sender could obviously use await Task.WhenAll(tasks).
As per comments below, I believe the nature of your system (i.e. Windows Service, with 2k messages / minute) warrants at least one dedicated thread for email sending, despite emailing likely being inherently IO bound.

Task.Run to increase parallelism of IO-bound operations?

I'm getting a bit confused with Task.Run and all I read about it on the internet. So here's my case: I have some function that handles incoming socket data:
public async Task Handle(Client client)
{
while (true)
{
var data = await client.ReadAsync();
await this.ProcessData(client, data);
}
}
but this has a disadvantage that I can only read next request once I've finished processing the last one. So here's a modified version:
public async Task Handle(Client client)
{
while (true)
{
var data = await client.ReadAsync();
Task.Run(async () => {
await this.ProcessData(client, data);
});
}
}
It's a simplified version. For more advanced one I would restrict the maximum amount of parallel requests of course.
Anyway this ProcessData is mostly IO-bound (doing some calls to dbs, very light processing and sending data back to client) yet I keep reading that I should use Task.Run with CPU-bound functions.
Is that a correct usage of Task.Run for my case? If not what would be an alternative?
Conceptually, that is a fine usage of Task.Run. It's very similar to how ASP.NET dispatches requests: (asynchronously) reading a request and then dispatching (synchronous or asynchronous) work to the thread pool.
In practice, you'll want to ensure that the result of ProcessData is handled properly. In particular, you'll want to observe exceptions. As the code currently stands, any exceptions from ProcessData will be swallowed, since the task returned from Task.Run is not observed.
IMO, the cleanest way to handle per-message errors is to have your own try/catch, as such:
public async Task Handle(Client client)
{
while (true)
{
var data = await client.ReadAsync();
Task.Run(async () => {
try { await this.ProcessData(client, data); }
catch (Exception ex) {
// TODO: handle
}
});
}
}
where the // TODO: handle is the appropriate error-handling code for your application. E.g., you might send an error response on the socket, or just log-and-ignore.

Does Task.Run scale as well as using Tasks from example WebApi?

We have alot of requests in our system so we use Tasks with WebApi. On some places we have high requirements on speed so we cant wait for the Task to complete, I have created a Worker for this. It creates a nested container so that Entity frameworks DbContext wont get disposed etc. But it looks like Task.Run spawns a new thread for each time, how well will this scale?
public class BackgroundWorker<TScope> : IBusinessWorker<TScope>, IRegisteredObject where TScope : class
{
private readonly IBusinessScope<TScope> _scope;
private bool _started;
private bool _stopping;
public BackgroundWorker(IBusinessScope<TScope> scope)
{
_scope = scope;
}
public void Run(Func<TScope, Task> action)
{
if(_stopping) throw new Exception("App pool is recycling, cant queue work");
if(_started) throw new Exception("You cant call Run multiple times");
_started = true;
HostingEnvironment.RegisterObject(this);
Task.Run(() =>
action(_scope.EntryPoint).ContinueWith(t =>
{
_scope.Dispose();
HostingEnvironment.UnregisterObject(this);
}));
}
public void Stop(bool immediate)
{
_stopping = true;
if(immediate)
HostingEnvironment.UnregisterObject(this);
}
}
Used like
backgroundWorker.Run(async ctx => await ctx.AddRange(foos).Save());
If I google they all end up using Task.Run but doesn't that kill the purpose?
Update:
Did a test
var guid = Guid.NewGuid();
_businessWorker.Run(async ctx => {
System.Diagnostics.Debug.WriteLine("{0}: {1}", guid, Thread.CurrentThread.ManagedThreadId);
await Task.Delay(1);
System.Diagnostics.Debug.WriteLine("{0}: {1}", guid, Thread.CurrentThread.ManagedThreadId);
});
This outputs
3bdbe90b-c31e-4709-95d8-f7516210b0ac: 17
3bdbe90b-c31e-4709-95d8-f7516210b0ac: 9
6548fd26-d209-4427-9a91-40fc30aa509e: 15
6548fd26-d209-4427-9a91-40fc30aa509e: 19
7411b043-4fae-44bf-b93f-4273a532afa1: 7
7411b043-4fae-44bf-b93f-4273a532afa1: 17
Which indicates that Task.Run actually works like i think it should
With real DB code it looks like this
a939713d-d728-46c9-be33-aa57704cf242: 19 <--
a939713d-d728-46c9-be33-aa57704cf242: 19 <-- Used same for entire work
7e588a42-afd0-4ab5-ba6b-f8520c889cde: 7
7e588a42-afd0-4ab5-ba6b-f8520c889cde: 19 <-- Reused first works thread when work #2 continued
6f3b067f-f478-43f9-8411-8142b449c28b: 8
6f3b067f-f478-43f9-8411-8142b449c28b: 18
update:
Tried Luaan's approach, seems to work with Tasks spawned from EntityFramework or WebApi HttpClient, but with manual Tasks etc like below it does not work well, some are executed some are not. With Task.Run all are executed
_businessWorkerFactory().Run(async ctx =>
{
var guid = Guid.NewGuid();
System.Diagnostics.Debug.WriteLine("{0}: {1}", guid, Thread.CurrentThread.ManagedThreadId);
var completion = new TaskCompletionSource<bool>();
ThreadPool.QueueUserWorkItem(obj =>
{
Thread.Sleep(1000);
completion.SetResult(true);
});
await completion.Task;
System.Diagnostics.Debug.WriteLine("{0}: {1}", guid, Thread.CurrentThread.ManagedThreadId);
});
Task.Run schedules the task to run on a thread pool thread. The same thread pool that handles requests.
On an ASP.NET application, sending work to the thread pool steals threads that might be necessary to handle requests.
Given your requirements, I think you would be better queuing that work to another service/process using something like MSMQ.
Task.Run doesn't spawn a new thread - it borrows one from the thread pool (assuming the thread pool task scheduler - there's different schedulers, and you can write your own as well). When you use await inside of Task.Run, it will still work as usual - freeing the thread pool thread until a callback is posted.
However, exactly for that reason, there's little point in using Task.Run for I/O work. If you have asynchronous I/O to do, just do it - it will work exactly the same, without requiring a context switch. You must make it asynchronous though - if it's just blocking code, you're taking up valuable threads from the thread pool.
Note that you don't need for an asynchronous request to finish. If the asynchronous action you are performing doesn't need too much time to setup (that is, it returns the Task almost immediately, even though it isn't finished), you can just call it:
public async Task SomeAsync()
{
var request = new MyRequest();
await request.MakeRequestAsync();
...
}
public void Start()
{
var task = SomeAsync();
// Now the task is started, and we can use it for future reference. Or just wire up
// some error handling continuations etc. - though it's usually a better idea to do that
// within SomeAsync directly.
}

TPL inside Windows Service

I need to perform few tasks inside a Windows Service I am writing in parallel. I am using VS2013, .NET 4.5 and this thread Basic design pattern for using TPL inside windows service for C# shows that TPL is the way to go.
Below is my implementation. I was wondering if anyone can tell me if I have done it correctly!
public partial class FtpLink : ServiceBase
{
private readonly CancellationTokenSource _cancellationTokenSource = new CancellationTokenSource();
private readonly ManualResetEvent _runCompleteEvent = new ManualResetEvent(false);
public FtpLink()
{
InitializeComponent();
// Load configuration
WebEnvironment.Instance.Initialise();
}
protected override void OnStart(string[] args)
{
Trace.TraceInformation("DatabaseToFtp is running");
try
{
RunAsync(_cancellationTokenSource.Token).Wait();
}
finally
{
_runCompleteEvent.Set();
}
}
protected override void OnStop()
{
Trace.TraceInformation("DatabaseToFtp is stopping");
_cancellationTokenSource.Cancel();
_runCompleteEvent.WaitOne();
Trace.TraceInformation("DatabaseToFtp has stopped");
}
private async Task RunAsync(CancellationToken cancellationToken)
{
while (!cancellationToken.IsCancellationRequested)
{
Trace.TraceInformation("Working");
// Do the actual work
var tasks = new List<Task>
{
Task.Factory.StartNew(() => new Processor().ProcessMessageFiles(), cancellationToken),
Task.Factory.StartNew(() => new Processor().ProcessFirmware(), cancellationToken)
};
Task.WaitAll(tasks.ToArray(), cancellationToken);
// Delay the loop for a certain time
await Task.Delay(WebEnvironment.Instance.DatabasePollInterval, cancellationToken);
}
}
}
There are a few things i would do differently:
OnStart should execute in a timely fashion. Common practice is to defer work to a background thread which is in charge of doing the actual work. You're actually doing that but blocking that thread with a call to Task.Wait, which kind of makes the offloading to a background thread useless, because execution becomes synchronous again.
You're using the sync over async anti-pattern, this should be mostly avoided. Let the calling method invoke the work in parallel.
I think you might be using the ManualResetEvent the other way around. You're wrapping your RunAsync method in a try-finally block, but you're only calling WaitOne from OnStop. I'm not really sure you need a lock here at all, it doesn't seem (from your current code) that this code is being invoked in parallel. Instead, you can store the Task returned by RunAsync in a field and wait on it to complete.
You're using the blocking version, WaitAll. Instead, you could use the asynchronous version, Task.WhenAll, which can be asynchronously waited.

WCF client blocks on async methods

I'm working on WCF client app, and I facing difficulties with the await/async pattern.
It seems that the line:
await client.LongOperationAsync();
always blocks. As I understand, the thread supposed to exit and continue on to the Main() method and then return when the async method completes, maybe I'm wrong.
The output for the code below is (always):
Test() started
Test() error
*
*
*
...
The Test() method always completes before the context returns to main. Any thoughts would be highly appreciated.
static void Main(string[] args)
{
Program p = new Program();
p.Test();
while (true)
{
Console.WriteLine("*");
Thread.Sleep(500);
}
}
private async Task Test()
{
Console.WriteLine("Test() started");
try
{
MySoapClient client = new MySoapClient(
new BasicHttpBinding(new BasicHttpSecurityMode()),
new EndpointAddress("http://badaddress"));
await client.LongOperationAsync();
Console.WriteLine("Test() success");
}
catch (Exception)
{
Console.WriteLine("Test() error");
return;
}
Console.WriteLine("Test() end successfully");
}
Async methods execute synchronously until the first await; if your LongOperationAsync method performs a blocking operation before its first await, the calling method will be blocked as well. I suspect that's what happening in your case.
This is probably because WebRequest.BeginGetResponse performs some of its work synchronously. See Stephen Toub's answer to this question:
The Async CTP's GetRequestStreamAsync and GetResponseAsync are simple
wrappers around the existing HttpWebRequest.BeginGetRequestStream and
BeginGetResponse in .NET 4. Those Begin* methods have a lot of setup
work they do (e.g. proxy, DNS, connection pooling, etc.) before they
can submit a request, and unfortunately today that work all happens
synchronously as part of the Begin* call.
In this case, you provided a bad domain name, so I suspect it takes a while for the DNS resolution to fail.

Categories