Wrapping rate limiting API call - c#

I have access an API call that accepts a maximum rate of calls per second. If the rate is exceeded, an exception is thrown.
I would like to wrap this call into an abstraction that does the necessary to keep the call rate under the limit. It would act like a network router: handling multiple calls and returning the results to the correct caller caring about the call rate. The goal is to make the calling code as unaware as possible about that limitation. Otherwise, every part in the code having this call would have to be wrapped into a try-catch!
For example: Imagine that you can call a method from an extern API that can add 2 numbers. This API can be called 5 times per second. Anything higher than this will result in an exception.
To illustrate the problem, the external service that limits the call rate is like the one in the answer to this question
How to build a rate-limiting API with Observables?
ADDITIONAL INFO:
Since you don't want the worry about that limit every time you call this method from any part of your code, you think about designing a wrapper method that you could call without worrying about the rate limit. On the inside you care about the limit, but on the outside you expose a simple async method.
It's similar to a web server. How does it return the correct pack of results to the correct customer?
Multiple callers will call this method, and they will get the results as they come. This abstraction should act like a proxy.
How could I do it?
I'm sure the firm of the wrapper method should be like
public async Task<Results> MyMethod()
And inside the method it will perform the logic, maybe using Reactive Extensions (Buffer). I don't know.
But how? I mean, multiple calls to this method should return the results to the correct caller. Is this even possible?
Thank you a lot!

There are rate limiting libraries available (see Esendex's TokenBucket Github or Nuget).
Usage is very simple, this example would limit polling to 1 a second
// Create a token bucket with a capacity of 1 token that refills at a fixed interval of 1 token/sec.
ITokenBucket bucket = TokenBuckets.Construct()
.WithCapacity(1)
.WithFixedIntervalRefillStrategy(1, TimeSpan.FromSeconds(1))
.Build();
// ...
while (true)
{
// Consume a token from the token bucket. If a token is not available this method will block until
// the refill strategy adds one to the bucket.
bucket.Consume(1);
Poll();
}
I have also needed to make it async for a project of mine, I simply made an extension method:
public static class TokenBucketExtensions
{
public static Task ConsumeAsync(this ITokenBucket tokenBucket)
{
return Task.Factory.StartNew(tokenBucket.Consume);
}
}
Using this you wouldn't need to throw/catch exceptions and writing a wrapper becomes fairly trivial

What exactly you should depends on your goals and limitations. My assumptions:
you want to avoid making requests while the rate limiter is in effect
you can't predict whether a specific request would be denied or how exactly will it take for another request to be allowed again
you don't need to make multiple request concurrently, and when multiple requests are waiting, it does not matter in which order are they completed
If these assumptions are valid, you could use AsyncAutoResetEvent from AsyncEx: wait for it to be set before making the request, set it after successfully making a request and set it after a delay when it's rate limited.
The code can look like this:
class RateLimitedWrapper<TException> where TException : Exception
{
private readonly AsyncAutoResetEvent autoResetEvent = new AsyncAutoResetEvent(set: true);
public async Task<T> Execute<T>(Func<Task<T>> func)
{
while (true)
{
try
{
await autoResetEvent.WaitAsync();
var result = await func();
autoResetEvent.Set();
return result;
}
catch (TException)
{
var ignored = Task.Delay(500).ContinueWith(_ => autoResetEvent.Set());
}
}
}
}
Usage:
public static Task<int> Add(int a, int b)
{
return rateLimitedWrapper.Execute(() => rateLimitingCalculator.Add(a, b));
}

A variant to implement this is to ensure a minimum time between calls, something like the following:
private readonly Object syncLock = new Object();
private readonly TimeSpan minTimeout = TimeSpan.FromSeconds(5);
private volatile DateTime nextCallDate = DateTime.MinValue;
public async Task<Result> RequestData(...) {
DateTime possibleCallDate = DateTime.Now;
lock(syncLock) {
// When is it possible to make the next call?
if (nextCallDate > possibleCallDate) {
possibleCallDate = nextCallDate;
}
nextCallDate = possibleCallDate + minTimeout;
}
TimeSpan waitingTime = possibleCallDate - DateTime.Now;
if (waitingTime > TimeSpan.Zero) {
await Task.Delay(waitingTime);
}
return await ... /* the actual call to API */ ...;
}

Older thread but in this context I would also like to mention the Polly library https://github.com/App-vNext/Polly which can be very helpful in this scenario.
Can do the same (and more) as TokenBucket https://github.com/esendex/TokenBucket mentioned in another answer, but this library is a bit older and doesn't support for example .net standard by default.

Related

ASP.NET Web API: One long method, how to ensure it'll be called once

I have a long running request to a web service which should be cached on the server side after completion. My problem is - I don't know how to prevent it being called concurrently/simultaneously before it's cached after first request.
My thought is I should create a data request Task and store it in a concurrent dictionary. So every other request should check if Task is already running and wait for it to complete.
I've ended up with this:
private static ConcurrentDictionary<string, Task> tasksCache = new ConcurrentDictionary<string, Task>();
public static T GetFromCache<T>(this ICacheManager<object> cacheManager, string name, Func<T> func)
{
if (cacheManager.Exists(name))
return (T)cacheManager[name];
if (tasksCache.ContainsKey(name))
{
tasksCache[name].Wait();
return (tasksCache[name] as Task<T>).Result;
}
var runningTask = Task.Run(() => func.Invoke());
tasksCache[name] = runningTask;
runningTask.Wait();
var data = runningTask.Result;
cacheManager.Put(name, data);
tasksCache.TryRemove(name, out Task t);
return data;
}
But this looks messy. Is there a better way?
I'd consider wrapping these in a Lazy<T> for each task, which has built-in semantics for controlling concurrent initialization.
This example demonstrates the use of the Lazy<T> class to provide lazy initialization with access from multiple threads.
You'll want to specify an appropriate LazyThreadSafetyMode.
Fully thread safe; uses locking to ensure that only one thread initializes the value. ExecutionAndPublication

Consuming rate-limiting API [duplicate]

I have access an API call that accepts a maximum rate of calls per second. If the rate is exceeded, an exception is thrown.
I would like to wrap this call into an abstraction that does the necessary to keep the call rate under the limit. It would act like a network router: handling multiple calls and returning the results to the correct caller caring about the call rate. The goal is to make the calling code as unaware as possible about that limitation. Otherwise, every part in the code having this call would have to be wrapped into a try-catch!
For example: Imagine that you can call a method from an extern API that can add 2 numbers. This API can be called 5 times per second. Anything higher than this will result in an exception.
To illustrate the problem, the external service that limits the call rate is like the one in the answer to this question
How to build a rate-limiting API with Observables?
ADDITIONAL INFO:
Since you don't want the worry about that limit every time you call this method from any part of your code, you think about designing a wrapper method that you could call without worrying about the rate limit. On the inside you care about the limit, but on the outside you expose a simple async method.
It's similar to a web server. How does it return the correct pack of results to the correct customer?
Multiple callers will call this method, and they will get the results as they come. This abstraction should act like a proxy.
How could I do it?
I'm sure the firm of the wrapper method should be like
public async Task<Results> MyMethod()
And inside the method it will perform the logic, maybe using Reactive Extensions (Buffer). I don't know.
But how? I mean, multiple calls to this method should return the results to the correct caller. Is this even possible?
Thank you a lot!
There are rate limiting libraries available (see Esendex's TokenBucket Github or Nuget).
Usage is very simple, this example would limit polling to 1 a second
// Create a token bucket with a capacity of 1 token that refills at a fixed interval of 1 token/sec.
ITokenBucket bucket = TokenBuckets.Construct()
.WithCapacity(1)
.WithFixedIntervalRefillStrategy(1, TimeSpan.FromSeconds(1))
.Build();
// ...
while (true)
{
// Consume a token from the token bucket. If a token is not available this method will block until
// the refill strategy adds one to the bucket.
bucket.Consume(1);
Poll();
}
I have also needed to make it async for a project of mine, I simply made an extension method:
public static class TokenBucketExtensions
{
public static Task ConsumeAsync(this ITokenBucket tokenBucket)
{
return Task.Factory.StartNew(tokenBucket.Consume);
}
}
Using this you wouldn't need to throw/catch exceptions and writing a wrapper becomes fairly trivial
What exactly you should depends on your goals and limitations. My assumptions:
you want to avoid making requests while the rate limiter is in effect
you can't predict whether a specific request would be denied or how exactly will it take for another request to be allowed again
you don't need to make multiple request concurrently, and when multiple requests are waiting, it does not matter in which order are they completed
If these assumptions are valid, you could use AsyncAutoResetEvent from AsyncEx: wait for it to be set before making the request, set it after successfully making a request and set it after a delay when it's rate limited.
The code can look like this:
class RateLimitedWrapper<TException> where TException : Exception
{
private readonly AsyncAutoResetEvent autoResetEvent = new AsyncAutoResetEvent(set: true);
public async Task<T> Execute<T>(Func<Task<T>> func)
{
while (true)
{
try
{
await autoResetEvent.WaitAsync();
var result = await func();
autoResetEvent.Set();
return result;
}
catch (TException)
{
var ignored = Task.Delay(500).ContinueWith(_ => autoResetEvent.Set());
}
}
}
}
Usage:
public static Task<int> Add(int a, int b)
{
return rateLimitedWrapper.Execute(() => rateLimitingCalculator.Add(a, b));
}
A variant to implement this is to ensure a minimum time between calls, something like the following:
private readonly Object syncLock = new Object();
private readonly TimeSpan minTimeout = TimeSpan.FromSeconds(5);
private volatile DateTime nextCallDate = DateTime.MinValue;
public async Task<Result> RequestData(...) {
DateTime possibleCallDate = DateTime.Now;
lock(syncLock) {
// When is it possible to make the next call?
if (nextCallDate > possibleCallDate) {
possibleCallDate = nextCallDate;
}
nextCallDate = possibleCallDate + minTimeout;
}
TimeSpan waitingTime = possibleCallDate - DateTime.Now;
if (waitingTime > TimeSpan.Zero) {
await Task.Delay(waitingTime);
}
return await ... /* the actual call to API */ ...;
}
Older thread but in this context I would also like to mention the Polly library https://github.com/App-vNext/Polly which can be very helpful in this scenario.
Can do the same (and more) as TokenBucket https://github.com/esendex/TokenBucket mentioned in another answer, but this library is a bit older and doesn't support for example .net standard by default.

Best way to call many web services?

I have 30 sub companies and every one has implemented their web service (with different technologies).
I need to implement a web service to aggregate them, for example, all the sub company web services have a web method with name GetUserPoint(int nationalCode) and I need to implement my web service that will call all of them and collect all of the responses (for example sum of points).
This is my base class:
public abstract class BaseClass
{ // all same attributes and methods
public long GetPoint(int nationalCode);
}
For each of sub companies web services, I implement a class that inherits this base class and define its own GetPoint method.
public class Company1
{
//implement own GetPoint method (call a web service).
}
to
public class CompanyN
{
//implement own GetPoint method (call a web service).
}
so, this is my web method:
[WebMethod]
public long MyCollector(string nationalCode)
{
BaseClass[] Clients = new BaseClass[] { new Company1(),//... ,new Company1()}
long Result = 0;
foreach (var item in Clients)
{
long ResultTemp = item.GetPoint(nationalCode);
Result += ResultTemp;
}
return Result;
}
OK, it works but it's so slow, because every sub companys web service is hosted on different servers (on the internet).
I can use parallel programing like this:(is this called parallel programing!?)
foreach (var item in Clients)
{
Tasks.Add(Task.Run(() =>
{
Result.AddRange(item.GetPoint(MasterLogId, mobileNumber));
}
}
I think parallel programing (and threading) isn't good for this solution, because my solution is IO bound (not CPU intensive)!
Call every external web service is so slow, am i right? Many thread that are pending to get response!
I think async programming is the best way but I am new to async programming and parallel programing.
What is the best way? (parallel.foreach - async TAP - async APM - async EAP -threading)
Please write for me an example.
It's refreshing to see someone who has done their homework.
First things first, as of .NET 4 (and this is still very much the case today) TAP is the preferred technology for async workflow in .NET. Tasks are easily composable, and for you to parallelise your web service calls is a breeze if they provide true Task<T>-returning APIs. For now you have "faked" it with Task.Run, and for the time being this may very well suffice for your purposes. Sure, your thread pool threads will spend a lot of time blocking, but if the server load isn't very high you could very well get away with it even if it's not the ideal thing to do.
You just need to fix a potential race condition in your code (more on that towards the end).
If you want to follow the best practices though, you go with true TAP. If your APIs provide Task-returning methods out of the box, that's easy. If not, it's not game over as APM and EAP can easily be converted to TAP. MSDN reference: https://msdn.microsoft.com/en-us/library/hh873178(v=vs.110).aspx
I'll also include some conversion examples here.
APM (taken from another SO question):
MessageQueue does not provide a ReceiveAsync method, but we can get it to play ball via Task.Factory.FromAsync:
public static Task<Message> ReceiveAsync(this MessageQueue messageQueue)
{
return Task.Factory.FromAsync(messageQueue.BeginReceive(), messageQueue.EndPeek);
}
...
Message message = await messageQueue.ReceiveAsync().ConfigureAwait(false);
If your web service proxies have BeginXXX/EndXXX methods, this is the way to go.
EAP
Assume you have an old web service proxy derived from SoapHttpClientProtocol, with only event-based async methods. You can convert them to TAP as follows:
public Task<long> GetPointAsyncTask(this PointWebService webService, int nationalCode)
{
TaskCompletionSource<long> tcs = new TaskCompletionSource<long>();
webService.GetPointAsyncCompleted += (s, e) =>
{
if (e.Cancelled)
{
tcs.SetCanceled();
}
else if (e.Error != null)
{
tcs.SetException(e.Error);
}
else
{
tcs.SetResult(e.Result);
}
};
webService.GetPointAsync(nationalCode);
return tcs.Task;
}
...
using (PointWebService service = new PointWebService())
{
long point = await service.GetPointAsyncTask(123).ConfigureAwait(false);
}
Avoiding races when aggregating results
With regards to aggregating parallel results, your TAP loop code is almost right, but you need to avoid mutating shared state inside your Task bodies as they will likely execute in parallel. Shared state being Result in your case - which is some kind of collection. If this collection is not thread-safe (i.e. if it's a simple List<long>), then you have a race condition and you may get exceptions and/or dropped results on Add (I'm assuming AddRange in your code was a typo, but if not - the above still applies).
A simple async-friendly rewrite that fixes your race would look like this:
List<Task<long>> tasks = new List<Task<long>>();
foreach (BaseClass item in Clients) {
tasks.Add(item.GetPointAsync(MasterLogId, mobileNumber));
}
long[] results = await Task.WhenAll(tasks).ConfigureAwait(false);
If you decide to be lazy and stick with the Task.Run solution for now, the corrected version will look like this:
List<Task<long>> tasks = new List<Task<long>>();
foreach (BaseClass item in Clients)
{
Task<long> dodgyThreadPoolTask = Task.Run(
() => item.GetPoint(MasterLogId, mobileNumber)
);
tasks.Add(dodgyThreadPoolTask);
}
long[] results = await Task.WhenAll(tasks).ConfigureAwait(false);
You can create an async version of the GetPoint:
public abstract class BaseClass
{ // all same attributes and methods
public abstract long GetPoint(int nationalCode);
public async Task<long> GetPointAsync(int nationalCode)
{
return await GetPoint(nationalCode);
}
}
Then, collect the tasks for each client call. After that, execute all tasks using Task.WhenAll. This will execute them all in parallell. Also, as pointed out by Kirill, you can await the results of each task:
var tasks = Clients.Select(x => x.GetPointAsync(nationalCode));
long[] results = await Task.WhenAll(tasks);
If you do not want to make the aggregating method async, you can collect the results by calling .Result instead of awaiting, like so:
long[] results = Task.WhenAll(tasks).Result;

Use a Task to avoid multiple calls to expensive operation and to cache its result

I have an async method that fetches some data from a database. This operation is fairly expensive, and takes a long time to complete. As a result, I'd like to cache the method's return value. However, it's possible that the async method will be called multiple times before its initial execution has a chance to return and save its result to the cache, resulting in multiple calls to this expensive operation.
To avoid this, I'm currently reusing a Task, like so:
public class DataAccess
{
private Task<MyData> _getDataTask;
public async Task<MyData> GetDataAsync()
{
if (_getDataTask == null)
{
_getDataTask = Task.Run(() => synchronousDataAccessMethod());
}
return await _getDataTask;
}
}
My thought is that the initial call to GetDataAsync will kick off the synchronousDataAccessMethod method in a Task, and any subsequent calls to this method before the Task has completed will simply await the already running Task, automatically avoiding calling synchronousDataAccessMethod more than once. Calls made to GetDataAsync after the private Task has completed will cause the Task to be awaited, which will immediately return the data from its initial execution.
This seems to be working, but I'm having some strange performance issues that I suspect may be tied to this approach. Specifically, awaiting _getDataTask after it has completed takes several seconds (and locks the UI thread), even though the synchronousDataAccessMethod call is not called.
Am I misusing async/await? Is there a hidden gotcha that I'm not seeing? Is there a better way to accomplish the desired behavior?
EDIT
Here's how I call this method:
var result = (await myDataAccessObject.GetDataAsync()).ToList();
Maybe it has something to do with the fact that the result is not immediately enumerated?
If you want to await it further up the call stack, I think you want this:
public class DataAccess
{
private Task<MyData> _getDataTask;
private readonly object lockObj = new Object();
public async Task<MyData> GetDataAsync()
{
lock(lockObj)
{
if (_getDataTask == null)
{
_getDataTask = Task.Run(() => synchronousDataAccessMethod());
}
}
return await _getDataTask;
}
}
Your original code has the potential for this happening:
Thread 1 sees that _getDataTask == null, and begins constructing the task
Thread 2 sees that _getDataTask == null, and begins constructing the task
Thread 1 finishes constructing the task, which starts, and Thread 1 waits on that task
Thread 2 finishes constructing a task, which starts, and Thread 2 waits on that task
You end up with two instances of the task running.
Use the lock function to prevent multiple calls to the database query section. Lock will make it thread safe so that once it has been cached all the other calls will use it instead of running to the database for fulfillment.
lock(StaticObject) // Create a static object so there is only one value defined for this routine
{
if(_getDataTask == null)
{
// Get data code here
}
return _getDataTask
}
Please rewrite your function as:
public Task<MyData> GetDataAsync()
{
if (_getDataTask == null)
{
_getDataTask = Task.Run(() => synchronousDataAccessMethod());
}
return _getDataTask;
}
This should not change at all the things that can be done with this function - you can still await on the returned task!
Please tell me if that changes anything.
Bit late to answer this but there is an open source library called LazyCache that will do this for you in two lines of code and it was recently updated to handle caching Tasks for just this sort of situation. It is also available on nuget.
Example:
Func<Task<List<MyData>>> cacheableAsyncFunc = () => myDataAccessObject.GetDataAsync();
var cachedData = await cache.GetOrAddAsync("myDataAccessObject.GetData", cacheableAsyncFunc);
return cachedData;
// Or instead just do it all in one line if you prefer
// return await cache.GetOrAddAsync("myDataAccessObject.GetData", myDataAccessObject.GetDataAsync);
}
It has built in locking by default so the cacheable method will only execute once per cache miss, and it uses a lamda so you can do "get or add" in one go. It defaults to 20 minutes sliding expiration but you can set whatever caching policy you like on it.
More info on caching tasks is in the api docs and you may find the sample app to demo caching tasks useful.
(Disclaimer: I am the author of LazyCache)

Calling async methods from a synchronous context

I'm calling a service over HTTP (ultimately using the HttpClient.SendAsync method) from within my code. This code is then called into from a WebAPI controller action. Mostly, it works fine (tests pass) but then when I deploy on say IIS, I experience a deadlock because caller of the async method call has been blocked and the continuation cannot proceed on that thread until it finishes (which it won't).
While I could make most of my methods async I don't feel as if I have a basic understanding of when I'd must do this.
For example, let's say I did make most of my methods async (since they ultimately call other async service methods) how would I then invoke the first async method of my program if I built say a message loop where I want some control of the degree of parallelism?
Since the HttpClient doesn't have any synchronous methods, what can I safely presume to do if I have an abstraction that isn't async aware? I've read about the ConfigureAwait(false) but I don't really understand what it does. It's strange to me that it's set after the async invocation. To me that feels as if a race waiting to happen... however unlikely...
WebAPI example:
public HttpResponseMessage Get()
{
var userContext = contextService.GetUserContext(); // <-- synchronous
return ...
}
// Some IUserContextService implementation
public IUserContext GetUserContext()
{
var httpClient = new HttpClient();
var result = httpClient.GetAsync(...).Result; // <-- I really don't care if this is asynchronous or not
return new HttpUserContext(result);
}
Message loop example:
var mq = new MessageQueue();
// we then run say 8 tasks that do this
for (;;)
{
var m = mq.Get();
var c = GetCommand(m);
c.InvokeAsync().Wait();
m.Delete();
}
When you have a message loop that allow things to happen in parallel and you have asynchronous methods, there's a opportunity to minimize latency. Basically, what I want to accomplish in this instance is to minimize latency and idle time. Though I'm actually unsure as to how to invoke into the command that's associated with the message that arrives off the queue.
To be more specific, if the command invocation needs to do service requests there's going to be latency in the invocation that could be used to get the next message. Stuff like that. I can totally do this simply by wrapping up things in queues and coordinating this myself but I'd like to see this work with just some async/await stuff.
While I could make most of my methods async I don't feel as if I have a basic understanding of when I'd must do this.
Start at the lowest level. It sounds like you've already got a start, but if you're looking for more at the lowest level, then the rule of thumb is anything I/O-based should be made async (e.g., HttpClient).
Then it's a matter of repeating the async infection. You want to use async methods, so you call them with await. So that method must be async. So all of its callers must use await, so they must also be async, etc.
how would I then invoke the first async method of my program if I built say a message loop where I want some control of the degree of parallelism?
It's easiest to put the framework in charge of this. E.g., you can just return a Task<T> from a WebAPI action, and the framework understands that. Similarly, UI applications have a message loop built-in that async will work naturally with.
If you have a situation where the framework doesn't understand Task or have a built-in message loop (usually a Console application or a Win32 service), you can use the AsyncContext type in my AsyncEx library. AsyncContext just installs a "main loop" (that is compatible with async) onto the current thread.
Since the HttpClient doesn't have any synchronous methods, what can I safely presume to do if I have an abstraction that isn't async aware?
The correct approach is to change the abstraction. Do not attempt to block on asynchronous code; I describe that common deadlock scenario in detail on my blog.
You change the abstraction by making it async-friendly. For example, change IUserContext IUserContextService.GetUserContext() to Task<IUserContext> IUserContextService.GetUserContextAsync().
I've read about the ConfigureAwait(false) but I don't really understand what it does. It's strange to me that it's set after the async invocation.
You may find my async intro helpful. I won't say much more about ConfigureAwait in this answer because I think it's not directly applicable to a good solution for this question (but I'm not saying it's bad; it actually should be used unless you can't use it).
Just bear in mind that async is an operator with precedence rules and all that. It feels magical at first, but it's really not so much. This code:
var result = await httpClient.GetAsync(url).ConfigureAwait(false);
is exactly the same as this code:
var asyncOperation = httpClient.GetAsync(url).ConfigureAwait(false);
var result = await asyncOperation;
There are usually no race conditions in async code because - even though the method is asynchronous - it is also sequential. The method can be paused at an await, and it will not be resumed until that await completes.
When you have a message loop that allow things to happen in parallel and you have asynchronous methods, there's a opportunity to minimize latency.
This is the second time you've mentioned a "message loop" "in parallel", but I think what you actually want is to have multiple (asynchronous) consumers working off the same queue, correct? That's easy enough to do with async (note that there is just a single message loop on a single thread in this example; when everything is async, that's usually all you need):
await tasks.WhenAll(ConsumerAsync(), ConsumerAsync(), ConsumerAsync());
async Task ConsumerAsync()
{
for (;;) // TODO: consider a CancellationToken for orderly shutdown
{
var m = await mq.ReceiveAsync();
var c = GetCommand(m);
await c.InvokeAsync();
m.Delete();
}
}
// Extension method
public static Task<Message> ReceiveAsync(this MessageQueue mq)
{
return Task<Message>.Factory.FromAsync(mq.BeginReceive, mq.EndReceive, null);
}
You'd probably also be interested in TPL Dataflow. Dataflow is a library that understands and works well with async code, and has nice parallel options built-in.
While I appreciate the insight from community members it's always difficult to express the intent of what I'm trying to do but tremendously helpful to get advice about circumstances surrounding the problem. With that, I eventually arrived that the following code.
public class AsyncOperatingContext
{
struct Continuation
{
private readonly SendOrPostCallback d;
private readonly object state;
public Continuation(SendOrPostCallback d, object state)
{
this.d = d;
this.state = state;
}
public void Run()
{
d(state);
}
}
class BlockingSynchronizationContext : SynchronizationContext
{
readonly BlockingCollection<Continuation> _workQueue;
public BlockingSynchronizationContext(BlockingCollection<Continuation> workQueue)
{
_workQueue = workQueue;
}
public override void Post(SendOrPostCallback d, object state)
{
_workQueue.TryAdd(new Continuation(d, state));
}
}
/// <summary>
/// Gets the recommended max degree of parallelism. (Your main program message loop could use this value.)
/// </summary>
public static int MaxDegreeOfParallelism { get { return Environment.ProcessorCount; } }
#region Helper methods
/// <summary>
/// Run an async task. This method will block execution (and use the calling thread as a worker thread) until the async task has completed.
/// </summary>
public static T Run<T>(Func<Task<T>> main, int degreeOfParallelism = 1)
{
var asyncOperatingContext = new AsyncOperatingContext();
asyncOperatingContext.DegreeOfParallelism = degreeOfParallelism;
return asyncOperatingContext.RunMain(main);
}
/// <summary>
/// Run an async task. This method will block execution (and use the calling thread as a worker thread) until the async task has completed.
/// </summary>
public static void Run(Func<Task> main, int degreeOfParallelism = 1)
{
var asyncOperatingContext = new AsyncOperatingContext();
asyncOperatingContext.DegreeOfParallelism = degreeOfParallelism;
asyncOperatingContext.RunMain(main);
}
#endregion
private readonly BlockingCollection<Continuation> _workQueue;
public int DegreeOfParallelism { get; set; }
public AsyncOperatingContext()
{
_workQueue = new BlockingCollection<Continuation>();
}
/// <summary>
/// Initialize the current thread's SynchronizationContext so that work is scheduled to run through this AsyncOperatingContext.
/// </summary>
protected void InitializeSynchronizationContext()
{
SynchronizationContext.SetSynchronizationContext(new BlockingSynchronizationContext(_workQueue));
}
protected void RunMessageLoop()
{
while (!_workQueue.IsCompleted)
{
Continuation continuation;
if (_workQueue.TryTake(out continuation, Timeout.Infinite))
{
continuation.Run();
}
}
}
protected T RunMain<T>(Func<Task<T>> main)
{
var degreeOfParallelism = DegreeOfParallelism;
if (!((1 <= degreeOfParallelism) & (degreeOfParallelism <= 5000))) // sanity check
{
throw new ArgumentOutOfRangeException("DegreeOfParallelism must be between 1 and 5000.", "DegreeOfParallelism");
}
var currentSynchronizationContext = SynchronizationContext.Current;
InitializeSynchronizationContext(); // must set SynchronizationContext before main() task is scheduled
var mainTask = main(); // schedule "main" task
mainTask.ContinueWith(task => _workQueue.CompleteAdding());
// for single threading we don't need worker threads so we don't use any
// otherwise (for increased parallelism) we simply launch X worker threads
if (degreeOfParallelism > 1)
{
for (int i = 1; i < degreeOfParallelism; i++)
{
ThreadPool.QueueUserWorkItem(_ => {
// do we really need to restore the SynchronizationContext here as well?
InitializeSynchronizationContext();
RunMessageLoop();
});
}
}
RunMessageLoop();
SynchronizationContext.SetSynchronizationContext(currentSynchronizationContext); // restore
return mainTask.Result;
}
protected void RunMain(Func<Task> main)
{
// The return value doesn't matter here
RunMain(async () => { await main(); return 0; });
}
}
This class is complete and it does a couple of things that I found difficult to grasp.
As general advice you should allow the TAP (task-based asynchronous) pattern to propagate through your code. This may imply quite a bit of refactoring (or redesign). Ideally you should be allowed to break this up into pieces and make progress as you work towards to overall goal of making your program more asynchronous.
Something that's inherently dangerous to do is to call asynchronous code callously in an synchronous fashion. By this we mean invoking the Wait or Result methods. These can lead to deadlocks. One way to work around something like that is to use the AsyncOperatingContext.Run method. It will use the current thread to run a message loop until the asynchronous call is complete. It will swap out whatever SynchronizationContext is associated with the current thread temporarily to do so.
Note: I don't know if this is enough, or if you are allowed to swap back the SynchronizationContext this way, assuming that you can, this should work. I've already been bitten by the ASP.NET deadlock issue and this could possibly function as a workaround.
Lastly, I found myself asking the question, what is the corresponding equivalent of Main(string[]) in an async context? Turns out that's the message loop.
What I've found is that there are two things that make out this async machinery.
SynchronizationContext.Post and the message loop. In my AsyncOperatingContext I provide a very simple message loop:
protected void RunMessageLoop()
{
while (!_workQueue.IsCompleted)
{
Continuation continuation;
if (_workQueue.TryTake(out continuation, Timeout.Infinite))
{
continuation.Run();
}
}
}
My SynchronizationContext.Post thus becomes:
public override void Post(SendOrPostCallback d, object state)
{
_workQueue.TryAdd(new Continuation(d, state));
}
And our entry point, basically the equivalent of an async main from synchronous context (simplified version from original source):
SynchronizationContext.SetSynchronizationContext(new BlockingSynchronizationContext(_workQueue));
var mainTask = main(); // schedule "main" task
mainTask.ContinueWith(task => _workQueue.CompleteAdding());
RunMessageLoop();
return mainTask.Result;
All of this is costly and we can't just go replace calls to async methods with this but it does allow us to rather quickly create the facilities required to keep writing async code where needed without having to deal with the whole program. It's also very clear from this implementation where the worker threads go and how the impact concurrency of your program.
I look at this and think to myself, yeap, that's how Node.js does it. Though JavaScript does not have this nice async/await language support that C# currently does.
As an added bonus, I have complete control of the degree of parallelism, and if I want, I can run my async tasks completely single threaded. Though, If I do so and call Wait or Result on any task, it will deadlock the program because it will block the only message loop available.

Categories