Problems doing asynch operations in C# using Mutex - c#

I've tried this MANY ways, here is the current iteration. I think I've just implemented this all wrong. What I'm trying to accomplish is to treat this Asynch result in such a way that until it returns AND I finish with my add-thumbnail call, I will not request another call to imageProvider.BeginGetImage.
To Clarify, my question is two-fold. Why does what I'm doing never seem to halt at my Mutex.WaitOne() call, and what is the proper way to handle this scenario?
/// <summary>
/// re-creates a list of thumbnails from a list of TreeElementViewModels (directories)
/// </summary>
/// <param name="list">the list of TreeElementViewModels to process</param>
public void BeginLayout(List<AiTreeElementViewModel> list)
{
// *removed code for canceling and cleanup from previous calls*
// Starts the processing of all folders in parallel.
Task.Factory.StartNew(() =>
{
thumbnailRequests = Parallel.ForEach<AiTreeElementViewModel>(list, options, ProcessFolder);
});
}
/// <summary>
/// Processes a folder for all of it's image paths and loads them from disk.
/// </summary>
/// <param name="element">the tree element to process</param>
private void ProcessFolder(AiTreeElementViewModel element)
{
try
{
var images = ImageCrawler.GetImagePaths(element.Path);
AsyncCallback callback = AddThumbnail;
foreach (var image in images)
{
Console.WriteLine("Attempting Enter");
synchMutex.WaitOne();
Console.WriteLine("Entered");
var result = imageProvider.BeginGetImage(callback, image);
}
}
catch (Exception exc)
{
Console.WriteLine(exc.ToString());
// TODO: Do Something here.
}
}
/// <summary>
/// Adds a thumbnail to the Browser
/// </summary>
/// <param name="result">an async result used for retrieving state data from the load task.</param>
private void AddThumbnail(IAsyncResult result)
{
lock (Thumbnails)
{
try
{
Stream image = imageProvider.EndGetImage(result);
string filename = imageProvider.GetImageName(result);
string imagePath = imageProvider.GetImagePath(result);
var imageviewmodel = new AiImageThumbnailViewModel(image, filename, imagePath);
thumbnailHash[imagePath] = imageviewmodel;
HostInvoke(() => Thumbnails.Add(imageviewmodel));
UpdateChildZoom();
//synchMutex.ReleaseMutex();
Console.WriteLine("Exited");
}
catch (Exception exc)
{
Console.WriteLine(exc.ToString());
// TODO: Do Something here.
}
}
}

To start with,
you create a Task to do a Parallel.ForEach to run a Method that Invokes a delegate.
Three levels of parallelism where 1 would be enough.
And if I read this right, inside the delegate you want to use a Mutex to run only 1 instance at a time.
Could you indicate which actions you want to happen in parallel?

Related

RateGate class to Polly policy

I'm trying to replace the RateGate logic with a Polly policy. However, there is no status code or anything and I'm not sure if it's possible achieve the same idea but with Polly.
Usage
// Binance allows 5 messages per second, but we still get rate limited if we send a lot of messages at that rate
// By sending 3 messages per second, evenly spaced out, we can keep sending messages without being limited
private readonly RateGate _webSocketRateLimiter = new RateGate(1, TimeSpan.FromMilliseconds(330));
private void Send(IWebSocket webSocket, object obj)
{
var json = JsonConvert.SerializeObject(obj);
_webSocketRateLimiter.WaitToProceed();
Log.Trace("Send: " + json);
webSocket.Send(json);
}
RateGate class
public class RateGate : IDisposable
{
// Semaphore used to count and limit the number of occurrences per
// unit time.
private readonly SemaphoreSlim _semaphore;
// Times (in millisecond ticks) at which the semaphore should be exited.
private readonly ConcurrentQueue<int> _exitTimes;
// Timer used to trigger exiting the semaphore.
private readonly Timer _exitTimer;
// Whether this instance is disposed.
private bool _isDisposed;
/// <summary>
/// Number of occurrences allowed per unit of time.
/// </summary>
public int Occurrences
{
get; private set;
}
/// <summary>
/// The length of the time unit, in milliseconds.
/// </summary>
public int TimeUnitMilliseconds
{
get; private set;
}
/// <summary>
/// Flag indicating we are currently being rate limited
/// </summary>
public bool IsRateLimited
{
get { return !WaitToProceed(0); }
}
/// <summary>
/// Initializes a <see cref="RateGate"/> with a rate of <paramref name="occurrences"/>
/// per <paramref name="timeUnit"/>.
/// </summary>
/// <param name="occurrences">Number of occurrences allowed per unit of time.</param>
/// <param name="timeUnit">Length of the time unit.</param>
/// <exception cref="ArgumentOutOfRangeException">
/// If <paramref name="occurrences"/> or <paramref name="timeUnit"/> is negative.
/// </exception>
public RateGate(int occurrences, TimeSpan timeUnit)
{
// Check the arguments.
if (occurrences <= 0)
throw new ArgumentOutOfRangeException(nameof(occurrences), "Number of occurrences must be a positive integer");
if (timeUnit != timeUnit.Duration())
throw new ArgumentOutOfRangeException(nameof(timeUnit), "Time unit must be a positive span of time");
if (timeUnit >= TimeSpan.FromMilliseconds(UInt32.MaxValue))
throw new ArgumentOutOfRangeException(nameof(timeUnit), "Time unit must be less than 2^32 milliseconds");
Occurrences = occurrences;
TimeUnitMilliseconds = (int)timeUnit.TotalMilliseconds;
// Create the semaphore, with the number of occurrences as the maximum count.
_semaphore = new SemaphoreSlim(Occurrences, Occurrences);
// Create a queue to hold the semaphore exit times.
_exitTimes = new ConcurrentQueue<int>();
// Create a timer to exit the semaphore. Use the time unit as the original
// interval length because that's the earliest we will need to exit the semaphore.
_exitTimer = new Timer(ExitTimerCallback, null, TimeUnitMilliseconds, -1);
}
// Callback for the exit timer that exits the semaphore based on exit times
// in the queue and then sets the timer for the nextexit time.
// Credit to Jim: http://www.jackleitch.net/2010/10/better-rate-limiting-with-dot-net/#comment-3620
// for providing the code below, fixing issue #3499 - https://github.com/QuantConnect/Lean/issues/3499
private void ExitTimerCallback(object state)
{
try
{
// While there are exit times that are passed due still in the queue,
// exit the semaphore and dequeue the exit time.
var exitTime = 0;
var exitTimeValid = _exitTimes.TryPeek(out exitTime);
while (exitTimeValid)
{
if (unchecked(exitTime - Environment.TickCount) > 0)
{
break;
}
_semaphore.Release();
_exitTimes.TryDequeue(out exitTime);
exitTimeValid = _exitTimes.TryPeek(out exitTime);
}
// we are already holding the next item from the queue, do not peek again
// although this exit time may have already pass by this stmt.
var timeUntilNextCheck = exitTimeValid
? Math.Min(TimeUnitMilliseconds, Math.Max(0, exitTime - Environment.TickCount))
: TimeUnitMilliseconds;
_exitTimer.Change(timeUntilNextCheck, -1);
}
catch (Exception)
{
// can throw if called when disposing
}
}
/// <summary>
/// Blocks the current thread until allowed to proceed or until the
/// specified timeout elapses.
/// </summary>
/// <param name="millisecondsTimeout">Number of milliseconds to wait, or -1 to wait indefinitely.</param>
/// <returns>true if the thread is allowed to proceed, or false if timed out</returns>
public bool WaitToProceed(int millisecondsTimeout)
{
// Check the arguments.
if (millisecondsTimeout < -1)
throw new ArgumentOutOfRangeException(nameof(millisecondsTimeout));
CheckDisposed();
// Block until we can enter the semaphore or until the timeout expires.
var entered = _semaphore.Wait(millisecondsTimeout);
// If we entered the semaphore, compute the corresponding exit time
// and add it to the queue.
if (entered)
{
var timeToExit = unchecked(Environment.TickCount + TimeUnitMilliseconds);
_exitTimes.Enqueue(timeToExit);
}
return entered;
}
/// <summary>
/// Blocks the current thread until allowed to proceed or until the
/// specified timeout elapses.
/// </summary>
/// <param name="timeout"></param>
/// <returns>true if the thread is allowed to proceed, or false if timed out</returns>
public bool WaitToProceed(TimeSpan timeout)
{
return WaitToProceed((int)timeout.TotalMilliseconds);
}
/// <summary>
/// Blocks the current thread indefinitely until allowed to proceed.
/// </summary>
public void WaitToProceed()
{
WaitToProceed(Timeout.Infinite);
}
// Throws an ObjectDisposedException if this object is disposed.
private void CheckDisposed()
{
if (_isDisposed)
throw new ObjectDisposedException("RateGate is already disposed");
}
/// <summary>
/// Releases unmanaged resources held by an instance of this class.
/// </summary>
public void Dispose()
{
Dispose(true);
GC.SuppressFinalize(this);
}
/// <summary>
/// Releases unmanaged resources held by an instance of this class.
/// </summary>
/// <param name="isDisposing">Whether this object is being disposed.</param>
protected virtual void Dispose(bool isDisposing)
{
if (!_isDisposed)
{
if (isDisposing)
{
// The semaphore and timer both implement IDisposable and
// therefore must be disposed.
_semaphore.Dispose();
_exitTimer.Dispose();
_isDisposed = true;
}
}
}
}
GitHub source code
Rate gate
Disclaimer: I haven't used this component so what I describe here is what I understand from the code.
It is an intrusive policy which means it modifies/alters the execution/data flow in order to slow down fast producer or smoothen out burst. It is blocking the flow to avoid resource abuse.
Here you can specify the "sleep duration" between subsequent calls which is enforced by the gate itself.
Polly's Rate limiter
This policy is designed to avoid resource abuse as well. That means if the consumer issues too many requests against the resource under a predefined time then it simply shortcuts the execution by throwing a RateLimitRejectedException.
So, if you want to allow 20 executions under 1 minute
RateLimitPolicy rateLimiter = Policy
.RateLimit(20, TimeSpan.FromSeconds(1));
and you do not want to exceed the limit you have to wait by yourself
rateLimiter.Execute(() =>
{
//Your Action delegate which runs <1ms
Thread.Sleep(50);
});
So, the executions should be distributed evenly during the allowed period. If your manually injected delay is shorter let's say 10ms then it will throw an exception.
Conclusion
According to my understanding both works like a proxy object. They are sitting between the consumer and producer to control the consumption rate.
The rate gate does that by injecting artificial delays whereas the rate limiter shortcuts the execution if abuse is detected.

TPL Dataflow SendAsync task never completes when block is linked

I was hoping for a clean solution for throttling a specific type of producer while the consumer is busy processing, without writing a custom block of my own. I had hoped the code below would have done exactly that, but once SendAsync blocks after the capacity limit hits, its task never completes, insinuating that the postponed message is never consumed.
_block = new TransformBlock<int, string>(async i =>
{
// Send the next request to own input queue
// before processing this request, or block
// while pipeline is full.
// Do not start processing if pipeline is full!
await _block.SendAsync(i + 1);
// Process this request and pass it on to the
// next block in the pipeline.
return i.ToString();
},
// TransformBlock block has input and output buffers. Limit both,
// otherwise requests that cannot be passed on to the next
// block in the pipeline will be cached in this block's output
// buffer, never throttling this block.
new ExecutionDataflowBlockOptions { BoundedCapacity = 5 });
// This block is linked to the output of the
// transform block.
var action = new ActionBlock<string>(async i =>
{
// Do some very long processing on the transformed element.
await Task.Delay(1000);
},
// Limit buffer size, and consequently throttle previous blocks
// in the pipeline.
new ExecutionDataflowBlockOptions { BoundedCapacity = 5 });
_block.LinkTo(action);
// Start running.
_block.Post(0);
I was wondering if there is any reason why the linked ActionBlock does not consume the postponed message.
I faced the same problem as you. I didn't dig deep into implementation of LinkTo but I think it propogate message only when source block received some. I mean, there may be a case when source block have some messages in its input, but it will not process them until next Post/SendAsync it received. And that's your case.
Here is my solution and it's working for me.
First declare "engine"
/// <summary>
/// Engine-class (like a car engine) that produced a lot count (or infinite) of actions.
/// </summary>
public class Engine
{
private BufferBlock<int> _bufferBlock;
/// <summary>
/// Creates source block that produced stub data.
/// </summary>
/// <param name="count">Count of actions. If count = 0 then it's infinite loop.</param>
/// <param name="boundedCapacity">Bounded capacity (throttling).</param>
/// <param name="cancellationToken">Cancellation token (used to stop infinite loop).</param>
/// <returns>Source block that constantly produced 0-value.</returns>
public ISourceBlock<int> CreateEngine(int count, int boundedCapacity, CancellationToken cancellationToken)
{
_bufferBlock = new BufferBlock<int>(new DataflowBlockOptions { BoundedCapacity = boundedCapacity });
Task.Run(async () =>
{
var counter = 0;
while (count == 0 || counter < count)
{
await _bufferBlock.SendAsync(0);
if (cancellationToken.IsCancellationRequested)
return;
counter++;
}
}, cancellationToken).ContinueWith((task) =>
{
_bufferBlock.Complete();
});
return _bufferBlock;
}
}
And then Producer that uses engine
/// <summary>
/// Producer that generates random byte blobs with specified size.
/// </summary>
public class Producer
{
private static Random random = new Random();
/// <summary>
/// Returns source block that produced byte arrays.
/// </summary>
/// <param name="blobSize">Size of byte arrays.</param>
/// <param name="count">Total count of blobs (if 0 then infinite).</param>
/// <param name="boundedCapacity">Bounded capacity (throttling).</param>
/// <param name="cancellationToken">Cancellation token (used to stop infinite loop).</param>
/// <returns>Source block.</returns>
public static ISourceBlock<byte[]> BlobsSourceBlock(int blobSize, int count, int boundedCapacity, CancellationToken cancellationToken)
{
// Creating engine with specified bounded capacity.
var engine = new Engine().CreateEngine(count, boundedCapacity, cancellationToken);
// Creating transform block that uses our driver as a source.
var block = new TransformBlock<int, byte[]>(
// Useful work.
i => CreateBlob(blobSize),
new ExecutionDataflowBlockOptions
{
// Here you can specify your own throttling.
BoundedCapacity = boundedCapacity,
MaxDegreeOfParallelism = Environment.ProcessorCount,
});
// Linking engine (and engine is already working at that time).
engine.LinkTo(block, new DataflowLinkOptions { PropagateCompletion = true });
return block;
}
/// <summary>
/// Simple random byte[] generator.
/// </summary>
/// <param name="size">Array size.</param>
/// <returns>byte[]</returns>
private static byte[] CreateBlob(int size)
{
var buffer = new byte[size];
random.NextBytes(buffer);
return buffer;
}
}
Now you can use producer with consumer (eg ActionBlock)
var blobsProducer = BlobsProducer.CreateAndStartBlobsSourceBlock(0, 1024 * 1024, 10, cancellationTokenSource.Token);
var md5Hash = MD5.Create();
var actionBlock = new ActionBlock<byte[]>(b =>
{
Console.WriteLine(GetMd5Hash(md5Hash, b));
},
new ExecutionDataflowBlockOptions() { BoundedCapacity = 10 });
blobsProducer.LinkTo(actionBlock);
Hope it will help you!

Why can I never empty the BlockingCollection?

I am downloading some JSON periodically, say every 10 seconds... When the data arrives, an event is fired. The event fired simply adds the JSON to a BlockingCollection<string>(to be processed).
I'm trying to process the JSON as fast as possible (as soon as it arrives...):
public class Engine
{
private BlockingCollection<string> Queue = new BlockingCollection<string>();
private DataDownloader DataDownloader;
public void Init(string url, int interaval)
{
dataDownloader = new DataDownloader(url, interaval);
dataDownloader .StartCollecting();
dataDownloader .DataReceivedEvent += DataArrived;
//Kick off a new task to process the incomming JSON
Task.Factory.StartNew(Process, TaskCreationOptions.LongRunning);
}
/// <summary>
/// Processes the JSON in parallel
/// </summary>
private void Process()
{
Parallel.ForEach(Queue.GetConsumingEnumerable(), ProcessJson);
}
/// <summary>
/// Deserializes JSON and adds result to database
/// </summary>
/// <param name="json"></param>
private void ProcessJson(string json)
{
using (var db = new MyDataContext())
{
var items= Extensions.DeserializeData(json);
foreach (var item in items)
{
db.Items.Add(item);
db.SaveChanges();
}
}
}
private void DataArrived(object sender, string json)
{
Queue.Add(json);
Console.WriteLine("Queue length: " + Queue.Count);
}
}
When I run the program, it works and data gets added to the Database, but if I watch the console message from Console.WriteLine("Queue length: " + Queue.Count);, I get something like this:
1
1
1
1
1
1
1
1
2
3
4
5
6
7
...
I've tried modifying my Process to look like this:
/// <summary>
/// Processes the JSON in parallel
/// </summary>
private void Process()
{
foreach (var json in Queue.GetConsumingEnumerable())
{
ProcessJson(json);
}
}
I then add multiple Task.Factory.StartNew(Process, TaskCreationOptions.LongRunning); but I get the same problem...
Does anyone have any idea of what is going wrong here?
The queue will initially be filled before processing starts. Probably because the Entity Framework stuff needs to get loaded and a database connection has to be established, this takes a while.
Then the GetConsumingEnumerable() starts to catch up with the downloading process, depleting the queue during downloading. The collection is empty, MoveNext() returns false, Parallel.ForEach() exits and Process() finishes.
Then you'll see the queue starting to fill up, because it's not consumed anymore.
You need to keep trying to read from the BlockingCollection until the download process finishes.

.NET Throttle algorithm

i would like to implement a good throttle algorithm by in .net (C# or VB) but i cannot figure out how could i do that.
The case is my asp.net website should post requests to another website to fetch results.
At maximum 300 requests per minute should be sent.
If the requests exceed the 300 limit the other party Api returns nothing (Which is something i would not like to use as a check in my code).
P.S. I have seen solutions in other languages than .net but i am a newbie and please be kind and keep your answers as simple as 123.
Thank you
You could have a simple application (or session) class and check that for hits. This is something extremely rough just to give you the idea:
public class APIHits {
public int hits { get; private set; }
private DateTime minute = DateTime.Now();
public bool AddHit()
{
if (hits < 300) {
hits++;
return true;
}
else
{
if (DateTime.Now() > minute.AddSeconds(60))
{
//60 seconds later
minute = DateTime.Now();
hits = 1;
return true;
}
else
{
return false;
}
}
}
}
The simplest approach is just to time how long it is between packets and not allow them to be sent at a rate of more than one every 0.2 seconds. That is, record the time when you are called and when you are next called, check that at least 200ms has elasped, or return nothing.
This approach will work, but it will only work for smooth packet flows - if you expect bursts of activity then you may want to allow 5 messages in any 200ms period as long as the average over 1 minute is no more than 300 calls. In this case, you could use an array of values to store the "timestamps" of the last 300 packets, and then each time yoiu receive a call you can look back to "300 calls ago" to check that at least 1 minute has elapsed.
For both of these schemes, the time values returned by Environment.TickCount would be adequate for your needs (spans of not less than 200 milliseconds), as it's accurate to about 15 ms.
Here's an async and sync implementation of a throttle that can limit the number of calls to a method per duration of time. It's based on a simple comparison of the current time to DateTimeOffset and Task.Delay/Thread.Sleep. It should work fine for many implementations that don't need a high degree of time resolution, and should be called BEFORE the methods that you want to throttle.
This solution allows the user to specify the number of calls that are allowed per duration (with the default being 1 call per time period). This allows your throttle to be as "burstable" as you need at the cost of no control over when the callers can continue, or calls can be as evenly spaced as possible.
Let's say let’s say the target is 300 calls/min: you could have a regular throttle with a duration of 200ms that will evenly spread out every call with at least a minimum of 200ms in between, or you could create a throttle that will allow 5 calls every second with no regard to their spacing (first 5 calls win – might be all at once!). Both will keep the rate limit under 300calls/min, but the former is on the extreme end of evenly separated and the latter is more “bursty”. Having things evenly spread out is nice when processing items in a loop, but may not be so good for things running in parallel (like web requests) where the call times are unpredictable and unnecessary delays might actually slow down throughput. Again, your use case and testing will have to be your guide on which is best.
This class is thread-safe and you'll need to keep a reference to an instance of it somewhere that is accessible to the object instances that need to share it. For an ASP.NET web application that would be a field on the application instance, could be a static field on a web page/controller, injected from the DI container of your choice as a singleton, or any other way you could access the shared instance in your particular scenario.
EDIT: Updated to ensure the delay is never longer than the duration.
public class Throttle
{
/// <summary>
/// How maximum time to delay access.
/// </summary>
private readonly TimeSpan _duration;
/// <summary>
/// The next time to run.
/// </summary>
private DateTimeOffset _next = DateTimeOffset.MinValue;
/// <summary>
/// Synchronize access to the throttle gate.
/// </summary>
private readonly SemaphoreSlim _mutex = new SemaphoreSlim(1, 1);
/// <summary>
/// Number of allowed callers per time window.
/// </summary>
private readonly int _numAllowed = 1;
/// <summary>
/// The number of calls in the current time window.
/// </summary>
private int _count;
/// <summary>
/// The amount of time per window.
/// </summary>
public TimeSpan Duration => _duration;
/// <summary>
/// The number of calls per time period.
/// </summary>
public int Size => _numAllowed;
/// <summary>
/// Crates a Throttle that will allow one caller per duration.
/// </summary>
/// <param name="duration">The amount of time that must pass between calls.</param>
public Throttle(TimeSpan duration)
{
if (duration.Ticks <= 0)
throw new ArgumentOutOfRangeException(nameof(duration));
_duration = duration;
}
/// <summary>
/// Creates a Throttle that will allow the given number of callers per time period.
/// </summary>
/// <param name="num">The number of calls to allow per time period.</param>
/// <param name="per">The duration of the time period.</param>
public Throttle(int num, TimeSpan per)
{
if (num <= 0 || per.Ticks <= 0)
throw new ArgumentOutOfRangeException();
_numAllowed = num;
_duration = per;
}
/// <summary>
/// Returns a task that will complete when the caller may continue.
/// </summary>
/// <remarks>This method can be used to synchronize access to a resource at regular intervals
/// with no more frequency than specified by the duration,
/// and should be called BEFORE accessing the resource.</remarks>
/// <param name="cancellationToken">A cancellation token that may be used to abort the stop operation.</param>
/// <returns>The number of actors that have been allowed within the current time window.</returns>
public async Task<int> WaitAsync(CancellationToken cancellationToken = default(CancellationToken))
{
await _mutex.WaitAsync(cancellationToken)
.ConfigureAwait(false);
try
{
var delay = _next - DateTimeOffset.UtcNow;
// ensure delay is never longer than the duration
if (delay > _duration)
delay = _duration;
// continue immediately based on count
if (_count < _numAllowed)
{
_count++;
if (delay.Ticks <= 0) // past time window, reset
{
_next = DateTimeOffset.UtcNow.Add(_duration);
_count = 1;
}
return _count;
}
// over the allowed count within the window
if (delay.Ticks > 0)
{
// delay until the next window
await Task.Delay(delay, cancellationToken)
.ConfigureAwait(false);
}
_next = DateTimeOffset.UtcNow.Add(_duration);
_count = 1;
return _count;
}
finally
{
_mutex.Release();
}
}
/// <summary>
/// Returns a task that will complete when the caller may continue.
/// </summary>
/// <remarks>This method can be used to synchronize access to a resource at regular intervals
/// with no more frequency than specified by the duration,
/// and should be called BEFORE accessing the resource.</remarks>
/// <param name="cancellationToken">A cancellation token that may be used to abort the stop operation.</param>
/// <returns>The number of actors that have been allowed within the current time window.</returns>
public int Wait(CancellationToken cancellationToken = default(CancellationToken))
{
_mutex.Wait(cancellationToken);
try
{
var delay = _next - DateTimeOffset.UtcNow;
// ensure delay is never larger than the duration.
if (delay > _duration)
delay = _duration;
// continue immediately based on count
if (_count < _numAllowed)
{
_count++;
if (delay.Ticks <= 0) // past time window, reset
{
_next = DateTimeOffset.UtcNow.Add(_duration);
_count = 1;
}
return _count;
}
// over the allowed count within the window
if (delay.Ticks > 0)
{
// delay until the next window
Thread.Sleep(delay);
}
_next = DateTimeOffset.UtcNow.Add(_duration);
_count = 1;
return _count;
}
finally
{
_mutex.Release();
}
}
}
This sample shows how the throttle can be used synchronously in a loop, as well as how cancellation behaves. If you think of it like people getting in line for a ride, if the cancellation token is signaled it's as if the person steps out of line and the other people move forward.
var t = new Throttle(5, per: TimeSpan.FromSeconds(1));
var c = new CancellationTokenSource(TimeSpan.FromSeconds(22));
foreach(var i in Enumerable.Range(1,300)) {
var ct = i > 250
? default(CancellationToken)
: c.Token;
try
{
var n = await t.WaitAsync(ct).ConfigureAwait(false);
WriteLine($"{i}: [{n}] {DateTime.Now}");
}
catch (OperationCanceledException)
{
WriteLine($"{i}: Operation Canceled");
}
}

How can I most effectively take advantage of multiple cores for short computations in .NET?

Here is the context: I am writing an interpreter in C# for a small programming language called Heron, and it has some primitive list operations which can be executed in parallel.
One of the biggest challenges I am facing is to distribute the work done by the evaluator across the different cores effectively whenever a parallelizable operation is encountered. This can be a short or long operation, it is hard to determine in advance.
One thing that I don't have to worry about is synchronizing data: the parallel operations are explicitly not allowed to modify data.
So the primary questions I have is:
What is the most effective way to distribute the work across threads, so that I can guarantee that the computer will distribute the work across two cores?
I am also interested in a related question:
Roughly how long should an operation take before we can start overcoming the overhead of separating the work onto another thread?
If you want to do a lot with parallel operations, you're going to want to start with .Net 4.0. Here's the Parallel Programming for .Net documentation. You'll want to start here though. .Net 4.0 adds a LOT in terms of multi-core utilization. Here's a quick example:
Current 3.5 Serial method:
for(int i = 0; i < 30000; i++)
{
doSomething(i);
}
New .Net 4.0 Parallel method:
Parallel.For(0, 30000, (i) => doSomething(i));
The Parallel.For method automatically scales across the number of cores available, you can see how fast you could start taking advantage of this. There are dozens of new libraries in the framework, supporting full thread/task management like your example as well (Including all the piping for syncing, cancellation, etc).
There are libraries for Parallel LINQ (PLINQ), Task Factories, Task Schedulers and a few others. In short, for the specific task you laid out .Net 4.0 has huge benefits for you, and I'd go ahead and grab the free beta 2 (RC coming soon) and get started. (No, I don't work for Microsoft...but rarely do I see an upcoming release fulfill a need so perfectly, so I highly recommend .Net 4.0 for you)
Because I didn't want to develop using VS 2010, and I found that ThreadPool didn't have optimal performance for distributing work across cores (I think because it started/stopped too many threads) I ended up rolling my own. Hope that others find this useful:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
namespace HeronEngine
{
/// <summary>
/// Represents a work item.
/// </summary>
public delegate void Task();
/// <summary>
/// This class is intended to efficiently distribute work
/// across the number of cores.
/// </summary>
public static class Parallelizer
{
/// <summary>
/// List of tasks that haven't been yet acquired by a thread
/// </summary>
static List<Task> allTasks = new List<Task>();
/// <summary>
/// List of threads. Should be one per core.
/// </summary>
static List<Thread> threads = new List<Thread>();
/// <summary>
/// When set signals that there is more work to be done
/// </summary>
static ManualResetEvent signal = new ManualResetEvent(false);
/// <summary>
/// Used to tell threads to stop working.
/// </summary>
static bool shuttingDown = false;
/// <summary>
/// Creates a number of high-priority threads for performing
/// work. The hope is that the OS will assign each thread to
/// a separate core.
/// </summary>
/// <param name="cores"></param>
public static void Initialize(int cores)
{
for (int i = 0; i < cores; ++i)
{
Thread t = new Thread(ThreadMain);
// This system is not designed to play well with others
t.Priority = ThreadPriority.Highest;
threads.Add(t);
t.Start();
}
}
/// <summary>
/// Indicates to all threads that there is work
/// to be done.
/// </summary>
public static void ReleaseThreads()
{
signal.Set();
}
/// <summary>
/// Used to indicate that there is no more work
/// to be done, by unsetting the signal. Note:
/// will not work if shutting down.
/// </summary>
public static void BlockThreads()
{
if (!shuttingDown)
signal.Reset();
}
/// <summary>
/// Returns any tasks queued up to perform,
/// or NULL if there is no work. It will reset
/// the global signal effectively blocking all threads
/// if there is no more work to be done.
/// </summary>
/// <returns></returns>
public static Task GetTask()
{
lock (allTasks)
{
if (allTasks.Count == 0)
{
BlockThreads();
return null;
}
Task t = allTasks.Peek();
allTasks.Pop();
return t;
}
}
/// <summary>
/// Primary function for each thread
/// </summary>
public static void ThreadMain()
{
while (!shuttingDown)
{
// Wait until work is available
signal.WaitOne();
// Get an available task
Task task = GetTask();
// Note a task might still be null becaue
// another thread might have gotten to it first
while (task != null)
{
// Do the work
task();
// Get the next task
task = GetTask();
}
}
}
/// <summary>
/// Distributes work across a number of threads equivalent to the number
/// of cores. All tasks will be run on the available cores.
/// </summary>
/// <param name="localTasks"></param>
public static void DistributeWork(List<Task> localTasks)
{
// Create a list of handles indicating what the main thread should wait for
WaitHandle[] handles = new WaitHandle[localTasks.Count];
lock (allTasks)
{
// Iterate over the list of localTasks, creating a new task that
// will signal when it is done.
for (int i = 0; i < localTasks.Count; ++i)
{
Task t = localTasks[i];
// Create an event used to signal that the task is complete
ManualResetEvent e = new ManualResetEvent(false);
// Create a new signaling task and add it to the list
Task signalingTask = () => { t(); e.Set(); };
allTasks.Add(signalingTask);
// Set the corresponding wait handler
handles[i] = e;
}
}
// Signal to waiting threads that there is work
ReleaseThreads();
// Wait until all of the designated work items are completed.
Semaphore.WaitAll(handles);
}
/// <summary>
/// Indicate to the system that the threads should terminate
/// and unblock them.
/// </summary>
public static void CleanUp()
{
shuttingDown = true;
ReleaseThreads();
}
}
}
I would go with thread pool even though it has its problems, MS is investing in improving it and it seems like .NET 4 will have an improved one. At this point, I think that the best thing would be to use the thread pool wrapped in your own object and wait with deciding about your own implementation

Categories