I have multiple enumerators that enumerate over flat files. I originally had each enumerator in a Parallel Invoke and each Action was adding to a BlockingCollection<Entity> and that collections was returning a ConsumingEnumerable();
public interface IFlatFileQuery
{
IEnumerable<Entity> Run();
}
public class FlatFile1 : IFlatFileQuery
{
public IEnumerable<Entity> Run()
{
// loop over a flat file and yield each result
yield return Entity;
}
}
public class Main
{
public IEnumerable<Entity> DoLongTask(ICollection<IFlatFileQuery> _flatFileQueries)
{
// do some other stuff that needs to be returned first:
yield return Entity;
// then enumerate and return the flat file data
foreach (var entity in GetData(_flatFileQueries))
{
yield return entity;
}
}
private IEnumerable<Entity> GetData(_flatFileQueries)
{
var buffer = new BlockingCollection<Entity>(100);
var actions = _flatFileQueries.Select(fundFileQuery => (Action)(() =>
{
foreach (var entity in fundFileQuery.Run())
{
buffer.TryAdd(entity, Timeout.Infinite);
}
})).ToArray();
Task.Factory.StartNew(() =>
{
Parallel.Invoke(actions);
buffer.CompleteAdding();
});
return buffer.GetConsumingEnumerable();
}
}
However after a bit of testing it turns out that the code change below is about 20-25% faster.
private IEnumerable<Entity> GetData(_flatFileQueries)
{
return _flatFileQueries.AsParallel().SelectMany(ffq => ffq.Run());
}
The trouble with the code change is that it waits till all flat file queries are enumerated before it returns the whole lot that can then be enumerated and yielded.
Would it be possible to yield in the above bit of code somehow to make it even faster?
I should add that at most the combined results of all the flat file queries might only be 1000 or so Entities.
Edit:
Changing it to the below doesn't make a difference to the run time. (R# even suggests to go back to the way it was)
private IEnumerable<Entity> GetData(_flatFileQueries)
{
foreach (var entity in _flatFileQueries.AsParallel().SelectMany(ffq => ffq.Run()))
{
yield return entity;
}
}
The trouble with the code change is that it waits till all flat file queries are enumerated before it returns the whole lot that can then be enumerated and yielded.
Let's prove that it's false by a simple example. First, let's create a TestQuery class that will yield a single entity after a given time. Second, let's execute several test queries in parallel and measure how long it took to yield their result.
public class TestQuery : IFlatFileQuery {
private readonly int _sleepTime;
public IEnumerable<Entity> Run() {
Thread.Sleep(_sleepTime);
return new[] { new Entity() };
}
public TestQuery(int sleepTime) {
_sleepTime = sleepTime;
}
}
internal static class Program {
private static void Main() {
Stopwatch stopwatch = Stopwatch.StartNew();
var queries = new IFlatFileQuery[] {
new TestQuery(2000),
new TestQuery(3000),
new TestQuery(1000)
};
foreach (var entity in queries.AsParallel().SelectMany(ffq => ffq.Run()))
Console.WriteLine("Yielded after {0:N0} seconds", stopwatch.Elapsed.TotalSeconds);
Console.ReadKey();
}
}
This code prints:
Yielded after 1 seconds
Yielded after 2 seconds
Yielded after 3 seconds
You can see with this output that AsParallel() will yield each result as soon as its available, so everything works fine. Note that you might get different timings depending on the degree of parallelism (such as "2s, 5s, 6s" with a degree of parallelism of 1, effectively making the whole operation not parallel at all). This output comes from an 4-cores machine.
Your long processing will probably scale with the number of cores, if there is no common bottleneck between the threads (such as a shared locked resource). You might want to profile your algorithm to see if there are slow parts that can be improved using tools such as dotTrace.
I don't think there is a red flag in your code anywhere. There are no outrageous inefficiencies. I think it comes down to multiple smaller differences.
PLINQ is very good at processing streams of data. Internally, it works more efficiently than adding items to a synchronized list one-by-one. I suspect that your calls to TryAdd are a bottleneck because each call requires at least two Interlocked operations internally. Those can put enormous load on the inter-processor memory bus because all threads will compete for the same cache line.
PLINQ is cheaper because internally, it does some buffering. I'm sure it doesn't output items one-by-one. Probably it batches them and amortizes sycnhronization cost that way over multiple items.
A second issue would be the bounded capacity of the BlockingCollection. 100 is not a lot. This might lead to a lot of waiting. Waiting is costly because it requires a call to the kernel and a context switch.
I make this alternative that works good for me in any scenario:
This works for me:
In a Task in a Parallel.Foreach Enqueue in a ConcurrentQueue the item
transformed to be processed.
The task has a continue that marks a
flag with that task ends.
In the same thread of execution with tasks
ends a while dequeue and yields
Fast and excellent results for me:
Task.Factory.StartNew (() =>
{
Parallel.ForEach<string> (TextHelper.ReadLines(FileName), ProcessHelper.DefaultParallelOptions,
(string currentLine) =>
{
// Read line, validate and enqeue to an instance of FileLineData (custom class)
});
}).
ContinueWith
(
ic => isCompleted = true
);
while (!isCompleted || qlines.Count > 0)
{
if (qlines.TryDequeue (out returnLine))
{
yield return returnLine;
}
}
By default the ParallelQuery class, when is working on IEnumerable<T> sources, employs a partitioning strategy known as "chunk partitioning". With this strategy each worker thread grabs a progressively larger number of items each time. This means that it has an input buffer. Then the results are accumulated into an output buffer, having a size chosen by the system, before they are available to the consumer of the query. You can disable both buffers by using the configuration options EnumerablePartitionerOptions.NoBuffering and ParallelMergeOptions.NotBuffered.
private IEnumerable<Entity> GetData(ICollection<IFlatFileQuery> flatFileQueries)
{
return Partitioner
.Create(flatFileQueries, EnumerablePartitionerOptions.NoBuffering)
.AsParallel()
.AsOrdered()
.WithMergeOptions(ParallelMergeOptions.NotBuffered)
.SelectMany(ffq => ffq.Run());
}
This way each worker thread will grab only one item at a time, and will propagate the result as soon as it is computed.
NoBuffering: Create a partitioner that takes items from the source enumerable one at a time and does not use intermediate storage that can be accessed more efficiently by multiple threads. This option provides support for low latency (items will be processed as soon as they are available from the source) and provides partial support for dependencies between items (a thread cannot deadlock waiting for an item that the thread itself is responsible for processing).
NotBuffered: Use a merge without output buffers. As soon as result elements have been computed, make that element available to the consumer of the query.
Related
In my code I have a method such as:
void PerformWork(List<Item> items)
{
HostingEnvironment.QueueBackgroundWorkItem(async cancellationToken =>
{
foreach (var item in items)
{
await itemHandler.PerformIndividualWork(item);
}
});
}
Where Item is just a known model and itemHandler just does some work based off of the model (the ItemHandler class is defined in a separately maintained code base as nuget pkg I'd rather not modify).
The purpose of this code is to have work done for a list of items in the background but synchronously.
As part of the work, I would like to create a unit test to verify that when this method is called, the items are handled synchronously. I'm pretty sure the issue can be simplified down to this:
await MyTask(1);
await MyTask(2);
Assert.IsTrue(/* MyTask with arg 1 was completed before MyTask with arg 2 */);
The first part of this code I can easily unit test is that the sequence is maintained. For example, using NSubstitute I can check method call order on the library code:
Received.InOrder(() =>
{
itemHandler.PerformIndividualWork(Arg.Is<Item>(arg => arg.Name == "First item"));
itemHandler.PerformIndividualWork(Arg.Is<Item>(arg => arg.Name == "Second item"));
itemHandler.PerformIndividualWork(Arg.Is<Item>(arg => arg.Name == "Third item"));
});
But I'm not quite sure how to ensure that they aren't run in parallel. I've had several ideas which seem bad like mocking the library to have an artificial delay when PerformIndividualWork is called and then either checking a time elapsed on the whole background task being queued or checking the timestamps of the itemHandler received calls for a minimum time between the calls. For instance, if I have PerformIndividualWork mocked to delay 500 milliseconds and I'm expecting three items, then I could check elapsed time:
stopwatch.Start();
// I have an interface instead of directly calling HostingEnvironment, so I can access the task being queued here
backgroundTask.Invoke(...);
stopwatch.Stop();
Assert.IsTrue(stopwatch.ElapsedMilliseconds > 1500);
But that doesn't feels right and could lead to false positives. Perhaps the solution lies in modifying the code itself; however, I can't think of a way of meaningfully changing it to make this sort of unit test (testing tasks are run in order) possible. We'll definitely have system/integration testing to ensure the issue caused by asynchronous performance of the individual items doesn't happen, but I would like to hit testing here at this level as well.
Not sure if this is a good idea, but one approach could be to use an itemHandler that will detect when items are handled in parallel. Here is a quick and dirty example:
public class AssertSynchronousItemHandler : IItemHandler
{
private volatile int concurrentWork = 0;
public List<Item> Items = new List<Item>();
public Task PerformIndividualWork(Item item) =>
Task.Run(() => {
var result = Interlocked.Increment(ref concurrentWork);
if (result != 1) {
throw new Exception($"Expected 1 work item running at a time, but got {result}");
}
Items.Add(item);
var after = Interlocked.Decrement(ref concurrentWork);
if (after != 0) {
throw new Exception($"Expected 0 work items running once this item finished, but got {after}");
}
});
}
There are probably big problems with this, but the basic idea is to check how many items are already being handled when we enter the method, then decrement the counter and check there are still no other items being handled. With threading stuff I think it is very hard to make guarantees about things from tests alone, but with enough items processed this can give us a little confidence that it is working as expected:
[Fact]
public void Sample() {
var handler = new AssertSynchronousItemHandler();
var subject = new Subject(handler);
var input = Enumerable.Range(0, 100).Select(x => new Item(x.ToString())).ToList();
subject.PerformWork(input);
// With the code from the question we don't have a way of detecting
// when `PerformWork` finishes. If we can't change this we need to make
// sure we wait "long enough". Yes this is yuck. :)
Thread.Sleep(1000);
Assert.Equal(input, handler.Items);
}
If I modify PerformWork to do things in parallel I get the test failing:
public void PerformWork2(List<Item> items) {
Task.WhenAll(
items.Select(item => itemHandler.PerformIndividualWork(item))
).Wait(2000);
}
// ---- System.Exception : Expected 1 work item running at a time, but got 4
That said, if it is very important to run synchronously and it is not apparent from glancing at the implementation with async/await then maybe it is worth using a more obviously synchronous design, like a queue serviced by only one thread, so that you're guaranteed synchronous execution by design and people won't inadvertently change it to async during refactoring (i.e. it is deliberately synchronous and documented that way).
Consider this situation:
class Product { }
interface IWorker
{
Task<Product> CreateProductAsync();
}
I am now given an IEnumerable<IWorker> workers and am supposed to create an IEnumerable<Product> from it that I have to pass to some other function that I cannot alter:
void CheckProducts(IEnumerable<Product> products);
This methods needs to have access to the entire IEnumerable<Product>. It is not possible to subdivide it and call CheckProducts on multiple subsets.
One obvious solution is this:
CheckProducts(workers.Select(worker => worker.CreateProductAsync().Result));
But this is blocking, of course, and hence it would only be my last resort.
Syntactically, I need precisely this, just without blocking.
I cannot use await inside of the function I'm passing to Select() as I would have to mark it as async and that would require it to return a Task itself and I would have gained nothing. In the end I need an IEnumerable<Product> and not an IEnumerable<Task<Product>>.
It is important to know that the order of the workers creating their products does matter, their work must not overlap. Otherwise, I would do this:
async Task<IEnumerable<Product>> CreateProductsAsync(IEnumerable<IWorker> workers)
{
var tasks = workers.Select(worker => worker.CreateProductAsync());
return await Task.WhenAll(tasks);
}
But unfortunately, Task.WhenAll() executes some tasks in parallel while I need them executed sequentially.
Here is one possibility to implement it if I had an IReadOnlyList<IWorker> instead of an IEnumerable<IWorker>:
async Task<IEnumerable<Product>> CreateProductsAsync(IReadOnlyList<IWorker> workers)
{
var resultList = new Product[workers.Count];
for (int i = 0; i < resultList.Length; ++i)
resultList[i] = await workers[i].CreateProductAsync();
return resultList;
}
But I must deal with an IEnumerable and, even worse, it is usually quite huge, sometimes it is even unlimited, yielding workers forever. If I knew that its size was decent, I would just call ToArray() on it and use the method above.
The ultimate solution would be this:
async Task<IEnumerable<Product>> CreateProductsAsync(IEnumerable<IWorker> workers)
{
foreach (var worker in workers)
yield return await worker.CreateProductAsync();
}
But yield and await are incompatible as described in this answer. Looking at that answer, would that hypothetical IAsyncEnumerator help me here? Does something similar meanwhile exist in C#?
A summary of the issues I'm facing:
I have a potentially endless IEnumerable<IWorker>
I want to asynchronously call CreateProductAsync() on each of them in the same order as they are coming in
In the end I need an IEnumerable<Product>
A summary of what I already tried, but doesn't work:
I cannot use Task.WhenAll() because it executes tasks in parallel.
I cannot use ToArray() and process that array manually in a loop because my sequence is sometimes endless.
I cannot use yield return because it's incompatible with await.
Does anybody have a solution or workaround for me?
Otherwise I will have to use that blocking code...
IEnumerator<T> is a synchronous interface, so blocking is unavoidable if CheckProducts enumerates the next product before the next worker has finished creating the product.
Nevertheless, you can achieve parallelism by creating products on another thread, adding them to a BlockingCollection<T>, and yielding them on the main thread:
static IEnumerable<Product> CreateProducts(IEnumerable<IWorker> workers)
{
var products = new BlockingCollection<Product>(3);
Task.Run(async () => // On the thread pool...
{
foreach (IWorker worker in workers)
{
Product product = await worker.CreateProductAsync(); // Create products serially.
products.Add(product); // Enqueue the product, blocking if the queue is full.
}
products.CompleteAdding(); // Notify GetConsumingEnumerable that we're done.
});
return products.GetConsumingEnumerable();
}
To avoid unbounded memory consumption, you can optionally specify the capacity of the queue as a constructor argument to BlockingCollection<T>. I used 3 in the code above.
The Situation:
Here you're saying you need to do this synchronously, because IEnumerable doesn't support async and the requirements are you need an IEnumerable<Product>.
I am now given an IEnumerable workers and am supposed to
create an IEnumerable from it that I have to pass to some
other function that I cannot alter:
Here you say the entire product set needs to be processed at the same time, presumably making a single call to void CheckProducts(IEnumerable<Product> products).
This methods needs to check the entire Product set as a whole. It is
not possible to subdivide the result.
And here you say the enumerable can yield an indefinite number of items
But I must deal with an IEnumerable and, even worse, it is usually
quite huge, sometimes it is even unlimited, yielding workers forever.
If I knew that its size was decent, I would just call ToArray() on it
and use the method above.
So lets put these together. You need to do asynchronous processing of an indefinite number of items within a synchronous environment and then evaluate the entire set as a whole... synchronously.
The Underlying Problems:
1: To evaluate a set as a whole, it must be completely enumerated. To completely enumerate a set, it must be finite. Therefore it is impossible to evaluate an infinite set as a whole.
2: Switching back and forth between sync and async forces the async code to run synchronously. that might be ok from a requirements perspective, but from a technical perspective it can cause deadlocks (maybe unavoidable, I don't know. Look that up. I'm not the expert).
Possible Solutions to Problem 1:
1: Force the source to be an ICollection<T> instead of IEnumerable<T>. This enforces finiteness.
2: Alter the CheckProducts algorithm to process iteratively, potentially yielding intermediary results while still maintaining an ongoing aggregation internally.
Possible Solutions to Problem 2:
1: Make the CheckProducts method asynchronous.
2: Make the CreateProduct... method synchronous.
Bottom Line
You can't do what you're asking how you're asking, and it sounds like someone else is dictating your requirements. They need to change some of the requirements, because what they're asking for is (and I really hate using this word) impossible. Is it possible you have misinterpreted some of the requirements?
Two ideas for you OP
Multiple call solution
If you are allowed to call CheckProducts more than once, you could simply do this:
foreach (var worker in workers)
{
var product = await worker.CreateProductAsync();
CheckProducts(new [] { product } );
}
If it adds value, I'm pretty sure you could work out a way to do it in batches of, say, 100 at a time, too.
Thread pool solution
If you are not allowed to call CheckProducts more than once, and not allowed to modify CheckProducts, there is no way to force it to yield control and allow other continuations to run. So no matter what you do, you cannot force asynchronousness into the IEnumerable that you pass to it, not just because of the compiler checking, but because it would probably deadlock.
So here is a thread pool solution. The idea is to create one separate thread to process the products in series; the processor is async, so a call to CreateProductAsync() will still yield control to anything else that has been posted to the synchronization context, as needed. However it can't magically force CheckProduct to give up control, so there is still some possibility that it will block occasionally if it is able to check products faster than they are created. In my example I'm using Monitor.Wait() so the O/S won't schedule the thread until there is something waiting for it. You'll still be using up a thread resource while it blocks, but at least you won't be wasting CPU time in a busy-wait loop.
public static IEnumerable<Product> CreateProducts(IEnumerable<Worker> workers)
{
var queue = new ConcurrentQueue<Product>();
var task = Task.Run(() => ConvertProducts(workers.GetEnumerator(), queue));
while (true)
{
while (queue.Count > 0)
{
Product product;
var ok = queue.TryDequeue(out product);
if (ok) yield return product;
}
if (task.IsCompleted && queue.Count == 0) yield break;
Monitor.Wait(queue, 1000);
}
}
private static async Task ConvertProducts(IEnumerator<Worker> input, ConcurrentQueue<Product> output)
{
while (input.MoveNext())
{
var current = input.Current;
var product = await current.CreateProductAsync();
output.Enqueue(product);
Monitor.Pulse(output);
}
}
From your requirements I can put together the following:
1) Workers processed in order
2) Open to receive new Workers at any time
So using the fact that a dataflow TransformBlock has a built in queue and processes items in order. Now we can accept Workers from the producer at any time.
Next we make the result of the TransformBlockobservale so that the consumer can consume Products on demand.
Made some quick changes and started the consumer portion. This simply takes the observable produced by the Transformer and maps it to an enumerable that yields each product. For background here is the ToEnumerable().
The ToEnumerator operator returns an enumerator from an observable sequence. The enumerator will yield each item in the sequence as it is produced
Source
using System;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
namespace ClassLibrary1
{
public class WorkerProducer
{
public async Task ProduceWorker()
{
//await ProductTransformer_Transformer.SendAsync(new Worker())
}
}
public class ProductTransformer
{
public IObservable<Product> Products { get; private set; }
public TransformBlock<Worker, Product> Transformer { get; private set; }
private Task<Product> CreateProductAsync(Worker worker) => Task.FromResult(new Product());
public ProductTransformer()
{
Transformer = new TransformBlock<Worker, Product>(wrk => CreateProductAsync(wrk));
Products = Transformer.AsObservable();
}
}
public class ProductConsumer
{
private ThirdParty ThirdParty { get; set; } = new ThirdParty();
private ProductTransformer Transformer { get; set; }
public ProductConsumer()
{
ThirdParty.CheckProducts(Transformer.Products.ToEnumerable());
}
public class Worker { }
public class Product { }
public class ThirdParty
{
public void CheckProducts(IEnumerable<Product> products)
{
}
}
}
Unless I misunterstood something, I don't see why you don't simply do it like this:
var productList = new List<Product>(workers.Count())
foreach(var worker in workers)
{
productList.Add(await worker.CreateProductAsync());
}
CheckProducts(productList);
What about if you simply keep clearing a List of size 1?
var productList = new List<Product>(1);
var checkTask = Task.CompletedTask;
foreach(var worker in workers)
{
await checkTask;
productList.Clear();
productList.Add(await worker.CreateProductAsync());
checkTask = Task.Run(CheckProducts(productList));
}
await checkTask;
You can use Task.WhenAll, but instead of returning result of Task.WhenAll, return collection of tasks transformed to the collection of results.
async Task<IEnumerable<Product>> CreateProductsAsync(IEnumerable<IWorker> workers)
{
var tasks = workers.Select(worker => worker.CreateProductAsync()).ToList();
await Task.WhenAll(tasks);
return tasks.Select(task => task.Result);
}
Order of tasks will be persisted.
And seems like should be ok to go with just return await Task.WhenAll()
From docs of Task.WhenAll Method (IEnumerable>)
The Task.Result property of the returned task will be set to
an array containing all of the results of the supplied tasks in the
same order as they were provided...
If workers need to be executed one by one in the order they were created and based on requirement that another function need whole set of workers results
async Task<IEnumerable<Product>> CreateProductsAsync(IEnumerable<IWorker> workers)
{
var products = new List<product>();
foreach (var worker in workers)
{
product = await worker.CreateProductAsync();
products.Add(product);
}
return products;
}
You can do this now with async, IEnumerable and LINQ but every method in the chain after the async would be a Task<T>, and you need to use something like await Task.WhenAll at the end. You can use async lambdas in the LINQ methods, which return Task<T>. You don't need to wait synchronously in these.
The Select will start your tasks sequentially i.e. they won't even exist as tasks until the select enumerates each one, and won't keep going after you stop enumerating. You could also run your own foreach over the enumerable of tasks if you want to await them all individually.
You can break out of this like any other foreach without it starting all of them, so this will also work on an infinite enumerable.
public async Task Main()
{
// This async method call could also be an async lambda
foreach (var task in GetTasks())
{
var result = await task;
Console.WriteLine($"Result is {result}");
if (result > 5) break;
}
}
private IEnumerable<Task<int>> GetTasks()
{
return GetNumbers().Select(WaitAndDoubleAsync);
}
private async Task<int> WaitAndDoubleAsync(int i)
{
Console.WriteLine($"Waiting {i} seconds asynchronously");
await Task.Delay(TimeSpan.FromSeconds(i));
return i * 2;
}
/// Keeps yielding numbers
private IEnumerable<int> GetNumbers()
{
var i = 0;
while (true) yield return i++;
}
Outputs, the following, then stops:
Waiting 0 seconds asynchronously
Result is 0
Waiting 1 seconds asynchronously
Result is 2
Waiting 2 seconds asynchronously
Result is 4
Waiting 3 seconds asynchronously
Result is 6
The important thing is that you can't mix yield and await in the same method, but you can yield Tasks returned from a method that uses await absolutely fine, so you can use them together just by splitting them into separate methods. Select is already a method that uses yield, so you may not need to write your own method for this.
In your post you were looking for a Task<IEnumerable<Product>>, but what you can actually use is a IEnumerable<Task<Product>>.
You can go even further with this e.g. if you had something like a REST API where one resource can have links to other resources, like if you just wanted to get a list of users of a group, but stop when you found the user you were interested in:
public async Task<IEnumerable<Task<User>>> GetUserTasksAsync(int groupId)
{
var group = await GetGroupAsync(groupId);
return group.UserIds.Select(GetUserAsync);
}
foreach (var task in await GetUserTasksAsync(1))
{
var user = await task;
...
}
There is no solution to your problem. You can't transform a deferred IEnumerable<Task<Product>> to a deferred IEnumerable<Product>, such that the consuming thread will not get blocked while enumerating the IEnumerable<Product>. The IEnumerable<T> is a synchronous interface. It returns an enumerator with a synchronous MoveNext method. The MoveNext returns bool, which is not an awaitable type. An asynchronous interface IAsyncEnumerable<T> exists, whose enumerator has an asynchronous MoveNextAsync method, with a return type of ValueTask<bool>. But you have explicitly said that you can't change the consuming method, so you are stuck with the IEnumerable<T> interface. No solution then.
try
workers.ForEach(async wrkr =>
{
var prdlist = await wrkr.CreateProductAsync();
//Remaing tasks....
});
I have the following use case. Multiple threads are creating data points which are collected in a ConcurrentBag. Every x ms a single consumer thread looks at the data points that came in since the last time and processes them (e.g. count them + calculate average).
The following code more or less represents the solution that I came up with:
private static ConcurrentBag<long> _bag = new ConcurrentBag<long>();
static void Main()
{
Task.Run(() => Consume());
var producerTasks = Enumerable.Range(0, 8).Select(i => Task.Run(() => Produce()));
Task.WaitAll(producerTasks.ToArray());
}
private static void Produce()
{
for (int i = 0; i < 100000000; i++)
{
_bag.Add(i);
}
}
private static void Consume()
{
while (true)
{
var oldBag = _bag;
_bag = new ConcurrentBag<long>();
var average = oldBag.DefaultIfEmpty().Average();
var count = oldBag.Count;
Console.WriteLine($"Avg = {average}, Count = {count}");
// Wait x ms
}
}
Is a ConcurrentBag the right tool for the job here?
Is switching the bags the right way to achieve clearing the list for new data points and then processing the old ones?
Is it safe to operate on oldBag or could I run into trouble when I iterate over oldBag and a thread is still adding an item?
Should I use Interlocked.Exchange() for switching the variables?
EDIT
I guess the above code was not really a good representation of what I'm trying to achieve. So here is some more code to show the problem:
public class LogCollectorTarget : TargetWithLayout, ILogCollector
{
private readonly List<string> _logMessageBuffer;
public LogCollectorTarget()
{
_logMessageBuffer = new List<string>();
}
protected override void Write(LogEventInfo logEvent)
{
var logMessage = Layout.Render(logEvent);
lock (_logMessageBuffer)
{
_logMessageBuffer.Add(logMessage);
}
}
public string GetBuffer()
{
lock (_logMessageBuffer)
{
var messages = string.Join(Environment.NewLine, _logMessageBuffer);
_logMessageBuffer.Clear();
return messages;
}
}
}
The class' purpose is to collect logs so they can be sent to a server in batches. Every x seconds GetBuffer is called. This should get the current log messages and clear the buffer for new messages. It works with locks but it as they are quite expensive I don't want to lock on every Logging-operation in my program. So that's why I wanted to use a ConcurrentBag as a buffer. But then I still need to switch or clear it when I call GetBuffer without loosing any log messages that happen during the switch.
Since you have a single consumer, you can work your way with a simple ConcurrentQueue, without swapping collections:
public class LogCollectorTarget : TargetWithLayout, ILogCollector
{
private readonly ConcurrentQueue<string> _logMessageBuffer;
public LogCollectorTarget()
{
_logMessageBuffer = new ConcurrentQueue<string>();
}
protected override void Write(LogEventInfo logEvent)
{
var logMessage = Layout.Render(logEvent);
_logMessageBuffer.Enqueue(logMessage);
}
public string GetBuffer()
{
// How many messages should we dequeue?
var count = _logMessageBuffer.Count;
var messages = new StringBuilder();
while (count > 0 && _logMessageBuffer.TryDequeue(out var message))
{
messages.AppendLine(message);
count--;
}
return messages.ToString();
}
}
If memory allocations become an issue, you can instead dequeue them to a fixed-size array and call string.Join on it. This way, you're guaranteed to do only two allocations (whereas the StringBuilder could do many more if the initial buffer isn't properly sized):
public string GetBuffer()
{
// How many messages should we dequeue?
var count = _logMessageBuffer.Count;
var buffer = new string[count];
for (int i = 0; i < count; i++)
{
_logMessageBuffer.TryDequeue(out var message);
buffer[i] = message;
}
return string.Join(Environment.NewLine, buffer);
}
Is a ConcurrentBag the right tool for the job here?
Its the right tool for a job, this really depends on what you are trying to do, and why. The example you have given is very simplistic without any context so its hard to tell.
Is switching the bags the right way to achieve clearing the list for
new data points and then processing the old ones?
The answer is no, for probably many reasons. What happens if a thread writes to it, while you are switching it?
Is it safe to operate on oldBag or could I run into trouble when I
iterate over oldBag and a thread is still adding an item?
No, you have just copied the reference, this will achieve nothing.
Should I use Interlocked.Exchange() for switching the variables?
Interlock methods are great things, however this will not help you in your current problem, they are for thread safe access to integer type values. You are really confused and you need to look up more thread safe examples.
However Lets point you in the right direction. forget about ConcurrentBag and those fancy classes. My advice is start simple and use locking so you understand the nature of the problem.
If you want multiple tasks/threads to access a list, you can easily use the lock statement and guard access to the list/array so other nasty threads aren't modifying it.
Obviously the code you have written is a nonsensical example, i mean you are just adding consecutive numbers to a list, and getting another thread to average them them. This hardly needs to be consumer producer at all, and would make more sense to just be synchronous.
At this point i would point you to better architectures that would allow you to implement this pattern, e.g Tpl Dataflow, but i fear this is just a learning excise and unfortunately you really need to do more reading on multithreading and try more examples before we can truly help you with a problem.
It works with locks but it as they are quite expensive. I don't want to lock on every logging-operation in my program.
Acquiring an uncontended lock is actually quite cheap. Quoting from Joseph Albahari's book:
You can expect to acquire and release a lock in as little as 20 nanoseconds on a 2010-era computer if the lock is uncontended.
Locking becomes expensive when it is contended. You can minimize the contention by reducing the work inside the critical region to the absolute minimum. In other words don't do anything inside the lock that can be done outside the lock. In your second example the method GetBuffer does a String.Join inside the lock, delaying the release of the lock and increasing the chances of blocking other threads. You can improve it like this:
public string GetBuffer()
{
string[] messages;
lock (_logMessageBuffer)
{
messages = _logMessageBuffer.ToArray();
_logMessageBuffer.Clear();
}
return String.Join(Environment.NewLine, messages);
}
But it can be optimized even further. You could use the technique of your first example, and instead of clearing the existing List<string>, just swap it with a new list:
public string GetBuffer()
{
List<string> oldList;
lock (_logMessageBuffer)
{
oldList = _logMessageBuffer;
_logMessageBuffer = new();
}
return String.Join(Environment.NewLine, oldList);
}
Starting from .NET Core 3.0, the Monitor class has the property Monitor.LockContentionCount, that returns the number of times there was contention at the entry point of a lock. You could watch the delta of this property every second, and see if the number is concerning. If you get single-digit numbers, there is nothing to worry about.
Touching some of your questions:
Is a ConcurrentBag the right tool for the job here?
No. The ConcurrentBag<T> is a very specialized collection intended for mixed producer scenarios, mainly object pools. You don't have such a scenario here. A ConcurrentQueue<T> is preferable to a ConcurrentBag<T> in almost all scenarios.
Should I use Interlocked.Exchange() for switching the variables?
Only if the collection was immutable. If the _logMessageBuffer was an ImmutableQueue<T>, then it would be excellent to swap it with Interlocked.Exchange. With mutable types you have no idea if the old collection is still in use by another thread, and for how long. The operating system can suspend any thread at any time for a duration of 10-30 milliseconds or even more (demo). So it's not safe to use lock-free techniques. You have to lock.
I have a current algorithm that goes like this.
public class Executor
{
private ParallelOptions options = new ParallelOptions();
private IList<Step> AllSteps;
public void Execute()
{
options.MaxDegreeOfParallelism = 4;
var rootSteps = AllSteps.Where(s => !s.Parents.Any());
Parallel.Foreach(rootSteps, options, RecursivelyExecuteStep);
}
private void RecursivelyExecuteStep(Step step)
{
ExecuteStep();
var childSteps = AllSteps.Where(s=>s.Parents.Contains(step)
&& step.Parents.All(p=>p.IsComplete);
Parallel.ForEach(childSteps, options, RecursivelyExecuteStep);
}
}
ParallelOptions.MaxDegreeOfParallelism will be an input variable (but left it out of the code example for brevity).
I was wondering if thread pooling is handled for me automatically or if this creates new threads every time. Also what's the best way to optimize this, is thread pooling something I want. How do I use thread pooling. I'm fairly new to multithreading and what's new in 4.5[.1]
Will this not limit the algorithm to only 4 threads because each Parallel.Foreach would have it's own MaxDegreeOfParallelism of 4 thus not limiting all the threads in the app to 4? How do I achieve limiting all threading in the app to 4?
Edit: MaxDegreeOfParallelism
You can solve this problem with TPL DataFlow library (you can get it via NuGet). As it is said in other answer, Parallel class is using the ThreadPool internally, and you should not be bothered with that.
With the TPL Dataflow the only thing you need is create an TransformManyBlock<TInput,TOutput> linked on itself (or link BufferBlock with ActionBlock with Encapsulate extension), and set the MaxDegreeOfParallelism = 4 or whatever constant you think it should be.
Parallel.Foreach basically is a nice way to queue up work items to the .NET ThreadPool.
Your application (process) has only one instance of the ThreadPool, and it tries to be as smart as possible regarding how many concurrent threads it uses, taking things like number of available cores and virtual memory into account.
So yes, the .NET ThreadPool handles thread pooling for you, and in many cases you don't need to worry about it, use Parallel.Foreach and let it get on with it.
EDIT: As noted by others, you should be careful in overusing the ThreadPool since it is a shared resource and it may disturb other parts of your application. It will also start creating new threads if your items are blocking or very long-running, which often is wasteful. A rule of thumb is that the work items should be relatively quick and preferably non-blocking. You should test and benchmark, and if it works for your use case then it is very convenient.
You can control the max number of concurrent threads used by the ThreadPool in your application if you want explicit control, by calling ThreadPool.SetMaxThreads. I'd advice against that unless you really have to though, and know what you are doing. The ThreadPool already tries to avoid using more concurrent threads than you have cores for example.
What you can do with ParallellOptions.MaxDegreeOfParallelism is only to further limit the number of concurrent ThreadPool threads that are used to execute that particular call to Parallel.Foreach.
If you need more explicit control of how many concurrent threads an invocation of your algorithm uses, here are some possible alternatives (in, arguably, increasing implementation complexity):
With the default ThreadPool you can't limit concurrency while calling Parellel.Foreach recursively. You could for example consider using Parallel.Foreach only on the top level (with a ParellelOptions.MaxDegreeOfParallelism argument) and let RecursivelyExecuteStep use a standard foreach.
Modify (or replace) the ThreadPool for your algorithm to limit concurrency by setting ParallelOptions.TaskScheduler to an instance of QueuedTaskScheduler from Parallel Extension Extras as described here.
As suggested by #VMAtm, you can use TPL Dataflow to get more
explicit control of how your computations are performed, including
concurrency (this can also be combined with a custom task scheduler if you
really want to knock yourself out).
A simple straightforward implementation could look like the following:
ParallelOptions Options = new ParallelOptions{MaxDegreeOfParallelism = 4};
IList<Step> AllSteps;
public void Execute()
{
var RemainingSteps = new HashSet<Step>(AllSteps);
while(RemainingSteps.Count > 0)
{
var ExecutableSteps = RemainingSteps.Where(s => s.Parents.All(p => p.IsComplete)).ToList();
Parallel.ForEach(ExecutableSteps, Options, ExecuteStep);
RemainingSteps.ExceptWith(ExecutableSteps);
}
}
Granted, this will execute steps in phases, so you will not always have maximum concurrency. You may only be executing one step at the end of each phase, since the next steps to execute are only realized after all steps in the current phase complete.
If you want to improve concurrency, I would suggest using a BlockingCollection. You'll need to implement a custom partitioner to use Parallel.ForEach against the blocking collection in this case. You'll also want a concurrent collection of the remaining steps, so that you don't queue the same step multiple times (the race condition previously commented on).
public class Executor
{
ParallelOptions Options = new ParallelOptions() { MaxDegreeOfParallelism = 4 };
IList<Step> AllSteps;
//concurrent hashset of remaining steps (used to prevent race conditions)
ConcurentDictionary<Step, Step> RemainingSteps = new ConcurentDictionary<Step, Step>();
//blocking collection of steps that can execute next
BlockingCollection<Step> ExecutionQueue = new BlockingCollection<Step>();
public void Execute()
{
foreach(var step in AllSteps)
{
if(step.Parents.All(p => p.IsComplete))
{
ExecutionQueue.Add(step);
}
else
{
RemainingSteps.Add(step, step);
}
}
Parallel.ForEach(
GetConsumingPartitioner(ExecutionQueue),
Options,
Execute);
}
void Execute(Step step)
{
ExecuteStep(step);
if(RemainingSteps.IsEmpty)
{
//we're done, all steps are complete
executionQueue.CompleteAdding();
return;
}
//queue up the steps that can execute next (concurrent dictionary enumeration returns a copy, so subsequent removal is safe)
foreach(var step in RemainingSteps.Values.Where(s => s.Parents.All(p => p.IsComplete)))
{
//note, removal only occurs once, so this elimiates the race condition
Step NextStep;
if(RemainingSteps.TryRemove(step, out NextStep))
{
executionQueue.Add(NextStep);
}
}
}
Partitioner<T> GetConsumingPartitioner<T>(BlockingCollection<T> collection)
{
return new BlockingCollectionPartitioner<T>(collection);
}
class BlockingCollectionPartitioner<T> : Partitioner<T>
{
readonly BlockingCollection<T> Collection;
public BlockingCollectionPartitioner(BlockingCollection<T> collection)
{
if (collection == null) throw new ArgumentNullException("collection");
Collection = collection;
}
public override bool SupportsDynamicPartitions { get { return true; } }
public override IList<IEnumerator<T>> GetPartitions(int partitionCount)
{
if (partitionCount < 1) throw new ArgumentOutOfRangeException("partitionCount");
var Enumerable = GetDynamicPartitions();
return Enumerable.Range(0, partitionCount)
.Select(i => Enumerable.GetEnumerator()).ToList();
}
public override IEnumerable<T> GetDynamicPartitions()
{
return Collection.GetConsumingEnumerable();
}
}
}
Why will the Parallel.ForEach will not finish executing a series of tasks until MoveNext returns false?
I have a tool that monitors a combination of MSMQ and Service Broker queues for incoming messages. When a message is found, it hands that message off to the appropriate executor.
I wrapped the check for messages in an IEnumerable, so that I could hand the Parallel.ForEach method the IEnumerable plus a delegate to run. The application is designed to run continuously w/ the IEnumerator.MoveNext processing in a loop until it's able to get work, then the IEnumerator.Current giving it the next item.
Since the MoveNext will never die until I set the CancelToken to true, this should continue to process for ever. Instead what I'm seeing is that once the Parallel.ForEach has picked up all the messages and the MoveNext is no longer returning "true", no more tasks are processed. Instead it seems like the MoveNext thread is the only thread given any work while it waits for it to return, and the other threads (including waiting and scheduled threads) do not do any work.
Is there a way to tell the Parallel to keep working while it waits for a response from the MoveNext?
If not, is there another way to structure the MoveNext to get what I want? (having it return true and then the Current returning a null object spawns a lot of bogus Tasks)
Bonus Question: Is there a way to limit how many messages the Parallel pulls off at once? It seems to pull off and schedule a lot of messages at once (the MaxDegreeOfParallelism only seems to limit how much work it does at once, it doesn't stop it from pulling off a lot of messages to be scheduled)
Here is the IEnumerator for what I've written (w/o some extraneous code):
public class DataAccessEnumerator : IEnumerator<TransportMessage>
{
public TransportMessage Current
{ get { return _currentMessage; } }
public bool MoveNext()
{
while (_cancelToken.IsCancellationRequested == false)
{
TransportMessage current;
foreach (var task in _tasks)
{
if (task.QueueType.ToUpper() == "MSMQ")
current = _msmq.Get(task.Name);
else
current = _serviceBroker.Get(task.Name);
if (current != null)
{
_currentMessage = current;
return true;
}
}
WaitHandle.WaitAny(new [] {_cancelToken.WaitHandle}, 500);
}
return false;
}
public DataAccessEnumerator(IDataAccess<TransportMessage> serviceBroker, IDataAccess<TransportMessage> msmq, IList<JobTask> tasks, CancellationToken cancelToken)
{
_serviceBroker = serviceBroker;
_msmq = msmq;
_tasks = tasks;
_cancelToken = cancelToken;
}
private readonly IDataAccess<TransportMessage> _serviceBroker;
private readonly IDataAccess<TransportMessage> _msmq;
private readonly IList<JobTask> _tasks;
private readonly CancellationToken _cancelToken;
private TransportMessage _currentMessage;
}
Here is the Parallel.ForEach call where _queueAccess is the IEnumerable that holds the above IEnumerator and RunJob processes a TransportMessage that is returned from that IEnumerator:
var parallelOptions = new ParallelOptions
{
CancellationToken = _cancelTokenSource.Token,
MaxDegreeOfParallelism = 8
};
Parallel.ForEach(_queueAccess, parallelOptions, x => RunJob(x));
It sounds to me like Parallel.ForEach isn't really a good match for what you want to do. I suggest you use BlockingCollection<T> to create a producer/consumer queue instead - create a bunch of threads/tasks to service the blocking collection, and add work items to it as and when they arrive.
Your problem might be to do with the Partitioner being used.
In your case, the TPL will choose the Chunk Partitioner, which will take multiple items from the enum before passing them on to be processed. The number of items taken in each chunk will increase with time.
When your MoveNext method blocks, the TPL is left waiting for the next item and won't process the items that it has already taken.
You have a couple of options to fix this:
1) Write a Partitioner that always returns individual items. Not as tricky as it sounds.
2) Use the TPL instead of Parallel.ForEach:
foreach ( var item in _queueAccess )
{
var capturedItem = item;
Task.Factory.StartNew( () => RunJob( capturedItem ) );
}
The second solution changes the behaviour a bit. The foreach loop will complete when all the Tasks have been created, not when they have finished. If this is a problem for you, you can add a CountdownEvent:
var ce = new CountdownEvent( 1 );
foreach ( var item in _queueAccess )
{
ce.AddCount();
var capturedItem = item;
Task.Factory.StartNew( () => { RunJob( capturedItem ); ce.Signal(); } );
}
ce.Signal();
ce.Wait();
I haven't gone to the effort to make sure of this, but the impression I'd received from discussions of Parallel.ForEach was that it would pull all the items out of the enumerable them make appropriate decisions about how to divide them across threads. Based on your problem, that seems correct.
So, to keep most of your current code, you should probably pull the blocking code out of the iterator and place it into a loop around the call to Parallel.ForEach (which uses the iterator).