How to process items in parallel and then merge the results? - c#

I am confronted with the following problem:
I have a a data stream of Foo objects and stream those objects to several concurrent in-process tasks/threads which in turn process the objects and output FooResult objects. Each FooResult contains among othermembers the same Foo that was used in the creation of a FooResult. However, not every Foo necessarily creates a FooResult.
My problem is that I want to pass on from this whole process a wrapping object that contains the original Foo and potentially all, if any, FooResult objects that may have been created from a Foo within the concurrent tasks.
Note: I currently use TPL Dataflow, whereas each concurrent process happens within an ActionBlock<Foo> which is linked to from a BroadCastBlock<Foo>. It uses SendAsync() to a target dataflow block to send potentially created FooResult. Obviously the concurrent data flow blocks produce FooResult at unpredictable times which is what I currently struggle with. I do not seem to be able to figure out how many FooResult were created in all ActionBlock<Foo> together so that I can bundle them up with the originating Foo and pass it on as a wrapping object.
In Pseudo code it currently looks as follows:
BroadCastBlock<Foo> broadCastBlock;
ActionBlock<Foo> aBlock1;
ActionBlock<Foo> aBlock2;
ActionBlock<FooResult> targetBlock;
broadCastBlock.LinkTo(aBlock1); broadCastBlock.LinkTo(aBlock2);
aBlock1 = new ActionBlock<Foo>(foo =>
{
//do something here. Sometimes create a FooResult. If then
targetBlock.SendAsync(fooResult);
});
//similar for aBlock2
However, the problem with the current code is that the targetBlock potentially does not receive anything if a Foo did not produce a single FooResult in any of the action blocks. Also, it could be that targetBlock receives 2 FooResult objects because each action block produced a FooResult.
What I want is that the targetBlock receives a wrapping object that contains each Foo and if FooResult objects were created then also a collection of FooResult.
Any ideas what I could do to make the solution work in the way described? It does not have to peruse TPL Dataflow but it would be neat if it did.
UPDATE: The following is what I got through the implementation of JoinBlock as suggested by svick. I am not gonna use it (unless it can be tweaked performance wise), because it is extremely slow to run, I get to about 89000 items (and thats only int value types) per second.
public class Test
{
private BroadcastBlock<int> broadCastBlock;
private TransformBlock<int, int> transformBlock1;
private TransformBlock<int, int> transformBlock2;
private JoinBlock<int, int, int> joinBlock;
private ActionBlock<Tuple<int, int, int>> processorBlock;
public Test()
{
broadCastBlock = new BroadcastBlock<int>(i =>
{
return i;
});
transformBlock1 = new TransformBlock<int, int>(i =>
{
return i;
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded });
transformBlock2 = new TransformBlock<int, int>(i =>
{
return i;
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded });
joinBlock = new JoinBlock<int, int, int>();
processorBlock = new ActionBlock<Tuple<int, int, int>>(tuple =>
{
//Console.WriteLine("original value: " + tuple.Item1 + "tfb1: " + tuple.Item2 + "tfb2: " + tuple.Item3);
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded });
//Linking
broadCastBlock.LinkTo(transformBlock1, new DataflowLinkOptions { PropagateCompletion = true });
broadCastBlock.LinkTo(transformBlock2, new DataflowLinkOptions { PropagateCompletion = true });
broadCastBlock.LinkTo(joinBlock.Target1, new DataflowLinkOptions { PropagateCompletion = true });
transformBlock1.LinkTo(joinBlock.Target2, new DataflowLinkOptions { PropagateCompletion = true });
transformBlock2.LinkTo(joinBlock.Target3, new DataflowLinkOptions { PropagateCompletion = true });
joinBlock.LinkTo(processorBlock, new DataflowLinkOptions { PropagateCompletion = true });
}
public void Start()
{
Stopwatch watch = new Stopwatch();
watch.Start();
const int numElements = 1000000;
for (int i = 1; i <= numElements; i++)
{
broadCastBlock.Post(i);
}
////mark completion
broadCastBlock.Complete();
processorBlock.Completion.Wait();
watch.Stop();
Console.WriteLine("Time it took: " + watch.ElapsedMilliseconds + " - items processed per second: " + numElements / watch.ElapsedMilliseconds * 1000);
Console.ReadLine();
}
}
Update of the code to reflect suggestions:
public Test()
{
broadCastBlock = new BroadcastBlock<int>(i =>
{
return i;
});
transformBlock1 = new TransformBlock<int, int>(i =>
{
return i;
});
transformBlock2 = new TransformBlock<int, int>(i =>
{
return i;
});
joinBlock = new JoinBlock<int, int>();
processorBlock = new ActionBlock<Tuple<int, int>>(tuple =>
{
//Console.WriteLine("tfb1: " + tuple.Item1 + "tfb2: " + tuple.Item2);
});
//Linking
broadCastBlock.LinkTo(transformBlock1, new DataflowLinkOptions { PropagateCompletion = true });
broadCastBlock.LinkTo(transformBlock2, new DataflowLinkOptions { PropagateCompletion = true });
transformBlock1.LinkTo(joinBlock.Target1);
transformBlock2.LinkTo(joinBlock.Target2);
joinBlock.LinkTo(processorBlock, new DataflowLinkOptions { PropagateCompletion = true });
}
public void Start()
{
Stopwatch watch = new Stopwatch();
watch.Start();
const int numElements = 1000000;
for (int i = 1; i <= numElements; i++)
{
broadCastBlock.Post(i);
}
////mark completion
broadCastBlock.Complete();
Task.WhenAll(transformBlock1.Completion, transformBlock2.Completion).ContinueWith(_ => joinBlock.Complete());
processorBlock.Completion.Wait();
watch.Stop();
Console.WriteLine("Time it took: " + watch.ElapsedMilliseconds + " - items processed per second: " + numElements / watch.ElapsedMilliseconds * 1000);
Console.ReadLine();
}
}

I can see two ways to solve this:
Use JoinBlock. Your broadcast block, and both worker blocks will each send to one target of the join block. If a worker block doesn't have any results, it will give it null instead (or some other special value). Your worker blocks will need to change to TranformBlock<Foo, FooResult>, because using ActionBlock the way you do doesn't guarantee ordering (at least not when you set MaxDegreeOfParallelism), TransformBlock does.
The result of the JoinBlock would be a Tuple<Foo, FooResult, FooResult>, where any or both of the FooResults can be null.
Although I'm not sure I like that this solution relies heavily on correct ordering of items, that seems fragile to me.
Use some other object for synchronization. That object will be then responsible for sending the result forward, when all blocks are done with a certain item. This is similar to the NotificationWrapper suggested by Mario in his answer.
You could use TaskCompletionSource and Task.WhenAll() to take care of synchronization in this case.

As far as I understand the issue:
lock foo
work on foo
if foo has not triggered sending a result
and fooResult exists
send fooResult
remember in foo that result has already been sent
unlock foo
Update after comment of OP
So push foo into your BroadCastBlock
BroadCastBlock<Foo> bcb = new BroadCastBlock<Foo>(foo);
...
if ( aBlock1.HasResult )
{
bcb.Add( aBlock1.Result );
}
if ( aBlock2.HasResult )
{
bcb.Add( aBlock2.Result );
}
And now you can query bcb for what is present and send what needed (or just send the bcb).
Update (after even more discussion in comments)
class NotificationWrapper<TSource, TResult>
{
private readonly TSource originalSource;
private Queue<TResult> resultsGenerated = new Queue<TResult>()
private int workerCount = 0;
public NotificationWrapper<TSource, TResult>( TSource originalSource, int workerCount )
{
this.originalSource = originalSource;
this.workerCount = workerCount;
}
public void NotifyActionDone()
{
lock( this )
{
--workerCount;
if ( 0 == workerCount )
{
//do my sending
send( originalSource, resultsGenerated );
}
}
}
public void NotifyActionDone( TResult result )
{
lock ( this )
{
resultsGenerated.push( result );
NotifyActionDone();
}
}
}
And in the calling Code:
NotificationWrapper<Foo, Fooresult> notificationWrapper = new NotificationWrapper<Foo, Fooresult>( foo, 2 );
ActionBlock<Foo> ab1 = new ActionBlock<Foo>( foo, notificationWrapper );
ActionBlock<Foo> ab2 = new ActionBlock<Foo>( foo, notificationWrapper );
And the ActionBlock needs to be changed to either call NotifyActionDone() or NotifyActoinDone( Fooresult ) once it completed its calculation.

Related

Asynchronous Task, video buffering

I am trying to understand Tasks in C# but still having some problems. I am trying to create an application containing video. The main purpose is to read the video from a file (I am using Emgu.CV) and send it via TCP/IP for process in a board and then back in a stream (real-time) way. Firstly, I did it in serial. So, reading a Bitmap, sending-receiving from board, and plotting. But reading the bitmaps and plotting them takes too much time. I would like to have a Transmit, Receive FIFO Buffers that save the video frames, and a different task that does the job of sending receiving each frame. So I would like to do it in parallel. I thought I should create 3 Tasks:
tasks.Add(Task.Run(() => Video_load(video_path)));
tasks.Add(Task.Run(() => Video_Send_Recv(video_path)));
tasks.Add(Task.Run(() => VideoDisp_hw(32)));
Which I would like to run "parallel". What type of object should I use? A concurrent queue? BufferBlock? or just a list?
Thanks for the advices! I would like to ask something. I am trying to create a simple console program with 2 TPL blocks. 1 Block would be Transform block (taking a message i.e. "start" ) and loading data to a List and another block would be ActionBlock (just reading the data from the list and printing them). Here is the code below:
namespace TPL_Dataflow
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Hello World!");
Random randn = new Random();
var loadData = new TransformBlock<string, List<int>>(async sample_string =>
{
List<int> input_data = new List<int>();
int cnt = 0;
if (sample_string == "start")
{
Console.WriteLine("Inside loadData");
while (cnt < 16)
{
input_data.Add(randn.Next(1, 255));
await Task.Delay(1500);
Console.WriteLine("Cnt");
cnt++;
}
}
else
{
Console.WriteLine("Not started yet");
}
return input_data;
});
var PrintData = new ActionBlock<List<int>>(async input_data =>
{
while(input_data.Count > 0)
{
Console.WriteLine("output Data = " + input_data.First());
await Task.Delay(1000);
input_data.RemoveAt(0);
}
});
var linkOptions = new DataflowLinkOptions { PropagateCompletion = true };
loadData.LinkTo(PrintData, input_data => input_data.Count() >0 );
//loadData.LinkTo(PrintData, linkOptions);
loadData.SendAsync("start");
loadData.Complete();
PrintData.Completion.Wait();
}
}
}
But it seems to work in serial way.. What am I doing wrong? I tried to do the while loops async. I would like to do the 2 things in parallel. When data available from the List then plotted.
You could use a TransformManyBlock<string, int> as the producer block, and an ActionBlock<int> as the consumer block. The TransformManyBlock would be instantiated with the constructor that accepts a Func<string, IEnumerable<int>> delegate, and passed an iterator method (the Produce method in the example below) that yields values one by one:
Random random = new Random();
var producer = new TransformManyBlock<string, int>(Produce);
IEnumerable<int> Produce(string message)
{
if (message == "start")
{
int cnt = 0;
while (cnt < 16)
{
int value;
lock (random) value = random.Next(1, 255);
Console.WriteLine($"Producing #{value}");
yield return value;
Thread.Sleep(1500);
cnt++;
}
}
else
{
yield break;
}
}
var consumer = new ActionBlock<int>(async value =>
{
Console.WriteLine($"Received: {value}");
await Task.Delay(1000);
});
producer.LinkTo(consumer, new() { PropagateCompletion = true });
producer.Post("start");
producer.Complete();
consumer.Completion.Wait();
Unfortunately the producer has to block the worker thread during the idle period between yielding each value (Thread.Sleep(1500);), because the TransformManyBlock currently does not have a constructor that accepts a Func<string, IAsyncEnumerable<int>>. This will be probably fixed in the next release of the TPL Dataflow library. You could track this GitHub issue, to be informed about when this feature will be released.
Alternative solution: Instead of linking explicitly the producer and the consumer, you could keep them unlinked, and send manually the values produced by the producer to the consumer. In this case both blocks would be ActionBlocks:
Random random = new Random();
var consumer = new ActionBlock<int>(async value =>
{
Console.WriteLine($"Received: {value}");
await Task.Delay(1000);
});
var producer = new ActionBlock<string>(async message =>
{
if (message == "start")
{
int cnt = 0;
while (cnt < 16)
{
int value;
lock (random) value = random.Next(1, 255);
Console.WriteLine($"Producing #{value}");
var accepted = await consumer.SendAsync(value);
if (!accepted) break; // The consumer has failed
await Task.Delay(1500);
cnt++;
}
}
});
PropagateCompletion(producer, consumer);
producer.Post("start");
producer.Complete();
consumer.Completion.Wait();
async void PropagateCompletion(IDataflowBlock source, IDataflowBlock target)
{
try { await source.Completion.ConfigureAwait(false); } catch { }
var ex = source.Completion.IsFaulted ? source.Completion.Exception : null;
if (ex != null) target.Fault(ex); else target.Complete();
}
The main difficulty with this approach is how to propagate the completion of the producer to the consumer, so that eventually both blocks are completed. Obviously you can't use the new DataflowLinkOptions { PropagateCompletion = true } configuration, since the blocks are not linked explicitly. You also can't Complete manually the consumer, because in this case it would stop prematurely accepting values from the producer. The solution to this problem is the PropagateCompletion method shown in the above example.

Dataflow blocks when some parallel process does a heavy job

I'm trying to understand TPL Dataflow.
I have two blocks inputBlock och nextBlock.
inputBlock using MaxDegreeOfParallelism = 2.
I have this situation that it can take diffrent time to parallell jobs to finish. I do not want the flow of data stops due some parallell job takes long time to finish.
I simply want each parallell job take one item from the queue and process it and then pass it to next block.
I do never reach nextBlock when one of the parallel job in the first block "inputBlock" goes to sleep or do a heavy job.
internal class Program
{
private static bool _sleep = true;
private static void Main(string[] args)
{
var inputBlock = new TransformBlock<string, string>(
x =>
{
if (_sleep)
{
_sleep = false;
Console.WriteLine("First thread sleeping");
Thread.Sleep(5000000);
}
Console.WriteLine("Second thread running");
return x;
},
new ExecutionDataflowBlockOptions {MaxDegreeOfParallelism = 2}); //1
var nextBlock = new TransformBlock<string, string>(
x =>
{
Console.WriteLine(x);
return x;
}); //2
inputBlock.LinkTo(nextBlock, new DataflowLinkOptions {PropagateCompletion = true});
for (var i = 0; i < 100; i++)
{
input.Post(i.ToString());
}
input.Complete();
Console.ReadLine();
}
}
}
Using EnsureOrdered = false was the answer.
new ExecutionDataflowBlockOptions {MaxDegreeOfParallelism = 2, EnsureOrdered = false});

How to span MaxDegreeOfParallelism across multiple TPL Dataflow blocks?

I want to limit the total number of queries that I submit to my database server across all Dataflow blocks to 30. In the following scenario, the throttling of 30 concurrent tasks is per block so it always hits 60 concurrent tasks during execution. Obviously I could limit my parallelism to 15 per block to achieve a system wide total of 30 but this wouldn't be optimal.
How do I make this work? Do I limit (and block) my awaits using SemaphoreSlim, etc, or is there an intrinsic Dataflow approach that works better?
public class TPLTest
{
private long AsyncCount = 0;
private long MaxAsyncCount = 0;
private long TaskId = 0;
private object MetricsLock = new object();
public async Task Start()
{
ExecutionDataflowBlockOptions execOption
= new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 30 };
DataflowLinkOptions linkOption = new DataflowLinkOptions()
{ PropagateCompletion = true };
var doFirstIOWorkAsync = new TransformBlock<Data, Data>(
async data => await DoIOBoundWorkAsync(data), execOption);
var doCPUWork = new TransformBlock<Data, Data>(
data => DoCPUBoundWork(data));
var doSecondIOWorkAsync = new TransformBlock<Data, Data>(
async data => await DoIOBoundWorkAsync(data), execOption);
var doProcess = new TransformBlock<Data, string>(
i => $"Task finished, ID = : {i.TaskId}");
var doPrint = new ActionBlock<string>(
s => Debug.WriteLine(s));
doFirstIOWorkAsync.LinkTo(doCPUWork, linkOption);
doCPUWork.LinkTo(doSecondIOWorkAsync, linkOption);
doSecondIOWorkAsync.LinkTo(doProcess, linkOption);
doProcess.LinkTo(doPrint, linkOption);
int taskCount = 150;
for (int i = 0; i < taskCount; i++)
{
await doFirstIOWorkAsync.SendAsync(new Data() { Delay = 2500 });
}
doFirstIOWorkAsync.Complete();
await doPrint.Completion;
Debug.WriteLine("Max concurrent tasks: " + MaxAsyncCount.ToString());
}
private async Task<Data> DoIOBoundWorkAsync(Data data)
{
lock(MetricsLock)
{
AsyncCount++;
if (AsyncCount > MaxAsyncCount)
MaxAsyncCount = AsyncCount;
}
if (data.TaskId <= 0)
data.TaskId = Interlocked.Increment(ref TaskId);
await Task.Delay(data.Delay);
lock (MetricsLock)
AsyncCount--;
return data;
}
private Data DoCPUBoundWork(Data data)
{
data.Step = 1;
return data;
}
}
Data Class:
public class Data
{
public int Delay { get; set; }
public long TaskId { get; set; }
public int Step { get; set; }
}
Starting point:
TPLTest tpl = new TPLTest();
await tpl.Start();
Why don't you marshal everything to an action block that has the actual limitation?
var count = 0;
var ab1 = new TransformBlock<int, string>(l => $"1:{l}");
var ab2 = new TransformBlock<int, string>(l => $"2:{l}");
var doPrint = new ActionBlock<string>(
async s =>
{
var c = Interlocked.Increment(ref count);
Console.WriteLine($"{c}:{s}");
await Task.Delay(5);
Interlocked.Decrement(ref count);
},
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 15 });
ab1.LinkTo(doPrint);
ab2.LinkTo(doPrint);
for (var i = 100; i > 0; i--)
{
if (i % 3 == 0) await ab1.SendAsync(i);
if (i % 5 == 0) await ab2.SendAsync(i);
}
ab1.Complete();
ab2.Complete();
await ab1.Completion;
await ab2.Completion;
This is the solution I ended up going with (unless I can figure out how to use a single generic DataFlow block for marshalling every type of database access):
I defined a SemaphoreSlim at the class level:
private SemaphoreSlim ThrottleDatabaseQuerySemaphore = new SemaphoreSlim(30, 30);
I modified the I/O class to call a throttling class:
private async Task<Data> DoIOBoundWorkAsync(Data data)
{
if (data.TaskId <= 0)
data.TaskId = Interlocked.Increment(ref TaskId);
Task t = Task.Delay(data.Delay); ;
await ThrottleDatabaseQueryAsync(t);
return data;
}
The throttling class: (I also have a generic version of the throttling routine because I couldn't figure out how to write one routine to handle both Task and Task<TResult>)
private async Task ThrottleDatabaseQueryAsync(Task task)
{
await ThrottleDatabaseQuerySemaphore.WaitAsync();
try
{
lock (MetricsLock)
{
AsyncCount++;
if (AsyncCount > MaxAsyncCount)
MaxAsyncCount = AsyncCount;
}
await task;
}
finally
{
ThrottleDatabaseQuerySemaphore.Release();
lock (MetricsLock)
AsyncCount--;
}
}
}
The simplest solution to this problem is to configure all your blocks with a limited-concurrency TaskScheduler:
TaskScheduler scheduler = new ConcurrentExclusiveSchedulerPair(
TaskScheduler.Default, maxConcurrencyLevel: 30).ConcurrentScheduler;
ExecutionDataflowBlockOptions execOption = new()
{
TaskScheduler = scheduler,
MaxDegreeOfParallelism = scheduler.MaximumConcurrencyLevel,
};
TaskSchedulers can only limit the concurrency of work done on threads. They can't throttle asynchronous operations that are not running on threads. So in order to enforce the MaximumConcurrencyLevel policy, unfortunately you must pass synchronous delegates to all the Dataflow blocks. For example:
TransformBlock<Data, Data> doFirstIOWorkAsync = new(data =>
{
return DoIOBoundWorkAsync(data).GetAwaiter().GetResult();
}, execOption);
This change will increase the demand for ThreadPool threads, so you'd better increase the number of threads that the ThreadPool creates instantly on demand to a higher value than the default Environment.ProcessorCount:
ThreadPool.SetMinThreads(100, 100); // At the start of the program
I am proposing this solution not because it is optimal, but because it is easy to implement. My understanding is that wasting some RAM on ~30 threads that are going to be blocked most of the time, won't have any measurable negative effect on the type of application that you are working with.

Multicast block TPL Dataflow

I need to multicast a object into multiple path's
producer
|
multicast
| |
Process1 Process2
| |
Writedb WriteFile
the broadcast block is not helping much, it only does the latest to both proces1, process 2 , if process 2 is running late then it wont be able to receive messages.
db writer and write file have different data.
Here is the following code snippet.
class Program
{
public static void Main()
{
var broadCastBlock = new BroadcastBlock<int>(i => i);
var transformBlock1 = new TransformBlock<int, string>(i =>
{
Console.WriteLine("1 transformblock called: {0}", i);
//Thread.Sleep(4);
return string.Format("1_ {0},", i);
});
var transformBlock2 = new TransformBlock<int, string>(i =>
{
Console.WriteLine("2 transformblock called: {0}", i);
Thread.Sleep(100);
return string.Format("2_ {0},", i);
});
var processorBlockT1 = new ActionBlock<string>(i => Console.WriteLine("processBlockT1 {0}", i));
var processorBlockT2 = new ActionBlock<string>(i => Console.WriteLine("processBlockT2 {0}", i));
//Linking
broadCastBlock.LinkTo(transformBlock1, new DataflowLinkOptions { PropagateCompletion = true });
broadCastBlock.LinkTo(transformBlock2, new DataflowLinkOptions { PropagateCompletion = true });
transformBlock1.LinkTo(processorBlockT1, new DataflowLinkOptions { PropagateCompletion = true });
transformBlock2.LinkTo(processorBlockT2, new DataflowLinkOptions { PropagateCompletion = true });
const int numElements = 100;
for (int i = 1; i <= numElements; i++)
{
broadCastBlock.SendAsync(i);
}
//completion handling
broadCastBlock.Completion.ContinueWith(x =>
{
Console.WriteLine("Braodcast block Completed");
transformBlock1.Complete();
transformBlock2.Complete();
Task.WhenAll(transformBlock1.Completion, transformBlock2.Completion).ContinueWith(_ =>
{
processorBlockT1.Complete();
processorBlockT2.Complete();
});
});
transformBlock1.Completion.ContinueWith(x => Console.WriteLine("Transform1 completed"));
transformBlock2.Completion.ContinueWith(x => Console.WriteLine("Transform2 completed"));
processorBlockT1.Completion.ContinueWith(x => Console.WriteLine("processblockT1 completed"));
processorBlockT2.Completion.ContinueWith(x => Console.WriteLine("processblockT2 completed"));
//mark completion
broadCastBlock.Complete();
Task.WhenAll(processorBlockT1.Completion, processorBlockT2.Completion).ContinueWith(_ => Console.WriteLine("completed both tasks")).Wait();
Console.WriteLine("Finished");
Console.ReadLine();
}
}
What is the best method of a guaranteed delivery by broadcast. i.e., a multicast.
should I just stick in two buffers at both ends and then consume it so that the buffers always collect what ever is coming in and then the process might take some time to process all of them?
The BroadcastBlock guarantees that all messages will be offered to all linked blocks. So it is exactly what you need. What you should fix though is the way you feed the BroadcastBlock with messages:
for (int i = 1; i <= numElements; i++)
{
broadCastBlock.SendAsync(i); // Don't do this!
}
The SendAsync method is supposed to be awaited. You should never have more than one pending SendAsync operations targeting the same block. Doing so not only breaks all guarantees about the order of the received messages, but it is also extremely memory-inefficient. The whole point of using bounded blocks is for controlling the memory usage by limiting the size of the internal buffers of the blocks. By issuing multiple un-awaited SendAsync commands you circumvent this self-imposed limitation by creating an external dynamic buffer of incomplete Tasks, with each task weighing hundreds of bytes, for propagating messages having just a fraction of this weight. These messages could be much more efficiently buffered internally, by not making the blocks bounded in the first place.
for (int i = 1; i <= numElements; i++)
{
await broadCastBlock.SendAsync(i); // Now it's OK
}

TransformBlock never completes

I'm trying to wrap my head around "completion" in TPL Dataflow blocks. In particular, the TransformBlock doesn't seem to ever complete. Why?
Sample program
My code calculates the square of all integers from 1 to 1000. I used a BufferBlock and a TransformBlock for that. Later in my code, I await completion of the TransformBlock. The block never actually completes though, and I don't understand why.
static void Main(string[] args)
{
var bufferBlock = new BufferBlock<int>();
var calculatorBlock = new TransformBlock<int, int>(i =>
{
Console.WriteLine("Calculating {0}²", i);
return (int)Math.Pow(i, 2);
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 8 });
using (bufferBlock.LinkTo(calculatorBlock, new DataflowLinkOptions { PropagateCompletion = true }))
{
foreach (var number in Enumerable.Range(1, 1000))
{
bufferBlock.Post(number);
}
bufferBlock.Complete();
// This line never completes
calculatorBlock.Completion.Wait();
// Unreachable code
IList<int> results;
if (calculatorBlock.TryReceiveAll(out results))
{
foreach (var result in results)
{
Console.WriteLine("x² = {0}", result);
}
}
}
}
At first I thought I created a deadlock situation, but that doesn't seem to be true. When I inspected the calculatorBlock.Completion task in the debugger, its Status property was set to WaitingForActivation. That was the moment when my brain blue screened.
The reason your pipeline hangs is that both BufferBlock and TransformBlock evidently don't complete until they emptied themselves of items (I guess that the desired behavior of IPropagatorBlocks although I haven't found documentation on it).
This can be verified with a more minimal example:
var bufferBlock = new BufferBlock<int>();
bufferBlock.Post(0);
bufferBlock.Complete();
bufferBlock.Completion.Wait();
This blocks indefinitely unless you add bufferBlock.Receive(); before completing.
If you remove the items from your pipeline before blocking by either your TryReceiveAll code block, connecting another ActionBlock to the pipeline, converting your TransformBlock to an ActionBlock or any other way this will no longer block.
About your specific solution, it seems that you don't need a BufferBlock or TransformBlock at all since blocks have an input queue for themselves and you don't use the return value of the TransformBlock. This could be achieved with just an ActionBlock:
var block = new ActionBlock<int>(
i =>
{
Console.WriteLine("Calculating {0}²", i);
Console.WriteLine("x² = {0}", (int)Math.Pow(i, 2));
},
new ExecutionDataflowBlockOptions {MaxDegreeOfParallelism = 8});
foreach (var number in Enumerable.Range(1, 1000))
{
block.Post(number);
}
block.Complete();
block.Completion.Wait();
I think I understand it now. An instance of TransformBlock is not considered "complete" until the following conditions are met:
TransformBlock.Complete() has been called
InputCount == 0 – the block has applied its transformation to every incoming element
OutputCount == 0 – all transformed elements have left the output buffer
In my program, there is no target block that is linked to the source TransformBlock, so the source block never gets to flush its output buffer.
As a workaround, I added a second BufferBlock that is used to store transformed elements.
static void Main(string[] args)
{
var inputBufferBlock = new BufferBlock<int>();
var calculatorBlock = new TransformBlock<int, int>(i =>
{
Console.WriteLine("Calculating {0}²", i);
return (int)Math.Pow(i, 2);
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 8 });
var outputBufferBlock = new BufferBlock<int>();
using (inputBufferBlock.LinkTo(calculatorBlock, new DataflowLinkOptions { PropagateCompletion = true }))
using (calculatorBlock.LinkTo(outputBufferBlock, new DataflowLinkOptions { PropagateCompletion = true }))
{
foreach (var number in Enumerable.Range(1, 1000))
{
inputBufferBlock.Post(number);
}
inputBufferBlock.Complete();
calculatorBlock.Completion.Wait();
IList<int> results;
if (outputBufferBlock.TryReceiveAll(out results))
{
foreach (var result in results)
{
Console.WriteLine("x² = {0}", result);
}
}
}
}
TransformBlock needs a ITargetBlock where he can post the transformation.
var writeCustomerBlock = new ActionBlock<int>(c => Console.WriteLine(c));
transformBlock.LinkTo(
writeCustomerBlock, new DataflowLinkOptions
{
PropagateCompletion = true
});
After this it completes.

Categories