How can I block only until the first thread finishes? - c#

I recently came across a case where it would be handy to be able to spawn a bunch of threads, block and wait for exactly one answer (the first one to arrive), cancelling the rest of the threads and then unblocking.
For example, suppose I have a search function that takes a seed value. Let us stipulate that the search function can be trivially parallelized. Furthermore, our search space contains many potential solutions, and that for some seed values, the function will search indefinitely, but that at least one seed value will yield a solution in a reasonable amount of time.
It would be great if I could to this search in parallel, totally naively, like:
let seeds = [|0..100|]
Array.Parallel.map(fun seed -> Search(seed)) seeds
Sadly, Array.Parallel.map will block until all of the threads have completed. Bummer. I could always set a timeout in the search function, but then I'm almost certain to wait for the longest-running thread to finish; furthermore, for some problems, the timeout might not be long enough.
In short, I'd like something sort of like the UNIX sockets select() call, only for arbitrary functions. Is this possible? It doesn't have to be in a pretty data-parallel abstraction, as above, and it doesn't have to be F# code, either. I'd even be happy to use a native library and call it via P/Invoke.

You can create a bunch of tasks and then use Task.WaitAny or Task.WhenAny to either synchronously wait for the first task to finish or create a task that will be completed when the first task finishes, respectively.
A simple synchronous example:
var tasks = new List<Task<int>>();
var cts = new CancellationTokenSource();
for (int i = 0; i < 10; i++)
{
int temp = i;
tasks.Add(Task.Run(() =>
{
//placeholder for real work of variable time
Thread.Sleep(1000 * temp);
return i;
}, cts.Token));
}
var value = Task.WaitAny(tasks.ToArray());
cts.Cancel();
Or for an asynchronous version:
public static async Task<int> Foo()
{
var tasks = new List<Task<int>>();
var cts = new CancellationTokenSource();
for (int i = 0; i < 10; i++)
{
int temp = i;
tasks.Add(Task.Run(async () =>
{
await Task.Delay(1000 * temp, cts.Token);
return temp;
}));
}
var value = await await Task.WhenAny(tasks);
cts.Cancel();
return value;
}

let rnd = System.Random()
let search seed = async {
let t = rnd.Next(10000)
//printfn "seed: %d ms: %d" seed t
do! Async.Sleep t
return sprintf "seed %d finish" seed
}
let processResult result = async {
//Todo:
printfn "%s" result
}
let cts = new System.Threading.CancellationTokenSource()
let ignoreFun _ = () //if you don't want handle
let tasks =
[0..10]
|> List.map (fun i ->
async {
let! result = search i
do! processResult result
cts.Cancel()
}
)
Async.StartWithContinuations(Async.Parallel tasks, ignoreFun, ignoreFun, ignoreFun, cts.Token)

Try synchronizng all threads using an event object, when you find a solution set the event, all others threads have to check periodically for the event state and stop execution if it was already set.
For more details, look here.

This seemed to work for me
namespace CancellParallelLoops
{
class Program
{
static void Main(string[] args)
{
int[] nums = Enumerable.Range(0, 10000000).ToArray();
CancellationTokenSource cts = new CancellationTokenSource();
// Use ParallelOptions instance to store the CancellationToken
ParallelOptions po = new ParallelOptions();
po.CancellationToken = cts.Token;
po.MaxDegreeOfParallelism = System.Environment.ProcessorCount;
Console.WriteLine("Press any key to start. Press 'c' to cancel.");
Console.ReadKey();
// Run a task so that we can cancel from another thread.
Task.Factory.StartNew(() =>
{
if (Console.ReadKey().KeyChar == 'c')
cts.Cancel();
Console.WriteLine("press any key to exit");
});
try
{
Parallel.ForEach(nums, po, (num) =>
{
double d = Math.Sqrt(num);
Console.WriteLine("{0} on {1}", d, Thread.CurrentThread.ManagedThreadId);
if (num == 1000) cts.Cancel();
po.CancellationToken.ThrowIfCancellationRequested();
});
}
catch (OperationCanceledException e)
{
Console.WriteLine(e.Message);
}
Console.ReadKey();
}
}
}

Related

Async method somehow yields control when it shouldn't

Consider the code below
static void Main(string[] args)
{
var ts = new CancellationTokenSource();
CancellationToken ct = ts.Token;
Task<string> task = Task.Run(() =>
{
ct.ThrowIfCancellationRequested();
var task2 = ActualAsyncTask();
while (!task2.IsCompleted)
{
var t = DateTime.Now;
while (DateTime.Now - t < TimeSpan.FromSeconds(1))
{
}
ct.ThrowIfCancellationRequested();
}
return task2;
}, ct);
Console.ReadLine();
ts.Cancel();
Console.ReadLine();
}
static async Task<string> ActualAsyncTask()
{
await Task.Delay(1000);
for(int i = 0; i < 100; ++i)
{
var t = DateTime.Now;
while (DateTime.Now - t < TimeSpan.FromSeconds(1))
{
}
Console.WriteLine("tick");
}
return "success";
}
It spawns a task that busy-waits for a cancellation request while an asynchronous method also busy-waits and prints some text into console.
When you let it run for a few seconds and then press enter, the task will throw the cancellation exception. While it's an interesting trick, I don't understand how the async method is able to yield control and make it possible.
Both the anonymous lambda within the task and the asynchronous method report to run on the same worker thread, which means they should be synchronous. In my understanding this setup should not throw the cancellation exception past the first 1 second await, because past that point there are no more awaits to allow the while loop of the anonymous lambda to gain control of the thread flow, yet somehow they seemingly both run in parallel using one thread.
If I remove the await command entirely, the execution becomes as expected - sending a cancellation request no longer interrupts the task. What am I missing?

Can i change and int value inside a task while its running? c#

I currently am learning how to use Tasks in c#, i want to be able to run 2 tasks at the same time. then when the first task ends. tell the code to stop the second one. I have tried many things but none have worked, i have tried:
Try looking for something related to task.stop and have not found it. i am using task.wait for the first task so when the first one ends i have to do something to stop the second one.
Since the second one is infinite (its an eternal loop) i tried making the parameter of the loop something i could change in the main code, but its like the task is a method and variables in them are unique.
TL;DR: I want to know if i can change a parameter inside a task in order to stop it from outside its code. do the task itself take any parameters? and can i change them in the main code after they start running?
If none of the previous things are possible is it then possible in any way to stop an infinite task?
CODE:
Task a = new Task(() =>
{
int sd = 3;
while (sd < 20)
{
Console.Write("peanuts");
sd++; //this i can change cuz its like local to the task
}
});
a.Start();
// infinite task
Task b = new Task(() =>
{
int s = 3; // parameter i want to change to stop it
while (s < 10)
{
Console.Write(s+1);
}
});
b.Start();
a.Wait();
// Now here I want to stop task b
Console.WriteLine("peanuts");
Console.ReadKey();
Try this:
public static void Run()
{
CancellationTokenSource cts = new CancellationTokenSource();
Task1(cts);
Task2(cts.Token);
}
private static void Task2(CancellationToken token)
{
Task.Factory.StartNew(() =>
{
int s = 3; // parameter i want to change to stop it
while (!token.IsCancellationRequested)
{
Console.Write(s + 1);
}
}, token);
}
private static void Task1(CancellationTokenSource cts)
{
Task.Factory.StartNew(() =>
{
int sd = 3;
while (sd < 20)
{
Console.Write("peanuts");
sd++; //this i can change cuz its like local to the task
}
}).ContinueWith(t => cts.Cancel());
}
CancellationTokenSource will be cancelled when Task1 is finished. So, Task2 checks cancellation token each iteration and exits infinite loop when cancellation is requested.

TPL DataFlow Workflow

I have just started reading TPL Dataflow and it is really confusing for me. There are so many articles on this topic which I read but I am unable to digest it easily. May be it is difficult and may be I haven't started to grasp the idea.
The reason why I started looking into this is that I wanted to implement a scenario where parallel tasks could be run but in order and found that TPL Dataflow can be used as this.
I am practicing TPL and TPL Dataflow both and am at very beginners level so I need help from experts who could guide me to the right direction. In the test method written by me I have done the following thing,
private void btnTPLDataFlow_Click(object sender, EventArgs e)
{
Stopwatch watch = new Stopwatch();
watch.Start();
txtOutput.Clear();
ExecutionDataflowBlockOptions execOptions = new ExecutionDataflowBlockOptions();
execOptions.MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded;
ActionBlock<string> actionBlock = new ActionBlock<string>(async v =>
{
await Task.Delay(200);
await Task.Factory.StartNew(
() => txtOutput.Text += v + Environment.NewLine,
CancellationToken.None,
TaskCreationOptions.None,
scheduler
);
}, execOptions);
for (int i = 1; i < 101; i++)
{
actionBlock.Post(i.ToString());
}
actionBlock.Complete();
watch.Stop();
lblTPLDataFlow.Text = Convert.ToString(watch.ElapsedMilliseconds / 1000);
}
Now the procedure is parallel and both asynchronous (not freezing my UI) but the output generated is not in order whereas I have read that TPL Dataflow keeps the order of the elements by default. So my guess is that, then the Task which I have created is the culprit and it is not output the string in correct order. Am I right?
If this is the case then how do I make this Asynchronous and in order both?
I have tried to separate the code and tried to distribute the code in to different methods but my this try is failed as only string is output to textbox and nothing else happened.
private async void btnTPLDataFlow_Click(object sender, EventArgs e)
{
Stopwatch watch = new Stopwatch();
watch.Start();
await TPLDataFlowOperation();
watch.Stop();
lblTPLDataFlow.Text = Convert.ToString(watch.ElapsedMilliseconds / 1000);
}
public async Task TPLDataFlowOperation()
{
var actionBlock = new ActionBlock<int>(async values => txtOutput.Text += await ProcessValues(values) + Environment.NewLine,
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded, TaskScheduler = scheduler });
for (int i = 1; i < 101; i++)
{
actionBlock.Post(i);
}
actionBlock.Complete();
await actionBlock.Completion;
}
private async Task<string> ProcessValues(int i)
{
await Task.Delay(200);
return "Test " + i;
}
I know I have written a bad piece of code but this is the first time I am experimenting with TPL Dataflow.
How do I make this Asynchronous and in order?
This is something of a contradiction. You can make concurrent tasks start in order, but you can't really guarantee that they will run or complete in order.
Let's examine your code and see what's happening.
First, you've selected DataflowBlockOptions.Unbounded. This tells TPL Dataflow that it shouldn't limit the number of tasks that it allows to run concurrently. Therefore, each of your tasks will start at more-or-less the same time, in order.
Your asynchronous operation begins with await Task.Delay(200). This will cause your method to be suspended and then resume after about 200 ms. However, this delay is not exact, and will vary from one invocation to the next. Also, the mechanism by which your code is resumed after the delay may presumably take a variable amount of time. Because of this random variation in the actual delay, then next bit of code to run is now not in order—resulting in the discrepancy you're seeing.
You might find this example interesting. It's a console application to simplify things a bit.
class Program
{
static void Main(string[] args)
{
OutputNumbersWithDataflow();
OutputNumbersWithParallelLinq();
Console.ReadLine();
}
private static async Task HandleStringAsync(string s)
{
await Task.Delay(200);
Console.WriteLine("Handled {0}.", s);
}
private static void OutputNumbersWithDataflow()
{
var block = new ActionBlock<string>(
HandleStringAsync,
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded });
for (int i = 0; i < 20; i++)
{
block.Post(i.ToString());
}
block.Complete();
block.Completion.Wait();
}
private static string HandleString(string s)
{
// Perform some computation on s...
Thread.Sleep(200);
return s;
}
private static void OutputNumbersWithParallelLinq()
{
var myNumbers = Enumerable.Range(0, 20).AsParallel()
.AsOrdered()
.WithExecutionMode(ParallelExecutionMode.ForceParallelism)
.WithMergeOptions(ParallelMergeOptions.NotBuffered);
var processed = from i in myNumbers
select HandleString(i.ToString());
foreach (var s in processed)
{
Console.WriteLine(s);
}
}
}
The first set of numbers is calculated using a method rather similar to yours—with TPL Dataflow. The numbers are out-of-order.
The second set of numbers, output by OutputNumbersWithParallelLinq(), doesn't use Dataflow at all. It relies on the Parallel LINQ features built into .NET. This runs my HandleString() method on background threads, but keeps the data in order through to the end.
The limitation here is that PLINQ doesn't let you supply an async method. (Well, you could, but it wouldn't give you the desired behavior.) HandleString() is a conventional synchronous method; it just gets executed on a background thread.
And here's a more complex Dataflow example that does preserve the correct order:
private static void OutputNumbersWithDataflowTransformBlock()
{
Random r = new Random();
var transformBlock = new TransformBlock<string, string>(
async s =>
{
// Make the delay extra random, just to be sure.
await Task.Delay(160 + r.Next(80));
return s;
},
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded });
// For a GUI application you should also set the
// scheduler here to make sure the output happens
// on the correct thread.
var outputBlock = new ActionBlock<string>(
s => Console.WriteLine("Handled {0}.", s),
new ExecutionDataflowBlockOptions
{
SingleProducerConstrained = true,
MaxDegreeOfParallelism = 1
});
transformBlock.LinkTo(outputBlock, new DataflowLinkOptions { PropagateCompletion = true });
for (int i = 0; i < 20; i++)
{
transformBlock.Post(i.ToString());
}
transformBlock.Complete();
outputBlock.Completion.Wait();
}

creating a .net async wrapper to a sync request

I have the following situation (or a basic misunderstanding with the async await mechanism).
Assume you have a set of 1-20 web request call that takes a long time: findItemsByProduct().
you want to wrap it around in an async request, that would be able to abstract all these calls into one async call, but I can't seem to be able to do it without using more threads.
If I'm doing:
int total = result.paginationOutput.totalPages;
for (int i = 2; i < total + 1; i++)
{
await Task.Factory.StartNew(() =>
{
result = client.findItemsByProduct(i);
});
newList.AddRange(result.searchResult.item);
}
}
return newList;
problem here, that the calls don't run together, rather they are waiting one by one.
I would like all the calls to run together and than harvest the results.
as pseudo code, I would like the code to run like this:
forEach item {
result = item.makeWebRequest();
}
foreach item {
List.addRange(item.harvestResults);
}
I have no idea how to make the code to do that though..
Ideally, you should add a findItemsByProductAsync that returns a Task<Item[]>. That way, you don't have to create unnecessary tasks using StartNew or Task.Run.
Then your code can look like this:
int total = result.paginationOutput.totalPages;
// Start all downloads; each download is represented by a task.
Task<Item[]>[] tasks = Enumerable.Range(2, total - 1)
.Select(i => client.findItemsByProductAsync(i)).ToArray();
// Wait for all downloads to complete.
Item[][] results = await Task.WhenAll(tasks);
// Flatten the results into a single collection.
return results.SelectMany(x => x).ToArray();
Given your requirements which I see as:
Process n number of non-blocking tasks
Process results after all queries have returned
I would use the CountdownEvent for this e.g.
var results = new ConcurrentBag<ItemType>(result.pagination.totalPages);
using (var e = new CountdownEvent(result.pagination.totalPages))
{
for (int i = 2; i <= result.pagination.totalPages+1; i++)
{
Task.Factory.StartNew(() => return client.findItemsByProduct(i))
.ContinueWith(items => {
results.AddRange(items);
e.Signal(); // signal task is done
});
}
// Wait for all requests to complete
e.Wait();
}
// Process results
foreach (var item in results)
{
...
}
This particular problem is solved easily enough without even using await. Simply create each of the tasks, put all of the tasks into a list, and then use WhenAll on that list to get a task that represents the completion of all of those tasks:
public static Task<Item[]> Foo()
{
int total = result.paginationOutput.totalPages;
var tasks = new List<Task<Item>>();
for (int i = 2; i < total + 1; i++)
{
tasks.Add(Task.Factory.StartNew(() => client.findItemsByProduct(i)));
}
return Task.WhenAll(tasks);
}
Also note you have a major problem in how you use result in your code. You're having each of the different tasks all using the same variable, so there are race conditions as to whether or not it works properly. You could end up adding the same call twice and having one skipped entirely. Instead you should have the call to findItemsByProduct be the result of the task, and use that task's Result.
If you want to use async-await properly you have to declare your functions async, and the functions that call you also have to be async. This continues until you have once synchronous function that starts the async process.
Your function would look like this:
by the way you didn't describe what's in the list. I assume they are
object of type T. in that case result.SearchResult.Item returns
IEnumerable
private async Task<List<T>> FindItems(...)
{
int total = result.paginationOutput.totalPages;
var newList = new List<T>();
for (int i = 2; i < total + 1; i++)
{
IEnumerable<T> result = await Task.Factory.StartNew(() =>
{
return client.findItemsByProduct(i);
});
newList.AddRange(result.searchResult.item);
}
return newList;
}
If you do it this way, your function will be asynchronous, but the findItemsByProduct will be executed one after another. If you want to execute them simultaneously you should not await for the result, but start the next task before the previous one is finished. Once all tasks are started wait until all are finished. Like this:
private async Task<List<T>> FindItems(...)
{
int total = result.paginationOutput.totalPages;
var tasks= new List<Task<IEnumerable<T>>>();
// start all tasks. don't wait for the result yet
for (int i = 2; i < total + 1; i++)
{
Task<IEnumerable<T>> task = Task.Factory.StartNew(() =>
{
return client.findItemsByProduct(i);
});
tasks.Add(task);
}
// now that all tasks are started, wait until all are finished
await Task.WhenAll(tasks);
// the result of each task is now in task.Result
// the type of result is IEnumerable<T>
// put all into one big list using some linq:
return tasks.SelectMany ( task => task.Result.SearchResult.Item)
.ToList();
// if you're not familiar to linq yet, use a foreach:
var newList = new List<T>();
foreach (var task in tasks)
{
newList.AddRange(task.Result.searchResult.item);
}
return newList;
}

Anonymous Parallel Task Timers?

Maybe my brain is a bit fried so I'm missing some nice way to do the following... I want to be able to launch a timer though a Task that runs on a certain interval and checks some condition on each interval whether it should cancel itself, what's the most elegant solution?
Optimally I'd like something like:
Task.Factory.StartNew(() =>
{
Timer.Do(TimeSpan.FromMilliSeconds(200),() => ShouldCancel(), ()=>
{
//DoStuff
});
});
using a while/thread-sleep loop doesn't seem optimal. I guess I could define and use a ordinary timer but it seems a bit clunky...
How about something like the following.I'm sure the API could be cleaned up a bit.
Points to note:
The DoWork method must support cooperative cancellation, this is the only cancellation approach supported by the Task Parallel Library.
The timer must start inside the Task, otherwise the Task may be created and scheduled but not executed and the timer will be timing task wait time not execution time.
If you want to provide other external mechanisms for cancellation (other tokens) then you need to pass in another context and link them. See: CancellationTokenSource.CreateLinkedTokenSource
This is only approximate as System.Threading.Timer only has millisecond accuracy. It should be good enough for limiting a Task to run for a few seconds.
public static class TimeLimitedTaskFactory
{
public static Task StartNew<T>
(Action<CancellationToken> action, int maxTime)
{
Task tsk = Task.Factory.StartNew(() =>
{
var cts = new CancellationTokenSource();
System.Threading.Timer timer = new System.Threading.Timer(o =>
{
cts.Cancel();
Console.WriteLine("Cancelled!");
}, null, maxTime, int.MaxValue);
action(cts.Token);
});
return tsk;
}
}
class Program
{
static void Main(string[] args)
{
int maxTime = 2000;
int maxWork = 10;
Task tsk = TimeLimitedTaskFactory
.StartNew<int>((ctx) => DoWork(ctx, maxWork), maxTime);
Console.WriteLine("Waiting on Task...");
tsk.Wait();
Console.WriteLine("Finished...");
Console.ReadKey();
}
static void DoWork(CancellationToken ctx, int workSize)
{
int i = 0;
while (!ctx.IsCancellationRequested && i < workSize)
{
Thread.Sleep(500);
Console.WriteLine(" Working on ", ++i);
}
}
}
You also can use RX library.
var timerTask = Observable.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(3));
timerTask.Subscribe(x =>
{
//Do stuff here
});
I think this is what you want:
var cancelToken = new CancellationTokenSource();
var tt = Task.Factory.StartNew(obj =>
{
var tk = (CancellationTokenSource) obj;
while (!tk.IsCancellationRequested)
{
if (condition)//your condition
{
//Do work
}
Thread.Sleep(1000);
}
}, cancelToken);

Categories