Running multiple Tasks reuse the same object instance - c#

Here's an intersting one. I have a service creating a bunch of Tasks. At the moment only two tasks are configured in the list. However, if I put a breakpoint within the Task action and inspect the value of schedule.Name, it is hit twice with the same schedule name. However, two separate schedules are configured and in the schedule list. Can anyone explain why the Task reuses the last schedule in the loop? It this a scope issue?
// make sure that we can log any exceptions thrown by the tasks
TaskScheduler.UnobservedTaskException += new EventHandler<UnobservedTaskExceptionEventArgs>(TaskScheduler_UnobservedTaskException);
// kick off all enabled tasks
foreach (IJobSchedule schedule in _schedules)
{
if (schedule.Enabled)
{
Task.Factory.StartNew(() =>
{
// breakpoint at line below. Inspecting "schedule.Name" always returns the name
// of the last schedule in the list. List contains 2 separate schedule items.
IJob job = _kernel.Get<JobFactory>().CreateJob(schedule.Name);
JobRunner jobRunner = new JobRunner(job, schedule);
jobRunner.Run();
},
CancellationToken.None,
TaskCreationOptions.LongRunning,
TaskScheduler.Default
);
}
} // next schedule

If you use a temporary variable inside the foreach loop, it should solve your issue.
foreach (IJobSchedule schedule in _schedules)
{
var tmpSchedule = schedule;
if (tmpSchedule.Enabled)
{
Task.Factory.StartNew(() =>
{
// breakpoint at line below. Inspecting "schedule.Name" always returns the name
// of the last schedule in the list. List contains 2 separate schedule items.
IJob job = _kernel.Get<JobFactory>().CreateJob(tmpSchedule.Name);
JobRunner jobRunner = new JobRunner(job, tmpSchedule);
jobRunner.Run();
},
CancellationToken.None,
TaskCreationOptions.LongRunning,
TaskScheduler.Default
);
}
} //
For further reference about closures and loop variables, see
Closing over the loop variable considered harmful

Related

How to wait for a function execution for specific duration

My C# application stops responding for a long time, as I break the Debug it stops on a function.
foreach (var item in list)
{
xmldiff.Compare(item, secondary, output);
...
}
I guess the running time of this function is long or it hangs. Anyway, I want to wait for a certain time (e.g. 5 seconds) for the execution of this function, and if it exceeds this time, I skip it and go to the next item in the loop. How can I do it? I found some similar question but they are mostly for processes or asynchronous methods.
You can do it the brutal way: spin up a thread to do the work, join it with timeout, then abort it, if the join didn't work.
Example:
var worker = new Thread( () => { xmlDiff.Compare(item, secondary, output); } );
worker.Start();
if (!worker.Join( TimeSpan.FromSeconds( 1 ) ))
worker.Abort();
But be warned - aborting threads is not considered nice and can make your app unstable. If at all possible try to modify Compare to accept a CancellationToken to cancel the comparison.
I would avoid directly using threads and use Microsoft's Reactive Extensions (NuGet "Rx-Main") to abstract away the management of the threads.
I don't know the exact signature of xmldiff.Compare(item, secondary, output) but if I assume it produces an integer then I could do this with Rx:
var query =
from item in list.ToObservable()
from result in
Observable
.Start(() => xmldiff.Compare(item, secondary, output))
.Timeout(TimeSpan.FromSeconds(5.0), Observable.Return(-1))
select new { item, result };
var subscription =
query
.Subscribe(x =>
{
/* do something with `x.item` and/or `x.result` */
});
This automatically iterates through each item and starts a background computation of xmldiff.Compare, but only allows each computation to take as much as 5.0 seconds before returning a default value of -1.
The subscription variable is an IDisposable, so if you want to abort the entire query before it completes just call .Dispose().
I skip it and go to the next item in the loop
By "skip it", do you mean "leave it there" or "cancel it"? The two scenarios are quite different. But for both two I suggest you use Task.
//generate 10 example tasks
var tasks = Enumerable
.Range(0, 10)
.Select(n => new Task(() => DoSomething(n)))
.ToList();
var maxExecutionTime = TimeSpan.FromSeconds(5);
foreach (var task in tasks)
{
if (task.Wait(maxExecutionTime))
{
//the task is finished in time
}
else
{
// the task is over time
// just leave it there
// the loop continues
// if you want to cancel it, see
// http://stackoverflow.com/questions/4783865/how-do-i-abort-cancel-tpl-tasks
}
}
One thing to improve is "do you really need to run your tasks one by one?" If they are independent you can run them in parallel.

Create list of ActionBlock<T> that will complete when any fail

In a scenario where await may be called on an 'empty' list of tasks.
How do I await a list of Task<T>, and then add new tasks to the awaiting list until one fails or completes.
I am sure there is must be an Awaiter or CancellationTokenSource solution for this problem.
public class LinkerThingBob
{
private List<Task> ofmyactions = new List<Task>();
public void LinkTo<T>(BufferBlock<T> messages) where T : class
{
var action = new ActionBlock<IMsg>(_ => this.Tx(messages, _));
// this would not actually work, because the WhenAny
// will not include subsequent actions.
ofmyactions.Add(action.Completion);
// link the new action block.
this._inboundMessageBuffer.LinkTo(block);
}
// used to catch exceptions since these blocks typically don't end.
public async Task CompletionAsync()
{
// how do i make the awaiting thread add a new action
// to the list of waiting tasks without interrupting it
// or graciously interrupting it to let it know there's one more
// more importantly, this CompletionAsync might actually be called
// before the first action is added to the list, so I actually need
// WhenAny(INFINITE + ofmyactions)
await Task.WhenAny(ofmyactions);
}
}
My problem is that I need a mechanism where I can add each of the action instances created above to a Task<T> that will complete when there is an exception.
I am not sure how best to explain this but:
The task must not complete until at least one call to LinkTo<T> has been made, so I need to start with an infinite task
each time LinkTo<T> is called, the new action must be added to the list of tasks, which may already be awaited on in another thread.
There isn't anything built-in for this, but it's not too hard to build one using TaskCompletionSource<T>. TCS is the type to use when you want to await something and there isn't already a construct for it. (Custom awaiters are for more advanced scenarios).
In this case, something like this should suffice:
public class LinkerThingBob
{
private readonly TaskCompletionSource<object> _tcs = new TaskCompletionSource<object>();
private async Task ObserveAsync(Task task)
{
try
{
await task;
_tcs.TrySetResult(null);
}
catch (Exception ex)
{
_tcs.TrySetException(ex);
}
}
public void LinkTo<T>(BufferBlock<T> messages) where T : class
{
var action = new ActionBlock<IMsg>(_ => this.Tx(messages, _));
var _ = ObserveAsync(action.Completion);
this._inboundMessageBuffer.LinkTo(block);
}
public Task Completion { get { return _tcs.Task; } }
}
Completion starts in a non-completed state. Any number of blocks can be linked to it using ObserveAsync. As soon as one of the blocks completes, Completion also completes. I wrote ObserveAsync here in a way so that if the first completed block completes without error, then so will Completion; and if the first completed block completes with an exception, then Completion will complete with that same exception. Feel free to tweak for your specific needs. :)
This is a solution that uses exclusively tools of the TPL Dataflow library itself. You can create a TransformBlock that will "process" the ActionBlocks you want to observe. Processing a block means simply awaiting for its completion. So the TransformBlock takes incomplete blocks, and outputs the same blocks as completed. The TransformBlock must be configured with unlimited parallelism and capacity, and with ordering disabled, so that all blocks are observed concurrently, and each one that completes is returned instantly.
var allBlocks = new TransformBlock<ActionBlock<IMsg>, ActionBlock<IMsg>>(async block =>
{
try { await block.Completion; }
catch { }
return block;
}, new ExecutionDataflowBlockOptions()
{
MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded,
EnsureOrdered = false
});
Then inside the LinkerThingBob.LinkTo method, send the created ActionBlocks to the TransformBlock.
var actionBlock = new ActionBlock<IMsg>(_ => this.Tx(messages, _));
allBlocks.Post(actionBlock);
Now you need a target to receive the first faulted block. A WriteOnceBlock is quite suitable for this role, since it ensures that will receive at most one faulted block.
var firstFaulted = new WriteOnceBlock<ActionBlock<IMsg>>(x => x);
allBlocks.LinkTo(firstFaulted, block => block.Completion.IsFaulted);
Finally you can await at any place for the completion of the WriteOnceBlock. It will complete immediately after receiving a faulted block, or it may never complete if it never receives a faulted block.
await firstFaulted.Completion;
After the awaiting you can also get the faulted block if you want.
ActionBlock<IMsg> faultedBlock = firstFaulted.Receive();
The WriteOnceBlock is special on how it behaves when it forwards messages. Unlike most other blocks, you can call multiple times its Receive method, and you'll always get the same single item it contains (it is not removed from its buffer after the first Receive).

C# TPL application stops running

I have an application that was working and has a loop with a variable number of iterations. I have one function call in this loop. I then tried to change the program to launch the function as a separate thread. I set up a unit test to run, and the application stops running before completing any work.
I have set the loop to have one iteration and debug on the one thread. It stops running near the top of the function, not always on the same line, but in the same area where I try to make a copy of an object that has a data table and data rows where the selection can be changed in each thread. The following is the code and it consistently stops when debugging in this area, but the line that is reached varies.
// main thread called by unit test
...
for(...
{
Task compute = Task.Factory.StartNew(() => results.Add(Compute(originalObject)));
}
...
private ReturnObject Compute(MyObject originalObject)
{
...
// near top of function after some assignment statements
// of some string and boolean variables
MyObject myObject = originalObject.Copy;
// never makes it to the next line
...
}
// MyObject class
private MyObject(DataTable dtTable)
{
_dataService = new DataService();
_dataTable = dtTable.Copy();
_dataRows = _dataTable.Select();
}
public MyObject Copy()
{
MyObject copy = new MyObject(_dtTable);
return copy;
}
// DataService class
public DataService()
{
_oleDbConnection = null;
}
You do not appear to Wait for the tasks that you create to complete: you must either call the Wait method or access the Result property of a generic task to block the calling thread until the work is complete, try the following:
var tasks = new List<Task>();
for ...
{
Task compute = Task.Factory.StartNew(() => results.Add(Compute(originalObject)));
tasks.Add(compute);
}
Task.WaitAll(tasks.ToArray());

How to wait on all tasks (created task and subtask) without using TaskCreationOptions.AttachedToParent

I will have to create a concurrent software which create several Task, and every Task could generate another task(that could also generate another Task, ...).
I need that the call to the method which launch task is blocking: no return BEFORE all task and subtask are completed.
I know there is this TaskCreationOptions.AttachedToParent property, but I think it will not fit:
The server will have something like 8 cores at least, and each task will create 2-3 subtask, so if I set the AttachedToParent option, I've the impression that the second sub-task will not start before the three tasks of the first subtask ends. So I will have a limited multitasking here.
So with this process tree:
I've the impression that if I set AttachedToParent property everytime I launch a thread, B will not ends before E,F,G are finished, so C will start before B finish, and I will have only 3 actives thread instead of the 8 I can have.
If I don't put the AttachedToParent property, A will be finished very fast and return.
So how could I do to ensure that I've always my 8 cores fully used if I don't set this option?
The TaskCreationOptions.AttachedToParent does not prevent the other subtasks from starting, but rather prevents the parent task itself from closing. So when E,F and G are started with AttachedToParent, B is not flagged as finished until all three are finished. So it should do just as you want.
The source (in the accepted answer).
As Me.Name mentioned, AttachedToParent doesn't behave according to your impressions. I think it's a fine option in this case.
But if you don't want to use that for whatever reason, you can wait for all the child tasks to finish with Task.WaitAll(). Although it means you have to have all of them in a collection.
Task.WaitAll() blocks the current thread until all the Tasks are finished. If you don't want that and you are on .Net 4.5, you can use Task.WhenAll(), which will return a single Task that will finish when all of the given Tasks finish.
You could you TaskFactory create options like in this example:
Task parent = new Task(() => {
var cts = new CancellationTokenSource();
var tf = new TaskFactory<Int32>(cts.Token,
TaskCreationOptions.AttachedToParent,
TaskContinuationOptions.ExecuteSynchronously,
TaskScheduler.Default);
// This tasks creates and starts 3 child tasks
var childTasks = new[] {
tf.StartNew(() => Sum(cts.Token, 10000)),
tf.StartNew(() => Sum(cts.Token, 20000)),
tf.StartNew(() => Sum(cts.Token, Int32.MaxValue)) // Too big, throws Overflow
};
// If any of the child tasks throw, cancel the rest of them
for (Int32 task = 0; task <childTasks.Length; task++)
childTasks[task].ContinueWith(
t => cts.Cancel(), TaskContinuationOptions.OnlyOnFaulted);
// When all children are done, get the maximum value returned from the
// non-faulting/canceled tasks. Then pass the maximum value to another
// task which displays the maximum result
tf.ContinueWhenAll(
childTasks,
completedTasks => completedTasks.Where(
t => !t.IsFaulted && !t.IsCanceled).Max(t => t.Result), CancellationToken.None)
.ContinueWith(t =>Console.WriteLine("The maximum is: " + t.Result),
TaskContinuationOptions.ExecuteSynchronously);
});
// When the children are done, show any unhandled exceptions too
parent.ContinueWith(p => {
// I put all this text in a StringBuilder and call Console.WriteLine just once
// because this task could execute concurrently with the task above & I don't
// want the tasks' output interspersed
StringBuildersb = new StringBuilder(
"The following exception(s) occurred:" + Environment.NewLine);
foreach (var e in p.Exception.Flatten().InnerExceptions)
sb.AppendLine(" "+ e.GetType().ToString());
Console.WriteLine(sb.ToString());
}, TaskContinuationOptions.OnlyOnFaulted);
// Start the parent Task so it can start its children
parent.Start();

Launching multiple threads, Why must you wait?

I have been playing around with Threads and Tasks (.net 4) and noticed some odd behavior when you launch multiple threads without waiting a few miliseconds between each thread started call.
The example below when run does not output what I expected:
1
2
1
2
But instead only outputs:
2
2
2
2
Below is the code that I am running.
public static void Main()
{
var items = new[] {"1", "2"};
foreach (var item in items)
{
var thread = new Thread(() => Print(item));
thread.Start();
//var task = Task.Factory.StartNew(() => Print(item));
}
}
static void Print(string something)
{
while (true)
{
Console.WriteLine(something);
Thread.Sleep(1000);
}
}
Now when I call Thread.Sleep(50) after the thread.Start() then only does the output look as expected
1
2
1
2
My question is:
Why when you do not wait between launching both threads does the first
thread loose the method's parameter value you initially started it with?
i.e. first thread is launched with parameter of "1", second thread is launched with parameter of "2", however first thread's parameter becomes "2" as well? This makes no sense, especially since Print() method paramter is a value type of string.
Google "access to modified closure". What's happening is your local variable "item" is getting it's value changed before the Print function is invoked. A solution would be to create a new variable inside the scope of the loop and assign item to it.
The item is evaluated at the time that the thread you create starts due to c# closures. Another way to force the item to evaluate is to introduce a variable so that the closure will include it like so:
foreach (var item in items)
{
var closedItem = item;
var thread = new Thread(() => Print(closedItem));
thread.Start();
}
Your problem is not with threads. Your problem is with the closure and the foreach. You can read here why:
http://blogs.msdn.com/b/ericlippert/archive/2009/11/12/closing-over-the-loop-variable-considered-harmful.aspx
When you play with the timing of the threads you reorder the timings of the main thread as well so sometimes the loop will be executed before the print method of the new thread runs and sometimes after.
Show us the thread starting code and you'll find that you do not pass a constant string but a reference variable and in between calling those Start methods you are probably changing the variable.

Categories