How does Parallel.Invoke() method works? [duplicate] - c#

I am using the task parallel library like this in a .aspx page:
Parallel.Invoke(
new Action[]
{
() => { users= service.DoAbc(); },
() => { products= service.DoDef(); }
});
Previously I was firing off a thread per call, and it was more responsive that it is now when I use Parallel.Invoke.
Should I assume the TPL library will do what's best or is there a way for me to tweak it so it actually does calls in parallel?
I guess it comes down to the type of hardware my website is running on, which I believe is a VM.
Each of my calls makes a http request to fetch results from a API call.

Parallel.Invoke will run you methods in parallel unless this is more expensive than running them sequentially, or there are no available threads in the threadpool.This is an optimization, not an issue. Under normal circumstances you shouldn't try to second-guess the framework and just let it do its job.
You should consider overriding this behavior if you want to invoke some long-running IO-bound methods. Parallel.Invoke uses the default TaskScheduler which uses about as many threads as there are cores (not sure how many) to avoid overloading the CPU. This is not an issue if your actions just wait for some IO or network call to complete.
You can specify the maximum number of threads using the Parallel.Invoke(ParallelOptions,Action[])]1 override. You can also use the ParallelOptions class to pass a cancellation token or specifiy a custom TaskScheduler, eg one that allows you to use more threads than the default scheduler.
You can rewrite your code like this:
Parallel.Invoke(
new ParallelOptions{MaxDegreeOfParallelism=30},
new Action[]
{
() => { users= service.DoAbc(); },
() => { products= service.DoDef(); }
});
Still, you should not try to modify the default options unless you find an actual performance problem. You may end up oversubscribing your CPU and causing delays or thrashing.

You could fire off a couple tasks to handle the calls.
// Change to Task.Factory.StartNew depending on .NET version
var userTask = Task.Run(() => service.DoAbc());
var productsTask = Task.Run(() => service.DoDef());
Task.WaitAll(userTask, productsTask);
users = userTask.Result;
products = productsTask.Result;

Related

Run work async on specific thread, and continue

I need to fire off some Tasks to run, but I want them to be on SPECIFIC (the same) threads, every time they run. I don't know how to get that to happen except to perhaps instantiate a SingleThreadTaskScheduler (of my own creation). I am getting frames from a capture source, and I want to split off processing work onto parallel threads, but I need the parallel threads to operate on the frames in order. And for that to happen, they have to be the same thread as I fed last time, per processing pipeline. For instance, I have parallel processing pipelines A, B, and C. I need to feed AB&C each time I get a frame. They operate in parallel.
I saw another example on StackOverflow about how to create a single thread task scheduler, but it doesn't explain how I would be allowed to await the result and keep chugging in my current thread.
Here's the function I sort of need to execute. Task.Run() needs to be replaced by firing off x.LongRunningAsync() on a specific thread, not just some random one from the thread pool! That is, one specific thread PER item in this.Steps. The same thread needs to be called per call of DoParallelStuff. DoParallelStuff is called many times. The caller of this function wants to go off and do other stuff while these things are executing in parallel.
public async Task<bool> DoParallelStuff()
{
var tasks = this.Steps.Select(x => Task.Run(() => x.LongRunningAsync()));
var results = await Task.WhenAll(tasks);
var completed = !results.Any() || results.All(x => x == true);
this.OnCompleted();
return completed;
}
This problem can be solved with a custom TaskScheduler, but can also be solved with a custom SynchronizationContext. For example you could install Stephen Cleary's Nito.AsyncEx.Context package, and do this:
var tasks = this.Steps.Select(step => Task.Factory.StartNew(() =>
{
AsyncContext.Run(async () =>
{
await step.LongRunningAsync();
});
}, default, TaskCreationOptions.LongRunning, TaskScheduler.Default));
A dedicated thread is going to be launched for each step in this.Steps. Most probably this thread is going to be blocked for most of the time, while the LongRunningAsync is doing asynchronous stuff internally, but the continuations between the await points will be invoked on this thread.
It is important that all the await points inside the LongRunningAsync method are capturing the SynchronizationContext. A single ConfigureAwait(false) will cause the single-thread policy to fail.
And here is how a SingleThreadTaskScheduler can be used instead:
var tasks = this.Steps.Select(step => Task.Factory.StartNew(async () =>
{
await step.LongRunningAsync();
}, default, TaskCreationOptions.None, new SingleThreadTaskScheduler()).Unwrap());
Pay attention to the Unwrap call, it's very important.
The previous note, regarding the required absence of ConfigureAwait(false) at the await points, applies here too.
An (untested) implementation of the SingleThreadTaskScheduler class can be found here.
alright, I did some work over the weekend and figured it out. I used the SingleThreadTaskScheduler mentioned before, and, per pipeline (thread), I create a SingleThreadTaskScheduler. I modified this class to take a Func in Schedule(blah), and I passed in an action (what else would I call it), and I also modified the func to pass back a Task. Now, when I call Schedule(), I can use the returned Task to wait upon it, if I want, or I can ignore the Task and simply let it complete on the background thread. Either way, now I have full control. What I DON'T and can't have, however, is the ability for ANY call to 'await' within the Func<> that I send to the SingleThreadTaskScheduler. I don't know why it goes wrong, but if at any point, I use an 'await' in the code while running the SingleThreadTaskScheduler, it hangs. Dunno why, but don't care at this point, it is all running.

Potential Deadlock with MongoDB 2.0 Driver and Non Async Code

we have an ASP.NET MVC website and store all our texts in MongoDB. The class LocalizationTextManager is responsible to provide these texts and caches them internally. Typically this method is very fast ( < 5ms) and even faster if the result is in the cache.
We have two methods: GetString and GetStringAsync. GetStringAsync is preferred but we use the GetString method within Razor for example or in some rare situations where are not in an async context.
MongoDB has an async driver and I need to implement it non synchronously. Therefore we tried several approaches. I ensured that I set ConfigureAwait(false) anywhere in my code.
FindOrAddTextFromRepositoryAsync(key).Result;
Task.Run(async () => await FindOrAddTextFromRepositoryAsync(key)).Result;
Task.Run(async () => await FindOrAddTextFromRepositoryAsync(key).ConfigureAwait(false)).Result;
I know that I dont need ConfigureAwait(false) within the task (because there should be no synchronization-context).
I just deployed the website and it hangs after deployment. After several restarts of the process it was working. I made dumps before and found out that there are a lot of these method calls:
The following threads in w3wp (4).DMP are waiting in System.Threading.Monitor.Wait. ~100 Thread blocked:
mscorlib_ni!System.Threading.ManualResetEventSlim.Wait(Int32, System.Threading.CancellationToken)+3ec
mscorlib_ni!System.Threading.Tasks.Task.SpinThenBlockingWait(Int32, System.Threading.CancellationToken)+db
mscorlib_ni!System.Threading.Tasks.Task.InternalWait(Int32, System.Threading.CancellationToken)+24a
mscorlib_ni!System.Threading.Tasks.Task`1[[System.__Canon, mscorlib]].GetResultCore(Boolean)+36
GP.Components.Globalization.LocalizationTextManager.GetString(System.String, System.String)+2f4
GP.Components.Globalization.LocalizationTextManager.GetString(System.String, System.Globalization.CultureInfo)+8a
My question is: How do I implement it correctly? Another idea is to use a LimitedThreadsScheduler to ensure that it is not parallelized heavily.
The main issue in your code is that your code isn't asynchronous!
For each Task you create you explicitly call the Result property
.Result;
which leads to block the current thread until the task is done.
If you need to handle the Task.Complete event, you can use a continuation method or static methods of Task class to wait the tasks are pending. Simply do not block your tasks:
.ContinueWith( (t) => { Console.WriteLine(t.Result); },
TaskContinuationOptions.OnlyOnRanToCompletion);
or:
Task.WaitAll(tasks);
As I see, in the trace GetString, non-async version is running and waits the result, so other threads can't do anything. I suggest you to try to tune up the performance by setting the MaximumThreads for default thread pool which is being used for Tasks, and split up the sync and async code for different task schedulers so they doesn't block each other. Other options of tasks start explained here: Task.Run vs Task.Factory.StartNew
As for your question at the end, here is a great article about How to: Create a Task Scheduler That Limits Concurrency, so you can try to start from there.

ContinueWith in fire and forget?

I have multiple async methods that all make httpclient calls to other servers. I want to run these at the end of a webapi call and immediately return. The calls need to each record the time they take and when ALL complete, I need to log those times to a file. I spent a while tinkering until I got it to work, but I don't know why it works and the other ways don't.
Could you shed some light on this for me?
Here's the basic way that didn't work. That is, the calls were made, but the LogTimeTaken was not (or at least didn't write the log file).
//inside webapi action
var tasks = new List<Task>
{
MakeCall1Async(data,timeTaken),
MakeCall2Async(data, timeTaken),
MakeCall3Async(data, timeTaken)
};
Task.Run(async () =>
{
await Task.WhenAll(tasks);
LogTimeTaken(timeTaken);
});
//finish webapi action and return
Here's the way that did work:
//inside webapi action
Task.Run(async () =>
{
var tasks = new List<Task>
{
MakeCall1Async(data, timeTaken),
MakeCall2Async(data, timeTaken),
MakeCall3Async(data, timeTaken)
};
await Task.WhenAll(tasks).ContinueWith(t => LogTimeTaken(timeTaken));
});
//finish webapi action and return
Why?
Also, I am aware of the risks of using fire and forget inside of webapi, and that it won't always run to completion (like when an app pool recycles). 95%+ is good enough in this case.
EDIT
I understand everyone's concerns regarding the technology choice. I may be changing to a pub/sub architecture or use the QueueBackgroundWorkItem. Given that I only need successful completion 95% of the time, I think running it as I am is fine, however. The real answer I am trying to get is why the first way fails and the second way succeeds to write to the log.
You should look at background job schedulers like hangfire or Quartz.net
Below is a nice blog entry on these solutions.
How to run Background Tasks in ASP.Net

Manual threads vs Parallel.Foreach in task scheduler

I have a Windows Service that processes tasks created by users. This Service runs on a server with 4 cores. The tasks mostly involve heavy database work (generating a report for example). The server also has a few other services running so I don't want to spin up too many threads (let's say a maximum of 4).
If I use a BlockingCollection<MyCustomTask>, is it a better idea to create 4 Thread objects and use these to consume from the BlockingCollection<MyCustomTask> or should I use Parallel.Foreach to accomplish this?
I'm looking at the ParallelExtensionsExtras which contains a StaTaskScheduler which uses the former, like so (slightly modified the code for clarity):
var threads = Enumerable.Range(0, numberOfThreads).Select(i =>
{
var thread = new Thread(() =>
{
// Continually get the next task and try to execute it.
// This will continue until the scheduler is disposed and no more tasks remain.
foreach (var t in _tasks.GetConsumingEnumerable())
{
TryExecuteTask(t);
}
});
thread.IsBackground = true;
thread.SetApartmentState(ApartmentState.STA);
return thread;
}).ToList();
// Start all of the threads
threads.ForEach(t => t.Start());
However, there's also a BlockingCollectionPartitioner in the same ParallelExtensionsExtras which would enable the use of Parallel.Foreach on a BlockingCollection<Task>, like so:
var blockingCollection = new BlockingCollection<MyCustomTask>();
Parallel.ForEach(blockingCollection.GetConsumingEnumerable(), task =>
{
task.DoSomething();
});
It's my understanding that the latter leverages the ThreadPool. Would using Parallel.ForEach have any benefits in this case?
This answer is relevant if Task class in your code has nothing to do with System.Threading.Tasks.Task.
As a simple rule, use Parallel.ForEach to run tasks that will end eventually. Like execute some work in parallel with some other work
Use Threads when they run routine for the whole life of application.
So, it looks like in your case you should use Threads approach.

Tracking progress of a multi-step Task

I am working on a simple server that exposes webservices to clients. Some of the requests may take a long time to complete, and are logically broken into multiple steps. For such requests, it is required to report progress during execution. In addition, a new request may be initiated before a previous one completes, and it is required that both execute concurrently (barring some system-specific limitations).
I was thinking of having the server return a TaskId to its clients, and having the clients track the progress of the requests using the TaskId. I think this is a good approach, and I am left with the issue of how tasks are managed.
Never having used the TPL, I was thinking it would be a good way to approach this problem. Indeed, it allows me to run multiple tasks concurrently without having to manually manage threads. I can even create multi-step tasks relatively easily using ContinueWith.
I can't come up with a good way of tracking a task's progress, though. I realize that when my requests consist of a single "step", then the step has to cooperatively report its state. This is something I would prefer to avoid at this point. However, when a request consists of multiple steps, I would like to know which step is currently executing and report progress accordingly. The only way I could come up with is extremely tiresome:
Task<int> firstTask = new Task( () => { DoFirstStep(); return 3.14; } );
firstTask.
ContinueWith<int>( task => { UpdateProgress("50%"); return task.Result; } ).
ContinueWith<string>( task => { DoSecondStep(task.Result); return "blah"; }.
ContinueWith<string>( task => { UpdateProgress("100%"); return task.Result; } ).
And even this is not perfect since I would like the Task to store its own progress, instead of having UpdateProgress update some known location. Plus it has the obvious downside of having to change a lot of places when adding a new step (since now the progress is 33%, 66%, 100% instead of 50%, 100%).
Does anyone have a good solution?
Thanks!
This isn't really a scenario that the Task Parallel Library supports that fully.
You might consider an approach where you fed progress updates to a queue and read them on another Task:
static void Main(string[] args)
{
Example();
}
static BlockingCollection<Tuple<int, int, string>> _progressMessages =
new BlockingCollection<Tuple<int, int, string>>();
public static void Example()
{
List<Task<int>> tasks = new List<Task<int>>();
for (int i = 0; i < 10; i++)
tasks.Add(Task.Factory.StartNew((object state) =>
{
int id = (int)state;
DoFirstStep(id);
_progressMessages.Add(new Tuple<int, int, string>(
id, 1, "10.0%"));
DoSecondStep(id);
_progressMessages.Add(new Tuple<int, int, string>(
id, 2, "50.0%"));
// ...
return 1;
},
(object)i
));
Task logger = Task.Factory.StartNew(() =>
{
foreach (var m in _progressMessages.GetConsumingEnumerable())
Console.WriteLine("Task {0}: Step {1}, progress {2}.",
m.Item1, m.Item2, m.Item3);
});
List<Task> waitOn = new List<Task>(tasks.ToArray());
waitOn.Add(logger);
Task.WaitAll(waitOn.ToArray());
Console.ReadLine();
}
private static void DoSecondStep(int id)
{
Console.WriteLine("{0}: First step", id);
}
private static void DoFirstStep(int id)
{
Console.WriteLine("{0}: Second step", id);
}
This sample doesn't show cancellation, error handling or account for your requirement that your task may be long running. Long running tasks place special requirements on the scheduler. More discussion of this can be found at http://parallelpatterns.codeplex.com/, download the book draft and look at Chapter 3.
This is simply an approach for using the Task Parallel Library in a scenario like this. The TPL may well not be the best approach here.
If your web services are running inside ASP.NET (or a similar web application server) then you should also consider the likely impact of using threads from the thread pool to execute tasks, rather than service web requests:
How does Task Parallel Library scale on a terminal server or in a web application?
I don't think the solution you are looking for will involve the Task API. Or at least, not directly. It doesn't support the notion of percentage complete, and the Task/ContinueWith functions need to participate in that logic because it's data that is only available at that level (only the final invocation of ContinueWith is in any position to know the percentage complete, and even then, doing so algorithmically will be a guess at best because it certainly doesn't know if one task is going to take a lot longer than the other. I suggest you create your own API to do this, possibly leveraging the Task API to do the actual work.
This might help: http://blog.stephencleary.com/2010/06/reporting-progress-from-tasks.html. In addition to reporting progress, this solution also enables updating form controls without getting the Cross-thread operation not valid exception.

Categories