Strategies for automatic parallelization

Strategies for automatic parallelization - c#

I am building a node-based drag-and-drop editor, where each node represents one action (for example, read this file, or sort this data, etc.) Outputs and inputs of nodes can be connected.
One of the features I'd like to implement is automatic parallelization, so that if a path branches off I can automatically begin a thread to handle each branch. I'm concerned about a few issues, however:
If a path branches off, but then later joins back together, I will need to synchronize them somehow
If there are multiple start-nodes (where execution begins), their paths will have to be managed separately and then possibly dynamically joined/merged
I want to limit how many threads are created so that I don't suddenly have 20 threads deadlocked
Essentially, I'd like to know if any strategies for doing something like this exist (not looking for code necessarily; just theory). Could scheduling algorithms help?
Thanks for your advice! I look forward to hearing your suggestions.
Note: I'm using C# 3.5, so none of the fun parallel-tasking abilities are available to me. If necessary, I will make the switch to C# 4.0, but I'd like to avoid this.

The Task Parallel Library might be exactly what you're looking for.
I imagine your node-based drag-and-drop editor to look like this:
Every node is essentially a Task. A Task can be anything -- read a file from disk, download some data from the web, or compute anything.
When a Task has finished, it can ContinueWith one or more other Tasks, passing the result of the old Task to the new Tasks.
A Task can also consist of waiting for multiple Tasks to finish. WhenAll these Tasks have finished, this Task can continue with another Task, passing the result of all Tasks to the new task.
The TPL will schedule all these Tasks on a Thread Pool, so Threads can be reused and each Task doesn't need to have its own Thread. The TPL will find the optimal number of Threads for the system it is running on.
The Visual Studio Async CTP adds native language support for asynchronous operations to C#, which makes working with Tasks really easy and fun.
With the TPL it is just a matter of creating Tasks and composing them according to the node layout.
Complete program code for the above example:
var t1 = Task.Factory.StartNew<int>(() => 42);
var t2a = t1.ContinueWith<int>(t => t.Result + 1);
var t2b = t1.ContinueWith<int>(t => t.Result + 1);
var t3a = t2a.ContinueWith<int>(t => t.Result * 2);
var t3b = t2b.ContinueWith<int>(t => t.Result * 3);
var t4 = TaskEx.WhenAll<int>(t3a, t3b)
.ContinueWith<int>(t => t.Result[0] + t.Result[1]);
t4.ContinueWith(t => { Console.WriteLine(t.Result); });
Console.ReadKey();

Related

Handling Parallel Jobs/Threads

I'm trying to refactoring my project and now I'm trying to research for best ways to increase the application's performance.
Question 1. SpinLock vs Interlocked
To creating a counter, which way has better performance.
Interlocked.increament(ref counter)
Or
SpinLock _spinlock = new SpinLock()
bool lockTaken = false;
try
{
_spinlock.Enter(ref lockTaken);
counter = counter + 1;
}
finally
{
if (lockTaken) _spinlock.Exit(false);
}
And if we need to increment another counter, like counter2, should we declare another SpinLock object? or its enough to use another boolean object?
Question 2. Handling nested tasks or better replacement
In this current version of my application, I used tasks, adding each new task to an array and then used Task.WaitAll()
After a lot of research I just figured out that using Parallel.ForEach has better performance, But how can I control the number of current threads? I know I can specify a MaxDegreeOfParallelism in a ParallelOptions parameter, but the problem is here, every time crawl(url) method runs, It just create another limited number of threads, I mean if I set MaxDegree to 10, every time crawl(url) runs, another +10 will created, am I right?, so how can I prevent this? should I use semaphore and threads instead of Parallel? Or there is a better way?
public void Start() {
Parallel.Invoke(() => { crawl(url) } );
}
crawl(string url) {
var response = getresponse(url);
Parallel.foreach(response.links, ParallelOption, link => {
crawl(link);
});
}
Question 3. Notify when all Jobs (and nested jobs) finished.
And my last question is how can I understand when all my jobs has finished?

There a is a lot of misconceptions here, I'll point out just a few.
To creating a counter, which way has better performance.
They both do, depending on your exact situation
After a lot of research I just figured out that using Parallel.ForEach
has better performance
This is also very suspect, and actually just wrong. Once again it depends on what you want to do.
I know I can specify a MaxDegreeOfParallelism in a ParallelOptions
parameter, but the problem is here, every time crawl(url) method runs, It just create another limited number of threads
Once again this is wrong, this is your own implementation detail, and depends on how you do it. also TPL MaxDegreeOfParallelism is only a suggestion, it will only do what it thinks heuristically is best for you.
should I use semaphore and threads instead of Parallel? Or there is a
better way?
The answer is a resounding yes.
OK, let's have a look at what you are doing. You say you are making a crawler. A crawler, accesses the internet, each time you access the internet or a network resource or the file system you are (said simplistically) waiting around for an IO completion port callbacks. This is what's knows as an IO workload.
With IO Bound tasks we don't want to tie up the thread pool with threads waiting for IO completion ports. It's inefficient, you are using up valuable resources waiting for callback on threads that are effectively paused.
So for IO bound work, we don't want to spin up new tasks, and we don't want to use Parallel ForEach to wait around using up threads waiting for events to happen. The most appropriate modern pattern for IO bound tasks is the async and await pattern.
For CPU bound work (if you want to use as much CPU as you can) smash the thread pool, use TPL Parallel or as many tasks that is effective.
The async and await pattern works well with completion ports, because instead of waiting around idly for a callback it will give the threads back and allow them to be reused.
...
However what I suggest is using another approach, where you can take advantage of async and await and also control degrees of parallelisation. This enables you to be good to your thread pool, not using up resources waiting for callbacks, and allowing IO to be IO. I give you TPL DataFlow ActionBlock and TransformManyBlocks
This subject is a little above a simple working example, but I can assure you its an appropriate path for what you are doing. What I suggest is you have a look at the following links.
Stephen Cleary There Is No Thread
Stephen Cleary Introduction to Dataflow
Msdn Blogs Parallel Programming with .NET
Stephen Toub Going Deep Stephen Toub: Inside TPL Dataflow, In this he even talks about crawler examples.
Some random blog on dataflow and crawlers Tpl Dataflow walkthrough – Part 5
In Summary, there are many ways to do what you want to do, and there are many technologies. But the main thing is you have some very skewed ideas about parallel programming. You need to hit the books, hit the blogs, and start getting some really solid design principles from the ground up, and stop trying to figure this all out for your self by nit picking small bits of information.

I'd suggest looking at Microsoft's Reactive Framework for this. You can write your Crawl function like this:
public IObservable<Response> Crawl(string url)
{
return
from r in Observable.Start(() => GetResponse(url))
from l in r.Links.ToObservable()
from r2 in Crawl(l).StartWith(r)
select r2;
}
Then to call it try this:
IObservable<Response> crawls = Crawl("www.microsoft.com");
IDisposable subscription =
crawls
.Subscribe(
r => { /* process each response as it arrives */ },
() => { /* All crawls complete */ });
Done. It handles all the threading for you. Just NuGet "System.Reactive".

Parallel or async ASP.NET Core C#

I've googled this plenty but I'm afraid I don't fully understand the consequences of concurrency and parallelism.
I have about 3000 rows of database objects that each have an average of 2-4 logical data attached to them that need to be validated as a part of a search query, meaning the validation service needs to execute approx. 3*3000 times. E.g. the user has filtered on color then each row needs to validate the color and return the result. The loop cannot break when a match has been found, meaning all logical objects will always need to be evaluated (this is due to calculations of relevance and just not a match).
This is done on-demand when the user selects various properties, meaning performance is key here.
I'm currently doing this by using Parallel.ForEach but wonder if it is smarter to use async behavior instead?
Current way
var validatorService = new LogicalGroupValidatorService();
ConcurrentBag<StandardSearchResult> results = new ConcurrentBag<StandardSearchResult>();
Parallel.ForEach(searchGroups, (group) =>
{
var searchGroupResult = validatorService.ValidateLogicGroupRecursivly(
propertySearchQuery, group.StandardPropertyLogicalGroup);
result.Add(new StandardSearchResult(searchGroupResult));
});
Async example code
var validatorService = new LogicalGroupValidatorService();
List<StandardSearchResult> results = new List<StandardSearchResult>();
var tasks = new List<Task<StandardPropertyLogicalGroupSearchResult>>();
foreach (var group in searchGroups)
{
tasks.Add(validatorService.ValidateLogicGroupRecursivlyAsync(
propertySearchQuery, group.StandardPropertyLogicalGroup));
}
await Task.WhenAll(tasks);
results = tasks.Select(logicalGroupResultTask =>
new StandardSearchResult(logicalGroupResultTask.Result)).ToList();

The difference between parallel and async is this:
Parallel: Spin up multiple threads and divide the work over each thread
Async: Do the work in a non-blocking manner.
Whether this makes a difference depends on what it is that is blocking in the async-way. If you're doing work on the CPU, it's the CPU that is blocking you and therefore you will still end up with multiple threads. In case it's IO (or anything else besides the CPU, you will reuse the same thread)
For your particular example that means the following:
Parallel.ForEach => Spin up new threads for each item in the list (the nr of threads that are spun up is managed by the CLR) and execute each item on a different thread
async/await => Do this bit of work, but let me continue execution. Since you have many items, that means saying this multiple times. It depends now what the results:
If this bit of workis on the CPU, the effect is the same
Otherwise, you'll just use a single thread while the work is being done somewhere else

What does the Parallel.Foreach do behind the scenes?

So I just cant grasp the concept here.
I have a Method that uses the Parallel class with the Foreach method.
But the thing I dont understand is, does it create new threads so it can run the function faster?
Let's take this as an example.
I do a normal foreach loop.
private static void DoSimpleWork()
{
foreach (var item in collection)
{
//DoWork();
}
}
What that will do is, it will take the first item in the list, assign the method DoWork(); to it and wait until it finishes. Simple, plain and works.
Now.. There are three cases I am curious about
If I do this.
Parallel.ForEach(stringList, simpleString =>
{
DoMagic(simpleString);
});
Will that split up the Foreach into let's say 4 chunks?
So what I think is happening is that it takes the first 4 lines in the list, assigns each string to each "thread" (assuming parallel creates 4 virtual threads) does the work and then starts with the next 4 in that list?
If that is wrong please correct me I really want to understand how this works.
And then we have this.
Which essentially is the same but with a new parameter
Parallel.ForEach(stringList, new ParallelOptions() { MaxDegreeOfParallelism = 32 }, simpleString =>
{
DoMagic(simpleString);
});
What I am curious about is this
new ParallelOptions() { MaxDegreeOfParallelism = 32 }
Does that mean it will take the first 32 strings from that list (if there even is that many in the list) and then do the same thing as I was talking about above?
And for the last one.
Task.Factory.StartNew(() =>
{
Parallel.ForEach(stringList, simpleString =>
{
DoMagic(simpleString);
});
});
Would that create a new task, assigning each "chunk" to it's own task?

Do not mix async code with parallel. Task is for async operations - querying a DB, reading file, awaiting some comparatively-computation-cheap operation such that your UI won't be blocked and unresponsive.
Parallel is different. That's designed for 1) multi-core systems and 2) computational-intensive operations. I won't go in details how it works, that kind of info could be found in an MS documentation. Long story short, Parallel.For most probably will make it's own decision on what exactly when and how to run. It might disobey you parameters, i.e. MaxDegreeOfParallelism or somewhat else. The whole idea is to provide the best possible parallezation, thus complete your operation as fast as possible.

Parallel.ForEach perform the equivalent of a C# foreach loop, but with each iteration executing in parallel instead of sequentially. There is no sequencing, it depends on whether the OS can find an available thread, if there is it will execute
MaxDegreeOfParallelism
By default, For and ForEach will utilize as many threads as the OS provides, so changing MaxDegreeOfParallelism from the default only limits how many concurrent tasks will be used by the application.
You do not need to modify this parameter in general but may choose to change it in advanced scenarios:
When you know that a particular algorithm you're using won't scale
beyond a certain number of cores. You can set the property to avoid
wasting cycles on additional cores.
When you're running multiple algorithms concurrently and want to
manually define how much of the system each algorithm can utilize.
When the thread pool's heuristics is unable to determine the right
number of threads to use and could end up injecting too many
threads. e.g. in long-running loop body iterations, the
thread pool might not be able to tell the difference between
reasonable progress or livelock or deadlock, and might not be able
to reclaim threads that were added to improve performance. You can set the property to ensure that you don't use more than a reasonable number of threads.
Task.StartNew is usually used when you require fine-grained control for a long-running, compute-bound task, and like what #Сергей Боголюбов mentioned, do not mix them up
It creates a new task, and that task will create threads asynchronously to run the for loop
You may find this ebook useful: http://www.albahari.com/threading/#_Introduction

does the work and then starts with the next 4 in that list?
This depends on your machine's hardware and how busy the machine's cores are with other processes/apps your CPU is working on
Does that mean it will take the first 32 strings from that list (if there even if that many in the list) and then do the same thing as I was talking about above?
No, there's is no guarantee that it will take first 32, could be less. It will vary each time you execute the same code
Task.Factory.StartNew creates a new tasks but it will not create a new one for each chunk as you expect.
Putting a Parallel.ForEach inside a new Task will not help you further reduce the time taken for the parallel tasks themselves.

How to efficiently make 1000s of web requests as quickly as possible

I need to make 100,000s of lightweight (i.e. small Content-Length) web requests from a C# console app. What is the fastest way I can do this (i.e. have completed all the requests in the shortest possible time) and what best practices should I follow? I can't fire and forget because I need to capture the responses.
Presumably I'd want to use the async web requests methods, however I'm wondering what the impact of the overhead of storing all the Task continuations and marshalling would be.
Memory consumption is not an overall concern, the objective is speed.
Presumably I'd also want to make use of all the cores available.
So I can do something like this:
Parallel.ForEach(iterations, i =>
{
var response = await MakeRequest(i);
// do thing with response
});
but that won't make me any faster than just my number of cores.
I can do:
Parallel.ForEach(iterations, i =>
{
var response = MakeRequest(i);
response.GetAwaiter().OnCompleted(() =>
{
// do thing with response
});
});
but how do I keep my program running after the ForEach. Holding on to all the Tasks and WhenAlling them feels bloated, are there any existing patterns or helpers to have some kind of Task queue?
Is there any way to get any better, and how should I handle throttling/error detection? For instance, if the remote endpoint is slow to respond I don't want to continue spamming it.
I understand I also need to do:
ServicePointManager.DefaultConnectionLimit = int.MaxValue
Anything else necessary?

The Parallel class does not work with async loop bodies so you can't use it. Your loop body completes almost immediately and returns a task. There is no parallelism benefit here.
This is a very easy problem. Use one of the standard solutions for processing a series of items asynchronously with a given DOP (this one is good: http://blogs.msdn.com/b/pfxteam/archive/2012/03/05/10278165.aspx. Use the last piece of code).
You need to empirically determine the right DOP. Simply try different values. There is no theoretical way to derive the best value because it is dependent on many things.
The connection limit is the only limit that's in your way.
response.GetAwaiter().OnCompleted
Not sure what you tried to accomplish there... If you comment I'll explain the misunderstanding.

The operation you want to perform is
Call an I/O method
Process the result
You are correct that you should use an async version of the I/O method. What's more, you only need 1 thread to start all of the I/O operations. You will not benefit from parallelism here.
You will benefit from parallelism in the second part - processing the result, as this will be a CPU-bound operation. Luckily, async/await will do all the job for you. Console applications don't have a synchronization context. It means that the part of the method after an await will run on a thread pool thread, optimally utilizing all CPU cores.
private async Task MakeRequestAndProcessResult(int i)
{
var result = await MakeRequestAsync();
ProcessResult(result);
}
var tasks = iterations.Select(i => MakeRequestAndProcessResult(i)).ToArray();
To achieve the same behavior in an environment with a synchronization context (for example WPF or WinForms), use ConfigureAwait(false).
var result = await MakeRequestAsync().ConfigureAwait(false);
To wait for the tasks to complete, you can use await Task.WhenAll(tasks) inside an async method or Task.WaitAll(tasks) in Main().
Throwing 100k requests at a web service will probably kill it, so you will have to limit it. You can check answers to this question to find some options how to do it.

Parallel.ForEach should be able to use more threads than there are cores if you explicitly set the MaxDegreeOfParallelism property of the ParallelOptions parameter (in the overload of ForEach where there is that parameter) - see https://msdn.microsoft.com/en-us/library/system.threading.tasks.paralleloptions.maxdegreeofparallelism(v=vs.110).aspx
You should be able to set this on 1,000 to get it to use 1,000 threads or even more, but that might not be efficient due to the threading overheads. You may wish to experiment (eg. loop from eg. 100 to 1,000 stepping in 100s to try submitting 1,000 requests each time and time start to finish) or even set up some kind of self-tuning algorithm.

Parallel tasks with a long pause

I have a function which is along the lines of
private void DoSomethingToFeed(IFeed feed)
{
feed.SendData(); // Send data to remote server
Thread.Sleep(1000 * 60 * 5); // Sleep 5 minutes
feed.GetResults(); // Get data from remote server after it's processed it
}
I want to parallelize this, since I have lots of feeds that are all independent of each other. Based on this answer, leaving the Thread.Sleep() in there is not a good idea. I also want to wait after all the threads have spun up, until they've all had a chance to get their results.
What's the best way to handle a scenario like this?
Edit, because I accidentally left it out: I had originally considered calling this function as Parallel.ForEach(feeds, DoSomethingToFeed), but I was wondering if there was a better way to handle the sleeping when I found the answer I linked to.

Unless you have an awful lot of threads, you can keep it simple. Create all the threads. You'll get some thread creation overhead, but since the threads are basically sleeping the whole time, you won't get too much context switching.
It'll be easier to code than any other solution (unless you're using C# 5). So start with that, and improve it only if you actually see a performance problem.

I think you should take a look at the Task class in .NET. It is a nice abstraction on top of more low level threading / thread pool management.
In order to wait for all tasks to complete, you can use Task.WaitAll.
An example use of Tasks could look like:
IFeed feedOne = new SomeFeed();
IFeed feedTwo = new SomeFeed();
var t1 = Task.Factory.StartNew(() => { feedOne.SendData(); });
var t2 = Task.Factory.StartNew(() => { feedTwo.SendData(); });
// Waits for all provided tasks to finish execution
Task.WaitAll(t1, t2);
However, another solution would be using Parallel.ForEach which handles all Task creation for you and does the appropriate batching of tasks as well. A good comparison of the two approaches is given here - where it, among other good points is stated that:
Parallel.ForEach, internally, uses a Partitioner to distribute your collection into work items. It will not do one task per item, but rather batch this to lower the overhead involved.

check WaitHandle for waiting on tasks.

private void DoSomethingToFeed(IFeed feed)
{
Task.Factory.StartNew(() => feed.SendData())
.ContinueWith(_ => Delay(1000 * 60 * 5)
.ContinueWith(__ => feed.GetResults())
);
}
//http://stevenhollidge.blogspot.com/2012/06/async-taskdelay.html
Task Delay(int milliseconds)
{
var tcs = new TaskCompletionSource<object>();
new System.Threading.Timer(_ => tcs.SetResult(null)).Change(milliseconds, -1);
return tcs.Task;
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.