I'm investigating the Parallelism Break in a For loop.
After reading this and this I still have a question:
I'd expect this code :
Parallel.For(0, 10, (i,state) =>
{
Console.WriteLine(i); if (i == 5) state.Break();
}
To yield at most 6 numbers (0..6).
not only he is not doing it but have different result length :
02351486
013542
0135642
Very annoying. (where the hell is Break() {after 5} here ??)
So I looked at msdn
Break may be used to communicate to the loop that no other iterations after the current iteration need be run.
If Break is called from the 100th iteration of a for loop iterating in
parallel from 0 to 1000, all iterations less than 100 should still be
run, but the iterations from 101 through to 1000 are not necessary.
Quesion #1 :
Which iterations ? the overall iteration counter ? or per thread ? I'm pretty sure it is per thread. please approve.
Question #2 :
Lets assume we are using Parallel + range partition (due to no cpu cost change between elements) so it divides the data among threads . So if we have 4 cores (and perfect divisions among them):
core #1 got 0..250
core #2 got 251..500
core #3 got 501..750
core #4 got 751..1000
so the thread in core #1 will meet value=100 sometime and will break.
this will be his iteration number 100 .
But the thread in core #4 got more quanta and he is on 900 now. he is way beyond his 100'th iteration.
He doesnt have index less 100 to be stopped !! - so he will show them all.
Am I right ? is that is the reason why I get more than 5 elements in my example ?
Question #3 :
How cn I truly break when (i == 5) ?
p.s.
I mean , come on ! when I do Break() , I want things the loop to stop.
excactly as I do in regular For loop.
To yield at most 6 numbers (0..6).
The problem is that this won't yield at most 6 numbers.
What happens is, when you hit a loop with an index of 5, you send the "break" request. Break() will cause the loop to no longer process any values >5, but process all values <5.
However, any values greater than 5 which were already started will still get processed. Since the various indices are running in parallel, they're no longer ordered, so you get various runs where some values >5 (such as 8 in your example) are still being executed.
Which iterations ? the overall iteration counter ? or per thread ? I'm pretty sure it is per thread. please approve.
This is the index being passed into Parallel.For. Break() won't prevent items from being processed, but provides a guarantee that all items up to 100 get processed, but items above 100 may or may not get processed.
Am I right ? is that is the reason why I get more than 5 elements in my example ?
Yes. If you use a partitioner like you've shown, as soon as you call Break(), items beyond the one where you break will no longer get scheduled. However, items (which is the entire partition) already scheduled will get processed fully. In your example, this means you're likely to always process all 1000 items.
How can I truly break when (i == 5) ?
You are - but when you run in Parallel, things change. What is the actual goal here? If you only want to process the first 6 items (0-5), you should restrict the items before you loop through them via a LINQ query or similar. You can then process the 6 items in Parallel.For or Parallel.ForEach without a Break() and without worry.
I mean , come on ! when I do Break() , I want things the loop to stop. excactly as I do in regular For loop.
You should use Stop() instead of Break() if you want things to stop as quickly as possible. This will not prevent items already running from stopping, but will no longer schedule any items (including ones at lower indices or earlier in the enumeration than your current position).
If Break is called from the 100th iteration of a for loop iterating in parallel from 0 to 1000
The 100th iteration of the loop is not necessarily (in fact probably not) the one with the index 99.
Your threads can and will run in an indeterminent order. When the .Break() instruction is encountered, no further loop iterations will be started. Exactly when that happens depends on the specifics of thread scheduling for a particular run.
I strongly recommend reading
Patterns of Parallel Programming
(free PDF from Microsoft)
to understand the design decisions and design tradeoffs that went into the TPL.
Which iterations ? the overall iteration counter ? or per thread ?
Off all the iterations scheduled (or yet to be scheduled).
Remember the delegate may be run out of order, there is no guarantee that iteration i == 5 will be the sixth to execute, rather this is unlikely to be the case except in rare cases.
Q2: Am I right ?
No, the scheduling is not so simplistic. Rather all the tasks are queued up and then the queue is processed. But the threads each use their own queue until it is empty when they steal from other the threads. This leads no way to predict which thread will process what delegate.
If the delegates are sufficiently trivial it might all be processed on the original calling thread (no other thread gets a chance to steal work).
Q3: How cn I truly break when (i == 5) ?
Don't use concurrently if you want linear (in specific) processing.
The Break method is there to support speculative execution: try various ways and stop as soon as any one completes.
Related
I have just did a sample for multithreading using This Link like below:
Console.WriteLine("Number of Threads: {0}", System.Diagnostics.Process.GetCurrentProcess().Threads.Count);
int count = 0;
Parallel.For(0, 50000, options,(i, state) =>
{
count++;
});
Console.WriteLine("Number of Threads: {0}", System.Diagnostics.Process.GetCurrentProcess().Threads.Count);
Console.ReadKey();
It gives me 15 thread before Parellel.For and after it gives me 17 thread only. So only 2 thread is occupy with Parellel.For.
Then I have created a another sample code using This Link like below:
var options = new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount * 10 };
Console.WriteLine("MaxDegreeOfParallelism : {0}", Environment.ProcessorCount * 10);
Console.WriteLine("Number of Threads: {0}", System.Diagnostics.Process.GetCurrentProcess().Threads.Count);
int count = 0;
Parallel.For(0, 50000, options,(i, state) =>
{
count++;
});
Console.WriteLine("Number of Threads: {0}", System.Diagnostics.Process.GetCurrentProcess().Threads.Count);
Console.ReadKey();
In above code, I have set MaxDegreeOfParallelism where it sets 40 but is still taking same threads for Parallel.For.
So how can I increase running thread for Parallel.For?
I am facing a problem that some numbers is skipped inside the Parallel.For when I perform some heavy and complex functionality inside it. So here I want to increase the maximum thread and override the skipping issue.
What you're saying is something like: "My car is shaking when driving too fast. I'm trying to avoid this by driving even faster." That doesn't make any sense. What you need is to fix the car, not change the speed.
How exactly to do that depends on what are you actually doing in the loop. The code you showed is obviously placeholder, but even that's wrong. So I think what you should do first is to learn about thread safety.
Using a lock is one option, and it's the easiest one to get correct. But it's also hard to make it efficient. What you need is to lock only for a short amount of time each iteration.
There are other options how to achieve thread safety, including using Interlocked, overloads of Parallel.For that use thread-local data and approaches other than Parallel.For(), like PLINQ or TPL Dataflow.
After you made sure your code is thread safe, only then it's time to worry about things like the number of threads. And regarding that, I think there are two things to note:
For CPU-bound computations, it doesn't make sense to use more threads than the number of cores your CPU has. Using more threads than that will actually usually lead to slower code, since switching between threads has some overhead.
I don't think you can measure the number of threads used by Parallel.For() like that. Parallel.For() uses the thread pool and it's quite possible that there already are some threads in the pool before the loop begins.
Parallel loops use hardware CPU cores. If your CPU has 2 cores, this is the maximum degree of paralellism that you can get in your machine.
Taken from MSDN:
What to Expect
By default, the degree of parallelism (that is, how many iterations run at the same time in hardware) depends on the
number of available cores. In typical scenarios, the more cores you
have, the faster your loop executes, until you reach the point of
diminishing returns that Amdahl's Law predicts. How much faster
depends on the kind of work your loop does.
Further reading:
Threading vs Parallelism, how do they differ?
Threading vs. Parallel Processing
Parallel loops will give you wrong result for summation operations without locks as result of each iteration depends on a single variable 'Count' and value of 'Count' in parallel loop is not predictable. However, using locks in parallel loops do not achieve actual parallelism. so, u should try something else for testing parallel loop instead of summation.
I have some problems about the parallelness within and across TPL Dataflow blocks.
Within block: I have a TransformBlock, with let's say MaxDOP of 4, performing things that I already know would benefit if
it is as parallel as possible but the scheduler doesn't know. So when I give it 200 items, instead of roughly doing 50 items on 4 threads, it is usually like 150 item on 1 thread, 20 on 2 threads, and then it didn't bothered to use a fourth thread. Is there someway to hint it to be more parallel?
Across blocks: I have several blocks A -> B -> C that is a pipeline. I imagined it would work like early items would get finished processing at C while late items are still processed at A. But let's say, when I gave it like 10k items, it performed all 10k items for A, then all 10k at B, then all 10k at C. That means the first item only exited the pipeline when the last item is finished. I guess to the task scheduler all tasks are equal but to me I hope for "first response time" instead of "last response time". how do I hint the block to behave differently?
Thanks.
I have an IEnumerable of actions and they are decendent ordered by the time they will consume when executing. Now i want all of them to be executed in parallel. Are there any better solutions than this one?
IEnumerable<WorkItem> workItemsOrderedByTime = myFactory.WorkItems.DecendentOrderedBy(t => t.ExecutionTime);
Parallel.ForEach(workItemsOrderedByTime, t => t.Execute(), Environment.ProcessorCount);
So my idea is to first execute all expensice tasks in terms of time they need to be done.
EDIT: The question is if there is a better solution to get all done in minimum of time.
To solve your XY Problem of
Because otherwise it can happen that 9 of 10 tasks are finished and the last one is executed on 1 core and all other cores are doing nothing.
What you need to do is tell Parallel.ForEach to only take one item from the source list at a time. That way when you are down to the last items you won't have a bunch of slow work items all in a single core's queue.
This can be done by using Partitioner.Create and passing in EnumerablePartitionerOptions.NoBuffering
Parallel.ForEach(Partitioner.Create(workItems, EnumerablePartitionerOptions.NoBuffering),
new ParallelOptions{MaxDegreeOfParallelism = Environment.ProcessorCount},
t => t.Execute());
By default there is no execution order guarantee in Parallel.ForEach
That is why your call to DecendentOrderedBy does not do anything good. Though it might do something bad: in case default partitioner decides to do a range partition dividing say 12 WorkItems into 4 groups of 3 items, by the order in IEnumerable. Then first core has much more work to do, thus creating the problem you try to avoid.
Easy fix to (2) is explained in the answer by Scott. If Parallel.ForEach takes just one item then you naturally get some load balancing. In most cases this will work fine
The optimal (in most cases) solution for an ordered IEnumerable (as you have) will be Striped Partitioning number of buckets = number of cores. AFIK there you don't get this out-of-the-box in .NET. But you can provide a custom OrderablePartitioner that will partition data just this way.
I am sorry to say it but: "No free lunch"
I explain my situation.
I have a producer 1 to N consumers pattern. I'm using blocking collections and everything is working well. Doing some test I noticed this strange behavior:
I was testing how long my manipulation of data took in my consumers.
I noticed this strange things, below you'll find the code cleaned of my manipulation and which produce the strange behavior.
I have 4 consumers for 1 producer.
For most of data, the Console doesn't print anything, because ts=0 (its under a tick) but randomly (between every 1 to 5sec) it plots something like this (not in this very specific order, but of the same kind):
10000
20001
10000
30002
10000
40003
10000
10000
It is of the order of 10,000 ticks so around 1ms. Always a number in the format (N)000(N-1)
Note that the BlockingCollection I consume is filled depending on some network events which occurred completely at random times. Nothing regular from here.
The timing is almost perfect, always a multiple of 10,000 ticks.
What could be behind this ? Thks !
while(IsAlive)
{
DataToFieldMapping item;
try
{
_CollectionToConsume.TryTake(out item, -1);
}
catch
{
item = null;
}
if (item != null)
{
long ts = (DateTime.Now.Ticks - item.TimeStamp.Ticks);
if(ts>10)
Console.WriteLine(ts);
}
}
What's going on here is that DateTime.Now has a fairly limited precision. It's not giving you the time to the nearest tick. It is only updated every 10,000 ticks or so, which is why you generally see multiples of 10k ticks in your prints.
If you really want to get a better feel for the duration of those events, use the StopWatch class, which has a much higher precision. That said, StopWatch is simply a diagnostic tool (hence why it's in the Diagnostics namespace). You should only be using it to help you diagnose what's going on, and should be using it in production code.
On a side note, there really isn't any need to use a timer here at all. It appears that you're creating several consumers that are polling the BlockingCollection for new content. There is no reason to do this. They can simply block until the collection has items. (Hence the name, BlockingCollection.
The easiest way is for the consumers to simply do this:
foreach(var item in _CollectionToConsume.GetConsumingEnumerable())
ProcessItem(item);
Then just run that code in a background thread.
if you write the following and run, you'll see that ticks do not roll one to one, but rather in relatively large chunks b/c ticks resolution is actually much smaller.
for(int i =0; i< 100; i++)
{
Console.WriteLine(DateTime.Now.Ticks);
}
Use Stopwatch class to measure performance as that one uses a high-resolution timer which is much more suitable for the purpose.
I'm working on a console application which will be scheduled and run at set intervals, say every 30 minutes. Its only purpose is to query a Web Service to update a batch of database rows.
The Web Service API reccommends calling once every 30 seconds, and timeout after a set interval. The following pseudocode is given as an example:
listId := updateList(<list of terms>)
LOOP
WHILE NOT isUpdatingComplete(listId)
END LOOP
statuses := getStatuses(“LIST_ID = {listId}”)
I have coded this roughly in C# as:
int callCount = 0;
while( callCount < 5 && !client.isUpdateComplete(listId, out messages) )
{
listId = client.updateList(options, terms, out messages);
callCount++;
Thread.Sleep(30000);
}
// Get resulting status...
Is it OK in this situation to use Thread.Sleep()? I'm aware it is not generally good practice but from reading reasons not to use it this seems like acceptable usage.
Thanks.
Thread.Sleep ensures the current thread doesn't return until at least the specified milliseconds have passed. There are plenty of places it's appropriate to do that, and your example seems fine, assuming it's running on a background thread.
Some example places you don't want to use it - on the UI thread or where you need to do exact timing.
Generally speaking, Thread.Sleep is like any other tool: perfectly OK to use, except when it's terribly misused. I disagree with the "not generally good practice" part, which is the result of people abusing Thread.Sleep when they should be doing something else (i.e. blocking on a synchronization object).
In your case the program is single-threaded, it has no UI (i.e. the thread has no message loop) and you do not want to synchronize with external events. Therefore Thread.Sleep is just fine.
The general objection against Sleep() is that it wastes a Thread.
In your case there is only 1 Thread (maybe 2) so that is not really a problem.
So I think it looks fine (but I would sleep 29 seconds to cut some slack).
It's fine, except that you cannot interrupt it once it goes into sleep, without aborting the thread (which is not recommended).
That's why a ManualResetEvent might be a better idea, since it can be signalled ("awaken") from a different thread.
you could stick with the Thread.Sleep method. But it would be more elegant to schedule it to run every 30 minutes - so you don't have to take care of the waiting inside your application.
Thread.Sleep isn't the best for executing periodic logic. Thread.Sleep(n) means your thread will relinquish control for n milliseconds. There is no guarantee that it will regain control after n milliseconds, it depends on the CPU load.
If you are locking the thread for 30 mins case you should schedule a windows task every 30 mins, so the program executes and then ends. That way you are not locking a thread for so long.
For shorter times, like 30 secs / 1 min, System.Thread.Sleep() is perfectly fine. For more than 5 mins i would use a windows task. (Im spanish i think on the english version are called like that, im talking about the tasks you schedule from the control panel ;-) )