Parallel.For maximum threads and current thread C# - c#

What I want to achieve:
Using Parallel.For (anything else would be appreciated, but found out that this one is the easiest) I want to increase a variable named max so that it gets to 100,000,000, using threads, but the program should use only X number of threads at the same time.
Code snippet:
using System;
using System.Threading;
using System.Linq;
using System.Threading.Tasks;
using System.Diagnostics;
namespace pool_threading
{
class MainClass
{
static int max=0;
static int current_max_thread=0;
public static void Main (string[] args)
{
Parallel.For(0, 100000000, new ParallelOptions { MaxDegreeOfParallelism = 50 },
i =>
{
max++;
if(Thread.CurrentThread.ManagedThreadId>current_max_thread)
current_max_thread=Thread.CurrentThread.ManagedThreadId;
});
Console.WriteLine ("I got to this number {0} using most {1} threads",max,current_max_tread);
}
public static void GetPage(int i)
{
}
}
}
Result:
I got this number 38,786,886 using most 11 threads
Now ... I don't know why I get the number 38,786,886 which is less than 38,786,886, but I guess it's because multiple threads are trying to increase it at the exact same time, so if 10 are trying at the same time, only the first one will get the chance. Please correct me if I'm wrong.
The biggest "problem" is that I get 11 threads all the time, even with the maximum set to 50 (scroll the code to see the maximum), if I set it to maximum 1 thread, I get always 4 (maybe 4 is the minimum in this situation, but it still doesn't explain why I get maximum only 11 at the same time).

This is trivial, just use i inside of the loop instead of trying to increment and use max. You're not incrementing it safely, but there's not reason for you to try. The whole point of Parallel.For is that it gives you the loop index, and ensures that each loop index is hit exactly once, no more, no less.

There are two things going on here, as far as I can see:
First, you are hammering away on max from multiple threads. You probably have a multicore machine. Those operations are probably overlapping with each other. To perform your incrementing in a thread safe way you need
Interlocked.Increment(ref max);
not
max++; /* not thread safe */
Second, you don't always get the number of threads you asked for in MaxDegreeOfParallelism. You'll never get more, but you might get less.
If you're going to use parallelism please pay close attention to the thread-safety of the code you'll run in the threads.

I believe you are not getting 50 threads for 2 main reasons:
Hardware. There is a point at which a processor can do more with fewer threads because the time it takes to switch threads is more than just running a thread more often. This leads to
Managed code. C#.NET is a managed code system. The programmers of .Net know the above and probably set some limits and checks to prevent too many threads from running. What you are setting in the code might be above some internal limit, so it is effectively ignored. Also, you are setting a maximum, so the Parallel.For() can use anywhere from 1 to X number of threads, whatever it thinks is most efficient.

Related

Why the following C# program uses limited (10) number of threads? [duplicate]

I have just did a sample for multithreading using This Link like below:
Console.WriteLine("Number of Threads: {0}", System.Diagnostics.Process.GetCurrentProcess().Threads.Count);
int count = 0;
Parallel.For(0, 50000, options,(i, state) =>
{
count++;
});
Console.WriteLine("Number of Threads: {0}", System.Diagnostics.Process.GetCurrentProcess().Threads.Count);
Console.ReadKey();
It gives me 15 thread before Parellel.For and after it gives me 17 thread only. So only 2 thread is occupy with Parellel.For.
Then I have created a another sample code using This Link like below:
var options = new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount * 10 };
Console.WriteLine("MaxDegreeOfParallelism : {0}", Environment.ProcessorCount * 10);
Console.WriteLine("Number of Threads: {0}", System.Diagnostics.Process.GetCurrentProcess().Threads.Count);
int count = 0;
Parallel.For(0, 50000, options,(i, state) =>
{
count++;
});
Console.WriteLine("Number of Threads: {0}", System.Diagnostics.Process.GetCurrentProcess().Threads.Count);
Console.ReadKey();
In above code, I have set MaxDegreeOfParallelism where it sets 40 but is still taking same threads for Parallel.For.
So how can I increase running thread for Parallel.For?
I am facing a problem that some numbers is skipped inside the Parallel.For when I perform some heavy and complex functionality inside it. So here I want to increase the maximum thread and override the skipping issue.
What you're saying is something like: "My car is shaking when driving too fast. I'm trying to avoid this by driving even faster." That doesn't make any sense. What you need is to fix the car, not change the speed.
How exactly to do that depends on what are you actually doing in the loop. The code you showed is obviously placeholder, but even that's wrong. So I think what you should do first is to learn about thread safety.
Using a lock is one option, and it's the easiest one to get correct. But it's also hard to make it efficient. What you need is to lock only for a short amount of time each iteration.
There are other options how to achieve thread safety, including using Interlocked, overloads of Parallel.For that use thread-local data and approaches other than Parallel.For(), like PLINQ or TPL Dataflow.
After you made sure your code is thread safe, only then it's time to worry about things like the number of threads. And regarding that, I think there are two things to note:
For CPU-bound computations, it doesn't make sense to use more threads than the number of cores your CPU has. Using more threads than that will actually usually lead to slower code, since switching between threads has some overhead.
I don't think you can measure the number of threads used by Parallel.For() like that. Parallel.For() uses the thread pool and it's quite possible that there already are some threads in the pool before the loop begins.
Parallel loops use hardware CPU cores. If your CPU has 2 cores, this is the maximum degree of paralellism that you can get in your machine.
Taken from MSDN:
What to Expect
By default, the degree of parallelism (that is, how many iterations run at the same time in hardware) depends on the
number of available cores. In typical scenarios, the more cores you
have, the faster your loop executes, until you reach the point of
diminishing returns that Amdahl's Law predicts. How much faster
depends on the kind of work your loop does.
Further reading:
Threading vs Parallelism, how do they differ?
Threading vs. Parallel Processing
Parallel loops will give you wrong result for summation operations without locks as result of each iteration depends on a single variable 'Count' and value of 'Count' in parallel loop is not predictable. However, using locks in parallel loops do not achieve actual parallelism. so, u should try something else for testing parallel loop instead of summation.

Program stops responding when I try to increment and print a number in a loop

I have the following code in C#:
void Update () {
for (int m=0;m<122132343243243;m++)
{
print(m);
}
}
When I try run this Unity stops responding. How can I get this function to finish executing?
The main problem with your code is that you're asking the computer to do a massive amount of work all at once, so it stops responding. Counting up to 122 trillion is a lengthy task, which by itself would take at least 20 days (see Counting up to one trillion for estimation) and is only compounded by the fact that you're printing every number.
What you need to do is allow the computer to spread this work out over multiple frames, through the use of coroutines. (You could use another thread instead to prevent locking, but coroutines are the simpler Unity approach to this problem.)
Your code could be rewritten as follows:
void Start() {
StartCoroutine("CountAndPrint");
}
IEnumerator CountAndPrint() {
for (long m=0; m<122132343243243; m++) {
print(m);
yield return null;
}
}
Note: I also switched long in for int, as Daniel noted that the value of m will otherwise overflow should you let it run for long enough.
The main difference in this code is that it basically allows the program to "pause" execution of the method after each time it counts up, saving the rest of the work for future frames. This will allow you to continue interacting with the program while the counter increases.
Hope this helps! Let me know if you have any questions.

Strange Behavior with Threading and Timer

I explain my situation.
I have a producer 1 to N consumers pattern. I'm using blocking collections and everything is working well. Doing some test I noticed this strange behavior:
I was testing how long my manipulation of data took in my consumers.
I noticed this strange things, below you'll find the code cleaned of my manipulation and which produce the strange behavior.
I have 4 consumers for 1 producer.
For most of data, the Console doesn't print anything, because ts=0 (its under a tick) but randomly (between every 1 to 5sec) it plots something like this (not in this very specific order, but of the same kind):
10000
20001
10000
30002
10000
40003
10000
10000
It is of the order of 10,000 ticks so around 1ms. Always a number in the format (N)000(N-1)
Note that the BlockingCollection I consume is filled depending on some network events which occurred completely at random times. Nothing regular from here.
The timing is almost perfect, always a multiple of 10,000 ticks.
What could be behind this ? Thks !
while(IsAlive)
{
DataToFieldMapping item;
try
{
_CollectionToConsume.TryTake(out item, -1);
}
catch
{
item = null;
}
if (item != null)
{
long ts = (DateTime.Now.Ticks - item.TimeStamp.Ticks);
if(ts>10)
Console.WriteLine(ts);
}
}
What's going on here is that DateTime.Now has a fairly limited precision. It's not giving you the time to the nearest tick. It is only updated every 10,000 ticks or so, which is why you generally see multiples of 10k ticks in your prints.
If you really want to get a better feel for the duration of those events, use the StopWatch class, which has a much higher precision. That said, StopWatch is simply a diagnostic tool (hence why it's in the Diagnostics namespace). You should only be using it to help you diagnose what's going on, and should be using it in production code.
On a side note, there really isn't any need to use a timer here at all. It appears that you're creating several consumers that are polling the BlockingCollection for new content. There is no reason to do this. They can simply block until the collection has items. (Hence the name, BlockingCollection.
The easiest way is for the consumers to simply do this:
foreach(var item in _CollectionToConsume.GetConsumingEnumerable())
ProcessItem(item);
Then just run that code in a background thread.
if you write the following and run, you'll see that ticks do not roll one to one, but rather in relatively large chunks b/c ticks resolution is actually much smaller.
for(int i =0; i< 100; i++)
{
Console.WriteLine(DateTime.Now.Ticks);
}
Use Stopwatch class to measure performance as that one uses a high-resolution timer which is much more suitable for the purpose.

Thread.Sleep(0) doesn't work as described?

I am currently reading this excellent article on threading and read the following text:
Thread.Sleep(0) relinquishes the thread’s current time slice immediately, voluntarily handing over the CPU to other threads.
I wanted to test this and below is my test code:
static string s = "";
static void Main(string[] args)
{
//Create two threads that append string s
Thread threadPoints = new Thread(SetPoints);
Thread threadNewLines = new Thread(SetNewLines);
//Start threads
threadPoints.Start();
threadNewLines.Start();
//Wait one second for threads to manipulate string s
Thread.Sleep(1000);
//Threads have an infinite loop so we have to close them forcefully.
threadPoints.Abort();
threadNewLines.Abort();
//Print string s and wait for user-input
Console.WriteLine(s);
Console.ReadKey();
}
The functions that threadPoints and threadNewLines run:
static void SetPoints()
{
while(true)
{
s += ".";
}
}
static void SetNewLines()
{
while(true)
{
s += "\n";
Thread.Sleep(0);
}
}
If I understand Thread.Sleep(0) correctly, the output should be something like this:
............ |
.............. |
................ | <- End of console
.......... |
............. |
............... |
But I get this as output:
....................|
....................|
.... |
|
|
....................|
....................|
................. |
|
Seeing as the article mentioned in the beginning of the post is highly recommended by many programmers, I can only assume that my understanding of Thread.Sleep(0) is wrong. So if someone could clarify, I'd be much obliged.
What thread.sleep(0) is to free the cpu to handle other threads, but that doesn't mean that another thread couldn't be the current one. If you're trying to send the context to another thread, try to use some sort of signal.
If you have access to a machine (or perhaps a VM) with only a single core/processor, try running your code on that machine. You may be surprised with how the results vary. Just because two threads refer to the same variable "s", does not mean they actually refer to the same value at the same time, due to various levels of caching that can occur on modern multi-core (and even just parallel pipeline) CPUs. If you want to see how the yielding works irrespective of the caching issues, try wrapping each s += expression inside a lock statement.
If you'd expand the width of the console to be 5 time larger than current then you'd see what you expect, lines not reaching the console width. The problem is one time slice is actually very long. So, to have the expected effect with normal console with you'd have to slow down the Points thread, but without using Sleep. Instead of while (true) loop try this
for (int i = 0;; i++)
{
if (int % 10 == 0)
s += ".";
}
To slow down the thread even more replace number 10 with bigger number.
The next thread the processor handles is random thread and it even could be the same thread you just called Thread.Sleep(0). To ensure that next thread will be not the same thread you can call Thread.Yield() and check it's return result - if os has another thread that can run true will be returned else false.
You should (almost) never abort threads. The best practice is to signal them to die (commit suicide).
This is normally accomplished by setting some boolean variable and the threads should inspect its value to whether continue or not its execution.
You are setting a string variable named "s". You will incur in race conditions. String is not thread safe. You can wrap the operations that manipulate it in a lock or use a built-in type that is thread-safe.
Always pay attention, in the documentation, to know if the types you use are thread-safe.
Because of this you can't rely on your results because your program is not thread-safe.
If you run the program several times my guess is that you'll get different outputs.
Note: When using a boolean to share some state to cancel threads, make sure it is marked as volatile. JIT might optimize the code and never looks at its changed value.

multithread performance problem for web service call

Here is my sample program for web service server side and client side. I met with a strnage performance problem, which is, even if I increase the number of threads to call web services, the performance is not improved. At the same time, the CPU/memory/network consumption from performance panel of task manager is low. I am wondering what is the bottleneck and how to improve it?
(My test experience, double the number of threads will almost double the total response time)
Client side:
class Program
{
static Service1[] clients = null;
static Thread[] threads = null;
static void ThreadJob (object index)
{
// query 1000 times
for (int i = 0; i < 100; i++)
{
clients[(int)index].HelloWorld();
}
}
static void Main(string[] args)
{
Console.WriteLine("Specify number of threads: ");
int number = Int32.Parse(Console.ReadLine());
clients = new Service1[number];
threads = new Thread[number];
for (int i = 0; i < number; i++)
{
clients [i] = new Service1();
ParameterizedThreadStart starter = new ParameterizedThreadStart(ThreadJob);
threads[i] = new Thread(starter);
}
DateTime begin = DateTime.Now;
for (int i = 0; i < number; i++)
{
threads[i].Start(i);
}
for (int i = 0; i < number; i++)
{
threads[i].Join();
}
Console.WriteLine("Total elapsed time (s): " + (DateTime.Now - begin).TotalSeconds);
return;
}
}
Server side:
[WebMethod]
public double HelloWorld()
{
return new Random().NextDouble();
}
thanks in advance,
George
Although you are creating a multithreaded client, bear in mind that .NET has a configurable bottleneck of 2 simultaneous calls to a single host. This is by design.
Note that this is on the client, not the server.
Try adjusting your app.config file in the client:
<system.net>
<connectionManagement>
<add address=“*” maxconnection=“20″ />
</connectionManagement></system.net>
There is some more info on this in this short article :
My experience is generally that locking is the problem: I had a massively parallel server once that spent more time context switching than it did performing work.
So - check your memory and process counters in perfmon, if you look at context switches and its high (more than 4000 per second) then you're in trouble.
You can also check your memory stats on the server too - if its spending all its time swapping, or just creating and freeing strings, it'll appear to stall also.
Lastly, check disk I/O, same reason as above.
The resolution is to remove your locks, or hold them for a minimum of time. Our problem was solved by removing the dependence on COM BSTRs and their global lock, you'll find that C# has plenty of similar synchronisation bottlenecks (intended to keep your code working safely). I've seen performance drop when I moved a simple C# app from a single-core to a multi-core box.
If you cannot remove the locks, the best option is not to create as many threads :) Use a thread pool instead to let the CPU finish one job before starting another.
I don't believe that you are running into a bottleneck at all actually.
Did you try what I suggested ?
Your idea is to add more threads to improve performance, because you are expecting that all of your threads will run perfectly in parallel. This is why you are assuming that doubling the number of threads should not double the total test time.
Your service takes a fraction of a second to return and your threads will not all start working at exactly the same instant in time on the client.
So your threads are not actually working completely in parallel as you have assumed, and the results you are seeing are to be expected.
You are not seeing any performance gain because there is none to be had. The one line of code in your service (below) probably executes without a context switch most of the time anyway.
return new Random().NextDouble();
The overhead involved in the web service call is higher than than the work you are doing inside of it. If you have some substantial work to do inside the service (database calls, look-ups, file access etc) you may begin to see some performance increase.
Just parallelizing a task will not automatically make it faster.
-Jason
Of course adding Sleep will not improve performance.
But the point of the test is to test with a variable number of threads.
So, keep the Sleep in your WebMethod.
And try now with 5, 10, 20 threads.
If there are no other problems with your code, then the increase in time should not be linear as before.
You realize that in your test, when you double the amount of threads, you are doubling the amount of work that is being done. So if your threads are not truly executing in parallel, then you will, of course, see a linear increase in total time...
I ran a simple test using your client code (with a sleep on the service).
For 5 threads, I saw a total time of about 53 seconds.
And for 10 threads, 62 seconds.
So, for 2x the number of calls to the webservice, it only took 17% more time.. That is what you are expecting, no ?
Well, in this case, you're not really balancing your work between the chosen n.º of threads... Each Thread you create will be performing the same Job. So if you create n threads and you have a limited parallel processing capacity, the performance naturally decreases. Another think I notice is that the required Job is a relatively fast operation for 100 iterations and even if you plan on dividing this Job through multiple threads you need to consider that the time spent in context switching, thread creation/deletion will be an important factor in the overall time.
As bruno mentioned, your webmethod is a very quick operation. As an experiment, try ensuring that your HelloWorld method takes a bit longer. Throw in a Thread.Sleep(1000) before you return the random double. This will make it more likely that your service is actually forced to process requests in parallel.
Then try your client with different amounts of threads, and see how the performance differs.
Try to use some processor consuming task instead of Thread.Sleep. Actually combined approach is the best.
Sleep will just pass thread's time frame to another thread.
IIS AppPool "Maximum Worker Processes" is set to 1 by default. For some reason, each worker process is limited to process 10 service calls at a time. My WCF async server-side function does Sleep(10*1000); only.
This is what happens when Maximum Worker Processes = 1
http://s4.postimg.org/4qc26cc65/image.png
alternatively
http://i.imgur.com/C5FPbpQ.png?1
(First post on SO, I need to combine all pictures into one picture.)
The client is making 48 async WCF WS calls in this test (using 16 processes). Ideally this should take ~10 seconds to complete (Sleep(10000)), but it takes 52 seconds. You can see 5 horizontal lines in the perfmon picture (above link) (using perfmon for monitoring Web Service Current Connections in server). Each horizontal line lasts 10 seconds (which Sleep(10000) does). There are 5 horizontal lines because the server processes 10 calls each time then closes that 10 connections (this happens 5 times to process 48 calls). Completion of all calls took 52 seconds.
After setting Maximum Worker Processes = 2
(in the same picture given above)
This time there are 3 horizontal lines because the server processes 20 calls each time then closes that 20 connections (this happens 3 times to process 48 calls). Took 33 secs.
After setting Maximum Worker Processes = 3
(in the same picture given above)
This time there are 2 horizontal lines because the server processes 30 calls each time. (happens 2 times to process 48 calls) Took 24 seconds.
After setting Maximum Worker Processes = 8
(in the same picture given above)
This time there is 1 horizontal line because the server processes 80 calls each time. (happens once to process 48 calls) Took 14 seconds.
If you don't care this situation, your parallel (async or threaded) client calls will be queued by 10s in the server, then all of your threaded calls (>10) won't get processed by the server in parallel.
PS: I was using Windows 8 x64 with IIS 8.5. The 10 concurrent request limit is for workstation Windows OSes. Server OSes doesn't have that limit according to another post on SO (I can't give link due to rep < 10).

Categories