What strategies are there for writing a load testing tool?

What strategies are there for writing a load testing tool? - c#

I know about tools for load testing a WCF service, so please don't suggest I use one.
If I wanted to write a tool for calling a service (or invoking any action really) X times a second for Y seconds, what things do I need to consider?
My initial approach would be to have a timer fire at the required interval and create a task when it fires, but I'm concerned that this will simply queue a lot of tasks up waiting for threads from the thread pool to execute on and they will not invoke the service at the required times.
Would creating individual threads to do the work be better? Then I'd be concerned about creating a large number of threads.
So what strategies can I use?

It depends on your goals for scalability. Running a single session per thread is easier - the code is much simpler to run a series of synchronous I/O transactions on a thread. You can spawn thousands of threads on Windows and Linux, if tuned properly. If you need to scale much farther than that, then you'll need to use asynchronous I/O APIs and setup a pools of threads to service groups of those I/O channels. I'd suggest making the thread-to-I/O-channel ratio configurable and monitor the idle time of the threads in those pools...perhaps even allowing the pools to add more threads when needed.

Related

threading higher priority for a not known in advance thread

I create about 5000 background workers that do intensive work in a console app. I'm also using an external library that instantiates an object, say ObjectX. At some point, say t0, ObjectX tries to obtain a thread from an os thread pool and start it, but I have no control on how it obtains this thread. Things work fine for 100 background workers. For 1000 background workers it takes about 10 minutes after t0 for ObjectX to obtain and start a thread.
Is there a way to set, in advance, a high priority for any threads that will be started in the future by an object?
As I think the answer to 1 is "no", is there a way to limit the priority of the background workers so as to somehow favor everything else? Even though I only want to 'favor' ObjectX.
The goal would be to always have available resources to run the thread launched by ObjectX, no matter how overloaded the machine is.
I'm using C# and the .Net fr 3.5, on a Windows 64bit machine.

The way threads work is that they are given processor time by the OS. When this happens this is called a context switch. A context switch takes about 2000-8000 cycles (i.e. depending on processor 2000-8000 instructions). If the OS has many CPUs or cores, it may not need to take the CPU away from one thread and give it to another--avoiding a context switch. There can only be one thread per CPU running at a time, when you have more threads that need CPU than CPUs then you're forcing a context switch. Context switches are performed no faster than the system quantum (every 20ms for client and 120ms for server).
If you have 5000 background workers you effectively have 5000 threads. Each of those threads is potentially vying for CPU time. On a client version of windows, that means 250,000 context switches per second. i.e. 500,000,000 to 2,000,000,000 cycles per second are devoted simply to switching between threads. (i.e. over and above the work your threads are performing) if it could even process that many context switches per second.
The recommended practice is to only have one CPU-bound thread per processor. A CPU-bound thread is one that spends very little time "waiting". The UI thread is not a CPU-bound thread. If your background workers are spending a lot of time waiting for locks, then they may not be CPU-bound either--but, in general, background worker threads are CPU-bound. (otherwise, what would be the point of using a background worker?).
Also, the OS spends a lot of time figuring out what thread needs to get the CPU next. When you start changing thread priorities you interfere with that and most of the time end up making your entire system slower (not just your application) rather than faster.
Update:
On a related not, it takes about 200,000 cycles to create a new thread and about 100,000 cycles to destroy a thread.
Update 2:
If the impetus of the question isn't simply "If it can be done" but to be able to scale workload, then as #JoshW/#Servy mention, using something like the Producer/Consumer Pattern would allow for scalability that could facilitate horizontal scaling to multiple computers/nodes via a queue or a service bus. Simple starting up an in ordinate amount of threads is not scalable beyond the # of CPUs. If what you truly want is an architecture that can scaled out because "available resources...how overloaded the machine is" is simply impossible.

Personally I think this is a bad idea, however... given the comments you have made on other answers and your request that "No matter how many background workers are create that ObjectX runs as soon as possible"... You could conceivably force your background workers to block using a ManualResetEvent.
For example at the top of your worker code you could block on a Manual reset event with the WaitOne method. This manual reset could be static or passed as an input parameter and wherever your ObjectX gets instantiated/called or whatever, you call the .Reset method on your ManualResetEvent. This would block all your workers at the WaitOne line. Next at the bottom of the code that runs ObjectX, call the ManualResetEvent.Set() method and that will unblock the workers.
Note this is NOT an efficient way to manage your threads, but if you "just have to make it work" and have time later to improve it... I suppose it's one possible solution.

The goal would be to always have available resources to run the thread launched by ObjectX, no matter how overloaded the machine is.
Then thread priorities might not be the right tool.. Remember, thread priorities are evil
In general, windows is not a real-time OS; especially, win32 does not even attempt to be soft real-time (IIRC, the NT kernel tried, at some point, to have at least support for soft real time subsystems, but I may be wrong). So there is no guarantee about available resources, or timing.
Also, are you worried about other threads in the system? Those threads are out of your control (what if the other threads are already at the system max priority?).
If you are worried about threads in your app... you can control and throttle them, using less threads/workers to do more work (batching work in bigger units, and submitting it to a worker, for example, or by using TPL or other tools that will handle and throttle thread usage for you)
That said, you could intercept when a thread is created (look for example this question https://stackoverflow.com/a/3802316/863564) see if it was created for ObjectX (for example, checking its name) and use SetThreadPriority to boost it.

Launching multiple tasks from a WCF service

I need to optimize a WCF service... it's quite a complex thing. My problem this time has to do with tasks (Task Parallel Library, .NET 4.0). What happens is that I launch several tasks when the service is invoked (using Task.Factory.StartNew) and then wait for them to finish:
Task.WaitAll(task1, task2, task3, task4, task5, task6);
Ok... what I see, and don't like, is that on the first call (sometimes the first 2-3 calls, if made quickly one after another), the final task starts much later than the others (I am looking at a case where it started 0.5 seconds after the others). I tried calling
ThreadPool.SetMinThreads(12*Environment.ProcessorCount, 20);
at the beginning of my service, but it doesn't seem to help.
The tasks are all database-related: I'm reading from multiple databases and it has to take as little time as possible.
Any idea why the last task is taking so long? Is there something I can do about it?
Alternatively, should I use the thread pool directly? As it happens, in one case I'm looking at, one task had already ended before the last one started - I would had saved 0.2 seconds if I had reused that thread instead of waiting for a new one to be created. However, I can not be sure that that task will always end so quickly, so I can't put both requests in the same task.
[Edit] The OS is Windows Server 2003, so there should be no connection limit. Also, it is hosted in IIS - I don't know if I should create regular threads or using the thread pool - which is the preferred version?
[Edit] I've also tried using Task.Factory.StartNew(action, TaskCreationOptions.LongRunning); - it doesn't help, the last task still starts much later (around half a second later) than the rest.
[Edit] MSDN1 says:
The thread pool has a built-in delay
(half a second in the .NET Framework
version 2.0) before starting new idle
threads. If your application
periodically starts many tasks in a
short time, a small increase in the
number of idle threads can produce a
significant increase in throughput.
Setting the number of idle threads too
high consumes system resources
needlessly.
However, as I said, I'm already calling SetMinThreads and it doesn't help.

I have had problems myself with delays in thread startup when using the (.Net 4.0) Task-object. So for time-critical stuff I now use dedicated threads (... again, as that is what I was doing before .Net 4.0.)
The purpose of a thread pool is to avoid the operative system cost of starting and stopping threads. The threads are simply being reused. This is a common model found in for example internet servers. The advantage is that they can respond quicker.
I've written many applications where I implement my own threadpool by having dedicated threads picking up tasks from a task queue. Note however that this most often required locking that can cause delays/bottlenecks. This depends on your design; are the tasks small then there would be a lot of locking and it might be faster to trade some CPU in for less locking: http://www.boyet.com/Articles/LockfreeStack.html
SmartThreadPool is a replacement/extension of the .Net thread pool. As you can see in this link it has a nice GUI to do some testing: http://www.codeproject.com/KB/threads/smartthreadpool.aspx
In the end it depends on what you need, but for high performance I recommend implementing your own thread pool. If you experience a lot of thread idling then it could be beneficial to increase the number of threads (beyond the recommended cpucount*2). This is actually how HyperThreading works inside the CPU - using "idle" time while doing operations to do other operations.
Note that .Net has a built-in limit of 25 threads per process (ie. for all WCF-calls you receive simultaneously). This limit is independent and overrides the ThreadPool setting. It can be increased, but it requires some magic: http://www.csharpfriends.com/Articles/getArticle.aspx?articleID=201

Following from my prior question (yep, should have been a Q against original message - apologies):
Why do you feel that creating 12 threads for each processor core in your machine will in some way speed-up your server's ability to create worker threads? All you're doing is slowing your server down!
As per MSDN do
As per the MSDN docs: "You can use the SetMinThreads method to increase the minimum number of threads. However, unnecessarily increasing these values can cause performance problems. If too many tasks start at the same time, all of them might appear to be slow. In most cases, the thread pool will perform better with its own algorith for allocating threads. Reducing the minimum to less than the number of processors can also hurt performance.".
Issues like this are usually caused by bumping into limits or contention on a shared resource.
In your case, I am guessing that your last task(s) is/are blocking while they wait for a connection to the DB server to come available or for the DB to respond. Remember - if your invocation kicks off 5-6 other tasks then your machine is going to have to create and open numerous DB connections and is going to kick the DB with, potentially, a lot of work. If your WCF server and/or your DB server are cold, then your first few invocations are going to be slower until the machine's caches etc., are populated.
Have you tried adding a little tracing/logging using the stopwatch to time how long it takes for your tasks to connect to the DB server and then execute their operations?
You may find that reducing the number of concurrent tasks you kick off actually speeds things up. Try spawning 3 tasks at a time, waiting for them to complete and then spawn the next 3.

When you call Task.Factory.StartNew, it uses a TaskScheduler to map those tasks into actual work items.
In your case, it sounds like one of your Tasks is delaying occasionally while the OS spins up a new Thread for the work item. You could, potentially, build a custom TaskScheduler which already contained six threads in a wait state, and explicitly used them for these six tasks. This would allow you to have complete control over how those initial tasks were created and started.
That being said, I suspect there is something else at play here... You mentioned that using TaskCreationOptions.LongRunning demonstrates the same behavior. This suggests that there is some other factor at play causing this half second delay. The reason I suspect this is due to the nature of TaskCreationOptions.LongRunning - when using the default TaskScheduler (LongRunning is a hint used by the TaskScheduler class), starting a task with TaskCreationOptions.LongRunning actually creates an entirely new (non-ThreadPool) thread for that Task. If creating 6 tasks, all with TaskCreationOptions.LongRunning, demonstrates the same behavior, you've pretty much guaranteed that the problem is NOT the default TaskScheduler, since this is going to always spin up 6 threads manually.
I'd recommend running your code through a performance profiler, and potentially the Concurrency Visualizer in VS 2010. This should help you determine exactly what is causing the half second delay.

What is the OS? If you are not running the server versions of windows, there is a connection limit. Your many threads are probably being serialized because of the connection limit.
Also, I have not used the task parallel library yet, but my limited experience is that new threads are cheap to make in the context of networking.

These articles might explain the problem you're having:
http://blogs.msdn.com/b/wenlong/archive/2010/02/11/why-are-wcf-responses-slow-and-setminthreads-does-not-work.aspx
http://blogs.msdn.com/b/wenlong/archive/2010/02/11/why-does-wcf-become-slow-after-being-idle-for-15-seconds.aspx
seeing as you're using .Net 4, the first article probably doesn't apply, but as the second article points out the ThreadPool terminates idle threads after 15 seconds which might explain the problem you're having and offers a simple (though a little hacky) solution to get around it.
Whether or not you should be using the ThreadPool directly wouldn't make any difference as I suspect the task library is using it for you underneath anyway.
One third-party library we have been using for a while might help you here - Smart Thread Pool. You still get the same benefits of using the task libraries, in that you can have the return values from the threads and get any exception information from them too.
Also, you can instantiate threadpools so that when you have multiple places each needing a threadpool (so that a low priority process doesn't start eating into the quota of some high priority process) and oh yeah you can set the priority of the threads in the pool too which you can't do with the standard ThreadPool where all the threads are background threads.
You can find plenty of info on the codeplex page, I've also got a post which highlights some of the key differences:
http://theburningmonk.com/2010/03/threading-introducing-smartthreadpool/
Just on a side note, for tasks like the one you've mentioned, which might take some time to return, you probably shouldn't be using the threadpool anyway. It's recommended that we should avoid using the threadpool for any blocking tasks like that because it hogs up the threadpool which is used by all sorts of things by the framework classes, like handling timer events, etc. etc. (not to mention handling incoming WCF requests!). I feel like I'm spamming here but here's some of the info I've gathered around the use of the threadpool and some useful links at the bottom:
http://theburningmonk.com/2010/03/threading-using-the-threadpool-vs-creating-your-own-threads/
well, hope this helps!

Using delegates in C# .Net, what happens when I run out of threads in the .Net threadpool?

I'm making a multi-threaded application using delegates to handle the processing of requests in a WCF service. I want the clients to be able to send the request and then disconnect and await for a callback to announce the work is done (which will most likely be searching through a database). I don't know how many requests may come in at once, it could be one every once in a while or it could spike to dozens.
As far as I know, .Net's threadpool has 25 threads available to use. What happens when I spawn 25 delegates or more? Does it throw an error, does it wait, does it pause an existing operation and start working on the new delegate, or some other behavior?
Beyond that, what happens if I want to spawn up to or more than 25 delegates while other operations (such as incoming/outgoing connections) want to start, and/or when another operation is working and I want to spawn another delegate?
I want to make sure this is scalable without being too complex.
Thanks

All operations are queued (I am assuming that you are using the threadpool directly or indirectly). It is the job of the threadpool to munch through the queue and dispatch operations onto threads. Eventually all threads may become busy, which will just mean that the queue will grow until threads are free to start processing queued work items.

You're confusing delegates with threads, and number of concurrent connections.
With WCF 2-way bindings, the connection remains open while waiting for the callback.
IIS 7 or above, on modern hardware should have no difficulty maintaining a few thousand concurrent connections if they're sitting idle.
Delegates are just method pointers - you can have as many as you wish. That doesn't mean they're being invoked concurrently.

If you are using ThreadPool.QueueUserWorkItem then it just queues the extra items until a thread is available.

ThreadPools default max amount of thread is 250 not 25! You can still set a higher limit for the ThreadPool if you need that.
If your ThreadPool runs out of threads two things may happen: All opperations are queued until the next resource is available. If there are finished threads those might still be "in use" so the GC will trigger and free up some of them, providing you with new resources.
However you can also create Threads not using the ThreadPool.

multi threading a web application

I know there are many cases which are good cases to use multi-thread in an application, but when is it the best to multi-thread a .net web application?

A web application is almost certainly already multi threaded by the hosting environment (IIS etc). If your page is CPU-bound (and want to use multiple cores), then arguably multiple threads is a bad idea, as when your system is under load you are already using them.
The time it might help is when you are IO bound; for example, you have a web-page that needs to talk to 3 external web-services, talk to a database, and write a file (all unrelated). You can do those in parallel on different threads (ideally using the inbuilt async operations, to maximise completion-port usage) to reduce the overall processing time - all without impacting local CPU overly much (here the real delay is on the network).
Of course, in such cases you might also do better by simply queuing the work in the web application, and having a separate service dequeue and process them - but then you can't provide an immediate response to the caller (they'd need to check back later to verify completion etc).

IMHO you should avoid the use of multithread in a web based application.
maybe a multithreaded application could increase the performance in a standard app (with the right design), but in a web application you may want to keep a high throughput instead of speed.
but if you have a few concurrent connection maybe you can use parallel thread without a global performance degradation

Multithreading is a technique to provide a single process with more processing time to allow it to run faster. It has more threads thus it eats more CPU cycles. (From multiple CPU's, if you have any.) For a desktop application, this makes a lot of sense. But granting more CPU cycles to a web user would take away the same cycles from the 99 other users who are doing requests at the same time! So technically, it's a bad thing.
However, a web application might use other services and processes that are using multiple threads. Databases, for example, won't create a separate thread for every user that connects to them. They limit the number of threads to just a few, adding connections to a connection pool for faster usage. As long as there are connections available or pooled, the user will have database access. When the database runs out of connections, the user will have to wait.
So, basically, the use of multiple threads could be used for web applications to reduce the number of active users at a specific moment! It allows the system to share resources with multiple users without overloading the resource. Instead, users will just have to stand in line before it's their turn.
This would not be multi-threading in the web application itself, but multi-threading in a service that is consumed by the web application. In this case, it's used as a limitation by only allowing a small amount of threads to be active.

In order to benefit from multithreading your application has to do a significant amount of work that can be run in parallel. If this is not the case, the overhead of multithreading may very well top the benefits.
In my experience most web applications consist of a number of short running methods, so apart from the parallelism already offered by the hosting environment, I would say that it is rare to benefit from multithreading within the individual parts of a web application. There are probably examples where it will offer a benefit, but my guess is that it isn't very common.

ASP.NET is already capable of spawning several threads for processing several requests in parallel, so for simple request processing there is rarely a case when you would need to manually spawn another thread. However, there are a few uncommon scenarios that I have come across which warranted the creation of another thread:
If there is some operation that might take a while and can run in parallel with the rest of the page processing, you might spawn a secondary thread there. For example, if there was a webservice that you had to poll as a result of the request, you might spawn another thread in Page_Init, and check for results in Page_PreRender (waiting if necessary). Though it's still a question if this would be a performance benefit or not - spawning a thread isn't cheap and the time between a typical Page_Init and Page_Prerender is measured in milliseconds anyway. Keeping a thread pool for this might be a little bit more efficient, and ASP.NET also has something called "asynchronous pages" that might be even better suited for this need.
If there is a pool of resources that you wish to clean up periodically. For example, imagine that you are using some weird DBMS that comes with limited .NET bindings, but there is no pooling support (this was my case). In that case you might want to implement the DB connection pool yourself, and this would necessitate a "cleaner thread" which would wake up, say, once a minute and check if there are connections that have not been used for a long while (and thus can be closed off).
Another thing to keep in mind when implementing your own threads in ASP.NET - ASP.NET likes to kill off its processes if they have been inactive for a while. Thus you should not rely on your thread staying alive forever. It might get terminated at any moment and you better be ready for it.

What's the thread context for events in .Net using existing APIs?

When using APIs handling asynchronous events in .Net I find myself unable to predict how the library will scale for large numbers of objects.
For example, using the Microsoft.Office.Interop.UccApi library, when I create an endpoint it gets events when phone events happen. Now let's say I want to create 1000 endpoints. The number of events per endpoint is small, but is what's happening behind the scenes in the API able to keep up with the event flow? I don't know because it never says how it's architected.
Let's say I want to create all 1000 objects in the main thread. Then I want to put the Login method into a large thread pool so all objects login in parallel. Then once all the objects have logged in the next phase will begin.
Are the event callbacks the API raises happening in the original creating thread? A separate threadpool? Or the same threadpool I'm accessing with ThreadPool.QueueUserWorkItem?
Would I be better putting each object in it's own thread? Grouping a few objects in each thread? Or is it fine just creating all 1000 objects in the main thread and through .Net magic it will all be OK?
thanx

The events from interop assemblies are just wrappers around the COM connection points. The thread on which the call from the connection point arrive depends on the threading model of the object that advised on that connection point. COM will ensure the proper thread switching for this.
If your objects are implemented on the main thread, which in .Net is usually an STA, all events should arrive on that same thread. If you want your calls to arrive on a random thread from the COM thread pool (which I think is the same as the CLR thread pool), you need to create your objects on a thread that is configured as an MTA.
I would strongly advise against creating a thread for each object: 1) If you create these threads as STA, each of them will have a message queue, waisting system resource; 2) If you create them as MTA, nothing guarantees you the event call will arrive on your thread; 3) You'll have 1000 idle threads doing nothing and just waiting on an event to shutdown; and 4) Starting up and shutting down all these threads will have terrible perf cost on your application.

It really depends on a lot of things, primarily how powerful your hardware is. The threadpool does have a certain number of threads (which you can increase) that it will make available for your application. So if all of your events are firing at the same time some will most likely be waiting for a few moments while your threadpool waits for threads to become free again. The tradeoff is that you don't have the performance hit of creating new threads all the time either. Probably creating 1000 threads isn't the right answer either.
It may turn out that this is ideal, both because of the performance gains in reusing threads but also because having 1000 threads all running simultaneously might be more memory / CPU usage than it's worth.

I just wanted to note that in .NET 2.0 and greater it's possible to programmatically increase the maximum number of threads in the thread pool using ThreadPool.SetMaxThreads(). Given this you can put a hard cap on the number of threads and so ensure the scheduler won't be brought to it's knees by the overhead.
Even more useful in this sort of case, you can set the minimum number of threads with ThreadPool.SetMinThreads(). With this you can ensure that you only pay the "horrible performance price" Franci is talking about once, at application startup. You could balance this against the expected number peak of users and so ensure you won't be creating tons of new threads.
A single new thread creation won't destroy you. What I would be worried about is the case where a lot of threads need to be created at the same time. If you can say that this will only happen at startup you would be golden.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.