What resources do blocked threads take-up

What resources do blocked threads take-up - c#

One of the main purposes of writing code in the asynchronous programming model (more specifically - using callbacks instead of blocking the thread) is to minimize the number of blocking threads in the system.
For running threads , this goal is obvious, because of context switches and synchronization costs.
But what about blocked threads? why is it so important to reduce their number?
For example, when waiting for a response from a web server a thread is blocked and doesn't take-up any CPU time and does not participate in any context switch.
So my question is:
other than RAM (about 1MB per thread ?) What other resources do blocked threads take-up?
And another more subjective question:
In what cases will this cost really justify the hassle of writing asynchronous code (the price could be, for example, splitting your nice coherent method to lots of beginXXX and EndXXX methods, and moving parameters and local variables to be class fields).
UPDATE - additional reasons I didn't mention or didn't give enough weight to:
More threads means more locking on communal resources
More threads means more creation and disposing of threads which is expensive
The system can definitely run-out of threads/RAM and then stop servicing clients (in a web server scenario this can actually bring down the service)

So my question is: other than RAM (about 1MB per thread ?) What other resources do blocked threads take-up?
This is one of the largest ones. That being said, there's a reason that the ThreadPool in .NET allows so many threads per core - in 3.5 the default was 250 worker threads per core in the system. (In .NET 4, it depends on system information such as virtual address size, platform, etc. - there isn't a fixed default now.) Threads, especially blocked threads, really aren't that expensive...
However, I would say, from a code management standpoint, it's worth reducing the number of blocked threads. Every blocked thread is an operation that should, at some point, return and become unblocked. Having many of these means you have quite a complicated set of code to manage. Keeping this number reduced will help keep the code base simpler - and more maintainable.
And another more subjective question: In what cases will this cost really justify the hassle of writing asynchronous code (the price could be, for example, splitting your nice coherent method to lots of beginXXX and EndXXX methods, and moving parameters and local variables to be class fields).
Right now, it's often a pain. It depends a lot on the scenario. The Task<T> class in .NET 4 dratically improves this for many scenarios, however. Using the TPL, it's much less painful than it was previously using the APM (BeginXXX/EndXXX) or even the EAP.
This is why the language designers are putting so much effort into improving this situation in the future. Their goals are to make async code much simpler to write, in order to allow it to be used more frequently.

Besides from any resources the blocked thread might hold a lock on, thread pool size is also of consideration. If you have reached the maximum thread pool size (if I recall correctly for .NET 4 is max thread count is 100 per CPU) you simply won't be able to get anything else to run on the thread pool until at least one thread gets freed up.

I would like to point out that the 1MB figure for stack memory (or 256KB, or whatever it's set to) is a reserve; while it does take away from available address space, the actual memory is only committed as it's needed.
On the other hand, having a very large number of threads is bound to bog down the task scheduler somewhat as it has to keep track of them (which have become runnable since the last tick, and so on).

Related

Conditions to use async-methods in c# .net-core web-apis

I'm implementing several small services, each of which uses entity-framework to store certain (but little) data. They also have a fair bit of business-logic so it makes sense to separate them from one another.
I'm certainly aware that async-methods and the async-await pattern itself can solve many problems in regards to performance especially when it comes to any I/O or cpu-intensive operations.
I'm uncertain wether to use the async-methods of entity-framework logic (e.g. SaveChangesAsync or FirstOrDefaultAsync) because I can't find metrics that say "now you do it, and now you don't" besides from "Is it I/O or CPU-Intensive or not?".
What I've found when researching this topic (not limited to this but these are showing the problem):
not using it can lead to your application stopping to respond because the threads (not the ones of the cpu, but virtual threads of the os) can run out because of the in that case blocking i/o calls to the database.
using it bloats your code and decreases performance because of the context-switches at every method. Especially when I apply those to entity-framework calls it means that I have at least three context switches for one call from controller to business-logic to the repository to the database.
What I don't know, and that's what I would like to know from you:
How many virtual os threads are there? Or to be more precise: If I expect my application and server to be able to handle 100 requests to this service within five seconds (and I don't expect them to be more, 100 is already exagerated), should I back away from using async/await there?
What are the precise metrics that I could look at to answer this question for any of my services?
Or should I rather always use async-methods for I/O calls because they are already there and it could always happen that the load-situation on my server changes and there's so much going on that the async-methods would help me a great deal with that?

I'm certainly aware that async-methods and the async-await pattern itself can solve many problems in regards to performance especially when it comes to any I/O or cpu-intensive operations.
Sort of. The primary benefit of asynchronous code is that it frees up threads. UI apps (i.e., desktop/mobile) manifest this benefit in more responsive user interfaces. Services such as the ones you're writing manifest this benefit in better scalability - the performance benefits are only visible when under load. Also, services only receive this benefit from I/O operations; CPU-bound operations require a thread no matter what, so using await Task.Run on service applications doesn't help at all.
not using it can lead to your application stopping to respond because the threads (not the ones of the cpu, but virtual threads of the os) can run out because of the in that case blocking i/o calls to the database.
Yes. More specifically, the thread pool has a limited injection rate, so it can only grow so far so quickly. Asynchrony (freeing up threads) helps your service handle bursty traffic and heavy load. Quote:
Bear in mind that asynchronous code does not replace the thread pool. This isn’t thread pool or asynchronous code; it’s thread pool and asynchronous code. Asynchronous code allows your application to make optimum use of the thread pool. It takes the existing thread pool and turns it up to 11.
Next question:
using it bloats your code and decreases performance because of the context-switches at every method.
The main performance drawback to async is usually memory related. There's additional structures that need to be allocated to keep track of ongoing asynchronous work. In the synchronous world, the thread stack itself has this information.
What I don't know, and that's what I would like to know from you: [when should I use async?]
Generally speaking, you should use async for any new code doing I/O-based operations (including all EF operations). The metrics-based arguments are more about cost/benefit analysis of converting to async - i.e., given an existing old synchronous codebase, at what point is it worth investing the time to convert it to async.

TLDR: Should I use async? YES!
You seem to have fallen for the most common mistake when trying to understand async/await. Async is orthogonal to multi-threading.
To answer your question, when should you the async method?
If currentContext.IsAsync && method.HasAsyncVersion
return UseAsync.Yes;
Else
return UseAsync.No;
That above is the short version.
Async/Await actually solves a few problems
Unblock UI thread
M:N threading
Multithreaded scheduling and synchronization
Interupt/Event based asynchronous scheduling
Given the large number of different use cases for async/await, the "assumptions" you state only apply to certain cases.
For example, context switching, only happens with Multi-Threading. Single-Threaded Interupt based Async actually reduces context switching by reducing blocking times and keeping the OS thread well fed with work.
Finally, your question on OS threads, is fundimentally wrong.
Firstly, OS threads each require creation of a stack (4MB of continous RAM, 100 threads means 400MB of RAM before any work is even done).
Secondly, unless you have 100 physical cores on your PC, your CPUs will have to context switch between each OS thread, resulting in the CPU stalling, whilst it loads that thread. By using M:N threading, you can keep the CPU running, by reducing the number of OS threads and instead using Green Threads (Task in dotnet).
Thirdly, not all "await" results in "async" behavior. Tasks are able to synchronously return, short-circuiting all of the "bloat".
In short, without digging really deep, it is hard to find optimization opportunities by switching from async to sync methods.

Does asynchronous model really give benefits in throughput against properly configured synchronous?

Everybody knows that asynchrony gives you "better throughput", "scalability", and more efficient in terms of resources consumption. I also thought this (simplistic) way before doing an experiment below. It basically tells that if we take into account all the overhead for asynchronous code and compare it against properly configured synchronous code it yields little to no performance/throughput/resource consumption advantages.
The question: Does asynchronous code actually perform so much better comparing to the synchronous code with correctly configured thread pool? May be my performance tests are flawed in some dramatic way?
Test setup: Two ASP.NET Web API methods with JMeter trying to call them with 200 threads thread group (30 seconds rump up time).
[HttpGet]
[Route("async")]
public async Task<string> AsyncTest()
{
await Task.Delay(_delayMs);
return "ok";
}
[HttpGet]
[Route("sync")]
public string SyncTest()
{
Thread.Sleep(_delayMs);
return "ok";
}
Here is response time (log scale). Notice how synchronous code becomes faster when Thread Pool injected enough threads. If we were to set up Thread Pool beforehand (via SetMinThreads) it would outperform async right from the start.
What about resources consumption you would ask. "Thread has big cost in terms of CPU time scheduling, context switching and RAM footprint". Not so fast. Threads scheduling and context switching is efficient. As far as the stack usage goes thread does not instantly consume the RAM, but rather just reserve virtual address space and commit only a tiny fraction which is actually needed.
Let's look at what the data says. Even with bigger amount threads sync version has smaller memory footprint (working set which maps into the physical memory).
UPDATE. I want to post the results of follow-up experiment which should be more representational since avoids some biases of the first one.
First of all, the results of the first experiment are taken using IIS Express, which is basically dev time server, so I needed to move away from that. Also, considering the feedback I've isolated load generation machine from the server (two Azure VMs in the same network). I've also discovered that some IIS threading limits are from hard to impossible to violate and ended up switching to ASP.NET WebAPI self-hosting to eliminate IIS from the variables as well. Note that memory footprints/CPU times are radically different with this test, please do not compare numbers across the different test runs as setups are totally different (hosting, hardware, machines setup). Additionally, when I moved to another machines and another hosting solution the Thread Pool strategy changed (it is dynamic) and injection rate increased.
Settings: Delay 100ms, 200 JMeter "users", 30 sec ramp-up time.
I want to conclude these experiments with the following: Yes, under some specific
(more laboratory like) circumstances it's possible to get comparable results for sync vs. async, but in real world cases where workload can not be 100% predictable and workload is uneven we inevitably will hit some kind of threading limits: either server side limits, or Thread Pool grow limits (and bear in mind that thread pool management is automatic mechanism with not always easily predictable properties). Additionally, sync version does have a bigger memory footprint (both working set, and way bigger virtual memory size). As far as CPU consumption is concerned async also wins (CPU time per request metric).
On IIS with default settings the situation is even more dramatic: synchronous version is order(s) of magnitude slower (and smaller throughput) due to quite tight limit on threads count - 20 per CPU.
PS. Do use asynchronous pipelines for IO! [... sigh of relief...]

Everybody knows that asynchrony gives you "better throughput", "scalability", and more efficient in terms of resources consumption.
Scalability, yes. Throughput: it depends. Each asynchronous request is slower than the equivalent synchronous request, so you would only see a throughput benefit when scalability comes into play (i.e., there are more requests than threads available).
Does asynchronous code actually perform so much better comparing to the synchronous code with correctly configured thread pool?
Well, the catch there is "correctly configured thread pool". What you're assuming is that you can 1) predict your load, and 2) have a server big enough to handle it using one thread per request. For many (most?) real-world production scenarios, either or both of these are not true.
From my article on async ASP.NET:
Why not just increase the size of the thread pool [instead of using async]? The answer is twofold: Asynchronous code scales both further and faster than blocking thread pool threads.
First, asynchronous code scales further than synchronous code. With more realistic example code, the total scalability of ASP.NET servers (stress tested) showed a multiplicative increase. In other words, an asynchronous server could handle several times the number of continuous requests as a synchronous server (with both thread pools turned up to the maximum for that hardware). However, these experiments (not done by me) were done on a expected "realistic baseline" for average ASP.NET apps. I don't how the same results would carry over to a noop string return.
Second, asynchronous code scales faster than synchronous code. This one is pretty obvious; synchronous code scales fine up to the number of thread pool threads, but then can't scale faster than the thread injection rate. So you get that really slow response to a sudden heavy load, as shown in the beginning of your response time graph.
I think the work you've done is interesting; I am particularly surprised at the memory usage differences (or rather, lack of difference). I'd love to see you work this into a blog post. Recommendations:
Use ASP.NET Core for your tests. The old ASP.NET had only a partially-asynchronous pipeline; ASP.NET Core would be necessary for a more "pure" comparison of sync vs async.
Don't test locally; there are a lot of caveats when doing that. I'd recommend choosing a VM size (or single-instance Docker container or whatever) and testing in the cloud for repeatability.
Also try stress testing in addition to load testing. Keep increasing load until the server is totally overwhelmed, and see how both the async and sync servers respond.
As a final reminder (also from my article):
Bear in mind that asynchronous code does not replace the thread pool. This isn’t thread pool or asynchronous code; it’s thread pool and asynchronous code. Asynchronous code allows your application to make optimum use of the thread pool. It takes the existing thread pool and turns it up to 11.

Trully asynchronous code (I/O) is more scalable because it releases thread pool threads for other work instead of blocking them. So, for the same number of threads being, it can handle more requests.
But it does that at the cost of more control data structures and more work. So, (other than saving thread pool threads) it consumes more resources (memory, CPU).
It's all about availability, not performance.

What is a multithreading program and how does it work?

What is a multithreading program and how does it work exactly? I read some documents but I'm confused. I know that code is executed line by line, but I can't understand how the program manages this.
A simple answer would be appreciated.c# example please (only animation!)

What is a multi-threading program and how does it work exactly?
Interesting part about this question is complete books are written on the topic, but still it is elusive to lot of people. I will try to explain in the order detailed underneath.
Please note this is just to provide a gist, an answer like this can never do justice to the depth and detail required. Regarding videos, best that I have come across are part of paid subscriptions (Wintellect and Pluralsight), check out if you can listen to them on trial basis, assuming you don't already have the subscription:
Wintellect by Jeffery Ritcher (from his Book, CLR via C#, has same chapter on Thread Fundamentals)
CLR Threading by Mike Woodring
Explanation Order
What is a thread ?
Why were threads introduced, main purpose ?
Pitfalls and how to avoid them, using Synchronization constructs ?
Thread Vs ThreadPool ?
Evolution of Multi threaded programming API, like Parallel API, Task API
Concurrent Collections, usage ?
Async-Await, thread but no thread, why they are best for IO
What is a thread ?
It is software implementation, which is purely a Windows OS concept (multi-threaded architecture), it is bare minimum unit of work. Every process on windows OS has at least one thread, every method call is done on the thread. Each process can have multiple threads, to do multiple things in parallel (provided hardware support).
Other Unix based OS are multi process architecture, in fact in Windows, even the most complex piece of software like Oracle.exe have single process with multiple threads for different critical background operations.
Why were threads introduced, main purpose ?
Contrary to the perception that concurrency is the main purpose, it was robustness that lead to the introduction of threads, imagine every process on Windows is running using same thread (in the initial 16 bit version) and out of them one process crash, that simply means system restart to recover in most of the cases. Usage of threads for concurrent operations, as multiple of them can be invoked in each process, came in picture down the line. In fact it is even important to utilize the processor with multiple cores to its full ability.
Pitfalls and how to avoid using Synchronization constructs ?
More threads means, more work completed concurrently, but issue comes, when same memory is accessed, especially for Write, as that's when it can lead to:
Memory corruption
Race condition
Also, another issue is thread is a very costly resource, each thread has a thread environment block, Kernel memory allocation. Also for scheduling each thread on a processor core, time is spent for context switching. It is quite possible that misuse can cause huge performance penalty, instead of improvement.
To avoid Thread related corruption issues, its important to use the Synchronization constructs, like lock, mutex, semaphore, based on requirement. Read is always thread safe, but Write needs appropriate Synchronization.
Thread Vs ThreadPool ?
Real threads are not the ones, we use in C#.Net, that's just the managed wrapper to invoke Win32 threads. Challenge remain in user's ability to grossly misuse, like invoking lot more than required number of threads, assigning the processor affinity, so isn't it better that we request a standard pool to queue the work item and its windows which decide when the new thread is required, when an already existing thread can schedule the work item. Thread is a costly resource, which needs to be optimized in usage, else it can be bane not boon.
Evolution of Multi threaded programming, like Parallel API, Task API
From .Net 4.0 onward, variety of new APIs Parallel.For, Parallel.ForEach for data paralellization and Task Parallelization, have made it very simple to introduce concurrency in the system. These APIs again work using a Thread pool internally. Task is more like scheduling a work for sometime in the future. Now introducing concurrency is like a breeze, though still synchronization constructs are required to avoid memory corruption, race condition or thread safe collections can be used.
Concurrent Collections, usage ?
Implementations like ConcurrentBag, ConcurrentQueue, ConcurrentDictionary, part of System.Collections.Concurrent are inherent thread safe, using spin-wait and much easier and quicker than explicit Synchronization. Also much easier to manage and work. There's another set API like ImmutableList System.Collections.Immutable, available via nuget, which are thread safe by virtue of creating another copy of data structure internally.
Async-Await, thread but no thread, why they are best for IO
This is an important aspect of concurrency meant for IO calls (disk, network), other APIs discussed till now, are meant for compute based concurrency so threads are important and make it faster, but for IO calls thread has no use except waiting for the call to return, IO calls are processed on hardware based queue IO Completion ports

A simple analogy might be found in the kitchen.
You've probably cooked using a recipe before -- start with the specified ingredients, follow the steps indicated in the recipe, and at the end you (hopefully) have a delicious dish ready to eat. If you do that, then you have executed a traditional (non-multithreaded) program.
But what if you have to cook a full meal, which includes a number of different dishes? The simple way to do it would be to start with the first recipe, do everything the recipe says, and when it's done, put the finished dish (and the first recipe) aside, then start on the second recipe, do everything it says, put the second dish (and second recipe) aside, and so on until you've gone through all of the recipes one after another. That will work, but you might end up spending 10 hours in the kitchen, and of course by the time the last dish is ready to eat, the first dish might be cold and unappetizing.
So instead you'd probably do what most chefs do, which is to start working on several recipes at the same time. For example, you might put the roast in the oven for 45 minutes, but instead of sitting in front of the oven waiting 45 minutes for the roast to cook, you'd spend the 45 minutes chopping the vegetables. When the oven timer rings, you put down your vegetable knife, pull the cooked roast out of the oven and let it cool, then go back to chopping vegetables, and so on. If you can do that, then you are successfully multitasking several recipes/programs. That is, you aren't literally working on multiple recipes at once (you still have only two hands!), but you are jumping back and forth from following one recipe to following another whenever necessary, and thereby making progress on several tasks rather than twiddling your thumbs a lot. Do this well and you can have the whole meal ready to eat in a much shorter amount of time, and everything will be hot and fresh at about the same time too. If you do this, you are executing a simple multithreaded program.
Then if you wanted to get really fancy, you might hire a few other chefs to work in the kitchen at the same time as you, so that you can get even more food prepared in a given amount of time. If you do this, your team is doing multiprocessing, with each chef taking one part of the total work and all of them working simultaneously. Note that each chef may well be working on multiple recipes (i.e. multitasking) as described in the previous paragraph.
As for how a computer does this sort of thing (no more analogies about chefs), it usually implements it using a list of ready-to-run threads and a timer. When the timer goes off (or when the thread that is currently executing has nothing to do for a while, because e.g. it is waiting to load data from a slow hard drive or something), the operating system does a context switch, in which pauses the current thread (by putting it into a list somewhere and no longer executing instructions from that thread's code anymore), then pulls another ready-to-run thread from the list of ready-to-run threads and starts executing instructions from that thread's code instead. This repeats for as long as necessary, often with context switches happening every few milliseconds, giving the illusion that multiple programs are running "at the same time" even on a single-core CPU. (On a multi-core CPU it does this same thing on each core, and in that case it's no longer just an illusion; multiple programs really are running at the same time)

Why don't you refer to Microsoft's very own documentation of the .net class System.Threading.Thread?
It has a handfull of simple example programs written in C# (at the bottom of the page) just as you asked for:
Thread Examples

actually multi thread is do multiple process at the same time together . and you can complete process parallel .

it's actually multi thread is do multiple process at the same time together . and you can complete process parallel . you can take task from your main thread then execute some other way and done .

What are the scalability benefits of async (non-blocking) code?

Blocking threads is considered a bad practice for 2 main reasons:
Threads cost memory.
Threads cost processing time via context switches.
Here are my difficulties with those reasons:
Non-blocking, async code should also cost pretty much the same amount of memory, because the callstack should be saved somewhere right before executing he async call (the context is saved, after all). And if threads are significantly inefficient (memory-wise), why doesn't the OS/CLR offer a more light-weight version of threads (saving only the callstack's context and nothing else)? Wouldn't it be a much cleaner solution to the memory problem, instead of forcing us to re-architecture our programs in an asynchronous fashion (which is significantly more complex, harder to understand and maintain)?
When a thread gets blocked, it is put into a waiting state by the OS. The OS won't context-switch to the sleeping thread. Since way over 95% of the thread's life cycle is spent on sleeping (assuming IO-bound apps here), the performance hit should be negligible, since the processing sections of the thread would probably not be pre-empted by the OS because they should run very fast, doing very little work. So performance-wise, I can't see a whole lot of benefit to a non-blocking approach either.
What am I missing here or why are those arguments flawed?

Non-blocking, async code should also cost pretty much the same amount of memory, because the callstack should be saved somewhere right before executing he async call (the context is saved, after all).
The entire call stack is not saved when an await occurs. Why do you believe that the entire call stack needs to be saved? The call stack is the reification of continuation and the continuation of the awaited task is not the continuation of the await. The continuation of the await is on the stack.
Now, it may well be the case that when every asynchronous method in a given call stack has awaited, information equivalent to the call stack has been stored in the continuations of each task. But the memory burden of those continuations is garbage collected heap memory, not a block of a million bytes of committed stack memory. The continuation state size is order n in the size of the number of tasks; the burden of a thread is a million bytes whether you use it or not.
if threads are significantly inefficient (memory-wise), why doesn't the OS/CLR offer a more light-weight version of threads
The OS does. It offers fibers. Of course, fibers still have a stack, so that's maybe not better. You could have a thread with a small stack I suppose.
Wouldn't it be a much cleaner solution to the memory problem, instead of forcing us to re-architecture our programs in an asynchronous fashion
Suppose we made threads -- or for that matter, processes -- much cheaper. That still doesn't solve the problem of synchronizing access to shared memory.
For what it's worth, I think it would be great if processes were lighter weight. They're not.
Moreover, the question somewhat contradicts itself. You're doing work with threads, so you are already willing to take on the burden of managing asynchronous operations. A given thread must be able to tell another thread when it has produced the result that the first thread asked for. Threading already implies asynchrony, but asynchrony does not imply threading. Having an async architecture built in to the language, runtime and type system only benefits people who have the misfortune to have to write code that manages threads.
Since way over 95% of the thread's life cycle is spent on sleeping (assuming IO-bound apps here), the performance hit should be negligible, since the processing sections of the thread would probably not be pre-empted by the OS because they should run very fast, doing very little work.
Why would you hire a worker (thread) and pay their salary to sit by the mailbox (sleeping the thread) waiting for the mail to arrive (handling an IO message)? IO interrupts don't need a thread in the first place. IO interrupts exist in a world below the level of threads.
Don't hire a thread to wait on IO; let the operating system handle asynchronous IO operations. Hire threads to do insanely huge amounts of high latency CPU processing, and then assign one thread to each CPU you own.
Now we come to your question:
What are the benefits of async (non-blocking) code?
Not blocking the UI thread
Making it easier to write programs that live in a world with high latency
Making more efficient use of limited CPU resources
But let me rephrase the question using an analogy. You're running a delivery company. There are many orders coming in, many deliveries going out, and you cannot tell a customer that you will not take their delivery until every delivery before theirs is completed. Which is better:
hire fifty guys to take calls, pick up packages, schedule deliveries, and deliver packages, and then require that 46 of them be idle at all times or
hire four guys and make each of them really good at first, doing a little bit of work at a time, so that they are always responsive to customer requests, and second, really good at keeping a to-do list of jobs they need to do in the future
The latter seems like a better deal to me.

You are messing multithreading and async concepts here.
Both your "difficulties" come from the assumption that each async method gets assigned a specialized thread on which it does the work. However, the state of affairs is quite opposite: each time an async operation needs to be executed, the CLR picks an idle (thus already created) thread from the threadpool and executes that method on the selected thread.
The core concept here is that async doesn't mean always creating new threads, it means scheduling the execution on existing threads so that no thread is sitting idle.

Multithreading on a multi core machines not maxing CPU

I am working on maintaining someone else's code that is using multithreading, via two methods:
1: ThreadPool.QueueUserWorkItem(New WaitCallback(AddressOf ReadData), objUpdateItem)
2: Dim aThread As New Thread(AddressOf LoadCache)
aThread.Start()
However, on a dual core machine, I am only getting 50% CPU utlilization, and on a dual core with hyperthreadin enabled machine, I am only getting 25% CPU utilization.
Obviously threading is extremely complicated, but this behaviour would seem to indicate that I am not understanding some simple fundamental fact?
UPDATE
The code is too horribly complex to post here unfortunately, but for reference purposes, here is roughly what happens....I have approx 500 Accounts whose data is loaded from the database into an in memory cache...each account is loaded individually, and that process first calls a long running stored procedure, followed by manipulation and caching of the returned data. So, the point of threading in this situation is that there is indeed a bottleneck hitting the database (ie: the thread will be idled for up to 30 seconds waiting for the query to return), so we thread to allow others to begin processing the data they have received from Oracle.
So, the main thread executes:
ThreadPool.QueueUserWorkItem(New WaitCallback(AddressOf ReadData), objUpdateItem)
Then, the ReadData() then proceeds to execute (exactly once):
Dim aThread As New Thread(AddressOf LoadCache)
aThread.Start()
And this is occurring in a recursive function, so the QueueUserWorkItem can be executing multiple times, which in turn then executes exactly one new thread via the aThread.Start
Hopefully that gives a decent idea of how things are happening.
So, under this scenario, should this not theoretically pin both cores, rather than maxing out at 100% on one core, while the other core is essentially idle?

That code starts one thread that will go an do something. To get more than one core working you need to start more than one thread and get them both busy. Starting a thread to do some work, and then having your main thread wait for it won't get the task done any quicker. It is common to start a long running task on a background thread so that the UI remains responsive, which may be what this code was intended to do, but it won't make the task get done any quicker.
#Judah Himango - I had assumed that those two lines of code were samples of how multi-threading were being achieved in two different places in the program. Maybe the OP can clarify if this is the case or if these two lines really are in the one method. If they are part of one method then we will need to see what the two methods are actually doing.
Update:
That does sound like it should max out both cores. What do you mean by recursivly calling ReadData()? If each new thread is only calling ReadData at or near its end to start the next thread then that could explain the behaviour you are seeing.
I am not sure that this is actaully a good idea. If the stored proc takes 30 seconds to get the data then presumably it is placing a fair load on the database server. Running it 500 times in parallel is just going to make things worse. Obviously I don't know your database or data, but I would look at improving the performance of the stored proc.
If multi threading does look like the way forward, then I would have a loop on the main thread that calls ThreadPool.QueueUserWorkItem once for each account that needs loading. I would also remove the explicit thread creation and only use the thread pool. That way you are less likely to starve the local machine by creating too many threads.

How many threads are you spinning up? It may seem primitive (wait a few years, and you won't need to do this anymore), but your code has got to figure out an optimal number of threads to start, and spin up that many. Simply running a single thread won't make things any faster, and won't pin a physical processor, though it may be good for other reasons (a worker thread to keep your UI responsive, for instance).
In many cases, you'll want to be running a number of threads equal to the number of logical cores available to you (available from Environment.ProcessorCount, I believe), but it may have some other basis. I've spun up a few dozen threads, talking to different hosts, when I've been bound by remote process latency, for instance.

Multi-Threaded and Multi-Core are two different things. Doing things Multi-Threaded often won't offer you an enormous increase in performance, sometimes quite the opposite. The Operating System might do a few tricks to spread your cpu cycles over multiple cores, but that's where it ends.
What you are looking for is Parallelism. The .NET 4.0 framework will add a lot of new features to support Parallelism. Have a sneak-peak here:
http://www.danielmoth.com/Blog/2009/01/parallelising-loops-in-net-4.html

The CPU behavior would indicate that the application is only utilizing one logical processor. 50% would be one proc out of 2 (proc+proc). 25% would be one logical processor out of 4 (proc + HT + proc + HT)

How many threads to you have in total and do you have any locks in LoadCache. A SyncLock may a multi-thread system act as a single thread (by design). Also if your only spool one thread you will only get one worker thread.

CPU utilization is suggesting that you're only using one core; this may suggest that you've added threading to a portion where it is not beneficial (in this case, where CPU time is not a bottle neck).
If Loading the Cache or reading data happens very quickly, multi threading won't provide a massive improvement in speed performance. Similarly, if you're encountering a different bottleneck (slow bandwidth to a server, etc), it may not show up as CPU usage.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.