Time Complexity Evaluation?

Time Complexity Evaluation? - c#

I have implemented cryptography and Steganography algorithms in a application and would like to make a time complexity evaluation for these algorithms. I don't have any idea how these tests should be performed.
Is there any testing framework which I can apply or how to make these tests?

Assuming you're not using recursion, you should look at your loops. It's fairly simple to count by inspection the number of times each loop is iterated based on a certain input. The time complexity is given by multiplication of the number of times nested loops are run. If you have more than one set of nested loops, add them all together and this is the time complexity. As an example:
for (int i=0; i<N; ++i)
{
// Some constant time operation happens here
for (int j=0; j<N; ++j)
{
// Some constant time operation happens here
}
// Some constant time operation happens here
}
This loop will execute in O(N) + O(N2) = O(N + N2) = O(N2) time since the outer constant time portions of the loop are executed in O(N) time, while the inner portion is performed in O(N)*O(N) = O(N2).
If you're using recursion, things are a bit harder to analyze, but it boils down to the same thing ... counting how many times constant time portions of your code get executed. Recommend you read a book on analysis of algorithms (This one is pretty much the standard on the subject) which will help you understand how best to analyze other algorithms as well.

You are manipulating pixels in a bitmap, pretty hard to come up with an algorithm that isn't going to be O(n^2). You could run timing tests with various sizes of bitmaps, use the Stopwatch class. Bitmap manipulations that are not computationally hard, like this, are fundamentally limited by RAM bus bandwidth. And the size of the cpu cache if you don't address the pixels the smart way (storage order). You'll know you have cache problems when the perf suddenly falls off a cliff as you increase the bitmap size.

You could unit test it. Assuming you have a method with parameters you can time how long it takes with different input sizes n. That way you can work out the time complexity with respect to n. This would be the simplest way. Not sure of any frameworks that could automate this for you.

Related

Why is using a pointer for a for loop more performant in this case?

I don't have a background in C/C++ or related lower-level languages and so I've never ran into pointers before. I'm a game dev working primarily in C# and I finally decided to move to an unsafe context this morning for some performance-critical sections of code (and please no "don't use unsafe" answers as I've read so many times while doing research, as it's already yielding me around 6 times the performance in certain areas, with no issues so far, plus I love the ability to do stuff like reverse arrays with no allocation). Anyhow, there's a certain situation where I expected no difference, or even a possible decrease in speed, and I'm saving a lot of ticks in reality (I'm talking about double the speed in some instances). This benefit seems to decrease with the number of iterations, which I don't fully understand.
This is the situation:
int x = 0;
for(int i = 0; i < 100; i++)
x++;
Takes, on average about 15 ticks.
EDIT: The following is unsafe code, though I assumed that was a given.
int x = 0, i = 0;
int* i_ptr;
for(i_ptr = &i; *i_ptr < 100; (*i_ptr)++)
x++;
Takes about 7 ticks, on average.
As I mentioned, I don't have a low-level background and I literally just started using pointers this morning, at least directly, so I'm probably missing quite a bit of info. So my first query is- why is the pointer more performant in this case? It isn't an isolated instance, and there are a lot of other variables of course, at that specific point in time in relation to the PC, but I'm getting these results very consistently across a lot of tests.
In my head, the operations are as such:
No pointer:
Get address of i
Get value at address
Pointer:
Get address of i_ptr
Get address of i from i_ptr
Get value at address
In my head, there must surely be more overhead, however ridiculously negligible, from using a pointer here. How is it that a pointer is consistently more performant than the direct variable in this case? These are all on the stack as well, of course, so it's not dependent on where they end up being stored, from what I can tell.
As touched on earlier, the caveat is that this bonus decreases with the number of iterations, and pretty fast. I took out the extremes from the following data to account for background interference.
At 1000 iterations, they are both identical at 30 to 34 ticks.
At 10000 iterations, the pointer is slower by about 20 ticks.
Jump up to 10000000 iterations, and the pointer is slower by about 10000 ticks or so.
My assumption is that the decrease comes from the extra step I covered earlier, given that there is an additional lookup, which brings me back to wonder why it's more performant with a pointer than without at low loop counts. At the very least, I'd assume they would be more or less identical (which they are in practice, I suppose, but a difference of 8 ticks from millions of repeated tests is pretty definitive to me) up until the very rough threshold I found somewhere between 100 and 1000 iterations.
Apologies if I'm nitpicking somewhat, or if this is a poor question, but I feel as though it will be beneficial to know exactly what is going on under the hood. And if nothing else, I think it's pretty interesting!

Some users suggested that the test results were most likely due to measurement inaccuracies, and it would seem as such, at least upto a point. When averaged across ten million continuous tests, the mean of both is typically equal, though in some cases the use of pointers averages out to an extra tick. Interestingly, when testing as a single case, the use of pointers has a consistently lower execution time than without. There are of course a lot of additional variables at play at the specific points in time at which a test is tried, which makes it somewhat of a pointless pursuit to track this down any further. But the result is that I've learned some more about pointers, which was my primary goal, and so I'm pleased with the test.

Why if I run method one time is done work almost the same time if I run few times in for loop c#

I make one method who doing some simple operations like +, -, *, /.
I need to run this method 1513 times.
Here I try to run this method only once. To see do is working good and how times is be needed for to finish with operations.
Stopwatch st = new Stopwatch();
st.Start();
DiagramValue dv = new DiagramValue();
double pixel = dv.CalculateYPixel(23.46, diction);
st.Stop();
When is stop the stopwatch is teling me the time is 0.06s.
When I run the same method 1513 times in for loop like that:
Stopwatch st = new Stopwatch();
st.Start();
for (int i = 0; i < 1513; i++)
{
DiagramValue dv = new DiagramValue();
double pixel = dv.CalculateYPixel(23.46, diction);
}
st.Stop();
Then the Stopwatch is tell me is working around 0.14s. Or 0.14s / 1513 times = 0.00009s for one time.
My question is why If I running some method only once is too slow and if I running around thousand times in for loop is almost the same time.

Writing benchmarks is hard.
First, Stopwatch isn't infinitely accurate. When you run the method just once, you're very much limited by the accuracy of the underlying stopwatch. On the other hand, running the method multiple times alleviates this - you can get arbitrary precision by using a big enough loop. Instead of 1 vs 1513, compare e.g. 1500 vs. 3000. You'll get around 100% time increase, as expected.
Second, there's usually some cost with the first call in particular (e.g. JIT compilation) or with the memory pressure at the time of the call. That's why you usually need to do "preheating" - run the method outside of the stopwatch first to isolate these, and measure (multiple invocations) later.
Third, in a garbage collected environment like .NET, the guy who ordered the beer isn't necessarily the guy who pays the bill. Most of the cost of memory allocation in .NET is in the collection, rather than the allocation itself (which is about as cheap as a stack allocation). The collection usually happens outside of the code that caused the allocations in the first place, pointing you in the entirely wrong direction when searching for performance issues. That's why most .NET memory trackers display garbage collection separately - it's important to take account of, but can easily mislead you as to the cause if you're not careful.
There's many more issues, but these should cover your particular scenario well enough.

Some possible reasons include:
Timing resolution. You get a more accurate figure when you find the mean over a large number of iterations.
Noise. The percentage of stuff that isn't what you actually want to record, will be different.
Jitting. .NET will create code the first time a method is used. As such the first time it is run in a programs lifetime, the longer it will take, by a large factor (try running it once and then measuring the second attempt).
Branch prediction. If you keep doing the same thing with the same data the CPU's branch predictor is going to get better at predicting which branches are takken.
GC stability. Not likely in this case, but possible. Often at the start of a set of operations that requires particular objects to be created and then released the program ends up having to get more memory from the OS. When it's a bit into that set of operations it's more likely to have reached a steady state where it can just get that memory by cleaning out objects it isn't using any more, which is faster.

Sort or RemoveAll first on an IEnumerable that needs both?

When an IEnumerable needs both to be sorted and for elements to be removed, are there advantages/drawback of performing the stages in a particular order? My performance tests appear to indicate that it's irrelevant.
A simplified (and somewhat contrived) example of what I mean is shown below:
public IEnumerable<DataItem> GetDataItems(int maximum, IComparer<DataItem> sortOrder)
{
IEnumerable<DataItem> result = this.GetDataItems();
result.Sort(sortOrder);
result.RemoveAll(item => !item.Display);
result = result.Take(maximum);
return result;
}

If your tests indicate it's irrelevant, than why worry about it? Don't optimize before you need to, only when it becomes a problem. If you find a problem with performance, and have used a profiler, and have found that that method is the hotspot, then you can worry more about it.
On second thought, have you considered using LINQ? Those calls could be replaced with a call to Where and OrderBy, both of which are deferred, and then calling Take, like you have in your example. The LINQ libraries should find the best way of doing this for you, and if your data size expands to the point where it takes a noticeable amount of time to process, you can use PLINQ with a simple call to AsParallel.

You might as well RemoveAll before sorting so that you'll have fewer elements to sort.

I think that Sort() method would usually have complexity of O(n*log(n)), and RemoveAll() just O(n), so in general it is probably better to remove items first.

You'd want something like this:
public IEnumerable<DataItem> GetDataItems(int maximum, IComparer<DataItem> sortOrder)
{
IEnumerable<DataItem> result = this.GetDataItems();
return result
.Where(item => item.Display)
.OrderBy(sortOrder)
.Take(maximum);
}

There are two answers that are correct, but won't teach you anything:
It doesn't matter.
You should probably do RemoveAll first.
The first is correct because you said your performance tests showed it's irrelevant. The second is correct because it will have an effect on larger datasets.
There's a third answer that also isn't very useful: Sometimes it's faster to do removals afterwards.
Again, it doesn't actually tell you anything, but "sometimes" always means there is more to learn.
There's also only so much value in saying "profile first". What if profiling shows that 90% of the time is spent doing x.Foo(), which it does in a loop? Is the problem with Foo(), with the loop or with both? Obviously if we can make both more efficient we should, but how do we reason about that without knowledge outside of what a profiler tells us?
When something happens over multiple items (which is true of both RemoveAll and Sort) there are five things (I'm sure there are more I'm not thinking of now) that will affect the performance impact:
The per-set constant costs (both time and memory). How much it costs to do things like calling the function that we pass a collection to, etc. These are almost always negligible, but there could be some nasty high cost hidden there (often because of a mistake).
The per-item constant costs (both time and memory). How much it costs to do something that we do on some or all of the items. Because this happens multiple times, there can be an appreciable win in improving them.
The number of items. As a rule the more items, the more the performance impact. There are exceptions (next item), but unless those exceptions apply (and we need to consider the next item to know when this is the case), then this will be important.
The complexity of the operation. Again, this is a matter of both time-complexity and memory-complexity, but here the chances that we might choose to improve one at the cost of another. I'll talk about this more below.
The number of simultaneous operations. This can be a big difference between "works on my machine" and "works on the live system". If a super time-efficient approach uses .5GB of memory is tested on a machine with 2GB of memory available, it'll work wonderfully, but when you move it to a machine with 8GB of memory available and have multiple concurrent users, it'll hit a bottleneck at 16 simultaneous operations, and suddenly what was beating other approaches in your performance measurements becomes the application's hotspot.
To talk about complexity a bit more. The time complexity is a measure of how the time taken to do something relates the number of items it is done with, while memory complexity is a measure of how the memory used relates to that same number of items. Obtaining an item from a dictionary is O(1) or constant because it takes the same amount of time however large the dictionary is (not strictly true, strictly it "approaches" O(1), but it's close enough for most thinking). Finding something in an already sorted list can be O(log2 n) or logarithmic. Filtering through a list will be linear or O(n). Sorting something using a quicksort (which is what Sort uses) tends to be linearithmic or O(n log2 n) but in its worse case - against a list already sorted - will be quadratic O(n2).
Considering these, with a set of 8 items, an O(1) operation will take 1k seconds to do something, where k is a constant amount of time, O(log2 n) means 3k seconds, O(n) means 8k, O(n log2 n) means 24k and O(n2) means 64k. These are the most commonly found though there are plenty of others like O(nm) which is affected by two different sizes, or O(n!) which would be 40320k.
Obviously, we want as low a complexity as possible, though since k will be different in each case, sometimes the best solution for a small set has a high complexity (but low k constant) though a lower-complexity case will beat it with larger input.
So. Let's go back to the cases you are considering, viz filtering followed by sorting vs. sorting followed by filtering.
Per-set constants. Since we are moving two operations around but still doing both, this will be the same either way.
Per-item constants. Again, we're still doing the same things per item in either case, so no effect.
Number of items. Filtering reduces the number of items. Therefore the sooner we filter items, the more efficient the rest of the operation. Therefore doing RemoveAll first wins in this regard.
Complexity of the operation. It's either a O(n) followed by a average-case-O(log2 n)-worse-case-O(n2), or it's an average-case-O(log2 n)-worse-case-O(n2) followed by an O(n). Same either way.
Number of simultaneous cases. Total memory pressure will be relieved the sooner we remove some items, (slight win for RemoveAll first).
So, we've got two reasons to consider RemoveAll first as likely to be more efficient and none to consider it likely to be less efficient.
We would not assume that we were 100% guaranteed to be correct here. For a start we could simply have made a mistake in our reasoning. For another, there could be other factors we've dismissed as irrelevant that were actually pertinent. It is still true that we should profile before optimising, but reasoning about the sort of things I've mentioned above will both make us more likely to write performant code in the first place (not the same as optimising; but a matter of picking between options when readability, clarity and correctness is equal either way) and makes it easier to find likely ways to improve those things that profiling has found to be troublesome.
For a slightly different but relevant case, consider if the criteria sorted on matched those removed on. E.g. if we were to sort by date and remove all items after a given date.
In this case, if the list deallocates on all removals, it'll still be O(n), but with a much smaller constant. Alternatively, if it just moved the "last-item" pointer*, it becomes O(1). Finding the pointer is O(log2 n), so here there's both reasons to consider that filtering first will be faster (the reasons given above) and that sorting first will be faster (that removal can be made a much faster operation than it was before). With this sort of case it becomes only possible to tell by extending our profiling. It is also true that the performance will be affected by the type of data sent, so we need to profile with realistic data, rather than artificial test data, and we may even find that what was the more performant choice becomes the less performant choice months later when the dataset it is used on changes. Here the ability to reason becomes even more important, because we should note the possibility that changes in real-world use may make this change in this regard, and know that it is something we need to keep an eye on throughout the project's life.
(*Note, List<T> does not just move a last-item pointer for a RemoveRange that covers the last item, but another collection could.)

It would probably be better to the RemoveAll first, although it would only make much of a difference if your sorting comparison was intensive to calculate.

While loop execution time

We were having a performance issue in a C# while loop. The loop was super slow doing only one simple math calc. Turns out that parmIn can be a huge number anywhere from 999999999 to MaxInt. We hadn't anticipated the giant value of parmIn. We have fixed our code using a different methodology.
The loop, coded for simplicity below, did one math calc. I am just curious as to what the actual execution time for a single iteration of a while loop containing one simple math calc is?
int v1=0;
while(v1 < parmIn) {
v1+=parmIn2;
}

There is something else going on here. The following will complete in ~100ms for me. You say that the parmIn can approach MaxInt. If this is true, and the ParmIn2 is > 1, you're not checking to see if your int + the new int will overflow. If ParmIn >= MaxInt - parmIn2, your loop might never complete as it will roll back over to MinInt and continue.
static void Main(string[] args)
{
int i = 0;
int x = int.MaxValue - 50;
int z = 42;
System.Diagnostics.Stopwatch st = new System.Diagnostics.Stopwatch();
st.Start();
while (i < x)
{
i += z;
}
st.Stop();
Console.WriteLine(st.Elapsed.Milliseconds.ToString());
Console.ReadLine();
}

Assuming an optimal compiler, it should be one operation to check the while condition, and one operation to do the addition.

The time, small as it is, to execute just one iteration of the loop shown in your question is ... surprise ... small.
However, it depends on the actual CPU speed and whatnot exactly how small it is.
It should be just a few machine instructions, so not many cycles to pass once through the iteration, but there could be a few cycles to loop back up, especially if branch prediction fails.
In any case, the code as shown either suffers from:
Premature optimization (in that you're asking about timing for it)
Incorrect assumptions. You can probably get a much faster code if parmIn is big by just calculating how many loop iterations you would have to perform, and do a multiplication. (note again that this might be an incorrect assumption, which is why there is only one sure way to find performance issues, measure measure measure)
What is your real question?

It depends on the processor you are using and the calculation it is performing. (For example, even on some modern architectures, an add may take only one clock cycle, but a divide may take many clock cycles. There is a comparison to determine if the loop should continue, which is likely to be around one clock cycle, and then a branch back to the start of the loop, which may take any number of cycles depending on pipeline size and branch prediction)
IMHO the best way to find out more is to put the code you are interested into a very large loop (millions of iterations), time the loop, and divide by the number of iterations - this will give you an idea of how long it takes per iteration of the loop. (on your PC). You can try different operations and learn a bit about how your PC works. I prefer this "hands on" approach (at least to start with) because you can learn so much more from physically trying it than just asking someone else to tell you the answer.

The while loop is couple of instructions and one instruction for the math operation. You're really looking at a minimal execution time for one iteration. it's the sheer number of iterations you're doing that is killing you.
Note that a tight loop like this has implications on other things as well, as it bogs down one CPU and it blocks the UI thread (if it's running on it). Thus, not only it is slow due to the number of operations, it also adds a perceived perf impact due to making the whole machine look unresponsive.

If you're interested in the actual execution time, why not time it for yourself and find out?
int parmIn = 10 * 1000 * 1000; // 10 million
int v1=0;
Stopwatch sw = Stopwatch.StartNew();
while(v1 < parmIn) {
v1+=parmIn2;
}
sw.Stop();
double opsPerSec = (double)parmIn / sw.Elapsed.TotalSeconds;
And, of course, the time for one iteration is 1/opsPerSec.

Whenever someone asks about how fast control structures in any language you know they are trying to optimize the wrong thing. If you find yourself changing all your i++ to ++i or changing all your switch to if...else for speed you are micro-optimizing. And micro optimizations almost never give you the speed you want. Instead, think a bit more about what you are really trying to do and devise a better way to do it.
I'm not sure if the code you posted is really what you intend to do or if it is simply the loop stripped down to what you think is causing the problem. If it is the former then what you are trying to do is find the largest value of a number that is smaller than another number. If this is really what you want then you don't really need a loop:
// assuming v1, parmIn and parmIn2 are integers,
// and you want the largest number (v1) that is
// smaller than parmIn but is a multiple of parmIn2.
// AGAIN, assuming INTEGER MATH:
v1 = (parmIn/parmIn2)*parmIn2;
EDIT: I just realized that the code as originally written gives the smallest number that is a multiple of parmIn2 that is larger than parmIn. So the correct code is:
v1 = ((parmIn/parmIn2)*parmIn2)+parmIn2;
If this is not what you really want then my advise remains the same: think a bit on what you are really trying to do (or ask on Stackoverflow) instead of trying to find out weather while or for is faster. Of course, you won't always find a mathematical solution to the problem. In which case there are other strategies to lower the number of loops taken. Here's one based on your current problem: keep doubling the incrementer until it is too large and then back off until it is just right:
int v1=0;
int incrementer=parmIn2;
// keep doubling the incrementer to
// speed up the loop:
while(v1 < parmIn) {
v1+=incrementer;
incrementer=incrementer*2;
}
// now v1 is too big, back off
// and resume normal loop:
v1-=incrementer;
while(v1 < parmIn) {
v1+=parmIn2;
}
Here's yet another alternative that speeds up the loop:
// First count at 100x speed
while(v1 < parmIn) {
v1+=parmIn2*100;
}
// back off and count at 50x speed
v1-=parmIn2*100;
while(v1 < parmIn) {
v1+=parmIn2*50;
}
// back off and count at 10x speed
v1-=parmIn2*50;
while(v1 < parmIn) {
v1+=parmIn2*10;
}
// back off and count at normal speed
v1-=parmIn2*10;
while(v1 < parmIn) {
v1+=parmIn2;
}
In my experience, especially with graphics programming where you have millions of pixels or polygons to process, speeding up code usually involve adding even more code which translates to more processor instructions instead of trying to find the fewest instructions possible for the task at hand. The trick is to avoid processing what you don't have to.

Fastest way to calculate primes in C#?

I actually have an answer to my question but it is not parallelized so I am interested in ways to improve the algorithm. Anyway it might be useful as-is for some people.
int Until = 20000000;
BitArray PrimeBits = new BitArray(Until, true);
/*
* Sieve of Eratosthenes
* PrimeBits is a simple BitArray where all bit is an integer
* and we mark composite numbers as false
*/
PrimeBits.Set(0, false); // You don't actually need this, just
PrimeBits.Set(1, false); // remindig you that 2 is the smallest prime
for (int P = 2; P < (int)Math.Sqrt(Until) + 1; P++)
if (PrimeBits.Get(P))
// These are going to be the multiples of P if it is a prime
for (int PMultiply = P * 2; PMultiply < Until; PMultiply += P)
PrimeBits.Set(PMultiply, false);
// We use this to store the actual prime numbers
List<int> Primes = new List<int>();
for (int i = 2; i < Until; i++)
if (PrimeBits.Get(i))
Primes.Add(i);
Maybe I could use multiple BitArrays and BitArray.And() them together?

You might save some time by cross-referencing your bit array with a doubly-linked list, so you can more quickly advance to the next prime.
Also, in eliminating later composites once you hit a new prime p for the first time - the first composite multiple of p remaining will be p*p, since everything before that has already been eliminated. In fact, you only need to multiply p by all the remaining potential primes that are left after it in the list, stopping as soon as your product is out of range (larger than Until).
There are also some good probabilistic algorithms out there, such as the Miller-Rabin test. The wikipedia page is a good introduction.

Parallelisation aside, you don't want to be calculating sqrt(Until) on every iteration. You also can assume multiples of 2, 3 and 5 and only calculate for N%6 in {1,5} or N%30 in {1,7,11,13,17,19,23,29}.
You should be able to parallelize the factoring algorithm quite easily, since the Nth stage only depends on the sqrt(n)th result, so after a while there won't be any conflicts. But that's not a good algorithm, since it requires lots of division.
You should also be able to parallelize the sieve algorithms, if you have writer work packets which are guaranteed to complete before a read. Mostly the writers shouldn't conflict with the reader - at least once you've done a few entries, they should be working at least N above the reader, so you only need a synchronized read fairly occasionally (when N exceeds the last synchronized read value). You shouldn't need to synchronize the bool array across any number of writer threads, since write conflicts don't arise (at worst, more than one thread will write a true to the same place).
The main issue would be to ensure that any worker being waited on to write has completed. In C++ you'd use a compare-and-set to switch to the worker which is being waited for at any point. I'm not a C# wonk so don't know how to do it that language, but the Win32 InterlockedCompareExchange function should be available.
You also might try an actor based approach, since that way you can schedule the actors working with the lowest values, which may be easier to guarantee that you're reading valid parts of the the sieve without having to lock the bus on each increment of N.
Either way, you have to ensure that all workers have got above entry N before you read it, and the cost of doing that is where the trade-off between parallel and serial is made.

Without profiling we cannot tell which bit of the program needs optimizing.
If you were in a large system, then one would use a profiler to find that the prime number generator is the part that needs optimizing.
Profiling a loop with a dozen or so instructions in it is not usually worth while - the overhead of the profiler is significant compared to the loop body, and about the only ways to improve a loop that small is to change the algorithm to do fewer iterations. So IME, once you've eliminated any expensive functions and have a known target of a few lines of simple code, you're better off changing the algorithm and timing an end-to-end run than trying to improve the code by instruction level profiling.

#DrPizza Profiling only really helps improve an implementation, it doesn't reveal opportunities for parallel execution, or suggest better algorithms (unless you've experience to the otherwise, in which case I'd really like to see your profiler).
I've only single core machines at home, but ran a Java equivalent of your BitArray sieve, and a single threaded version of the inversion of the sieve - holding the marking primes in an array, and using a wheel to reduce the search space by a factor of five, then marking a bit array in increments of the wheel using each marking prime. It also reduces storage to O(sqrt(N)) instead of O(N), which helps both in terms of the largest N, paging, and bandwidth.
For medium values of N (1e8 to 1e12), the primes up to sqrt(N) can be found quite quickly, and after that you should be able to parallelise the subsequent search on the CPU quite easily. On my single core machine, the wheel approach finds primes up to 1e9 in 28s, whereas your sieve (after moving the sqrt out of the loop) takes 86s - the improvement is due to the wheel; the inversion means you can handle N larger than 2^32 but makes it slower. Code can be found here. You could parallelise the output of the results from the naive sieve after you go past sqrt(N) too, as the bit array is not modified after that point; but once you are dealing with N large enough for it to matter the array size is too big for ints.

You also should consider a possible change of algorithms.
Consider that it may be cheaper to simply add the elements to your list, as you find them.
Perhaps preallocating space for your list, will make it cheaper to build/populate.

Are you trying to find new primes? This may sound stupid, but you might be able to load up some sort of a data structure with known primes. I am sure someone out there has a list. It might be a much easier problem to find existing numbers that calculate new ones.
You might also look at Microsofts Parallel FX Library for making your existing code multi-threaded to take advantage of multi-core systems. With minimal code changes you can make you for loops multi-threaded.

There's a very good article about the Sieve of Eratosthenes: The Genuine Sieve of Eratosthenes
It's in a functional setting, but most of the opimization do also apply to a procedural implementation in C#.
The two most important optimizations are to start crossing out at P^2 instead of 2*P and to use a wheel for the next prime numbers.
For concurrency, you can process all numbers till P^2 in parallel to P without doing any unnecessary work.

void PrimeNumber(long number)
{
bool IsprimeNumber = true;
long value = Convert.ToInt32(Math.Sqrt(number));
if (number % 2 == 0)
{
IsprimeNumber = false;
MessageBox.Show("No It is not a Prime NUmber");
return;
}
for (long i = 3; i <= value; i=i+2)
{
if (number % i == 0)
{
MessageBox.Show("It is divisible by" + i);
IsprimeNumber = false;
break;
}
}
if (IsprimeNumber)
{
MessageBox.Show("Yes Prime NUmber");
}
else
{
MessageBox.Show("No It is not a Prime NUmber");
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.