I explain my situation.
I have a producer 1 to N consumers pattern. I'm using blocking collections and everything is working well. Doing some test I noticed this strange behavior:
I was testing how long my manipulation of data took in my consumers.
I noticed this strange things, below you'll find the code cleaned of my manipulation and which produce the strange behavior.
I have 4 consumers for 1 producer.
For most of data, the Console doesn't print anything, because ts=0 (its under a tick) but randomly (between every 1 to 5sec) it plots something like this (not in this very specific order, but of the same kind):
10000
20001
10000
30002
10000
40003
10000
10000
It is of the order of 10,000 ticks so around 1ms. Always a number in the format (N)000(N-1)
Note that the BlockingCollection I consume is filled depending on some network events which occurred completely at random times. Nothing regular from here.
The timing is almost perfect, always a multiple of 10,000 ticks.
What could be behind this ? Thks !
while(IsAlive)
{
DataToFieldMapping item;
try
{
_CollectionToConsume.TryTake(out item, -1);
}
catch
{
item = null;
}
if (item != null)
{
long ts = (DateTime.Now.Ticks - item.TimeStamp.Ticks);
if(ts>10)
Console.WriteLine(ts);
}
}
What's going on here is that DateTime.Now has a fairly limited precision. It's not giving you the time to the nearest tick. It is only updated every 10,000 ticks or so, which is why you generally see multiples of 10k ticks in your prints.
If you really want to get a better feel for the duration of those events, use the StopWatch class, which has a much higher precision. That said, StopWatch is simply a diagnostic tool (hence why it's in the Diagnostics namespace). You should only be using it to help you diagnose what's going on, and should be using it in production code.
On a side note, there really isn't any need to use a timer here at all. It appears that you're creating several consumers that are polling the BlockingCollection for new content. There is no reason to do this. They can simply block until the collection has items. (Hence the name, BlockingCollection.
The easiest way is for the consumers to simply do this:
foreach(var item in _CollectionToConsume.GetConsumingEnumerable())
ProcessItem(item);
Then just run that code in a background thread.
if you write the following and run, you'll see that ticks do not roll one to one, but rather in relatively large chunks b/c ticks resolution is actually much smaller.
for(int i =0; i< 100; i++)
{
Console.WriteLine(DateTime.Now.Ticks);
}
Use Stopwatch class to measure performance as that one uses a high-resolution timer which is much more suitable for the purpose.
Related
I make one method who doing some simple operations like +, -, *, /.
I need to run this method 1513 times.
Here I try to run this method only once. To see do is working good and how times is be needed for to finish with operations.
Stopwatch st = new Stopwatch();
st.Start();
DiagramValue dv = new DiagramValue();
double pixel = dv.CalculateYPixel(23.46, diction);
st.Stop();
When is stop the stopwatch is teling me the time is 0.06s.
When I run the same method 1513 times in for loop like that:
Stopwatch st = new Stopwatch();
st.Start();
for (int i = 0; i < 1513; i++)
{
DiagramValue dv = new DiagramValue();
double pixel = dv.CalculateYPixel(23.46, diction);
}
st.Stop();
Then the Stopwatch is tell me is working around 0.14s. Or 0.14s / 1513 times = 0.00009s for one time.
My question is why If I running some method only once is too slow and if I running around thousand times in for loop is almost the same time.
Writing benchmarks is hard.
First, Stopwatch isn't infinitely accurate. When you run the method just once, you're very much limited by the accuracy of the underlying stopwatch. On the other hand, running the method multiple times alleviates this - you can get arbitrary precision by using a big enough loop. Instead of 1 vs 1513, compare e.g. 1500 vs. 3000. You'll get around 100% time increase, as expected.
Second, there's usually some cost with the first call in particular (e.g. JIT compilation) or with the memory pressure at the time of the call. That's why you usually need to do "preheating" - run the method outside of the stopwatch first to isolate these, and measure (multiple invocations) later.
Third, in a garbage collected environment like .NET, the guy who ordered the beer isn't necessarily the guy who pays the bill. Most of the cost of memory allocation in .NET is in the collection, rather than the allocation itself (which is about as cheap as a stack allocation). The collection usually happens outside of the code that caused the allocations in the first place, pointing you in the entirely wrong direction when searching for performance issues. That's why most .NET memory trackers display garbage collection separately - it's important to take account of, but can easily mislead you as to the cause if you're not careful.
There's many more issues, but these should cover your particular scenario well enough.
Some possible reasons include:
Timing resolution. You get a more accurate figure when you find the mean over a large number of iterations.
Noise. The percentage of stuff that isn't what you actually want to record, will be different.
Jitting. .NET will create code the first time a method is used. As such the first time it is run in a programs lifetime, the longer it will take, by a large factor (try running it once and then measuring the second attempt).
Branch prediction. If you keep doing the same thing with the same data the CPU's branch predictor is going to get better at predicting which branches are takken.
GC stability. Not likely in this case, but possible. Often at the start of a set of operations that requires particular objects to be created and then released the program ends up having to get more memory from the OS. When it's a bit into that set of operations it's more likely to have reached a steady state where it can just get that memory by cleaning out objects it isn't using any more, which is faster.
I try to capture the exact execution time of function
Stopwatch regularSW = new Stopwatch();
for (int i = 0; i < 10; i++) {
regularSW.Start();
//function();
regularSW.Stop();
Console.WriteLine("Measured time: " + regularSW.Elapsed);
}
I also tried with DateTime and Process.GetCurrentProcess().TotalProcessorTime
but each time I get a different value.
How i can get same value ?
With StopWatch you already use the most accurate way. But you are not re-starting it in the loop. It always starts at the value where it ended. You either have to create a new StopWatch or call StopWatch.Restart instead of Start:
Stopwatch regularSW = new Stopwatch();
for (int i = 0; i < 10; i++) {
regularSW.Restart();
//function();
regularSW.Stop();
Console.WriteLine("Measured time: " + regularSW.Elapsed);
}
That's the reason for the different values. If you now still get different values, then the reason is that the method function really has different execution times which is not that unlikely(f.e. if it's a database query).
Since this question seems to be largely theoretical(regarding your comments), consider following things if you want to measure time in .NET:
compile and run in release mode, Any CPU (on an x64 machine) and optimizations on
A tick is 0.0001 milliseconds, so don't overestimate your results
They are different because you cannot control what other operations your system might need to perform in the background while your C# progam is running
If you for example claim memory in the method because you fill a local list, then the garbage collector might attempt to reclaim garbage(memory)
C# code is compiled Just In Time. The first time you go through a loop can therefore be hundreds or thousands of times more expensive than every subsequent time due to the cost of the jitter analyzing the code that the loop calls. If you are intending on measuring the "warm" cost of a loop then you need to run the loop once before you start timing it. If you are intending on measuring the average cost including the jit time then you need to decide how many times makes up a reasonable number of trials, so that the average works out correctly
you are running your code in a multithreaded, multiprocessor environment where threads can be switched at will, and where the thread quantum (the amount of time the operating system will give another thread until yours might get a chance to run again) is about 16 milliseconds. 16 milliseconds is about fifty million processor cycles. Coming up with accurate timings of sub-millisecond operations can be quite difficult if the thread switch happens within one of the several million processor cycles that you are trying to measure. Take that into consideration.
The last two points were copied from this answer of Eric Lippert (worth reading).
My question consists of 2 parts:
Is there any good way in C# to measure computation effort other than using timers such as Stopwatch? Below is what I have been doing, but the granularity is not great, and the result returned varies every time. I am wondering if there is more precise measure such as CPU operation count so that the result returned can be consistent.
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
//do work
stopWatch.Stop();
TimeSpan ts = stopWatch.Elapsed;
Console.WriteLine(ts);
If the alternative approach in 1 is not possible, how can I make the performance test result less variate? What are some factors that can make the result change? Would closing all other applications running help? (I did try it but there seems to be no significant effect.) How about running the test on a VM, sandbox, etc.?
(After typing the proceeding text I realized that I also have tried the Performance Analysis feature which comes with Visual Studio. The test result seems more coarse because of the sampling method it uses. So I also want to rule out that option)
You need to get a profiling tool. But you can use StopWatch more reliably if you run your tests in a loop multiple times but only take the results of the test if the garbage collection generation stays the same.
Like this:
var timespans = new List<TimeSpan>();
while (true)
{
var count = GC.CollectionCount(0);
var sw = Stopwatch.StartNew();
/* run test here */
sw.Stop();
if (count == GC.CollectionCount(0))
{
timespans.Add(sw.Elapsed);
}
if (timespans.Count == 100)
{
break;
}
}
That'll give you 100 tests where garbage collection didn't occur. The average is then pretty good to work from.
If you find that your tests never run without invoking a garbage collection then try working out the minimum number of GC's that get triggered and collect your time spans only when that number occurs.
You could query a system performance counter. The msdn doc for the System.Diagnostics.PerformanceCounter class has some examples. With this class you could query "\Process(your_process_name)\% Processor Time" for example. It's an alternative to Stopwatch but tbh I think just using stopwatch and averaging many runs over time is a perfectly good way to go.
If what you need is a higher resolution stopwatch because you are trying to measure a very small slice of cpu time, then you may be interested in the High-Performance Counter.
I am currently trying to write an application that runs the same code exactly 100 times a second. I have done some testing using the built-in timers of the .NET framework. I've tested the System.Threading.Timer class, the System.Windows.Forms.Timer class and the System.Timers.Timer class. None of them seem to be accurate enough for what I am trying to do.
I've found out about PerformanceCounters and am currently trying to implement such in my application.
However, I am having a bit of trouble with my program taking up a whole core of my CPU when idling.
I only need it to be active 100 times a second. My loop looks like this:
long nextTick, nextMeasure;
QueryPerformanceCounter(out start);
nextTick = start + countsPerTick;
nextMeasure = start + performanceFrequency;
long currentCount;
while (true)
{
QueryPerformanceCounter(out currentCount);
if (currentCount >= nextMeasure)
{
Debug.Print("Ticks this second: " + tickCount);
tickCount = 0;
nextMeasure += performanceFrequency;
}
if (currentCount >= nextTick)
{
Calculations();
tickCount++;
nextTick += countsPerTick;
}
}
As you can see, most of the time the program will be waiting to run Calculations() again by running through the while loop constantly. Is there a way to stop this from happening? I don't want to slow the computers my program will be run on down.
System.Thread.Thread.Sleep unfortunately is also pretty "inaccurate", but I would be okay with using it if there is no other solution.
What I am basically asking is this: Is there a way to make an infinite loop less CPU-intensive? Is there any other way of accurately waiting for a specific amount of time?
As I'm sure you're aware, Windows is not a real-time O/S, so there can never be any guarantee that your code will run as often as you want.
Having said that, the most efficient in terms of yielding to other threads is probably to use Thread.Sleep() as the timer. If you want higher accuracy than the default you can issue a timeBeginPeriod with the desired resolution down to a millisecond. The function must be DLLImported from winmm.dll.
timeBeginPeriod(1) together with a normal timer or Thread.Sleep should work decently.
Note that this has a global effect. There are claims that it increases power consumption, since it forces the windows timer to run more often, shortening the CPU sleeping periods. This means you should generally avoid it. But if you need highly accurate timing, it's certainly a better choice than a busy-wait.
We were having a performance issue in a C# while loop. The loop was super slow doing only one simple math calc. Turns out that parmIn can be a huge number anywhere from 999999999 to MaxInt. We hadn't anticipated the giant value of parmIn. We have fixed our code using a different methodology.
The loop, coded for simplicity below, did one math calc. I am just curious as to what the actual execution time for a single iteration of a while loop containing one simple math calc is?
int v1=0;
while(v1 < parmIn) {
v1+=parmIn2;
}
There is something else going on here. The following will complete in ~100ms for me. You say that the parmIn can approach MaxInt. If this is true, and the ParmIn2 is > 1, you're not checking to see if your int + the new int will overflow. If ParmIn >= MaxInt - parmIn2, your loop might never complete as it will roll back over to MinInt and continue.
static void Main(string[] args)
{
int i = 0;
int x = int.MaxValue - 50;
int z = 42;
System.Diagnostics.Stopwatch st = new System.Diagnostics.Stopwatch();
st.Start();
while (i < x)
{
i += z;
}
st.Stop();
Console.WriteLine(st.Elapsed.Milliseconds.ToString());
Console.ReadLine();
}
Assuming an optimal compiler, it should be one operation to check the while condition, and one operation to do the addition.
The time, small as it is, to execute just one iteration of the loop shown in your question is ... surprise ... small.
However, it depends on the actual CPU speed and whatnot exactly how small it is.
It should be just a few machine instructions, so not many cycles to pass once through the iteration, but there could be a few cycles to loop back up, especially if branch prediction fails.
In any case, the code as shown either suffers from:
Premature optimization (in that you're asking about timing for it)
Incorrect assumptions. You can probably get a much faster code if parmIn is big by just calculating how many loop iterations you would have to perform, and do a multiplication. (note again that this might be an incorrect assumption, which is why there is only one sure way to find performance issues, measure measure measure)
What is your real question?
It depends on the processor you are using and the calculation it is performing. (For example, even on some modern architectures, an add may take only one clock cycle, but a divide may take many clock cycles. There is a comparison to determine if the loop should continue, which is likely to be around one clock cycle, and then a branch back to the start of the loop, which may take any number of cycles depending on pipeline size and branch prediction)
IMHO the best way to find out more is to put the code you are interested into a very large loop (millions of iterations), time the loop, and divide by the number of iterations - this will give you an idea of how long it takes per iteration of the loop. (on your PC). You can try different operations and learn a bit about how your PC works. I prefer this "hands on" approach (at least to start with) because you can learn so much more from physically trying it than just asking someone else to tell you the answer.
The while loop is couple of instructions and one instruction for the math operation. You're really looking at a minimal execution time for one iteration. it's the sheer number of iterations you're doing that is killing you.
Note that a tight loop like this has implications on other things as well, as it bogs down one CPU and it blocks the UI thread (if it's running on it). Thus, not only it is slow due to the number of operations, it also adds a perceived perf impact due to making the whole machine look unresponsive.
If you're interested in the actual execution time, why not time it for yourself and find out?
int parmIn = 10 * 1000 * 1000; // 10 million
int v1=0;
Stopwatch sw = Stopwatch.StartNew();
while(v1 < parmIn) {
v1+=parmIn2;
}
sw.Stop();
double opsPerSec = (double)parmIn / sw.Elapsed.TotalSeconds;
And, of course, the time for one iteration is 1/opsPerSec.
Whenever someone asks about how fast control structures in any language you know they are trying to optimize the wrong thing. If you find yourself changing all your i++ to ++i or changing all your switch to if...else for speed you are micro-optimizing. And micro optimizations almost never give you the speed you want. Instead, think a bit more about what you are really trying to do and devise a better way to do it.
I'm not sure if the code you posted is really what you intend to do or if it is simply the loop stripped down to what you think is causing the problem. If it is the former then what you are trying to do is find the largest value of a number that is smaller than another number. If this is really what you want then you don't really need a loop:
// assuming v1, parmIn and parmIn2 are integers,
// and you want the largest number (v1) that is
// smaller than parmIn but is a multiple of parmIn2.
// AGAIN, assuming INTEGER MATH:
v1 = (parmIn/parmIn2)*parmIn2;
EDIT: I just realized that the code as originally written gives the smallest number that is a multiple of parmIn2 that is larger than parmIn. So the correct code is:
v1 = ((parmIn/parmIn2)*parmIn2)+parmIn2;
If this is not what you really want then my advise remains the same: think a bit on what you are really trying to do (or ask on Stackoverflow) instead of trying to find out weather while or for is faster. Of course, you won't always find a mathematical solution to the problem. In which case there are other strategies to lower the number of loops taken. Here's one based on your current problem: keep doubling the incrementer until it is too large and then back off until it is just right:
int v1=0;
int incrementer=parmIn2;
// keep doubling the incrementer to
// speed up the loop:
while(v1 < parmIn) {
v1+=incrementer;
incrementer=incrementer*2;
}
// now v1 is too big, back off
// and resume normal loop:
v1-=incrementer;
while(v1 < parmIn) {
v1+=parmIn2;
}
Here's yet another alternative that speeds up the loop:
// First count at 100x speed
while(v1 < parmIn) {
v1+=parmIn2*100;
}
// back off and count at 50x speed
v1-=parmIn2*100;
while(v1 < parmIn) {
v1+=parmIn2*50;
}
// back off and count at 10x speed
v1-=parmIn2*50;
while(v1 < parmIn) {
v1+=parmIn2*10;
}
// back off and count at normal speed
v1-=parmIn2*10;
while(v1 < parmIn) {
v1+=parmIn2;
}
In my experience, especially with graphics programming where you have millions of pixels or polygons to process, speeding up code usually involve adding even more code which translates to more processor instructions instead of trying to find the fewest instructions possible for the task at hand. The trick is to avoid processing what you don't have to.