With the same code i have:
5-10% CPU usage with IsFixedTimeStep = true and TargetElapsedTime = TimeSpan.FromSeconds(1 / 60f)
50-60% CPU usage with IsFixedTimeStep = true and TargetElapsedTime = TimeSpan.FromSeconds(1 / 30f)
by decreasing the frame rate one should expect less CPU usage.
I have tried with different code with similiar results.
Anyone know the cause?
If I had to guess (which I have to, because you've provided very little information to go on), I'd say it's an interaction between the GPU and CPU.
Take a look at this blog post.
Basically, at 60 FPS, you're probably being GPU-limited. The CPU is sitting idle waiting for the GPU to draw a frame before it starts on drawing another one. You're probably dropping frames.
At 30 FPS, the GPU is able to keep up, and so the CPU has to send out frames more frequently.
But, again, this is just a guess. You'd have to instrument your code to properly check for these things.
Related
I'm looking for some kind of timer that has a higher resolution than the Windows default of ~15ms. I don't need a timer for time measurement but rather a timer that is able to wait X milliseconds (or call an event every X milliseconds). I know it's possible to change the Windows timer resolution with NtSetTimerResolution, but that affects all applications (which I don't want). I don't need much precision, so say if I'm looking for 2ms then 1.5ms and 2.5ms would be OK too.
Using spinners work but this obviously causes too much CPU usage. Ideas that are creative are welcome too, as long as it can get the job done.
NtSetTimerResolution and timeBeginPeriod can increase timer resoultion, but they are system wide. If anyone has good idea, please tell me.
I don't recommend that you do this. Google has modified Chrome to increase the timer frequency only when necessary, which works in most cases.
The default timer resolution on Windows is 15.6 ms – the timer interrupts 64 times per second. As the program increases the timer frequency, they increase power consumption and impair battery life. They also waste more computing power than I expected – they slow down your computer! Because of these problems, Microsoft has been telling developers for years not to increase the timer frequency.
Currently my application has a major bottleneck when it comes to GPU CPU data sharing.
Basically I am selecting multiple items, each item becomes a buffer and then becomes a 2D texture (of the same size) and they all get blended together on the GPU. After which I need to know various things about the blend result. Which is on the GPU as a (single channel float) texture:
Maximum & Minimum value in the texture
Average value
Sum Value
Effectively I ended up with the very slow round about of
Put data on the GPU * N
Read data from GPU
Cycle data on CPU looking for values
Obviously a CPU profile shows the 2 major hot spots as the writes and the read. the textures are in the 100x100s not 1000x1000s but there are a lot of them.
There are 3 things I am currently considering
Combine all the data & find out interesting data before putting on GPU (seems pointless putting it on the GPU at all & some of the blends are complex)
When loading the data put it all onto the GPU (as texture levels, therefore skipping the lag on item selection in favor of a slower load)
Calculate the "interesting data" on the GPU and just have the CPU read back those values
On my machine and the data I have worked with, throwing all the data on the GPU would barely use the GPU memory. Highest I have seen so far is 9000 entries of 170 X 90, as its single channel float, by my maths that comes out as 1/2 GB. Which isn't a problem on my machine, but I could see it being a problem on the average laptop. Can I get a GPU to page from HDD? Is this even worth pursuing?
Sorry for asking such a broad question but I am looking for the most fruitful avenue to pursue and each avenue would be new ground to me. Profiling seems to highlight readback as the biggest problem at the moment. Could I improve this by changing FBO/Texture settings?
At the moment I am working in SharpGL and preferably need to stick to OpenGL 3.3. If however there is a route for rapid improvement in performance for any particular technique that is out of reach via either video memory or GL version I might be able to make a case to up the software system requirements.
I just noticed today, that when I compile and run a new XNA 4.0 game, one of CPU threads is running at 100% and the framerate drops to 54 FPS.
The weird thing is that sometimes it works at 60 FPS, but then it just drops to 54 FPS.
I haven't noticed this behaviour before, so I don't know if this is normal. I uninstalled my antivirus and reinstalled XNA Game Studio, XNA Redistributable and .NET Framework 4.
If I set IsFixedTimeStep to false, the game runs at 60 FPS and CPU usage is minimal (1-2%). but as far as I know, this requires me to do velocity calculations using ElapsedGameTime, but I don't know how to do it, since I'm fairly new to XNA. But some say that setting it to false reduces jerky animations.
I already checked this forum thread, but no one has found a good solution.
Has anyone experienced this problem?
EDIT:
I did some more research and I implemented a FPS counter (until now, I measured it with Fraps), and my counter shows the game running at 60 FPS (with IsFixedTimeStep = true), so this solves the FPS issue, but the high CPU usage remains. Is it possible that this happens to everyone?
According to this discussion on XBox Live Indie Games forum , apparently on some processors (and OS-s) XNA takes up 100% CPU time on one core when the default value of Game.IsFixedTimeStep is used.
A common solution (one that worked for me as well) is to put the following in your Game constructor:
IsFixedTimeStep = false;
What does it mean?
The Game.IsFixedTimeStep property, when true, ensures that your frame (Update(), Draw(), ...) is called at a fixed time interval specified in Game.TargetElapsedTime. This defaults to 60 calls per second.
When Game.IsFixedTimeStep = false, the calls to the next frame will happen when the previous one is finished. This is illustrated in the time graph below:
How does this change affect my code?
All fixed time calculations (movements, timings, etc.) will need to be modified to accommodate variable time steps. Thankfully, this is very simple.
Suppose you had
Vector3 velocity;
Vector3 position;
for some object, and you are updating the position with
position += velocity;
On default, this would mean that the speed of your object is 60 * velocity.Length() units per second. You add velocity 60 times each second.
When you translate that sentence into code, you get this simple modification:
position += velocity * 60 * gameTime.ElapsedGameTime.TotalSeconds;
To put it simple: you're scaling the values you add based on how much time has passed.
By making these modifications in places where you perform moving (or timing, etc.), you'll ensure that your game acts as it would back when it was fixed time step.
High CPU usage (100% on one core) is a non-problem for games. In other words, it's expected. The way you're using the CPU when you write a game demands you do this.
Open the code environment and write a simple program
void main(){
while(1) puts( "aaah" ) ;
}
Open the CPU monitor and see that 100% of one core is used.
Now your game is doing this:
void main(){
while(1){
update() ;
draw() ;
while( getTimeElapsedThisFrame() < 1.0/60.0 ) ; // busy wait, until 16ms have passed
}
}
So basically you call update() and draw(), and that takes up most of the 16ms you have to compute a frame (which you will take up most of when your game has a lot of stuff in it).
Now because O/S sleep resolution is not exact (if you call sleep( 1ms ), it may actually sleep for 20ms before your app wakes up again), what happens is games don't ever call the O/S system function Sleep. The Sleep function would cause your game to "oversleep" as it were and the game would appear laggy and slow.
Good CPU usage isn't worth having an unresponsive game app. So the standard is to "busy wait" and whittle away the time while hogging the CPU.
I have a complex problem, I've been working on it for weeks. My program is an eduactional software which use the webcam for analyzing physical experiments (eg. oscillating movement). I've experienced the folowings:
If the processor is busy, the time
measuring is inaccurate
(ISampleGrabberCB.BufferCB(SampleTime))
If I don't use the time, just count
the samples: 0, 1, 2... it looks
better. I perceive this when I look
at the curve of the movement.
My primary goal is reduce the inaccuracy, what I try to achieve with limitation of the FPS (which cause busy processor).
My WebCam (Intel Classmate PC's built
in webcam) has auto fps and exposure
time. Depending on the illumination
they fluctuate.
IAMStreamConfig.AvgTimePerFrame has no effect.
IAMCameraControl isn't supported by the webcam.
IKsPropertySet: I don't know how to
use this, since I don't have any
support for the webcam. In this
example they can use it for Logitech
webcam: http://social.msdn.microsoft.com/Forums/en/windowsdirectshowdevelopment/thread/47b1317d-87e6-4121-9189-0defe1e2dd44
from the MSDN article on Time and Clocks in DirectShow:
Any object that supports the IReferenceClock interface can serve as a reference clock. A filter with access to a hardware timer can provide a clock (an example is the audio renderer), or the filter graph manager can create one that uses the system time.
I've never attempted to use the IReferenceClock from a filter, but it would be my suspicion that it may not provide a high resolution clock that you need.
This SO post on high resolution timers might be what you need.
IAMStreamConfig.AvgTimePerFrame is for informational purposes, and attempting to adjust it won't have any effect. It's just a value from which you can calculate average frame rate for your video stream.
e.g.
VIDEOINFOHEADER* pVih = (VIDEOINFOHEADER*)m_MediaTypes.VideoType.pbFormat;
if( pVih )
int nFrameRate = (int)( (double) (10000000.0f / pVih->AvgTimePerFrame) );
Is there a simple way to determine how many milliseconds I need to "Sleep" for in order to "emulate" a 2 mhz speed. In other words, I want to execute an instruction, call System.Threading.Thread.Sleep() function for an X amount of milliseconds in order to emulate 2 mhz. This doesn't need to be exact to the millisecond, but is there a ball park I can get? Some forumlate that divides the PC clock speed by the 2 mhz or something?
Thanks
A 2 MHz clock has a 500 ns period. Sleep's argument is in milliseconds, so even if you used Sleep(1), you would miss 2,000 cycles.
Worse, Sleep does not promise that it will return after X milliseconds, only that it will return after at least X milliseconds.
Your best bet would be to use some kind of Timer with an event that keeps the program from consuming or producing data too quickly.
For the user, a pause of less than 100 ms or so will generally be imperceptible. Based on that, instead of attempting to sleep after each instruction, you'd be much better off executing for something like 50 ms, then sleeping for an appropriate length of time, then executing for another 50 ms.
Also note, however, that most processors with a 2 MHz clock (e.g. a Z80) did not actually execute 2 million instructions per second. A 2 MHz Z80 took a minimum of four processor clocks to fetch one instruction giving a maximum instruction rate of 500 KHz.
Note that sleeping is not at all a good proxy for running code on a less capable CPU. There are many things that affect computational performance other than clock rate. In many cases, clock rate is a second or third (or 10'th) order determinate of computational performance.
Also note that QueryPerformanceCounter() while high resolution is expensive on most systems (3000 to 5000 CPU clocks in many cases). The reason is that it requires a system call and several reads from the HPET in the system's south bridge. (note, this varies by system).
Could you help us better understand what you are trying to do?
As I mentioned in my comment on James Black's answer: do not poll a timer call (like QPC or the direct X stufF). Your thread will simply consume massive amounts of CPU cycles and not let ANY thread at a lower priority run, and will eat up most of the time at its priority. Note that the NT Scheduler does adjust thread priorities. This is called 'boosting'. If your thread is boosted and hits one of your polling loops, then it will almost assuredly cause perf problems. This is very bad behavior from a system perspective. Avoid it if at all possible.
Said another way: Windows is a mult-tasking OS and users run lots of things. Be aware that your app is running in a larger context and its behavior can have system wide implications.
The problem you will have is that the minimum sleep on windows seems to be about 20-50ms, so though you may put that you want to sleep for 1ms, it will wake up later, due to the fact that other processes will be running, and the time slice is quite large.
If you must have a small time such as 500ns (1/2e06 * 1000) then you will want to use DirectX, as it has a high-resolution timer, so that you can just loop until the pause is done, but, you will need to take over the computer, not allow other processes to interrupt what is going on.