I´m reading C# 5.0 in a Nutshell (O'Reilly) and in the first chapter there is a section that talks about Memory Management. This sections explains about the the unnecessary usage of pointers in C#, because it eliminates the problem of incorrect pointers found in other languages like C++. Finally it mentions the critical usage of pointers in performance-critical hotspots.
Then, what is a performance-critical hotspot and its purpose?
Thanks in advance for your help.
A "performance critical hotspot" refers to a piece of code which is a performance bottleneck. This could be many things, but a good example of this is image processing.
Let's say I have a rather large bitmap and I need to perform some operation on each pixel. This is going to be a loop with many iterations and perhaps a lot going on. Saving a bit of CPU and/or IO time during each iteration of this loop (this "hotspot") will result in a large overall performance gain.
So, GetPixel and SetPixel are out the window. They're slow and, from experience, I know they will not perform well on large images. In this case I can use LockBits to pin the image to its current memory location and obtain a pointer to the raw image bits.
This sort of traversal will result in far faster code and I have now optimized a "performance critical hotspot"
Related
I'm developing a benchmarking application using Universal Windows Platform that evaluates CPU and RAM performance of a Windows 10 system.
Although I found different algorithms to benchmark a CPU, I still didn't found any solid algorithm or solution to evaluate the write and read speeds of memory.
How can I achieve this in C#?
Thanks in advance :)
I don't see why this would not be possible from managed code. Array access code turns into normal x86 memory instructions. It's a thin abstraction. In particular I don't see why you would need a customized OS.
You should be able to test sequential memory speed by performing memcpy on big arrays. They must be bigger than the last level cache size.
You can test random access by randomly indexing into a big array. The index calculation must be cheap, unpredictable and there must be a dependency chain that serializes the memory instructions so that the CPU cannot parallelize them.
Honestly I don't think its possible. RAM benchmarks usually run off of dedicated OS's
RAM testing is different from RAM benchmarking.
C# doesn't give you that kind of control over RAM
Of course, just new up a big array and access it. Also, understand the overheads that are present. The only overhead is a range check.
The GC has no impact during the benchmark. It might be triggered by an allocation.
I am going to be doing a project soon for my degree that requires brute force text crunching and analysis. This will obviously mean a lot of reading and writing to RAM.
What is the most efficient method of memory management in C#? Last semester I was introduced to the memory marshal class and found this to be a very efficient method of reading and writing large amounts of data to RAM, however maybe that was just my experience. I'm hoping that someone can give me some advice or suggestions on alternatives or best practices for memory management in C#.
Thanks
The most efficient memory management system varies wildly with what you try to do in practice.
As a rule of thumb, try to stay clear of unmanaged code in C#: managed memory is more than enough for the immense majority of problems, and unless you know exactly what to do you're very unlikely to be more efficient than managed memory.
So my advice would be the following. Try a fully managed implementation, with a few good practices to prevent using too much memory:
always dispose your disposable objects
try mutualizing heavy assets, byte buffers for instance: instead of creating a new buffer every time you need one, use a buffer pool
If you gain empirical evidence that you need to do manual marshalling, then learn about it and use it. But not before.
Remember that a lot of people have worked on C# memory management, and that most C# developers don't need more (to the point that a lot of them don't even know how memory management works behind the scene, because they just don't need to). Managed memory in C# is pretty good, give it a shot first.
Alright, so I wanted to ask if it's actually possible to make a parser from c# to c++.
So that code written in C# would be able to run as fast as code written in C++.
Is it actually possible to do? I'm not asking how hard is it going to be.
What makes you think that translating your C# code to C++ would magically make it faster?
Languages don't have a speed. Assuming that C# code is slower (I'll get back to that), it is because of what that code does (including the implicit requirements placed by C#, such as bounds checking on arrays), and not because of the language it is written in.
If you converted your C# code to C++, it would still need to do bounds checking on arrays, because the original source code expected this to happen, so it would have to do just as much work.
Moreover, C# often isn't slower than C++. There are plenty of benchmarks floating around on the internet, generally showing that for the most part, C# is as fast as (or faster than) C++. Only when you spend a lot of time optimizing your code, does C++ become faster.
If you want faster code, you need to write code that requires less work to execute, not try to change the source language. That's just cargo-cult programming at its worst. You once saw some efficient code, and that was written in C++, so now you try to make things C++, in the hope of attracting any efficiency that might be passing by.
It just doesn't work that way.
Although you could translate C# code to C++, there would be the issue that C# depends on the .Net framework libraries which are not native, so you could not simply translate C# code to C++.
Update
Also C# code depends on the runtime to do things such as memory management i.e. Garbage Collection. If you translated the C# code to C++, where would the memory management code be? Parsing and translating is not going to fix issues like that.
The Mono project has invested quite a lot of energy in turning LLVM into a native machine code compiler for the C# runtime, although there are some problems with specific language constructs like shared generics etc.. Check it out and take it for a spin.
You can use NGen to compile IL to native code
Performance related tweaks:
Platform independent
use a profiler to spot the bottlenecks;
prevent unnecessary garbage (spot it using generation #0 collect count and the Large Object heap)
prevent unnecessary copying (use struct wisely)
prevent unwarranted generics (code-sharing has unexpected performance side effects)
prefer oldfashioned loops over enumerator blocks when performance is an issue
When using LINQ watch closely where you maintain/break deferred evaluation. Both can be enormous boosts to performance
use reflection.emit/Expression Trees to precompile certain dynamic logic that is performance bottleneck
Mono
use Mono --gc=sgen --optimize=inline,... (the SGEN garbage collector can make orders of magnitude difference). See also man mono for a lot of tuning/optimization options
use MONO_GENERIC_SHARING=none to disable sharing of generics (making particular tasks a lot quicker especially when supporting both valuetypes and reftypes) (not recommended for regular production use)
use the -optimize+ compile flag (optimizing the CLR code independently from what the JITter may do with that)
Less mainstream:
use the LLVM backend (click the quote:)
This allows Mono to benefit from all of the compiler optimizations done in LLVM. For example the SciMark score goes from 482 to 610.
use mkbundle to create a statically linked NATIVE binary image (already fully JITted, i.e. AOT (ahead-of-time compiled))
MS .NET
Most of the above have direct Microsoft pendants (NGen, `/Optimize' etc.)
Of course MS don't have a switchable/tunable garbage collector, and I don't think a fully compiled native binary can be achieved like with mono.
As always the answer to making code run faster is:
Find the bottleneck and optimize that
Most of the time the bottleneck is either:
time spend in a critical loop
Review your algorithm and datastructure, do not change the language, the latter will give a 10% speedup, the first will give you a 1000x speedup.
If you're stuck on the best algorithm, you can always ask a specific, short and detailed question on SO.
time waiting for resources for a slow source
Reduce the amount of stuff you're requesting from the source
instead of:
SELECT * FROM bigtable
do
SELECT TOP 10 * FROM bigtable ORDER BY xxx
The latter will return instantly and you cannot show a million records in a meaningful way anyhow.
Or you can have the server at the order end reduce the data so that it doesn't take a 100 years to cross the network.
Alternativly you can execute the slow datafetch routine in a separate thread, so the rest of your program can do meaningful stuff instead of waiting.
Time spend because you are overflowing memory with Gigabytes of data
Use a different algorithm that works on a smaller dataset at a time.
Try to optimize cache usage.
The answer to efficient coding is measure where your coding time goes
Use a profiler.
see: http://csharp-source.net/open-source/profilers
And optimize those parts that eat more than 50% of your CPU time.
Do this for a number of iterations, and soon your 10 hour running time will be down to a manageable 3 minutes, instead of the 9.5 hours that you will get from switching to this or that better language.
I have written a web service to resize user uploaded images and all works correctly from a functional point of view, but it causes CPU usage to spike every time it is used. It is running on Windows Server 2008 64 bit. I have tried compiling to 32 and 64 bit and get about the same results.
The heart of the service is this function:
private Image CreateReducedImage(Image imgOrig, Size NewSize)
{
var newBM = new Bitmap(NewSize.Width, NewSize.Height);
using (var newGrapics = Graphics.FromImage(newBM))
{
newGrapics.CompositingQuality = CompositingQuality.HighSpeed;
newGrapics.SmoothingMode = SmoothingMode.HighSpeed;
newGrapics.InterpolationMode = InterpolationMode.HighQualityBicubic;
newGrapics.DrawImage(imgOrig, new Rectangle(0, 0, NewSize.Width, NewSize.Height));
}
return newBM;
}
I put a profiler on the service and it seemed to indicate the vast majority of the time is spent in the GDI+ library itself and there is not much to be gained in my code.
Questions:
Am I doing something glaringly inefficient in my code here? It seems to conform to the example I have seen.
Are there gains to be had in using libraries other than GDI+? The benchmarks I have seen seem to indicate that GDI+ does well compare to other libraries but I didn't find enough of these to be confident.
Are there gains to be had by using "unsafe code" blocks?
Please let me know if I have not included enough of the code...I am happy to put as much up as requested but don't want to be obnoxious in the post.
Image processing is usually an expensive operation. You have to remember that a 32 bit color image is expanded in memory into 4 * pixel width * pixel height before your app even starts any kind of processing. A spike is definitely to be expected especially when doing any kind of pixel processing.
That being said, the only place i could see you in being able to speed up the process or lowering the impact on your processor is to try a lower quality interpolation mode.
You could try
newGrapics.InterpolationMode = InterpolationMode.Low;
as HighQualityBicubic will be the most processor-intensive of the resampling operations, but of course you will then lose image quality.
Apart from that, I can't really see anything that can be done to speed up your code. GDI+ will almost certainly be the fastest on a Windows machine (no code written in C# is going to surpass a pure C library), and using other image libraries carries the potential risk of unsafe and/or buggy code.
The bottom line is, resizing an image is an expensive operation no matter what you do. The simplest solution is your case might simply be to replace your server's CPU with a faster model.
I know that the DirectX being released with Windows 7 is said to provide 2D hardware acceleration. Whether this implies it will beat out GDI+ on this kind of operation, I don't know. MS has a pretty unflattering description of GDI here which implies it is slower than it should be, among other things.
If you really want to try to do this kind of stuff yourself, there is a great GDI Tutorial that shows it. The author makes use of both SetPixel and "unsafe blocks," in different parts of his tutorials.
As an aside, multi-threading will probably help you here, assuming your server has more than one CPU. That is, you can process more than one image at once and probably get faster results.
When you write
I have written a web service to resize
user uploaded images
It sounds to mee that the user uploads an image to a (web?) server, and the server then calls a web service to do the scaling?
If that is the case, I would simply move the scaling directly to the server. Imho, scaling an image doesn't justify it's own web service. And you get quite a bit unnecessary traffic going from the server to the web service, and back. In particular because the image is probably base64 encoded, which makes the data traffic even bigger.
But I'm just guessing here.
p.s. Unsafe blocks in itself doesn't give any gain, they just allow unsafe code to be compiled. So unless you write your own scaling routing, an unsafe block isn't going to help.
You may want to try ImageMagick. It's free, and there is also a .NET wrapper: click here. Or here.
Or you can send a command to a DOS Shell.
We have used ImageMagick on Windows Servers now and then, for batch processing and sometimes for a more flexible image conversion.
Of course, there are commercial components as well, like those by Leadtools and Atalasoft. We have never tried those.
I suspect the spike is because you have the interpolation mode cranked right up. All interpolation modes work per pixel and BiCubic High Quality is about as high as you can go with GDI+ so I suspect the per pixel calculations are chewing up your CPU.
As a test try dropping the interpolation mode down to InterpolationModeNearestNeighbor and see if the CPU spike drops - if so then that's your culprit.
If so then do some trial and error for cost vs quality, chances are you might not need High Quality BiCubic to get decent results
When said this code need some optimization, or can be some how optimized, what does that mean? which kind of code need optimization? How to apply optimization to the code in c#? What the benefits from that?
Optimization is a very broad term. In general it implies modifying the system to make some of its aspect to work more efficiently or use fewer resources or be more robust. For example, a computer program may be optimized so that it will execute faster or use less memory or disk storage or be more responsive in terms of UI.
Although "optimization" has the same root as "optimal", the process of optimization does not produce a totally optimal system: there's always a trade-off, so only attributes of greatest interest are optimized.
And remember:
The First Rule of Program Optimization: Don't do it. The Second Rule of Program Optimization (for experts only!): Don't do it yet. (Michael A. Jackson)
Optimization is the process of modifying a system to make some aspect of it work more efficiently or use fewer resources.
In your case refers mainly to 2 levels:
Design level
At the highest level, the design may be optimized to make best use of the available resources. The implementation of this design will benefit from a good choice of efficient algorithms and the implementation of these algorithms will benefit from writing good quality code. The architectural design of a system overwhelmingly affects its performance. The choice of algorithm affects efficiency more than any other item of the design. In some cases, however, optimization relies on using fancier algorithms, making use of special cases and special tricks and performing complex trade-offs; thus, a fully optimized program can sometimes, if insufficiently commented, be more difficult for less experienced programmers to comprehend and hence may contain more faults than unoptimized versions.
Source code level
Avoiding bad quality coding can also improve performance, by avoiding obvious slowdowns. After that, however, some optimizations are possible which actually decrease maintainability; some, but not all of them can nowadays be performed by optimizing compilers. For instance, using more indirection is often needed to simplify or improve a software, but that indirection has a cost.
Code optimization is making code run faster. There are two primary ways of doing this:
1) Squeezing more work into less cycles. Figure out where the code is doing an extra copy or if there is a branch in a tight loop. This is optimizing in the small.
2) Making your algorithms scale better. You may have heard of "Big O" notation. This is making an algorithm degrade much less quickly with large sets of data.
For instance, if you naively search a phone book for a name you will start on page 1 and read all the names until you find the one you are looking for. This will take a number of instructions scaled by the number of names in the phone book. We call this O(n). Now think about how you really search the phone book. You open to some place toward the middle and see which side the name you are looking for is on. This is called a binary search and scales at the logarithm of the number of names. We call this O(logn). It's much faster.
Remember the first rule of optimization: Measure first. Many man years have been spent optimizing code that wasn't run very much.
When doing code optimization, you take a metric on your code and try to make it more efficient. The metric usually refers to a scarce resource.
Here are common metrics
Execution speed (usually the first that comes to mind when saying optimization)
Memory consumption
Executable size (on embedded systems it can be important)
Database access
Remote service access (Make it less chatty, caching..)
Simplicity, readability, maintainability of the code
After optimization the code should give the same result.
The problem is that you have to make choices. Execution speed often comes with more memory consuption...
You should also alwas consider optimization globally. Having a gain of 10ms in a loop when you then spend 1000ms waiting for a web service is totaly useless.
To add to Anton Gogolev's answer, when a piece of code needs optimisation, it is because a particular performance requirement is not met. We develop programs to meet users requirements, right? Most programmers tend to think largely in terms of functional requirements, i.e. what the program does, but users will also have performance requirements, what is the resource cost (network bandwidth, CPU cycles, memory, disk space, etc...) of providing the functionality. Optimization is the process of changing a piece of code to meet a specific performance requirement. IMHO this should happen at design time, but you will sometimes write a piece of code only to discover it underperforms. To optimize the code, you first have to find out which is the resource that you are over using. If it is CPU cycles or memory, a profiler might help. If it is network bandwidth, which is a very common one these days, you will need to do some load testing and comms profiling.
My advice would be to always understand your current and probable future perfromance requirements before writing code, and optimize at design stage. Late optimization is expensive, difficult, and often either fails or results in ugly code.
Optimization has two main purposes:
getting your software use less resources, e.g., run faster, be smaller, use less RAM, less hard disk space both when running and when storing documents, less network access, ...
getting your software be more maintainable, by refactoring it.
You don't need to optimize as long as no related issue has been raised: It is far more difficult to debug optimized code than to optimize correct code.
It might be for example that the code has a block of code which is duplicated, and could/should be put into a method, you might be using deprecated methods/classes, there might be simpler ways to do what the code is doing, there might be some cleaning up to do (e.g. remove hard coding) etc...