I have been writing applications lately in c# that use a ton of memory or stack overflow due to processing extremely large amounts of data in fun ways. Is there a language better suited for this type of thing? Would I benefit from learning a different language (other than c++) to do this?
C# isn't the problem. You may need to reconsider the "fun ways" you're handling memory and data. Provide specific scenarios and questions here to get specific answers and alternatives to potentially-problematic methods and strategies you may be using in your application(s).
If running on a 32bit system .Net will start giving you out of memory exceptions when you consume ~800mb. This is because it need to allocate continuous blocks of memory. If you have an array or list which needs to be expanded, it will copy the old content to a new one, thus having two instances allocated at the same time.
If you can run 64bit, then you will hit your exceptions on anything from ~2GB and above, all depending on how your application works, and what else is running.
For data larger than your physical memory, I would recommend either memory mapped files, or doing some disk/memory swapping.
If you are working with large data sets and doing functional manipulation, you might consider looking into a functional language like F# or Haskell.
The will not suffer as readily from recursive issues.
However these languages wont substitute for a good design and attention to how you are doing your operations. Its possible that C# is completely well suited to your problem you might just need to refactor how you are handling the problem space.
IDL (Interactive Data Language) is specially suited for large, matrix-like sets of data. You must, however, pay attention to using matrix or vector operations and not sequential loops.
If licensing is a problem you can try the free clone GDL, although it may not be as fast as IDL.
How large is your data?
Related
I am going to be doing a project soon for my degree that requires brute force text crunching and analysis. This will obviously mean a lot of reading and writing to RAM.
What is the most efficient method of memory management in C#? Last semester I was introduced to the memory marshal class and found this to be a very efficient method of reading and writing large amounts of data to RAM, however maybe that was just my experience. I'm hoping that someone can give me some advice or suggestions on alternatives or best practices for memory management in C#.
Thanks
The most efficient memory management system varies wildly with what you try to do in practice.
As a rule of thumb, try to stay clear of unmanaged code in C#: managed memory is more than enough for the immense majority of problems, and unless you know exactly what to do you're very unlikely to be more efficient than managed memory.
So my advice would be the following. Try a fully managed implementation, with a few good practices to prevent using too much memory:
always dispose your disposable objects
try mutualizing heavy assets, byte buffers for instance: instead of creating a new buffer every time you need one, use a buffer pool
If you gain empirical evidence that you need to do manual marshalling, then learn about it and use it. But not before.
Remember that a lot of people have worked on C# memory management, and that most C# developers don't need more (to the point that a lot of them don't even know how memory management works behind the scene, because they just don't need to). Managed memory in C# is pretty good, give it a shot first.
I know a similar question has already been asked here What features should a C#/.NET profiler have? but this thread is not only about the wishlist but also about how to go about implementing that wishlist.
So let me ask you this just once more. I am in the process of building a generic performance profiler. I know I can take in a dll as input and take the usual Stopwatch approach to profile the response times of methods in that dll. But this is very basic stuff. I am willing to use third party api(or do some code on my own too) to extract whatever useful information I can lay my hands on from that dll. I want to know everything that makes it slow. I want to know about it's memory leaks. Anything at all that would help me find bottlenecks of the application. I'd want similar approach to find expensive db operations. But all this, under one application.
So what approach do you suggest? Which tools can I bring under my umbrella so that I can use them in my project?
I want to make a 'single' application that will take generic inputs like dlls, can also take input as source code tree(solution, projects, .cs files) and emit out results in the form of response times, identifying bottlenecks, memory leaks, etc.
Anything at all that would help me find bottlenecks of the
application.
Be careful of the universal profiling assumption that measurements unerringly lead to where the bottlenecks are, because some of them can be found that way, but only some.
Then the remaining bottlenecks sit there consuming time needlessly, but the developer is oblivious to them because profiler measurements did not isolate them.
A simple example could be some sort of dictionary lookup that appears to be optimal, except that the words being looked up are highly non-random. If certain words are looked up much more frequently, that represents an opportunity for optimization, but to detect that you need to know something about the data. Measuring profilers don't look at the program's data.
A more extreme example is any sort of interpreter, whose data is the "instruction set" for another language. The bottlenecks could easily be in that other language, but since it is data the measuring profiler would not see them.
What does see problems of this sort are not measurements but a small number of samples of the program's state, where the developer can fully examine and characterize the content of each sample (call stack and data). This leads to a much better understanding of how and why the program is spending its time than putting measurements on methods or browsing a call graph.
The following article discusses an alternative heap structure that takes into consideration that most servers are virtualized and therefore most memory is paged to disk.
http://queue.acm.org/detail.cfm?id=1814327
Can (or should) a .NET developer implement a B-Heap data structure so that parent-child relationships are maintained within the same Virtual Memory Page? How or where would this be implemented?
Clarification
In other words, is this type of data structure needed within .NET as a primimitive type? True it should be implemented in either natively in the CLR or in a p/invoke.
When a server administrator deploys my .NET app within a virtual machine, does this binary heap optimization make sense? If so, when does it make sense? (number of objects, etc)
To at least a certain extent, BCL collections do seem to take paging concerns into account. They also take CPU cache concerns into account (which overlaps in some regard, as locality of memory can affect both, though in different ways).
Consider that Queue<T> uses arrays for internal storage. In purely random-access terms (that is to say, where there is never any cost for paging or CPU cache flushing) this is a poor choice; the queue will almost always be solely added to at one point and removed from at another and hence an internal implementation as a singly linked list would win in almost every way (for that matter, in terms of iterating through the queue - which it also supports - a linked list shouldn't do much worse than an array in this regard in a pure-random-access situation). Where array-based implementation fares better than singly-linked-list is precisely when paging and CPU cache are considered. That MS went for a solution that is worse in the pure-random-access situation but better in the real-world case where paging matters, so that they are paying attention to the effects of paging.
Of course, from the outside that isn't obvious - and shouldn't be. From the outside we want something that works like a queue; making the inside efficient is a different concern.
These concerns are also met in other ways. The way the GC works, for example, minimises the amount of paging necessary as its moving objects not only makes for less fragmentation, but also makes for fewer page faults. Other collections are also implemented in ways to make paging less frequent than the most immediate solution would suggest.
That's just a few things that stand out to me from things I have looked at. I'd bet good money such concerns are also considered at many other places in the .NET teams work. Likewise with other frameworks. Consider that the one big performance concern Cliff Click mentions repeatedly in terms of his Java lock-free hashtable (I really much finish checking my C# implementation) apart from those of lock-free concurrency (the whole point of the exercise) is cache-lines; and it's also the one other performance concern he doesn't dismiss!
Consider also, that most uses of most collections are going to fit in one page anyway!
If you are implementing your own collections, or putting a standard collection into particularly heavy use, then these are things you need to think about (sometimes "nah, not an issue" is enough thinking, sometimes it isn't) but that doesn't mean they aren't already thought about in terms of what we get from the BCL.
if you have an especially special-case scenario and algorithm then you might benefit from that kind of optimization.
But generally speaking, when reimplementing core parts of the CLR framework (on top of the CLR I might add, ie in managed code) your chances of doing it more efficiently than the CLR team did are incredibly slim. So I wouldn't recommend it unless you have already profiled the heck out of your current implementation and have positively identified issues related to locality of data in memory. And even then, you will get more bang for your buck by tweaking your algorithm to work better with the CLR memory management scheme then trying to bypass or work around it.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
It seems like optimization is a lost art these days. Wasn't there a time when all programmers squeezed every ounce of efficiency from their code? Often doing so while walking five miles in the snow?
In the spirit of bringing back a lost art, what are some tips that you know of for simple (or perhaps complex) changes to optimize C#/.NET code? Since it's such a broad thing that depends on what one is trying to accomplish it'd help to provide context with your tip. For instance:
When concatenating many strings together use StringBuilder instead. See link at the bottom for caveats on this.
Use string.Compare to compare two strings instead of doing something like string1.ToLower() == string2.ToLower()
The general consensus so far seems to be measuring is key. This kind of misses the point: measuring doesn't tell you what's wrong, or what to do about it if you run into a bottleneck. I ran into the string concatenation bottleneck once and had no idea what to do about it, so these tips are useful.
My point for even posting this is to have a place for common bottlenecks and how they can be avoided before even running into them. It's not even necessarily about plug and play code that anyone should blindly follow, but more about gaining an understanding that performance should be thought about, at least somewhat, and that there's some common pitfalls to look out for.
I can see though that it might be useful to also know why a tip is useful and where it should be applied. For the StringBuilder tip I found the help I did long ago at here on Jon Skeet's site.
It seems like optimization is a lost art these days.
There was once a day when manufacture of, say, microscopes was practiced as an art. The optical principles were poorly understood. There was no standarization of parts. The tubes and gears and lenses had to be made by hand, by highly skilled workers.
These days microscopes are produced as an engineering discipline. The underlying principles of physics are extremely well understood, off-the-shelf parts are widely available, and microscope-building engineers can make informed choices as to how to best optimize their instrument to the tasks it is designed to perform.
That performance analysis is a "lost art" is a very, very good thing. That art was practiced as an art. Optimization should be approached for what it is: an engineering problem solvable through careful application of solid engineering principles.
I have been asked dozens of times over the years for my list of "tips and tricks" that people can use to optimize their vbscript / their jscript / their active server pages / their VB / their C# code. I always resist this. Emphasizing "tips and tricks" is exactly the wrong way to approach performance. That way leads to code which is hard to understand, hard to reason about, hard to maintain, that is typically not noticably faster than the corresponding straightforward code.
The right way to approach performance is to approach it as an engineering problem like any other problem:
Set meaningful, measurable, customer-focused goals.
Build test suites to test your performance against these goals under realistic but controlled and repeatable conditions.
If those suites show that you are not meeting your goals, use tools such as profilers to figure out why.
Optimize the heck out of what the profiler identifies as the worst-performing subsystem. Keep profiling on every change so that you clearly understand the performance impact of each.
Repeat until one of three things happens (1) you meet your goals and ship the software, (2) you revise your goals downwards to something you can achieve, or (3) your project is cancelled because you could not meet your goals.
This is the same as you'd solve any other engineering problem, like adding a feature -- set customer focused goals for the feature, track progress on making a solid implementation, fix problems as you find them through careful debugging analysis, keep iterating until you ship or fail. Performance is a feature.
Performance analysis on complex modern systems requires discipline and focus on solid engineering principles, not on a bag full of tricks that are narrowly applicable to trivial or unrealistic situations. I have never once solved a real-world performance problem through application of tips and tricks.
Get a good profiler.
Don't bother even trying to optimize C# (really, any code) without a good profiler. It actually helps dramatically to have both a sampling and a tracing profiler on hand.
Without a good profiler, you're likely to create false optimizations, and, most importantly, optimize routines that aren't a performance problem in the first place.
The first three steps to profiling should always be 1) Measure, 2) measure, and then 3) measure....
Optimization guidelines:
Don't do it unless you need to
Don't do it if it's cheaper to throw new hardware at the problem instead of a developer
Don't do it unless you can measure the changes in a production-equivalent environment
Don't do it unless you know how to use a CPU and a Memory profiler
Don't do it if it's going to make your code unreadable or unmaintainable
As processors continue to get faster the main bottleneck in most applications isn't CPU, it's bandwidth: bandwidth to off-chip memory, bandwidth to disk and bandwidth to net.
Start at the far end: use YSlow to see why your web site is slow for end-users, then move back and fix you database accesses to be not too wide (columns) and not too deep (rows).
In the very rare cases where it's worth doing anything to optimize CPU usage be careful that you aren't negatively impacting memory usage: I've seen 'optimizations' where developers have tried to use memory to cache results to save CPU cycles. The net effect was to reduce the available memory to cache pages and database results which made the application run far slower! (See rule about measuring.)
I've also seen cases where a 'dumb' un-optimized algorithm has beaten a 'clever' optimized algorithm. Never underestimate how good compiler-writers and chip-designers have become at turning 'inefficient' looping code into super efficient code that can run entirely in on-chip memory with pipelining. Your 'clever' tree-based algorithm with an unwrapped inner loop counting backwards that you thought was 'efficient' can be beaten simply because it failed to stay in on-chip memory during execution. (See rule about measuring.)
When working with ORMs be aware of N+1 Selects.
List<Order> _orders = _repository.GetOrders(DateTime.Now);
foreach(var order in _orders)
{
Print(order.Customer.Name);
}
If the customers are not eagerly loaded this could result in several round trips to the database.
Don't use magic numbers, use enumerations
Don't hard-code values
Use generics where possible since it's typesafe & avoids boxing & unboxing
Use an error handler where it's absolutely needed
Dispose, dispose, dispose. CLR wound't know how to close your database connections, so close them after use and dispose of unmanaged resources
Use common-sense!
OK, I have got to throw in my favorite: If the task is long enough for human interaction, use a manual break in the debugger.
Vs. a profiler, this gives you a call stack and variable values you can use to really understand what's going on.
Do this 10-20 times and you get a good idea of what optimization might really make a difference.
If you identify a method as a bottleneck, but you don't know what to do about it, you are essentially stuck.
So I'll list a few things. All of these things are not silver bullets and you will still have to profile your code. I'm just making suggestions for things you could do and can sometimes help. Especially the first three are important.
Try solving the problem using just (or: mainly) low-level types or arrays of them.
Problems are often small - using a smart but complex algorithm does not always make you win, especially if the less-smart algorithm can be expressed in code that only uses (arrays of) low level types. Take for example InsertionSort vs MergeSort for n<=100 or Tarjan's Dominator finding algorithm vs using bitvectors to naively solve the data-flow form of the problem for n<=100. (the 100 is of course just to give you some idea - profile!)
Consider writing a special case that can be solved using just low-level types (often problem instances of size < 64), even if you have to keep the other code around for larger problem instances.
Learn bitwise arithmetic to help you with the two ideas above.
BitArray can be your friend, compared to Dictionary, or worse, List. But beware that the implementation is not optimal; You can write a faster version yourself. Instead of testing that your arguments are out of range etc., you can often structure your algorithm so that the index can not go out of range anyway - but you can not remove the check from the standard BitArray and it is not free.
As an example of what you can do with just arrays of low level types, the BitMatrix is a rather powerful structure that can be implemented as just an array of ulongs and you can even traverse it using an ulong as "front" because you can take the lowest order bit in constant time (compared with the Queue in Breadth First Search - but obviously the order is different and depends on the index of the items rather than purely the order in which you find them).
Division and modulo are really slow unless the right hand side is a constant.
Floating point math is not in general slower than integer math anymore (not "something you can do", but "something you can skip doing")
Branching is not free. If you can avoid it using a simple arithmetic (anything but division or modulo) you can sometimes gain some performance. Moving a branch to outside a loop is almost always a good idea.
People have funny ideas about what actually matters. Stack Overflow is full of questions about, for example, is ++i more "performant" than i++. Here's an example of real performance tuning, and it's basically the same procedure for any language. If code is simply written a certain way "because it's faster", that's guessing.
Sure, you don't purposely write stupid code, but if guessing worked, there would be no need for profilers and profiling techniques.
The truth is that there is no such thing as the perfect optimised code. You can, however, optimise for a specific portion of code, on a known system (or set of systems) on a known CPU type (and count), a known platform (Microsoft? Mono?), a known framework / BCL version, a known CLI version, a known compiler version (bugs, specification changes, tweaks), a known amount of total and available memory, a known assembly origin (GAC? disk? remote?), with known background system activity from other processes.
In the real world, use a profiler, and look at the important bits; usually the obvious things are anything involving I/O, anything involving threading (again, this changes hugely between versions), and anything involving loops and lookups, but you might be surprised at what "obviously bad" code isn't actually a problem, and what "obviously good" code is a huge culprit.
Tell the compiler what to do, not how to do it. As an example, foreach (var item in list) is better than for (int i = 0; i < list.Count; i++) and m = list.Max(i => i.value); is better than list.Sort(i => i.value); m = list[list.Count - 1];.
By telling the system what you want to do it can figure out the best way to do it. LINQ is good because its results aren't computed until you need them. If you only ever use the first result, it doesn't have to compute the rest.
Ultimately (and this applies to all programming) minimize loops and minimize what you do in loops. Even more important is to minimize the number of loops inside your loops. What's the difference between an O(n) algorithm and an O(n^2) algorithm? The O(n^2) algorithm has a loop inside of a loop.
I don't really try to optimize my code but at times I will go through and use something like reflector to put my programs back to source. It is interesting to then compare what I wrong with what the reflector will output. Sometimes I find that what I did in a more complicated form was simplified. May not optimize things but helps me to see simpler solutions to problems.
I want to create a simple http proxy server that does some very basic processing on the http headers (i.e. if header x == y, do z). The server may need to support hundreds of users. I can write the server in C# (pretty easy) or c++ (much harder). However, would a C# version have as good of performance as a C++ version? If not, would the difference in performance be big enough that it would not make sense to write it in C#?
You can use unsafe C# code and pointers in critical bottleneck points to make it run faster. Those behave much like C++ code and I believe it executes as fast.
But most of the time, C# is JIT-ted to uber-fast already, I don't believe there will be much differences as with what everyone has said.
But one thing you might want to consider is: Managed code (C#) string operations are rather slow compared to using pointers effectively in C++. There are more optimization tricks with C++ pointers than with CLR strings.
I think I have done some benchmarks before, but can't remember where I've put them.
Why do you expect a much higher performance from the C++ application?
There is no inherent slowdown added by a C# application when you are doing it right. (not too many dropped references, frequent object creation/dropping per call, etc.)
The only time a C++ application really outperforms an equivalent C# application is when you can do (very) low level operations. E.g. casting raw memory pointers, inline assembler, etc.
The C++ compiler may be better at creating fast code, but mostly this is wasted in most applications. If you do really have a part of your application that must be blindingly fast, try writing a C call for that hot spot.
Only if most of the system behaves too slowly you should consider writing it in C/C++. But there are many pitfalls that may kill your performance in your C++ code.
(TLDR: A C++ expert may create 'faster' code as an C# expert, but a mediocre C++ programmer may create slower code than mediocre C# one)
I would expect the C# version to be nearly as fast as the C++ one but with smaller memory footprint.
In some cases managed code is actually a LOT faster and uses less memory compared to non optimized C++. C++ code can be faster if written by expert, but it rarely justifies the effort.
As a side note I can recall a performance "competition" in the blogosphere between Michael Kaplan (c#) and Raymond Chan (C++) to write a program, that does exactly the same thing. Raymond Chan, who is considered one of the best programmers in the world (Joel) succeeded to write faster C++ after a long struggle rewriting most of the code.
The proxy server you describe would deal mostly with string data and I think its reasonable to implement in C#. In your example,
if header x == y, do z
the slowest part might actually be doing whatever 'z' is and you'll have to do that work regardless of the language.
In my experience, the design and implementation has much more to do with performance than do the choice of language/framework (however, the usual caveats apply: eg, don't write a device driver in C# or java).
I wouldn't think twice about writing the type of program you describe in a managed language (be it Java, C#, etc). These days, the performance gains you get from using a lower level language (in terms of closeness to hardware) is often easily offset by the runtime abilities of a managed environment. Of course this is coming from a C#/python developer so I'm not exactly unbiased...
If you need a fast and reliable proxy server, it might make sense to try some of those that already exist. But if you have custom features that are required, then you may have to build your own. You may want to collect some more information on the expected load: hundreds of users might be a few requests a minute or a hundred requests a second.
Assuming you need to serve under or around 200 qps on a single machine, C# should easily meet your needs -- even languages known for being slow (e.g. Ruby) can easily pump out a few hundred requests a second.
Aside from performance, there are other reasons to choose C#, e.g. it's much easier to write buffer overflows in C++ than C#.
Is your http server going to run on a dedicated machine? If yes, I would say go with C# if it is easier for you. If you need to run other applications on the same machine, you'll need to take into account the memory footprint of your application and the fact that GC will run at "random" times.