Related
I wrote some queries in C# using LINQ. After a while, I started using Haskell a little bit, which is a functional programming language (a not so popular one), and for me it seems that both of them are almost the same thing. But I am unsure about this. Please, if someone has used them more than me, could they tell me if they are almost the same thing regarding principles in programming ?
Also, could LINQ be considered functional programming ?
Thanks.
for me it seems that both of them are almost the same thing. But I am unsure about this. Please, if someone has used them more than me, could they tell me if they are almost the same thing regarding principles in programming ?
Yes, the design of LINQ query comprehensions was heavily influenced by the design of Haskell. Haskell expert Erik Meijer was on the C# language design committee when we designed LINQ; his insights were very valuable. (I joined the design team at the end of this process, so unfortunately I did not get to participate in all the interesting twists and turns the design went through over the years; it started being much more traditional OO than it ended up!)
If you've recently done a serious exploration into Haskell then you're probably becoming familiar with the idea of a monad. The LINQ syntax is designed specifically to make operations on the sequence monad feel natural, but in fact the implementation is more general; what C# calls "SelectMany" is a slightly modified form of the "Bind" operation on an arbitrary monad. You can actually use query comprehension with any monad, as my colleague Wes describes here, but doing so looks pretty weird and I recommend against it in production code.
Also, could LINQ be considered functional programming ?
Yes, LINQ is heavily influenced by ideas from functional programming. It is designed to treat functions as first-class objects, to emphasize calculation over side effects, and so on.
It may be worthwhile to take a look at Erik Meijer Lecture. Here's the description for it.
We kick off C9 Lectures with a journey
into the world of Functional
Programming with functional language
purist and high priest of the lambda
calculus, Dr. Erik Meijer (you can
thank Erik for many of the functional
constructs that have shown up in
languages like C# and VB.NET. When you
use LINQ, thank Erik in addition to
Anders).
It is true that C# has gained some aspects of functional programming. First and foremost is the lambda statement, implemented as an anonymous delegate. You now no longer need to define a method as belonging to an object, and you can define a method in a similar fashion as any other variable. This allows for many functional-type constructs:
var mult = (a,b)=>a*b;
var square = (a)=>Math.Pow(a,2);
var multandsquare = (a,b)=>square(mult(a,b));
//None of the above give a lick about what a and b really are, until...
multandsquare(5,3); //== 225
The basic paradigm of functional programming - that basically anything you can tell a computer to do can be told in terms of higher-order functions - can be applied to C# programs, especially now that C# actually has higher-order functions. The compiler will force you to have at least one class with at least one public main method (that's just how OO works), but from that point you can define pretty much everything only in terms of functions (def: a subclass of "methods" that take N parameters and produce 1 output without "side effects") instantiated as lambdas.
Linq does have some functional structure. Basically its method-chain paradigm is an example of monadic processing, which is how sequences of operations are structured through operation encapsulation in functional languages (and now in C#). Calling a Linq method on an IEnumerable returns another IEnumerable, which is really a different concrete class which contains a handle to the source Enumerable and some lambda to perform. You can replicate it in a functional language very simply; it's a tuple of the source (itself a tuple of the current element and everything else) and a function to perform that transforms the source tuple into the result, one element at a time (which in Linq, as it would be implemented in a functional language, is a nesting of some operation into a defined "rollup" function). Call a method that must produce an actual answer (a concrete typed result), and all these functions are evaluated sufficiently to produce the desired result.
Also could LINQ be considered
functional programming?
LINQ implies a functional style in that it resembles, for instance, a SQL select, which is read-only in principle. However, because the LINQ clauses in .NET are implemented by plain-old CLR routines, there is nothing to prevent LINQ expressions from modifying state.
Haskell, on the other hand, is a "pure" functional language, so expressions cannot modify global state. Although it is possible to perform IO and graphics operations, these are carried out using a very different idiom from anything in .NET — including F#, which lets you drop into a procedural style more or less transparently.
A friend and I have written an encryption module and we want to port it to multiple languages so that it's not platform specific encryption. Originally written in C#, I've ported it into C++ and Java. C# and Java will both encrypt at about 40 MB/s, but C++ will only encrypt at about 20 MB/s. Why is C++ running this much slower? Is it because I'm using Visual C++?
What can I do to speed up my code? Is there a different compiler that will optimize C++ better?
I've already tried optimizing the code itself, such as using x >> 3 instead of x / 8 (integer division), or y & 63 instead of y % 64 and other techniques. How can I build the project differently so that it is more performant in C++ ?
EDIT:
I must admit that I have not looked into how the compiler optimizes code. I have classes that I will be taking here in College that are dedicated to learning about compilers and interpreters.
As for my code in C++, it's not very complicated. There are NO includes, there is "basic" math along with something we call "state jumping" to produce pseudo random results. The most complicated things we do are bitwise operations that actually do the encryption and unchecked multiplication during an initial hashing phase. There are dynamically allocated 2D arrays which stay alive through the lifetime of the Encryption object (and properly released in a destructor). There's only 180 lines in this. Ok, so my micro-optimizations aren't necessary, but I should believe that they aren't the problem, it's about time. To really drill the point in, here is the most complicated line of code in the program:
input[L + offset] ^= state[state[SIndex ^ 255] & 63];
I'm not moving arrays, or working with objects.
Syntactically the entire set of code runs perfect and it'll work seamlessly if I were to encrypt something with C# and decrypt it with C++, or Java, all 3 languages interact as you'd expect they would.
I don't necessarily expect C++ to run faster then C# or Java (which are within 1 MB/s of each other), but I'm sure there's a way to make C++ run just as fast, or at least faster then it is now. I admit I'm not a C++ expert, I'm certainly not as seasoned in it as many of you seem to be, but if I can cut and paste 99% of the code from C# to C++ and get it to work in 5 mins, then I'm a little put out that it takes twice as long to execute.
RE-EDIT:
I found an optimization in Visual Studio I forgot to set before. Now C++ is running 50% faster then C#. Thanks for all the tips, I've learned a lot about compilers in my research.
Without source code it's difficult to say anything about the performance of your encryption algorithm/program.
I reckon though that you made a "mistake" while porting it to C++, meaning that you used it in a inefficient way (e.g. lots of copying of objects happens). Maybe you also used VC 6, whereas VC 9 would/could produce much better code.
As for the "x >> 3" optimization... modern compilers do convert integer division to bitshifts by themselves. Needless to say that this optimization may not be the bottleneck of your program at all. You should profile it first to find out where you're spending most of your time :)
The question is extreamly broad. Something that's efficient in C# may not be efficient in C++ and vice-versa.
You're making micro-optimisations, but you need to examine the overall design of your solution to make sure that it makes sense in C++. It may be a good idea to re-design large parts of your solution so that it works better in C++.
As with all things performance related, profile the code first, then modify, then profile again. Repeat until you've got to an acceptable level of performance.
Things that are 'relatively' fast in C# may be extremely slow in C++.
You can write 'faster' code in C++, but you can also write much slower code. Especially debug builds may be extremely slow in C++. So look at the type of optimizations by your compiler.
Mostly when porting applications, C# programmers tend to use the 'create a million newed objects' approach, which really makes C++ programs slow. You would rewrite these algorithm to use pre-allocated arrays and run with tight loops over these.
With pre-allocated memory you leverage the strengths of C++ in using pointers to memory by casting these to the right pod structured data.
But it really depends on what you have written in your code.
So measure your code an see where the implementations burn the most cpu, and then structure your code to use the right algorithms.
Your timing results are definitely not what I'd expect with well-written C++ and well-written C#. You're almost certainly writing inefficient C++. (Either that, or you're not compiling with the same sort of options. Make sure you're testing the release build, and check the optimization options.
However, micro-optimizations, like you mention, are going to do effectively nothing to improve the performance. You're wasting your time doing things that the compiler will do for you.
Usually you start by looking at the algorithm, but in this case we know the algorithm isn't causing the performance issue. I'd advise using a profiler to see if you can find a big time sink, but it may not find anything different from in C# or Java.
I'd suggest looking at how C++ differs from Java and C#. One big thing is objects. In Java and C#, objects are represented in the same way as C++ pointers to objects, although it isn't obvious from the syntax.
If you're moving objects about in Java and C++, you're moving pointers in Java, which is quick, and objects in C++, which can be slow. Look for where you use medium or large objects. Are you putting them in container classes? Those classes move objects around. Change those to pointers (preferably smart pointers, like std::tr1::shared_ptr<>).
If you're not experienced in C++ (and an experienced and competent C++ programmer would be highly unlikely to be microoptimizing), try to find somebody who is. C++ is not a really simple language, having a lot more legacy baggage than Java or C#, and you could be missing quite a few things.
Free C++ profilers:
What's the best free C++ profiler for Windows?
"Porting" performance-critical code from one language to another is usually a bad idea. You tend not to use the target language (C++ in this case) to its full potential.
Some of the worst C++ code I've seen was ported from Java. There was "new" for almost everything - normal for Java, but a sure performance killer for C++.
You're usually better off not porting, but reimplementing the critical parts.
The main reason C#/Java programs do not translate well (assuming everything else is correct). Is that C#/Java developers have not grokked the concept of objects and references correctly. Note in C#/Java all objects are passed by (the equivalent of) a pointer.
Class Message
{
char buffer[10000];
}
Message Encrypt(Message message) // Here you are making a copy of message
{
for(int loop =0;loop < 10000;++loop)
{
plop(message.buffer[loop]);
}
return message; // Here you are making another copy of message
}
To re-write this in a (more) C++ style you should probably be using references:
Message& Encrypt(Message& message) // pass a reference to the message
{
...
return message; // return the same reference.
}
The second thing that C#/Java programers have a hard time with is the lack of Garbage collection. If you are not releasing any memory correctly, you could start running low on memory and the C++ version is thrashing. In C++ we generally allocate objects on the stack (ie no new). If the lifetime of the object is beyond the current scope of the method/function then we use new but we always wrap the returned variable in a smart pointer (so that it will be correctly deleted).
void myFunc()
{
Message m;
// read message into m
Encrypt(m);
}
void alternative()
{
boost::shared_pointer<Message> m(new Message);
EncryptUsingPointer(m);
}
Show your code. We can't tell you how to optimize your code if we don't know what it looks like.
You're absolutely wasting your time converting divisions by constants into shift operations. Those kinds of braindead transformations can be made even by the dumbest compiler.
Where you can gain performance is in optimizations that require information the compiler doesn't have. The compiler knows that division by a power of two is equivalent to a right-shift.
Apart from this, there is little reason to expect C++ to be faster. C++ is much more dependent on you writing good code. C# and Java will produce pretty efficient code almost no matter what you do. But in C++, just one or two missteps will cripple performance.
And honestly, if you expected C++ to be faster because it's "native" or "closer to the metal", you're about a decade too late. JIT'ed languages can be very efficient, and with one or two exceptions, there's no reason why they must be slower than a native language.
You might find these posts enlightening.
They show, in short, that yes, ultimately, C++ has the potential to be faster, but for the most part, unless you go to extremes to optimize your code, C# will be just as fast, or faster.
If you want your C++ code to compete with the C# version, then a few suggestions:
Enable optimizations (you've hopefully already done this)
Think carefully about how you do disk I/O (IOStremas isn't exactly an ideal library to use)
Profile your code to see what needs optimizing.
Understand your code. Study the assembler output, and see what can be done more efficiently.
Many common operations in C++ are surprisingly slow. Dynamic memory allocation is a prime example. It is almost free in C# or Java, but very costly in C++. Stack-allocation is your friend.
Understand your code's cache behavior. Is your data scattered all over the place? It shouldn't be a surprise then that your code is inefficient.
Totally of topic but...
I found some info on the encryption module on the homepage you link to from your profile http://www.coreyogburn.com/bigproject.html
(quote)
Put together by my buddy Karl Wessels and I, we believe we have quite a powerful new algorithm.
What separates our encryption from the many existing encryptions is that ours is both fast AND secure. Currently, it takes 5 seconds to encrypt 100 MB. It is estimated that it would take 4.25 * 10^143 years to decrypt it!
[...]
We're also looking into getting a copyright and eventual commercial release.
I don't want to discourage you, but getting encryption right is hard. Very hard.
I'm not saying it's impossible for a twenty year old webdeveloper to develop an encryption algorithm that outshines all existing algorithms, but it's extremely unlikely, and I'm very sceptic, I think most people would be.
Nobody who cares about encryption would use an algorithm that's unpublished. I'm not saying you have to open up your sourcecode, but the workings of the algorithm must be public, and scrutinized, if you want to be taken seriously...
There are areas where a language running on a VM outperforms C/C++, for example heap allocation of new objects. You can find more details here.
There is a somwhat old article in Doctor Dobbs Journal named Microbenchmarking C++, C#, and Java where you can see some actual benchmarks, and you will find that C# sometimes is faster than C++. One of the more extreme examples is the single hash map benchmark. .NET 1.1 is a clear winner at 126 and VC++ is far behind at 537.
Some people will not believe you if you claim that a language like C# can be faster than C++, but it actually can. However, using a profiler and the very high level of fine-grained control that C++ offers should enable you to rewrite your application to be very performant.
When serious about performance you might want to be serious about profiling.
Separately, the "string" object implementation used in C# Java and C++, is noticeably slower in C++.
There are some cases where VM based languages as C# or Java can be faster than a C++ version. At least if you don't put much work into optimization and have a good knowledge of what is going on in the background. One reason is that the VMs can optimize byte-code at runtime and figure out which parts of the program are used often and changes its optimization strategy. On the other hand an old fashioned compiler has to decide how to optimize the program on compile-time and may not find the best solution.
The C# JIT probably noticed at run-time that the CPU is capable of running some advanced instructions, and is compiling to something better than what the C++ was compiled.
You can probably (surely with enough efforts) outperform this by compiling using the most sophisticated instructions available to the designated C.P.U and using knowledge of the algorithm to tell the compiler to use SIMD instructions at specific stages.
But before any fancy changes to your code, make sure are you C++ compiling to your C.P.U, not something much more primitive (Pentium ?).
Edit:
If your C++ program does a lot of unwise allocations and deallocations this will also explain it.
In another thread, I pointed out that doing a direct translation from one language to another will almost always end up in the version in the new language running more poorly.
Different languages take different techniques.
Try the intel compiler. Its much better the VC or gcc. As for the original question, I would be skeptical. Try to avoid using any containers and minimize the memory allocations in the offending function.
[Joke]There is an error in line 13[/Joke]
Now, seriously, no one can answer the question without the source code.
But as a rule of the thumb, the fact that C++ is that much slower than managed one most likely points to the difference of memory management and object ownership issues.
For instance, if your algorithm is doing any dynamic memory allocations inside the processing loop, this will affect the performance. If you pass heavy structures by the value, this will affect the performance. If you do unnecessary copies of objects, this will affect the performance. Exception abuse will cause performance to go south. And still counting.
I know the cases when forgotten "&" after the parameter name resulted in weeks of profiling/debugging:
void DoSomething(const HeavyStructure param); // Heavy structure will be copied
void DoSomething(const HeavyStructure& param); // No copy here
So, check your code to find possible bottlenecks.
C++ is not a language where you must use classes. In my opinion its not logical to use OOP methodologies where it doesnt really help. For a encrypter / decrypter its best not use classes; use arrays, pointers, use as few functions / classes / files possible. Best encryption system consists of a single file containing few functions. After your function works nice you can wrap it into classes if you wish. Also check the release build. There is huge speed difference
Nothing is faster than good machine/assembly code, so my goal when writing C/C++ is to write my code in such a way that the compiler understands my intentions to generate good machine code. Inlining is my favorite way to do this.
First, here's an aside. Good machine code:
uses registers more often than memory
rarely branches (if/else, for, and while)
uses memory more often than functions calls
rarely dynamically allocates any more memory (from the heap) than it already has
If you have a small class with very little code, then implement its methods in the body of the class definition and declare it locally (on the stack) when you use it. If the class is simple enough, then the compiler will often only generate a few instructions to effect its behavior, without any function calls or memory allocation to slow things down, just as if you had written the code all verbose and non-object oriented. I usually have assembly output turned on (/FAs /Fa with Visual C++) so I can check the output.
It's nice to have a language that allows you to write high-level, encapsulated object-oriented code and still translate into simple, pure, lightning fast machine code.
Here's my 2 cents.
I wrote a BlowFish cipher in C (and C#). The C# was almost 'identical' to the C.
How I compiled (i cant remember the numbers now, so just recalled ratios):
C native: 50
C managed: 15
C#: 10
As you can see, the native compilation out performs any managed version. Why?
I am not 100% sure, but my C version compiled to very optimised assembly code, the assembler output almost looked the same as a hand written assembler one I found.
As a follow up to this question What are the advantages of built-in immutability of F# over C#?--am I correct in assuming that the F# compiler can make certain optimizations knowing that it's dealing with largely immutable code? I mean even if a developer writes "Functional C#" the compiler wouldn't know all of the immutability that the developer had tried to code in so that it couldn't make the same optimizations, right?
In general would the compiler of a functional language be able to make optimizations that would not be possible with an imperative language--even one written with as much immutability as possible?
Am I correct in assuming that the F# compiler can make certain
optimizations knowing that it's dealing with largely immutable code?
Unfortunately not. To a compiler writer, there's a huge difference between "largely immutable" and "immutable". Even guaranteed immutability is not that important to the optimizer; the main thing that it buys you is you can write a very aggressive inliner.
In general would the compiler of a functional language be able to make optimizations that would not be possible with an imperative language--even one written with as much immutability as possible?
Yes, but it's mostly a question of being able to apply the classic optimizations more easily, in more places. For example, immutability makes it much easier to apply common-subexpression elimination because immutability can guarantee you that contents of certain memory cells are not changed.
On the other hand, if your functional language is not just immutable but pure (no side effects like I/O), then you enable a new class of optimizations that involve rewriting source-level expressions to more efficient expressions. One of the most important and more interesting to read about is short-cut deforestation, which is a way to avoid allocating memory space for intermediate results. A good example to read about is stream fusion.
If you are compiling a statically typed, functional language for high performance, here are some of the main points of emphasis:
Use memory effectively. When you can, work with "unboxed" values, avoiding allocation and an extra level of indirection to the heap. Stream fusion in particular and other deforestation techniques are all very effective because they eliminate allocations.
Have a super-fast allocator, and amortize heap-exhaustion checks over multiple allocations.
Inline functions effectively. Especially, inline small functions across module boundaries.
Represent first-class functions efficiently, usually through closure conversion. Handle partially applied functions efficiently.
Don't overlook the classic scalar and loop optimizations. They made a huge difference to compilers like TIL and Objective Caml.
If you have a lazy functional language like Haskell or Clean, there are also a lot of specialized things to do with thunks.
Footnotes:
One interesting option you get with total immutability is more ability to execute very fine-grained parallelism. The end of this story has yet to be told.
Writing a good compiler for F# is harder than writing a typical compiler (if there is such a thing) because F# is so heavily constrained: it must do the functional things well, but it must also work effectively within the .NET framework, which was not designed with functional languages in mind. We owe a tip of the hat to Don Syme and his team for doing such a great job on a heavily constrained problem.
No.
The F# compiler makes no attempt to analyze the referential transparency of a method or lambda. The .NET BCL is simply not designed for this.
The F# language specification does reserve the keyword 'pure', so manually marking a method as pure may be possible in vNext, allowing more aggressive graph reduction of lambda-expressions.
However, if you use the either record or algebraic types, F# will create default comparison and equality operators, and provide copy semantics. Amongst many other benefits (pattern-matching, closed-world assumption) this reduces a significant burden!
Yes, if you don't consider F#, but consider Haskell for instance. The fact that there are no side effects really opens up a lot of possibilities for optimization.
For instance consider in a C like language:
int factorial(int n) {
if (n <= 0) return 1;
return n* factorial(n-1);
}
int factorialuser(int m) {
return factorial(m) * factorial(m);
}
If a corresponding method was written in Haskell, there would be no second call to factorial when you call factorialuser. It might be possible to do this in C#, but I doubt the current compilers do it, even for a simple example as this. As things get more complicated, it would be hard for C# compilers to optimize to the level Haskell can do.
Note, F# is not really a "pure" functional language, currently. So, I brought in Haskell (which is great!).
Unfortunately, because F# is only mostly pure there aren't really that many opportunities for aggressive optimization. In fact, there are some places where F# "pessimizes" code compared to C# (e.g. making defensive copies of structs to prevent observable mutation). On the bright side, the compiler does a good job overall despite this, providing comparable performace to C# in most places nonetheless while simultaneously making programs easier to reason about.
I would say largely 'no'.
The main 'optimization' advantages you get from immutability or referential transparency are things like the ability to do 'common subexpression elimination' when you see code like ...f(x)...f(x).... But such analysis is hard to do without very precise information, and since F# runs on the .Net runtime and .Net has no way to mark methods as pure (effect-free), it requires a ton of built-in information and analysis to even try to do any of this.
On the other hand, in a language like Haskell (which mostly means 'Haskell', as there are few languages 'like Haskell' that anyone has heard of or uses :)) that is lazy and pure, the analysis is simpler (everything is pure, go nuts).
That said, such 'optimizations' can often interact badly with other useful aspects of the system (performance predictability, debugging, ...).
There are often stories of "a sufficiently smart compiler could do X", but my opinion is that the "sufficiently smart compiler" is, and always will be, a myth. If you want fast code, then write fast code; the compiler is not going to save you. If you want common subexpression elimination, then create a local variable (do it yourself).
This is mostly my opinion, and you're welcome to downvote or disagree (indeed I've heard 'multicore' suggested as a rising reason that potentially 'optimization may get sexy again', which sounds plausible on the face of it). But if you're ever hopeful about any compiler doing any non-trivial optimization (that is not supported by annotations in the source code), then be prepared to wait a long, long time for your hopes to be fulfilled.
Don't get me wrong - immutability is good, and is likely to help you write 'fast' code in many situations. But not because the compiler optimizes it - rather, because the code is easy to write, debug, get correct, parallelize, profile, and decide which are the most important bottlenecks to spend time on (possibly rewriting them mutably). If you want efficient code, use a development process that let you develop, test, and profile quickly.
Additional optimizations for functional languages are sometimes possible, but not necessarily because of immutability. Internally, many compilers will convert code into an SSA (single static assignment) form, where each local variable inside a function can only be assigned once. This can be done for both imperative and functional languages. For instance:
x := x + 1
y := x + 4
can become
x_1 := x_0 + 1
y := x_1 + 4
where x_0 and x_1 are different variable names. This vastly simplifies many transformations, since you can move bits of code around without worrying about what value they have at specific points in the program. This doesn't work for values stored in memory though (i.e., globals, heap values, arrays, etc). Again, this is done for both functional and imperative languages.
One benefit most functional languages provide is a strong type system. This allows the compiler to make assumptions that it wouldn't be able to otherwise. For instance, if you have two references of different types, the compiler knows that they cannot alias (point to the same thing). This is not an assumption a C compiler could ever make.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
What in C#.NET makes it more suitable for some projects than VB.NET?
Performance?, Capabilities?, Libraries/Components?, Reputation?, Reliability? Maintainability?, Ease?
Basically anything C# can do, that is impossible using VB, or vice versa.
Things you just have to consider when choosing C#/VB for a project.
C# and VB are basically the same however there are some minor differences. Aside from the obvious grammar differences you have the following differences:
C# can call unsafe code
VB has optional parameters (Coming in C#4.0)
VB is easier to use when making late bound calls (Coming in C# 4.0) This and number make 2 make using VB to do office automation much cleaner.
VB has a bunch of "helper" functions and class like the My namespace; however, all of this are accessible to C#
VB is case insensitive
C#'s syntax follow a similar grammar to c and java, which make it a much more comfortable transition from those languages, where as VB can be more comfortable to VB users. As far performance, and libraries or components they are nearly identical.
As for which one to choose, unless you have a need to do unsafe operations, then choose the language which is most natural to you. After years of being a VB developer I loved not having to write If yadada then.....End If if (yadaya){....} saves my carpal tunnel a few extra keystrokes (Which can then be used on answering SO questions)
Edit
Just learned one more difference btw C# and VB is that VB supports filtered exceptions so you could something like this pseudo:
try
{
//do something that fails
}
catch(Exception ex when ArgumentException,
ArgumentNullException, FormatException)
{
//Only handle these three types
}
This should not be confused with the ability to do:
try
{
//something that fails
}
catch(ArgumentException)
{
//Log Error
}
catch(ArgumentNullException)
{
//Log Error
}
In this case you are going to handle the exceptions differently in the VB world you could define one piece of code to handle multiple types of Exceptions.
Edit
Some more differences.
VB's Is operator compares two objects to determine if they are the same it compiles to the CEQ IL instruction where as C# compiles to isinst IL. So the following are equivalent statements
c#
if (foo is FooObject){}
vb
If TypeOf foo is FooObject then
Also as mentioned in the comments and I wish I could see them to give you credit but C# doesn't have a like parameter. You need to use the RegEx class.
I think this blog post by Kathleen Dollard provides an excellent overview to the question:
What a C# Coder Should Know Before They Write VB
and her first advice is:
1) Get over the respect thing or quit
before you start. VB is a great
language.
Street cred among geeks.
(And don't pretend that's not important!)
Others have covered a lot of the differences - as has been said several times, they're nearly equivalent languages. A few differences which haven't been covered as far as I've seen:
VB9 has:
XML literals
Mutable anonymous types (urgh)
More LINQ support in the language (C# only covers a few operators)
A whole bunch of extra bits in the language which get compiled down to calls to the Microsoft.VisualBasic assembly. (C# prefers to be a small language with the weight of the .NET framework behind it.)
DateTime literals
C# 3 has:
Better support for lambda expressions: IIRC, you can't write a lambda expression with a block body in VB.
Iterator blocks.
Syntax for extension methods (instead of decorating the method with an attribute)
I suspect there is more, but I thought I'd just throw those into the mix.
When you hit a problem, you'll often be able to Google for a code sample that shows how to solve it in minutes. This is a significant factor in improving productivity. When I was working with Delphi, I had to convert code samples from C into Object Pascal - doable, but tedious, i.e. lots of friction. So don't underestimate the fact that...
The vast majority of .Net code samples are in C# !
In early versions of VB.NET, the difference was more obvious, however with the current version, there are no significant differences.
VB supports XML literals directly in code, which is not supported by C#. C# supports unsafe code and VB don't. Technically these two are the biggest differences. There are many small variations but they are not important. I like C# more because I think the syntax is much less bloated. It's easy to recognize the blocks of code instead of seeing a bunch of keywords, but that's a pure personal preference.
Choose the one you and your team are more familiar with.
VB.Net has a root namespace and C# has a default namespace which is not the same. Because when you have a root namespace in VB.Net it will always add that before the namespace.
For example: if you have a rootnamepsace in VB.Net named namespace1 and then you add this to your file.
Namespace namespace1
Public Class class1
End Class
End Namespace
then you will have to refer to it as namespace1.namespace1.class1
in C# if you have a default namespace called namespace1 and you this in your file.
namespace namespace1{
public class class1{}
}
Then you can still just refer to it as namespace1.class1
My favorite feature of C# that VB doesn't have is the yield statement. It lets you easily return an lazily evaluated IEnumerable from a method.
Here is an article that covers it: http://msdn.microsoft.com/en-us/magazine/cc163970.aspx
VB has a better feedback on errors. In C# you have to compile more often to get all the errors in your syntax.
The big advantage i see with C# is that the vast majority of open source projects, sample code and blog snippets are written in C#. Although it’s easy to convert these to VB.NET with tools (or your head) it is still a monotonous task for VB devs (like me).
Even if you write in VB.NET day to day, you will still need to be able to read C#
Technically, they share the same framework with same performance and memory characteristics and same type systems, so I find it hard to hard to split the two.
Most good developers should be able to shift between the two with a few days adjustment time.
My core frustaration lies with the hundreds of time I have written:
String lastName as String
and wondered why it never compiles!
In C# you can have more fine tuned control over events/delegates. But you rarely need this.
There are a lot more code examples for C#.
In VB.Net it is (a lot) easier to use late binding. For example to COM objects (in C# from version 4.0).
I use C# for 90% in my projects
I use VB.Net for interop to Excel, etc
As soon as I know F# better, I will probably use it for the parts that F# is better suited for.
Performance?
No difference, though VB historically uses a weird indexing in loops which means that you have to subtract 1 from the highest index most of the time:
For i = 0 To someArrayOrString.Length - 1 …
Though I doubt this impacts performance in any measurable way.
On the other hand, due to background compilation, VB actually compiles seemingly faster. Some people claim that this makes the IDE react sluggishly but I've never noticed that myself.
Capabilities?
In a few instances, C#'s yield statement is really useful. VB requires more manual work here. Also, lambdas are much better implemented in C#, especially syntactically. Consider these two statements:
Parallel.For(1, 10000, i => {
// Do something
});
versus
Parallel.For(1, 10000, Sub() _
' Do something '
End Sub)
Besides the fact that VB can't do this yet, and that a comment at this place is forbidden, it's just more clutter and generally a no-go.
Libraries/Components?
Identical.
Reputation?
Not important. “Street cred”? Sorry. Not a factor. Get over it, Nick.
Reliability?
Maintainability?
Ease?
More or less identical. I claim that VB is easier but that may be biased and in any way, it's only marginal.
As I understand it, there are differences between the languages though they are minimal. I would suggest working with the language that you/your developers would feel most comfortable with. If they already have VB experience then I'd suggest VB.Net/vice-versa.
Though I prefer the terse syntax of C# personally. :)
Since C# and VB.Net both compile into MSIL, both have nearly identical Performance, Capabilities, Libraries and Components. Reflection can deassemble MSIL code into either C# or VB.NET (or a number of other languages)
This basically leaves us with, C# looks a lot like Java and C++ which gives it more credibility.
I think Josh gave a good rollup on the language differences.
Tooling Support
There are however also differences in how Visual Studio handles these languages.
Code Snippets are easier to use from the C# editor and the refactoring is better also.
The VB-editor simplifies intellisense by not showing all options at all times.
I'm sure there are more things, but those are the ones that I as a C#'er (doing very little VB) have noticed.
Here is another point not yet covered in this trail:
Higher Degree of Employability + Better (more) Dev Resources
Now I don’t agree with this point of view however:
There are more C# dev shops that VB shops
Some backward c# employers will put you at a serious disadvantage if you are stronger with VB. So maybe C# is the better choice here.
On the flip side, when you go to hire people for your C# project, you will be able to attract more candidates to iterview and get better skills as a result – I don't think detail should be overlooked.
I have to say it's been my experience when using Google to search for examples or documentation that C# examples are of better quality than VB examples. This isn't to say that there aren't bad C# examples, but if you search for something like "dictionary drop down" adding C# will provide a better quality answer higher on the list.
This doesn't apply to examples that display both C# and VB code side by side.
for the moment VB.Net has a very poor implementation of lambda expressions. Meaning that you can't do the neat things that you can in C#. Well you can, but it needs a very ugly workaround in most cases.
This is solved in VB.Net 10.0.
Free from old cruft, array declaration in C# is not padded with extra element. VB.NET array is padded with extra element, so legacy VB6 code migration is easier.
C# is more consistent than VB.NET
Button1.Color() = Color.Red
Button1.Color = Color.Red
I can't give an answer when one student asked me when to use parenthesis on properties of VB.NET. It's hard to give an insightful reasoning for this kind of misfeature.
Consistency with instance and static members. VB.NET allows accessing static members on instance e.g. yarn.Sleep(1000), this is a misfeature. https://stackoverflow.com/questions/312419/language-features-you-should-never-use
I have a number of data classes representing various entities.
Which is better: writing a generic class (say, to print or output XML) using generics and interfaces, or writing a separate class to deal with each data class?
Is there a performance benefit or any other benefit (other than it saving me the time of writing separate classes)?
There's a significant performance benefit to using generics -- you do away with boxing and unboxing. Compared with developing your own classes, it's a coin toss (with one side of the coin weighted more than the other). Roll your own only if you think you can out-perform the authors of the framework.
Not only yes, but HECK YES. I didn't believe how big of a difference they could make. We did testing in VistaDB after a rewrite of a small percentage of core code that used ArrayLists and HashTables over to generics. 250% or more was the speed improvement.
Read my blog about the testing we did on generics vs weak type collections. The results blew our mind.
I have started rewriting lots of old code that used the weakly typed collections into strongly typed ones. One of my biggest grips with the ADO.NET interface is that they don't expose more strongly typed ways of getting data in and out. The casting time from an object and back is an absolute killer in high volume applications.
Another side effect of strongly typing is that you often will find weakly typed reference problems in your code. We found that through implementing structs in some cases to avoid putting pressure on the GC we could further speed up our code. Combine this with strongly typing for your best speed increase.
Sometimes you have to use weakly typed interfaces within the dot net runtime. Whenever possible though look for ways to stay strongly typed. It really does make a huge difference in performance for non trivial applications.
Generics in C# are truly generic types from the CLR perspective. There should not be any fundamental difference between the performance of a generic class and a specific class that does the exact same thing. This is different from Java Generics, which are more of an automated type cast where needed or C++ templates that expand at compile time.
Here's a good paper, somewhat old, that explains the basic design:
"Design and Implementation of Generics for the
.NET Common Language Runtime".
If you hand-write classes for specific tasks chances are you can optimize some aspects where you would need additional detours through an interface of a generic type.
In summary, there may be a performance benefit but I would recommend the generic solution first, then optimize if needed. This is especially true if you expect to instantiate the generic with many different types.
I did some simple benchmarking on ArrayList's vs Generic Lists for a different question: Generics vs. Array Lists, your mileage will vary, but the Generic List was 4.7 times faster than the ArrayList.
So yes, boxing / unboxing are critical if you are doing a lot of operations. If you are doing simple CRUD stuff, I wouldn't worry about it.
Generics are one of the way to parameterize code and avoid repetition. Looking at your program description and your thought of writing a separate class to deal with each and every data object, I would lean to generics. Having a single class taking care of many data objects, instead of many classes that do the same thing, increases your performance. And of course your performance, measured in the ability to change your code, is usually more important than the computer performance. :-)
According to Microsoft, Generics are faster than casting (boxing/unboxing primitives) which is true.
They also claim generics provide better performance than casting between reference types, which seems to be untrue (no one can quite prove it).
Tony Northrup - co-author of MCTS 70-536: Application Development Foundation - states in the same book the following:
I haven’t been able to reproduce the
performance benefits of generics;
however, according to Microsoft,
generics are faster than using
casting. In practice, casting proved
to be several times faster than using
a generic. However, you probably won’t
notice performance differences in your
applications. (My tests over 100,000
iterations took only a few seconds.)
So you should still use generics
because they are type-safe.
I haven't been able to reproduce such performance benefits with generics compared to casting between reference types - so I'd say the performance gain is "supposed" more than "significant".
if you compare a generic list (for example) to a specific list for exactly the type you use then the difference is minimal, the results from the JIT compiler are almost the same.
if you compare a generic list to a list of objects then there is significant benefits to the generic list - no boxing/unboxing for value types and no type checks for reference types.
also the generic collection classes in the .net library were heavily optimized and you are unlikely to do better yourself.
In the case of generic collections vs. boxing et al, with older collections like ArrayList, generics are a performance win. But in the vast majority of cases this is not the most important benefit of generics. I think there are two things that are of much greater benefit:
Type safety.
Self documenting aka more readable.
Generics promote type safety, forcing a more homogeneous collection. Imagine stumbling across a string when you expect an int. Ouch.
Generic collections are also more self documenting. Consider the two collections below:
ArrayList listOfNames = new ArrayList();
List<NameType> listOfNames = new List<NameType>();
Reading the first line you might think listOfNames is a list of strings. Wrong! It is actually storing objects of type NameType. The second example not only enforces that the type must be NameType (or a descendant), but the code is more readable. I know right away that I need to go find TypeName and learn how to use it just by looking at the code.
I have seen a lot of these "does x perform better than y" questions on StackOverflow. The question here was very fair, and as it turns out generics are a win any way you skin the cat. But at the end of the day the point is to provide the user with something useful. Sure your application needs to be able to perform, but it also needs to not crash, and you need to be able to quickly respond to bugs and feature requests. I think you can see how these last two points tie in with the type safety and code readability of generic collections. If it were the opposite case, if ArrayList outperformed List<>, I would probably still take the List<> implementation unless the performance difference was significant.
As far as performance goes (in general), I would be willing to bet that you will find the majority of your performance bottlenecks in these areas over the course of your career:
Poor design of database or database queries (including indexing, etc),
Poor memory management (forgetting to call dispose, deep stacks, holding onto objects too long, etc),
Improper thread management (too many threads, not calling IO on a background thread in desktop apps, etc),
Poor IO design.
None of these are fixed with single-line solutions. We as programmers, engineers and geeks want to know all the cool little performance tricks. But it is important that we keep our eyes on the ball. I believe focusing on good design and programming practices in the four areas I mentioned above will further that cause far more than worrying about small performance gains.
Generics are faster!
I also discovered that Tony Northrup wrote wrong things about performance of generics and non-generics in his book.
I wrote about this on my blog:
http://andriybuday.blogspot.com/2010/01/generics-performance-vs-non-generics.html
Here is great article where author compares performance of generics and non-generics:
nayyeri.net/use-generics-to-improve-performance
If you're thinking of a generic class that calls methods on some interface to do its work, that will be slower than specific classes using known types, because calling an interface method is slower than a (non-virtual) function call.
Of course, unless the code is the slow part of a performance-critical process, you should focus of clarity.
See Rico Mariani's Blog at MSDN too:
http://blogs.msdn.com/ricom/archive/2005/08/26/456879.aspx
Q1: Which is faster?
The Generics version is considerably
faster, see below.
The article is a little old, but gives the details.
Not only can you do away with boxing but the generic implementations are somewhat faster than the non generic counterparts with reference types due to a change in the underlying implementation.
The originals were designed with a particular extension model in mind. This model was never really used (and would have been a bad idea anyway) but the design decision forced a couple of methods to be virtual and thus uninlineable (based on the current and past JIT optimisations in this regard).
This decision was rectified in the newer classes but cannot be altered in the older ones without it being a potential binary breaking change.
In addition iteration via foreach on an List<> (rather than IList<>) is faster due to the ArrayList's Enumerator requiring a heap allocation. Admittedly this did lead to an obscure bug