I've just disassembled a project to debug it using Reflector, but it seems to balk at decoding the 'compile results' of automatic properties, e.g. the next line gives me a syntax error. I've tried fixing these manually, but every time I fix one, more appear.
private string <GLDescription>k__BackingField;
Is there anything I can do about this?
Ha! Stupid me: all I had to do was set the disassembler optimization in Reflector's options to .NET 3.5. Mine was on 2.0.
The compiler generates fields with "unspeakable names" - i.e. ones which are illegal in C# itself, but are valid IL.
There's no exactly accurate translation of the IL into "normal" C# (without automatic properties). You can replace < and > with _ which will give legal code, but then of course it won't be exactly the same code any more. If you're only after the ability to debug, however, that won't be a problem.
If you decompile iterators (i.e. methods using yield statements) you'll find more of the same, including the use of fault blocks, which are like finally blocks but they only run when an exception has occurred (but without catching the exception). Various other constructs generate unspeakable names too, including anonymous methods, lambda expressions and anonymous types.
On a broader note, do you have permission to decompile this code? If the author doesn't mind you doing so, they're likely to be willing to give you the source code to start with which would make your life easier. If they don't want you debugging their source code to start with, you should consider the ethical (and potentially legal) ramifications of decompiling the code. This may vary by location: consult a real lawyer for more definitive guidance.
EDIT: Having seen your own answer, that makes a lot of sense. I'll leave this here for background material.
Related
You load a foreign code example with libraries attached to it in Visual Studio. Now there is a method that you want to reuse in your code. Is there a function in VS that lets you strip the code from all unnecessary code to only have code left that is necessary for your current method to run?
It is not about the library. Loading a .sln or .csproj and having classes over classes when you just want one method out of it is a waste of performance, ram and space. It is about code you can easily omit or references(what I call libraries) you can easily omit. A part-question of this is: Which "using" statement do you need that is only necessary for your current method and the methods that pass paramaters to it? In short, showing relevant code only. Code that is tied to each other.
Let's use an example: You go to github and download source code in c#. Let's call the solution S. You open S in Visual Studio. You don't disassemble, you just load the source code of S, that is there in plain text. Then you find a method M - in plain text - that you want to use. M contains some objects whose classes were defined somewhere in the project. The goal is to recreate the surrounding only for this method to copy & paste it into my own solution without having red underlined words in almost every line within the method
after reading the question and the comments, I think I have a vague idea what you are referring to.
In case we ignore the context of the method you are referring, you can extract any code piece from a "library" by using a .NET decompiler and assembly browser.
There are many of them for free, such as:
dotPeek,
ILSpy
...
This will allow you to see the method's code. From there on, you can proceed as you like. In case your copy the method to your code base, you might still have to change it a bit in order to adapt it to work with your objects and context. If you don't, this will give you insight on how the method works and might help you to understand the logic, so you can write your own.
Disclaimer: With this post, I am pointing out that it is possible to extract code from an assembly. I am not discussing the ethics or legal perspective behind such actions.
Hope this helps,
Happy Coding!
If it`s just one method, look at the source code and copy it to your libarary. Make sure you make a comment where you obtained the code and who has the copyright! Don't forget to include the licence, which you should have done with a libary reference anyway.
That said it is currently not (official) possible to automaticly remove unused public declared code from a library (assembly). This process is called Treeshaking by the way. Exception: .NET Native.
But .NET Native is only available for Windows Store Apps. You can read more about it here.
That said, we have the JIT (Just in Time)-Compiler which is realy smart. I wouldn't worry about a few KB library code. Spend your time optimizing your SQL Queries and other bottlenecks. The classes are only loaded, when you actualy use them.
Using some unstable solutions or maintaining a fork of a library, where you use more then one method (with no documentation and no expertise, since it is your own fork) isn't worth the headache, you will have!
If you realy want to go the route of removing everything you do not want, you can open the solution, declare everything as internal (search and replace is your friend) and restore the parts to public, which are giving you are Buildtime error / Runtime error (Reflection). Then remove everything which is internal. There are several DesignTime tools like Resharper, which can remove Dead Code.
But as I said, it's not worth it!
For .NET Core users, in 6-8 weeks, we have the .NET IL Linker as spender has commented, it looks promising. What does this mean? The .NET framework evolves from time to time. Let it envolve and look at your productivity in the meantime.
I know it might not be worth it but just for education purposes I want to know if there is a way to inject your own keywords to .NET languages.
For example I thought it's good to have C++ asm keyword in C#.
Remember I'm not talking about how to implement asm keyword but a general way to add keyword to C#.
My imagined code :
asm{
mov ax,1
add ax,4
}
So is there a way to achieve this ?
The answers which cover implementing keyword{ } suits enough for this question.
This isn't possible at the moment. However, there's a Microsoft project in development called Roslyn that can be summarised as "the compiler as a service." It allows you, amongst other things, to extend or modify the behaviour of the compiler through an API.
When Roslyn becomes available, I believe this should be something that (with caution!) is quite doable.
You can use whatever tools you would like to pre-process your code before sending it to the C# compiler. For example, you might use VS macros to do the pre-processing, mapping a given syntax that you invented into something that does compile into C# code, possibly generating an error if there is a problem. If VS macros aren't powerful enough for you then you can always use your own IDE that does whatever you code it to do to the text before sending it to the compiler.
There is no built in support in the compiler for specifying your own keywords/syntax; you would need to handle it entirely independent of the compiler.
Unfortunately this is not possible. You can't extend or alter the languages in any way.
You could in some obscure way use PostSharp to read and parse strings and transform them to custom code at compile time (a pre processor). But you would not get very happy with that, as it is very error prone and you won't get any kind of intellisense or code completion for your magic strings.
According to MSDN keywords are predefined and cannot be altered. So you can't add any, because you would need to tell the compiler how to handle them. Insofar, no you can't.
I've been using reflector to decompile a couple simple c# apps but I notice that though code is being decompiled, I still can't see things as they were written on VS. I think this is the way it is as the compiler replaces human instructions by machine code. However I thought I would give it a try and ask it on here. Maybe there is a decompiler that can decompile and show the coding almost identically to the original code.
That is impossible, since there are lots of ways to get the same IL from different code. For example, there is no way to know if an extension method was called fluent-style vs explicit on the declaring type. There is no way to know if LINQ vs regular code was used. All manner of implicit operations may or may not be there. Removed code may or may not have been there. Many primitives (including enums) up-to-and-including 4 bytes are indistinguishable once they are IL.
If you want the actual code, legally obtain the original code.
Existing .Net decompilers generally decompile to the best of their ability.
You appear to be asking for variable names and line formatting, which for obvious reasons are not compiled to IL.
There are several. I currently use JustDecompile found here http://www.telerik.com/products/decompiler.aspx?utm_source=twitter&utm_medium=sm&utm_campaign=ad
[Edit]
An alternative is .NET Reflector found here: http://www.reflector.net/
I believe there is a free version of it, but didn't take time to look.
Basically, no. There are often many ways to arrive at the same IL code, and there's no way at all for a decompiler to know which was used.
No, nor should there ever be. Things like comments and unreachable code would just add bloat with absolutely zero benefit. The very best you can ever do is approximate the compiled code.
Has anyone come across a tool to report on commented-out code in a .NET app? I'm talking about patterns like:
//var foo = "This is dead";
And
/*
var foo = "This is dead";
*/
This won't be found by tools like ReSharper or FxCop which look for unreferenced code. There are obvious implications around distinguishing commented code from commented text but it doesn't seem like too great a task.
Is there any existing tool out there which can pick this up? Even if it was just reporting of occurrences by file rather than full IDE integration.
Edit 1: I've also logged this as a StyleCop feature request. Seems like a good fit for the tool.
Edit 2: Yes, there's a good reason why I'd like to do this and it relates to code quality. See my comment below.
You can get an approximate answer by using a regexp that recognizes comments, that end with ";" or "}".
For a more precise scheme, see this answer:
Tool to find commented out VHDL code
I've been done this road for the same reason. I more or less did was Ira Baxter suggested (though I focused on variable_type variable = value and specifically looked for lines that consisted of 0 or more whitespace at beginning followed by // followed by code (and to handle /* */, I wrote a preprocessor that converted it into //'s. I tweaked the reg exp to cut down on false positives and also did a manual inspection just to be safe; fortunately, there were very few cases where the comment was doing pseudo-code like things as drachenstern suggests above; YMMV. I'd love to find a tool that could do this but some false positives from valid but possibly overly detailed pseudo code are going to be really to rule out, especially if they're using literate programming techniques to make the code as "readable" as pseudo code.
(I also did this for VB6 code; on the one hand, the lack of ;'s made it harder to right an "easy" reg exp, on the other hand the code used a lot less classes so it was easier to match on variable types which tend not to be in pseudo code)
Another option I looked at but didn't have available was to look at the revision control logs for lines that were code in version X and then //same line in version Y... this of courses assumes that A) they are using revision control B) you can view it, and C) they didn't write code and comment it out in the same revision. (And gets a little trickier if they use /* */ comments
There is another option for this, Sonar. It is actually a Java centric app but there is a plugin that can handle C# code. Currently it can report on:
StyleCop errors
FxCop errors
Gendarme errors
Duplicated code
Commented code
Unit test results
Code coverage (using NCover, OpenCover)
It does take a while for it to scan mostly due to the duplication checks (AFAIK uses text matching rather than C# syntax trees) and if you use the internal default derby database it can fail on large code bases. However it is very useful to for the code-base metrics you gain and it has snapshot features that enable you to see how things have changed and (hopefully) got better over time.
Since StyleCop is not actively maintained (see https://github.com/StyleCop/StyleCop#considerations), I decided to roll out my own, dead-csharp:
https://github.com/mristin/dead-csharp
Dead-csharp uses heuristics to find code patterns in the comments. The comments starting with /// are intentionally ignored (so that you can write code in structured comments).
StyleCop will catch the first pattern. It suggests you use //// as a comment for code so that it will ignore the rule.
Seeing as you mentioned NDepends it also can tell you Percentage commented http://www.ndepend.com/Metrics.aspx#PercentageComment. This is defined for application, assemblies, namespaces, types and methods.
Do class, method and variable names get included in the MSIL after compiling a Windows App project into an EXE?
For obfuscation - less names, harder to reverse engineer.
And for performance - shorter names, faster access.
e.g. So if methods ARE called via name:
Keep names short, better performance for named-lookup.
Keep names cryptic, harder to decompile.
Yes, they're in the IL - fire up Reflector and you'll see them. If they didn't end up in the IL, you couldn't build against them as libraries. (And yes, you can reference .exe files as if they were class libraries.)
However, this is all resolved once in JIT.
Keep names readable so that you'll be able to maintain the code in the future. The performance issue is unlikely to make any measurable difference, and if you want to obfuscate your code, don't do it at the source code level (where you're the one to read the code) - do it with a purpose-built obfuscator.
EDIT: As for what's included - why not just launch Reflector or ildasm and find out? From memory, you lose local variable names (which are in the pdb file if you build it) but that's about it. Private method names and private variable names are still there.
Yes, they do. I do not think that there will be notable performance gain by using shorter names. There is no way that gain overcomes the loss of readability.
Local variables are not included in MSIL. Fields, methods, classes etc are.
Variables are index based.
Member names do get included in the IL whether they are private or public. In fact all of your code gets included too, and if you'd use Reflector, you can practically read all the source code of the application. What's left is debugging the app, and I think there might be tools for that.
You must ABSOLUTELY (and I can't emphasize it more) obfuscate your code if you're making packaged applications that have a number of clients and competition. Luckily there are a number of obfuscators available.
This is a major gripe that I have with .Net. Since MS is doing so much hard work on this, why not develop (or acquire) a professional obfuscator and make that a part of VS. Dotfuscator just doesn't cut it, not the version they've for community.
Keep names short, better
performance for named-lookup.
How could this make any difference? I'm not sure how identifiers are looked up by the VM, but I'm pretty sure it's not doing a straight string comparison lookup. This would be the worst possible way to do it.
Keep names cryptic, harder to decompile.
To be honest, I don't think code obfuscation helps that much. Most competent developers out there have already developed a "sixth sense" to figure out things quickly even if identifiers like method names are totally unhelpful since very often the source code they need to maintain or improve already has these problems (I am talking about method names like "DoAllStuff()").
Anyway, security through obscurity is usually a bad idea.
If you are concerned about obfuscation check out .NET Reactor. I tested 8 different obfuscators and Reactor was not only the cheapest commercial one, it was the second best of the bunch (the best was the most expensive one, Dotfuscator Gold).
[EDIT]
Actually now that I think of it, if all you care about is obfuscating method names then the one that comes with VS.NET, Dotfuscator Community Edition, should work fine.
I think they're added, but the length of the name isn't going to affect anything, because of the way the function names are looked up. As for obfuscation, I think there are tools (Dotfuscator or something like that) that basically do exactly what you're saying.