How to reverse obfuscation in .NET? - c#

Is obfuscation only about garbling the names of non-public variables/members? If so, would it not be possible to write an application that would at least change these names more readible ones like "variable1", etc, and then extract the whole code that can still be compiled?

No, it is about a lot more, especially with more sophisticated obfuscators. They can produce IL that cannot be expressed in most languages, and where the logic flow is horribly tangled to befuddle the best of tools. With lots of time you can do it (probably lots by hand), and there is certainly an arms race between the obfuscators and deobfuscators - but you vastly underestimate the technology here.
Also, note that many obfuscators look at an entire application (not just one assembly), so they can change the public API too.

That is certainly the start of an obfuscator. Though some obfuscators will also encrypt strings and other such tricks to make it very difficult to reverse engineer the assembly.
Of course, since the runtime needs to run the assembly after all of this, it is possible for a determined hacker to reverse engineer it :)

There are 'deobfuscator' tools to undo several obfuscation techniques like Decrypt strings, Remove proxy methods, Devirtualize virtualized code, Remove anti-debug code, Remove junk classes, Restore the types of method parameters and fields and more...
One very powerful tool is de4dot.
But there are more.

Obfuscation is about changing meaningful names like accountBalance to meaningless ones like a1.
The application will obviously still work, but it will be more difficult to understand the algorithms inside it.

It's depend upon the obfuscation technology used. Obsfucating variable name is only one part of the issue. A lot of obfuscation tools perform some kind of program flow obfuscation at the same time, which will complicate further code comprehension. At the end, the obfuscated IL won't be expressible easily (if at all) in most programming languages.
Renaming the variables and fields won't help you much either, as having a lot of variable1, variable2.. won't help you to understand what you read.

Related

Tool for 'Flattening' (simplifying) C# Source

I need to provide a copy of the source code to a third party, but given it's a nifty extensible framework that could be easily repurposed, I'd rather provide a less OO version (a 'procedural' version for want of a better term) that would allow minor tweaks to values etc but not reimplementation using the full flexibility of how it is currently structured.
The code makes use of the usual stuff: classes, constructors, etc. Is there a tool or method for 'simplifying' this into what is still the 'source' but using only plain variables etc.
For example, if I had a class instance 'myclass' which initialised this.blah in the constructor, the same could be done with a variable called myclass_blah which would then be manipulated in a more 'flat' way. I realise some things like polymorphism would probably not be possible in such a situation. Perhaps an obfuscator, set to a 'super mild' setting would achieve it?
Thanks
My experience with nifty extensible frameworks has been that most shops have their own nifty extensible frameworks (usually more than one) and are not likely to steal them from vendor-provided source code. If you are under obligation to provide source code (due to some business relationship), then, at least in my mind, there's an ethical obligation to provide the actual source code, in a maintainable form. How you protect the source code is a legal matter and I can't offer legal advice, but really you should be including some license with your release and dealing with clients who are not going to outright steal your IP (assuming it's actually yours under the terms you're developing it.)
As had already been said, if this is a requirement based on restrictions of contracts then don't do it. In short, providing a version of the source that differs from what they're actually running becomes a liability and I doubt that it is one that your company should be willing to take. Proving that the code provided matches the code they are running is simple. This is also true if you're trying to avoid license restrictions of libraries your application uses (e.g. GPL).
If that isn't the case then why not provide a limited version of your extensibility framework that only works with internal types and statically compile any required extensions in your application? This will allow the application to continue to function as what they currently run while remaining maintainable without giving up your sacred framework. I've never done it myself but this sounds like something ILMerge could help with.
If you don't want to give out framework - just don't. Provide only source you think is required. Otherwise most likely you'll need to either support both versions in the future OR never work/interact with these people (and people they know) again.
Don't forget that non-obfuscated .Net assemblies have IL in easily de-compilable form. It is often easier to use ILSpy/Reflector to read someone else code than looking at sources.
If the reason to provide code is some sort of inspection (even simply looking at the code) you'd better have semi-decent code. I would seriously consider throwing away tool if its code looks written in FORTRAN-style using C# ( http://www.nikhef.nl/~templon/fortran/fortran_style ).
Side note: I believe "nifty extensible frameworks" are one of the roots of "not invented here" syndrome - I'd be more worried about comments on the framework (like "this code is ##### because it does not use YYY pattern and spacing is wrong") than reuse.

Checking efficiency of obfuscation of C# code

I'm evaluating several obfuscators for protecting code in a WPF application.
For checking results of job done by each obfuscator on a given assembly I use Red Gate's .Net Reflector. Just after each obfuscation I open the assembly with .NET Reflector and see what it looks like.
Is it enough? Can .NET Reflector's results be treated as an indicator of quality of obfuscation, or should I try some additional tools? (not any possible instrument of such a kind, but from a point of view of practical common sense).
The results from Reflector should be enough on an indication of how any casual attempt at decompiling would fare. Some obfuscatory will obfuscate code to the extent that the assembly will not even open in Reflector.
Anyone who would try any deeper than that will not be easily deterred by more advanced obfuscation than others.
It would be best, if Reflector and ILSpy would outright refuse to decompile the resulting assembly. I know that there exist obfuscators that are capable of that.
My opinion is: that "is it enough" or not depends on your target app. Obfuscation is never about 100% secure code, it's always to make the code deassembly difficult enough for potential attacker, but it all depends on how much that "potential attacker" will put effort to deassembly your app. And also .NET Reflector is a viewer, like you mantioned, so if it's ecure or not can deduct you, by looking on, for example:
if strings are encrypted
if parameters are encrypted
if class names and fields like (PWD_USER) are encrypted
...
Regards.

Restricting using strong named dlls functionality

I'm trying to think of a way that prevents others from using your published dlls. For example let's say you create a cool lightweight WinUI photo processing tool that's separated into several assemblies. One of them is your precious filters.dll assembly that basically does all of the core filtering work. Once you publish your application, how can you prevent others from taking this filters.dll and using it in other projects?
I've already tried to look at the StrongNameIdentityPermissionAttribute which has a good example here but it doesn't seem to work for me, the code just works without throwing any security exceptions..
Any ideas?
Strong names have nothing to do with preventing or inhibiting reverse engineering. They only serve to stop people substituting assemblies with hacked versions - and only if people havent turned off strong name verification. There's nothing to stop people taking your code, ILDASMing or Reflectoring and re-ILASMing as they see fit.
InternalsVisibleTo and friends are on an honour system at the compiler level too, so not much use for what you're looking for (although for some obfuscators, internals get more agressively obfuscated than publics by default - though this can generally be overcome). My main concern here is to point out that jsut because something is 'internal' doesnt bestow on it any magic code protection pixie dust that stops reverse engineering.
Most of this stuff re why these sort of approaches arent a solution for code protection is summarised very well in this article
There are also code protection products on the market that go beyond obfuscation which sound like the tool for the job you describe.
One method that may work for you is to declare the the methods and classes in the filter assembly to be internal and explicitly specify the assemblies that can access it as "friends".
You do this with an assembly declaration (ususally in assemblyinfo) like:
[assembly:InternalsVisibleTo("cs_friend_assemblies_2")]
see Friend Assemblies for more info.
Also make sure you obfuscate the assembly or people can dig into the code with reflector.
Don't bother worrying too much about protecting your .NET code. If you deploy it to someone elses computer, and that person wants to use or read your code, they will.
If your code is valuable enough you need to keep it on a computer you control (such as a web server) and guard against unauthorised access.
Obfuscation will only slow determined people down. Strong naming and signing is not used to protect your code, but instead to ensure that the user can confirm the code originates from who they expect it to come from (ie ensure it hasn't been tampered with).

Do method names get compiled into the EXE?

Do class, method and variable names get included in the MSIL after compiling a Windows App project into an EXE?
For obfuscation - less names, harder to reverse engineer.
And for performance - shorter names, faster access.
e.g. So if methods ARE called via name:
Keep names short, better performance for named-lookup.
Keep names cryptic, harder to decompile.
Yes, they're in the IL - fire up Reflector and you'll see them. If they didn't end up in the IL, you couldn't build against them as libraries. (And yes, you can reference .exe files as if they were class libraries.)
However, this is all resolved once in JIT.
Keep names readable so that you'll be able to maintain the code in the future. The performance issue is unlikely to make any measurable difference, and if you want to obfuscate your code, don't do it at the source code level (where you're the one to read the code) - do it with a purpose-built obfuscator.
EDIT: As for what's included - why not just launch Reflector or ildasm and find out? From memory, you lose local variable names (which are in the pdb file if you build it) but that's about it. Private method names and private variable names are still there.
Yes, they do. I do not think that there will be notable performance gain by using shorter names. There is no way that gain overcomes the loss of readability.
Local variables are not included in MSIL. Fields, methods, classes etc are.
Variables are index based.
Member names do get included in the IL whether they are private or public. In fact all of your code gets included too, and if you'd use Reflector, you can practically read all the source code of the application. What's left is debugging the app, and I think there might be tools for that.
You must ABSOLUTELY (and I can't emphasize it more) obfuscate your code if you're making packaged applications that have a number of clients and competition. Luckily there are a number of obfuscators available.
This is a major gripe that I have with .Net. Since MS is doing so much hard work on this, why not develop (or acquire) a professional obfuscator and make that a part of VS. Dotfuscator just doesn't cut it, not the version they've for community.
Keep names short, better
performance for named-lookup.
How could this make any difference? I'm not sure how identifiers are looked up by the VM, but I'm pretty sure it's not doing a straight string comparison lookup. This would be the worst possible way to do it.
Keep names cryptic, harder to decompile.
To be honest, I don't think code obfuscation helps that much. Most competent developers out there have already developed a "sixth sense" to figure out things quickly even if identifiers like method names are totally unhelpful since very often the source code they need to maintain or improve already has these problems (I am talking about method names like "DoAllStuff()").
Anyway, security through obscurity is usually a bad idea.
If you are concerned about obfuscation check out .NET Reactor. I tested 8 different obfuscators and Reactor was not only the cheapest commercial one, it was the second best of the bunch (the best was the most expensive one, Dotfuscator Gold).
[EDIT]
Actually now that I think of it, if all you care about is obfuscating method names then the one that comes with VS.NET, Dotfuscator Community Edition, should work fine.
I think they're added, but the length of the name isn't going to affect anything, because of the way the function names are looked up. As for obfuscation, I think there are tools (Dotfuscator or something like that) that basically do exactly what you're saying.

In C# (or any language) what is/are your favourite way of removing repetition?

I've just coded a 700 line class. Awful. I hang my head in shame. It's as opposite to DRY as a British summer.
It's full of cut and paste with minor tweaks here and there. This makes it's a prime candidate for refactoring. Before I embark on this, I'd thought I'd ask when you have lots of repetition, what are the first refactoring opportunities you look for?
For the record, mine are probably using:
Generic classes and methods
Method overloading/chaining.
What are yours?
I like to start refactoring when I need to, rather than the first opportunity that I get. You might say this is somewhat of an agile approach to refactoring. When do I feel I need to? Usually when I feel that the ugly parts of my codes are starting to spread. I think ugliness is okay as long as they are contained, but the moment when they start having the urge to spread, that's when you need to take care of business.
The techniques you use for refactoring should start with the simplest. I would strongly recommand Martin Fowler's book. Combining common code into functions, removing unneeded variables, and other simple techniques gets you a lot of mileage. For list operations, I prefer using functional programming idioms. That is to say, I use internal iterators, map, filter and reduce(in python speak, there are corresponding things in ruby, lisp and haskell) whenever I can, this makes code a lot shorter and more self-contained.
#region
I made a 1,000 line class only one line with it!
In all seriousness, the best way to avoid repetition is the things covered in your list, as well as fully utilizing polymorphism, examine your class and discover what would best be done in a base class, and how different components of it can be broken away a subclasses.
Sometimes by the time you "complete functionality" using copy and paste code, you've come to a point that it is maimed and mangled enough that any attempt at refactoring will actually take much, much longer than refactoring it at the point where it was obvious.
In my personal experience my favorite "way of removing repetition" has been the "Extract Method" functionality of Resharper (although this is also available in vanilla Visual Studio).
Many times I would see repeated code (some legacy app I'm maintaining) not as whole methods but in chunks within completely separate methods. That gives a perfect opportunity to turn those chunks into methods.
Monster classes also tend to reveal that they contain more than one functionality. That in turn becomes an opportunity to separate each distinct functionality into its own (hopefully smaller) class.
I have to reiterate that doing all of these is not a pleasurable experience at all (for me), so I really would rather do it right while it's a small ball of mud, rather than let the big ball of mud roll and then try to fix that.
First of all, I would recommend refactoring much sooner than when you are done with the first version of the class. Anytime you see duplication, eliminate it ASAP. This may take a little longer initially, but I think the results end up being a lot cleaner, and it helps you rethink your code as you go to ensure you are doing things right.
As for my favorite way of removing duplication.... Closures, especially in my favorite language (Ruby). They tend to be a really concise way of taking 2 pieces of code and merging the similarities. Of course (like any "best practice" or tip), this can not be blindly done... I just find them really fun to use when I can use them.
One of the things I do, is try to make small and simple methods that I can see on a single page in my editor (visual studio).
I've learnt from experience that making code simple makes it easier for the compiler to optimise it. The larger the method, the harder the compiler has to work!
I've also recently seen a problem where large methods have caused a memory leak. Basically I had a loop very much like the following:
while (true)
{
var smallObject = WaitForSomethingToTurnUp();
var largeObject = DoSomethingWithSmallObject();
}
I was finding that my application was keeping a large amount of data in memory because even though 'largeObject' wasn't in scope until smallObject returned something, the garbage collector could still see it.
I easily solved this by moving the 'DoSomethingWithSmallObject()' and other associated code to another method.
Also, if you make small methods, your reuse within a class will become significantly higher. I generally try to make sure that none of my methods look like any others!
Hope this helps.
Nick
"cut and paste with minor tweaks here and there" is the kind of code repetition I usually solve with an entirely non-exotic approach- Take the similar chunk of code, extract it out to a seperate method. The little bit that is different in every instance of that block of code, change that to a parameter.
There's also some easy techniques for removing repetitive-looking if/else if and switch blocks, courtesy of Scott Hanselman:
http://www.hanselman.com/blog/CategoryView.aspx?category=Source+Code&page=2
I might go something like this:
Create custom (private) types for data structures and put all the related logic in there. Dictionary<string, List<int>> etc.
Make inner functions or properties that guarantee behaviour. If you’re continually checking conditions from a publically accessible property then create an private getter method with all of the checking baked in.
Split methods apart that have too much going on. If you can’t put something succinct into the or give it a good name, then start breaking the function apart until the code is (even if these “child” functions aren’t used anywhere else).
If all else fails, slap a [SuppressMessage("Microsoft.Maintainability", "CA1502:AvoidExcessiveComplexity")] on it and comment why.

Categories