Are there similar tools to Clone Detective for other languages/IDEs? - c#

I just saw Clone Detective linked on YCombinator news, and the idea heavily appeals to me. It seems like it would be useful for many languages, not just C#, but I haven't seen anything similar elsewhere.
Edit: For those who don't want to follow the link, Clone Detective scans the codebase for duplicate code that may warrant refactoring to minimize duplication.

Java has a few - some of the most popular static analysis tools have this built in along with many other useful rules.
Ones I have used, in the (purely subjective) order that I was happiest with:
PMD - comes with CPD - their copy and paste detector
Checkstyle - specific rules to look for duplicate code
Findbugs - the daddy of all Java static analysis tools. Includes duplicate code detection, along with just about anything else that you can think of, but quite resource intensive
There are some nice IDE plugins for all of these and many other reporting tools (for example, you can see results on a Hudson continuos build server, or your project's Maven site)

The IntelliJ IDE (Java, Scala, Ruby,...) has a Locate Duplicate... tool. Usefull indeed !

Related

Track Data Input Through Application Code and System Libraries

I am a security dude, and I have done extensive research on this one, and at this point I am looking for guidance on where to go next.
Also, sorry for the long post, I bolded the important parts.
What I am trying to do at a high level is simple:
I am trying to input some data into a program, and "follow" this data, and track how it's processed, and where it ends up.
For example, if I input my login credentials to FileZilla, I want to track every memory reference that accesses, and initiate traces to follow where that data went, which libraries it was sent to, and bonus points if I can even correlate it down to the network packet.
Right now I am focusing on the Windows platform, and I think my main question comes down to this:
Are there any good APIs to remote control a debugger that understand Windows forms and system libraries?
Here are the key attributes I have found so far:
The name of this analysis technique is "Dynamic Taint Analysis"
It's going to require a debugger or a profiler
Inspect.exe is a useful tool to find Windows UI elements that take input
The Windows automation framework in general may be useful
Automating debuggers seems to be a pain. IDebugClient interface allows for more rich data, but debuggers like IDAPro or even CheatEngine have better memory analysis utilities
I am going to need to place memory break points, and track the references and registers that are associated with the input.
Here are a collection of tools I have tried:
I have played with all the following tools: WinDBG (awesome tool), IDA Pro, CheatEngine, x64dbg, vdb (python debugger), Intel's PIN, Valgrind, etc...
Next, a few Dynamic Taint Analysis tools, but they don't support detecting of .NET components or other conveniences that Windows debugging framework provides natively provided by utilities like Inspect.exe:
https://github.com/wmkhoo/taintgrind
http://bitblaze.cs.berkeley.edu/temu.html
I then tried writing my own C# program using IDebugClient interface, but the it's poorly documented, and the best project I could find was from this fellow, and is 3 years old:
C# app to act like WINDBG's "step into" feature
I am willing to contribute code to an existing project that fits this use case, but at this point I don't even know where to start.
I feel like as a whole dynamic program analysis and debugging tools could use some love... I feel kind of stuck, and don't know where to move from here. There are so many different tools and approaches to solving this problem, and all of them are lacking in some manner of another.
Anyway, I appreciate any direction or guidance. If you made it this far thanks!!
-Dave
If you insist on doing this at runtime, Valgrind or Pin might be your best bet. As I understand it (having never used it), you can configure these tools to interpret each machine instruction in an arbitrary way. You want to trace dataflows through machine instructions to track tainted data (reads of such data, followed by writes to registers or condition code bits). A complication will likely be tracing the origin of an offending instruction back to a program element (DLL? Link module? Named subroutine) so that you can complain appropriately.
This a task you might succeed at doing as an individual in terms of effort.
This should work for applications.
I suspect one of your problems will be tracing where goes in the OS. That's a lot harder although the same principle applies; your difficulty will be getting the OS supplier to let you track insructions executed in the OS.
Doing this as runtime analysis has the downside that if a malicious application doesn't do anything bad on your particular execution, you won't find any problems. That's the classic shortcoming of dynamic analysis.
You could consider tracking the data the source code level using classic compiler techniques. This requires that you have access to all the source code that might be involved (that's actually really hard if your application depends on a wide variety of libraries), that you have tools that can parse and track dataflows through source modules, and that these tools talk to each other for different languages (assembler, C, Java, SQL, HTML, even CSS...).
As static analysis, this has the chance of detecting an undesired dataflow no matter which execution occurs. Turing limitations means that you likely cannot detect all such issues. THat's the shortcoming of static analysis.
Building your own tools, or even integrating individual ones, to do this is likely outside what you can reasonably do as an individual. You'll need to find uniform framework for building such tools. [Check my bio for one].

Semantic analysis in C#

What are the tools, if any, to make a C# code that does semantic analysis? I am interested in detecting sysnonims, for example, if there is a sentences with a word K9 in it, the tool would recognize that K9 means dog.
What your looking for is a Natural Language Processing (NLP) tool, there are a few tools around that could be some help such as SharpNLP but I'm not aware if there is a specific tool for detecting and replacing synonyms.
A better route these days would be to use ikvm along with something like opennlp. SharpNLP, mentioned already, is from 2006, is grossly outdated and is dead. There was one effort to revive it, but that stalled. You likely will not see any more pure .net solutions because one can use ikvm to access the already existing, mature NLP projects.

Does Visual Studio for C# have these Eclipse for Java features?

So far I use to develop in Java. Java is multi-platform (now works on Android!), has a very powerfull VM and is open, well behaved, etc. But is also old and seems to be stopped on time in terms of language features. Scala and Gosu are nice replacements, but I don't like Scala syntax and Gosu is very immature and unlike to win from Scala. All this makes me think about moving to C# at least for web development! Phew!
One thing that is quite important to me is IDE support. Right now I use Eclipse for Java, and my favorite features are these (most important first, somewhat):
Full code navigation (call hierarchy, show variable reads & writes, inherited members).
Incremental compilation (which means fast compilation).
Many kinds of errors are detected and underlined before compilation.
Many intelligent quick-fixes (can fix/write many code for you and quickly rename elements and refactor references).
Intelligent and configurable code completion. Display hints even for unimported packages/classes.
Over 15 kinds of refactorings, all of them very useful.
Over 15 options of source generation (add unimplemented methods, generate getters and setter, generate delegates).
Configurable code formatter, even for code fragments (select code then format).
Debugger supports hot code replacement and "Drop to Frame" so I can go back an check other things without full program startup.
Code cleanups (remove unnecessary parenthesis, remove unnecessary "this" references, etc).
Very decent, autonomous and seamless CVS integration, with integrated file comparison and computer-aided merge.
Very nice tools for web development (server deployment, JavaScript and HTML editor with formatter).
Tons of plugins (code coverage analyser, memory dump analyser, eGIT).
Which of these features are available in Visual Studio for C#/ASP.NET? If I can get some by adding a cheap plugin, please tell.
Well, I like Visual Studio a lot more than Eclipse. I have only used Eclipse for minimal Java programming and Action Script. Visual Studio can do every thing you listed and if you combine it with a paid plugin like Resharper or CodeRush, you get a lot more. Why don't you download Visual Studio Express and play around with it? That would be much better than getting an answer from a very biased C# dev.
VS2010 with the addition of ReSharper has most of these things. I can't speak to CVS, but it does have fine SVN integration.
I started using VS about six months ago after a few years of Eclipse, and it works pretty well.
Nothing is cheap in the VS world compared to Eclipse.

ReSharper 5.0 VS CodeRush 10.1 - Specific Feature Comparison

I'm deciding between ReSharper 5.0 VS CodeRush 10.1. I've seen a lot of questions that target which one is better/faster than the other. For Example, ReSharper may be a little slower than CodeRush when working with large projects. What I am looking for is a list of which features are completely, outright missing in the opponent.
An example would be that ReSharper has an intellisense replacement, which CodeRush leaves to VisualStudio.
What features does one have that the other is missing - assuming performance/speed and learning curve is a non-issue?
(I'm ReSharper Product Manager, take with care)
For me it is not specific features that make most sense here, you can make up any numbers on comparison charts by categorizing things, e.g. you can count formatting actions as refactorings. Also some missing features here and there can be complimented with other plugins - being them within tool's ecosystem or VS ecosystem.
What I really find important for any tool of this kind is deep and accurate code understanding. Some tools, unfortunately, are not that accurate. Every product has bugs and issues, of course, but being not able to accurately resolve symbols is show-stopper for me. May be not for other people, who value fast typing over accurate analysis and refactoring. Other tools in this area cannot even parse C# code sometimes, not to say about resolving generic overloaded methods with lambdas and correctly supporting LINQ patterns.
As for complete, direct and independent feature list comparison, I don't think there is one. I believe every product has comparison chart with Visual Studio (ReSharper has), so you can combine them, and then clarify specific points with the community and users of corresponding products.
I am personally using and loving both, DevExpress CodeRush Express for VS (free) and ReSharper (open source license).
http://www.devexpress.com/Products/Visual_Studio_Add-in/CodeRushX/
http://www.jetbrains.com/resharper/buy/opensource_license.html
Honestly speaking I want both, though they do have some overlaps such as refactoring, code analysis. But they do have a lot of unique features, such as structure highlight (CR). And at some extent installing both consumes more system resources. However, the benefits from both products are huge.
Always use a powerful machine (or machines) as your development environment, and it can overcome a lot of pains in the near future :)
Cards on Table: I'm a huge CodeRush fan and a member of DX-Squad (Which means I help out on the DX forums)
As you might imagine, I use CodeRush quite a bit and I have a reasonably good knowledge on what is, and what isn't possible using the current DX toolset. I think however that there are few who are particularly well versed in both products
As for what might be missing from each. Typically this doesn't bother me... If I need something not supplied with CodeRush, I tend to build it myself :)
The DXCore (free framework on which CodeRush and CodeRush XPress are built) makes this very easy. Feel free to head over to our community site and take a look.

C# Call Graph Generation Tool

I just got a heaping pile of (mostly undocumented) C# code and I'd like to visualize it's structure before I dive in and start refactoring. I've done this in the past (in other languages) with tools that generate call graphs.
Can you recommend a good tool for facilitating the discovery of structure in C#?
UPDATE
In addition to the tools mentioned here I've seen (through the tubes) people say that .NET Reflector and CLR Profiler have this functionality. Any experience with these?
NDepend is pretty good at this. Additionally Visual Studio 2008 Team System has a bunch of features that allow you to keep track of cyclomatic complexity but its much more basic than NDepend. (Run code analysis)
Concerning NDepend, it can produce some usable call graph like for example:
The call graph can be made clearer by grouping its method by parent classes, namespaces or projects:
Find more explanations about NDepend call graph here.
It's bit late, but http://sequenceviz.codeplex.com/ is an awesome tool that shows the caller graph/Sequence diagram. The diagrams are generated by reverse engineering .NET Assemblies.
As of today (June 2017), the best tool in class is Resharper's Inspect feature. It allows you to find all incoming calls, outgoing calls, value origin/destination, etc.
The best part of ReSharper, compared to other tools mentioned above: it's less buggy.
I've used doxygen to some success. It's a little confusing, but free and it works.
Visual Studio 2010.
Plus, on a method-by-method basis - Reflector (Analyzer (Ctrl+R); "Depends On" and "Used By")
SequenceViz and DependencyStructureMatrix for Reflector might help you out: http://www.codeplex.com/reflectoraddins
I'm not sure if it will do it over just source code, but ANTS Profiler will produce a call graph for a running application (may be more useful anyway).

Categories