How to include keywords and aliases in Roslyn recommended symbols?

How to include keywords and aliases in Roslyn recommended symbols? - c#

I am using Roslyn to create a C# scripting control with IntelliSense.
I am generally very happy with the results I am getting, however, the recommended symbols don't include keywords such as for and if et cetera and also don't contain type aliases such as int, when it includes Int32.
More specifically, I am using Microsoft.CodeAnalysis.Recommendations, that is:
Recommender.GetRecommendedSymbolsAtPositionAsync(mySemanticModel, scriptPosition, myAdhocWorkspace);
My SemanticModel object is obtained from a C# compilation which always has a reference to mscorlib.dll at the very least.
At all positions in my script, the recommended completions are always correct. However, I would argue that they are incomplete if they are missing keywords such as if, else and for etc.
I can see that it would be easy for me to include common type aliases in my IntelliSense manually. That is, if Int32 is a possible completion, then I could manually add int.
However, it is less obvious when an if statement or a for statement or even is/as would be appropriate in the given scope.
Is there a way to include these keywords when getting the recommended symbols this way?
Is there also a way to automatically include type aliases?

It seems that Recommender.GetRecommendedSymbolsAtPositionAsync provides only symbols completion. That mean, Methods, Types etc (ISymbol implementations).
If you want keywords or snippets completion, you can use Microsoft.CodeAnalysis.Completion.CompletionService
void CompletionExample()
{
var code = #"using System;
namespace NewConsoleApp
{
class NewClass
{
void Method()
{
fo // I want to get 'for' completion for this
}
}
}";
var completionIndex = code.LastIndexOf("fo") + 2;
// Assume you have a method that create a workspace for you
var workspace = CreateWorkspace("newSln", "newProj", code);
var doc = workspace.CurrentSolution.Projects.First().Documents.First();
var service = CompletionService.GetService(doc);
var completionItems = service.GetCompletionsAsync(doc, completionIndex).Result.Items;
foreach (var result in completionItems)
{
Console.WriteLine(result.DisplayText);
Console.WriteLine(string.Join(",", result.Tags));
Console.WriteLine();
}
}
You can play around to figure it out how to customize it for your needs (rules, filters).
Notice that each result comes from a specific completion provider (item.Properties["Provider"]) and you can create a custom CompletionProvider (at least you should be able).
You can also take a look at C# for VS code (that powered with OmniSharp) to see how they did the work.

Related

Find all method calls for a specific method using Roslyn

I am working on a code analyser using Roslyn and my current task is to find all internal methods which are unused in the assembly.
I start with a MethodDeclarationSyntax and get the symbol from that. I then use the FindCallersAsync method in SymbolFinder, but it returns an empty collection even when I am making a call to the method in question somewhere in the assembly. See the code below.
protected override void Analyze(SyntaxNodeAnalysisContext context)
{
NodeToAnalyze = context.Node;
var methodDeclaration = NodeToAnalyze as MethodDeclarationSyntax;
if (methodDeclaration == null)
return;
var methodSymbol = context.SemanticModel.GetDeclaredSymbol(methodDeclaration) as ISymbol;
if (methodSymbol.DeclaredAccessibility != Accessibility.Internal)
return;
var solutionPath = GetSolutionPath();
var msWorkspace = MSBuildWorkspace.Create();
var solution = msWorkspace.OpenSolutionAsync(solutionPath).Result;
var callers = SymbolFinder.FindCallersAsync(symbol, solution).Result; // Returns empty collection.
...
}
I have seen similar code here, but in that example the method symbol is obtained using GetSymbolInfo on an InvocationExpressionSyntax:
//Get the syntax node for the first invocation to M()
var methodInvocation = doc.GetSyntaxRootAsync().Result.DescendantNodes().OfType<InvocationExpressionSyntax>().First();
var methodSymbol = model.GetSymbolInfo(methodInvocation).Symbol;
//Finds all references to M()
var referencesToM = SymbolFinder.FindReferencesAsync(methodSymbol, doc.Project.Solution).Result;
However, in my case, I need to find the invocations (if any) from a declaration. If I do get the invocation first and pass in the symbol from GetSymbolInfo the calls to the method are returned correctly - so the issue seems to be with the symbol parameter and not solution.
Since I am trying to get the underlying symbol of a declaration, I cannot use GetSymbolInfo, but use GetDeclaredSymbol instead (as suggested here).
My understanding from this article is that the symbols returned from GetDeclaredSymbol and GetSymbolInfo should be the same. However, a simple comparison using Equals returns false.
Does anyone have any idea of what the difference is between the two symbols returned and how I can get the 'correct' one which works? Or perhaps there is a better approach entirely? All my research seems to point to FindCallersAsync, but I just can't get it to work.

My understanding from this article is that the symbols returned from GetDeclaredSymbol and GetSymbolInfo should be the same. However, a simple comparison using Equals returns false.
This is because they're not the same symbol; they are coming from entirely different compilations which might or might not be different. One is coming from the compiler that is actively compiling, one is coming from MSBuildWorkspace.
Fundamentally, using MSBuildWorkspace in an analyzer is unsupported. Completely. Don't do that. Not only would that be really slow, but it also has various correctness issues, especially if you're running your analyzer in Visual Studio. If your goal is to find unused methods anywhere in a solution, that's something we don't really support implementing as an analyzer either, since that involves cross-project analysis.

Can Roslyn be used to generate dynamic method similar to DynamicMethod IL generation

I have been using DynamiMethod to generate the IL using
method.GetILGenerator();
This works well but is of course very hard to use since you generally don't want to work with low level IL in a high level language like C#. Now since there is Roslyn I though I can use that instead. I have tried to figure out how to use Roslyn to do similar thing: generate a dynamic method and then create a delegate for it. The only way I was able to do that is to have full class like this
SyntaxTree syntaxTree = CSharpSyntaxTree.ParseText(#"
using System;
namespace RoslynCompileSample
{
public class Writer
{
public void Write(string message)
{
Console.WriteLine(message);
}
}
}");
Then instead of the Write method I can insert my method inside using string concatenation. After that dynamic assembly is generated in memory and loaded and reflection is used to get the required method and generate the delegate.
This method seems to work fine but seems a bit of an overkill for my case as I will need to use multiple independent methods possible leading to lots of assemblies being loaded.
So the question is: Is there an easy way to do something similar to dynamic method for Roslyn, so that I can only define a body of the method attached to a type? If not, is there any big drawback in compiling many dynamic assemblies (like too many can't be loaded, etc...)

You can use CSharpScript class. await CSharpScript.EvaluateAsync("1 + 2") just evaluates the expression. You can find it in Microsoft.CodeAnalysis.Scripting.CSharp package (currently only prerelease version). Add usings and assembly references using ScriptOptions (second parameter).
Compile expression to delegate:
var func = CSharpScript.Create<int>("1 + 3").CompileToDelegate()
Passing something to the function using globals object:
await CSharpScript.Create<int>("1 + x",
ScriptOptions.Default.AddReferences(typeof(Program).Assembly),
globalsType: typeof(ScriptGlobals))
.CreateDelegate()
.Invoke(new ScriptGlobals() { x = 4 });

I have one more idea how to solve your problem, that does not use Roslyn at all. You described it's annoying to emit IL using ILGenerator. However .NET Framework has build-in semantic trees, that can be compiled to dynamic methods. They live in Linq.Expression namespace and are also used in Linq providers.
var parameter = Expression.Parameter(typeof(int), "a"); // define parameter
var body = Expression.Add(parameter, Expression.Constant(42)); // sum parameter and number
var lambdaExpression = Expression.Lambda<Func<int, int>>(new[] { parameter }, body); // define method
var add42Delegate = lambdaExpression.Compile(); // compile to dynamic method
You can do almost anything using it, it's much more comfortable than ILGenerator and is included in standard library.

I'd like to comment on exyi's answer with Expression and Func<int,int>, but I don't have enough reputation. So here comes my "answer" instead.
If all you need is an first-class citizen piece of code that you can execute with parameters, you can simply create the Lambda like this:
Func<int, int> add42 = number => number + 42;
// Called like this:
int theNumber46 = add42.Invoke(4);
If you need to have the actual expression tree, there is a neat shortcut as well:
Expression<Func<int, int>> add42 = number => number + 42;
// Called like this:
int theNumber46 = add42.Compile().Invoke(4);
The only difference in code is, that you wrapped the Func<int,int> with an Expression<..>. The conceptual difference is, that a Lambda (or Func<> in this example, but there are other Lambdas as well) can be executed as is, whereas an Expression<> needs to be compiled first with the Compile() method. But the Expression holds information about the syntax tree and can therefore be used for IQueryable data providers as used in Entity Framework.
So it all depends on what you want to do with your dynamic method/lamda/delegate.

Is there a way to implement custom language features in C#?

I've been puzzling about this for a while and I've looked around a bit, unable to find any discussion about the subject.
Lets assume I wanted to implement a trivial example, like a new looping construct: do..until
Written very similarly to do..while
do {
//Things happen here
} until (i == 15)
This could be transformed into valid csharp by doing so:
do {
//Things happen here
} while (!(i == 15))
This is obviously a simple example, but is there any way to add something of this nature? Ideally as a Visual Studio extension to enable syntax highlighting etc.

Microsoft proposes Rolsyn API as an implementation of C# compiler with public API. It contains individual APIs for each of compiler pipeline stages: syntax analysis, symbol creation, binding, MSIL emission. You can provide your own implementation of syntax parser or extend existing one in order to get C# compiler w/ any features you would like.
Roslyn CTP
Let's extend C# language using Roslyn! In my example I'm replacing do-until statement w/ corresponding do-while:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Roslyn.Compilers.CSharp;
namespace RoslynTest
{
class Program
{
static void Main(string[] args)
{
var code = #"
using System;
class Program {
public void My() {
var i = 5;
do {
Console.WriteLine(""hello world"");
i++;
}
until (i > 10);
}
}
";
//Parsing input code into a SynaxTree object.
var syntaxTree = SyntaxTree.ParseCompilationUnit(code);
var syntaxRoot = syntaxTree.GetRoot();
//Here we will keep all nodes to replace
var replaceDictionary = new Dictionary<DoStatementSyntax, DoStatementSyntax>();
//Looking for do-until statements in all descendant nodes
foreach (var doStatement in syntaxRoot.DescendantNodes().OfType<DoStatementSyntax>())
{
//Until token is treated as an identifier by C# compiler. It doesn't know that in our case it is a keyword.
var untilNode = doStatement.Condition.ChildNodes().OfType<IdentifierNameSyntax>().FirstOrDefault((_node =>
{
return _node.Identifier.ValueText == "until";
}));
//Condition is treated as an argument list
var conditionNode = doStatement.Condition.ChildNodes().OfType<ArgumentListSyntax>().FirstOrDefault();
if (untilNode != null && conditionNode != null)
{
//Let's replace identifier w/ correct while keyword and condition
var whileNode = Syntax.ParseToken("while");
var condition = Syntax.ParseExpression("(!" + conditionNode.GetFullText() + ")");
var newDoStatement = doStatement.WithWhileKeyword(whileNode).WithCondition(condition);
//Accumulating all replacements
replaceDictionary.Add(doStatement, newDoStatement);
}
}
syntaxRoot = syntaxRoot.ReplaceNodes(replaceDictionary.Keys, (node1, node2) => replaceDictionary[node1]);
//Output preprocessed code
Console.WriteLine(syntaxRoot.GetFullText());
}
}
}
///////////
//OUTPUT://
///////////
// using System;
// class Program {
// public void My() {
// var i = 5;
// do {
// Console.WriteLine("hello world");
// i++;
// }
//while(!(i > 10));
// }
// }
Now we can compile updated syntax tree using Roslyn API or save syntaxRoot.GetFullText() to text file and pass it to csc.exe.

The big missing piece is hooking into the pipeline, otherwise you're not much further along than what .Emit provided. Don't misunderstand, Roslyn brings alot of great things, but for those of us who want to implement preprocessors and meta programming, it seems for now that was not on the plate. You can implement "code suggestions" or what they call "issues"/"actions" as an extension, but this is basically a one off transformation of code that acts as a suggested inline replacement and is not the way you would implement a new language feature. This is something you could always do with extensions, but Roslyn makes the code analysis/transformation tremendously easier:
From what I've read of comments from Roslyn developers on the codeplex forums, providing hooks into the pipeline has not been an initial goal. All of the new C# language features they've provided in C# 6 preview involved modifying Roslyn itself. So you'd essentially need to fork Roslyn. They have documentation on how to build Roslyn and test it with Visual Studio. This would be a heavy handed way to fork Roslyn and have Visual Studio use it. I say heavy-handed because now anyone who wants to use your new language features must replace the default compiler with yours. You could see where this would begin to get messy.
Building Roslyn and replacing Visual Studio 2015 Preview's compiler with your own build
Another approach would be to build a compiler that acts as a proxy to Roslyn. There are standard APIs for building compilers that VS can leverage. It's not a trivial task though. You'd read in the code files, call upon the Roslyn APIs to transform the syntax trees and emit the results.
The other challenge with the proxy approach is going to be getting intellisense to play nicely with any new language features you implement. You'd probably have to have your "new" variant of C#, use a different file extension, and implement all the APIs that Visual Studio requires for intellisense to work.
Lastly, consider the C# ecosystem, and what an extensible compiler would mean. Let's say Roslyn did support these hooks, and it was as easy as providing a Nuget package or a VS extension to support a new language feature. All of your C# leveraging the new Do-Until feature is essentially invalid C#, and will not compile without the use of your custom extension. If you go far enough down this road with enough people implementing new features, very quickly you will find incompatible language features. Maybe someone implements a preprocessor macro syntax, but it can't be used along side someone else's new syntax because they happened to use similar syntax to delineate the beginning of the macro. If you leverage alot of open source projects and find yourself digging into their code, you would encounter alot of strange syntax that would require you side track and research the particular language extensions that project is leveraging. It could be madness. I don't mean to sound like a naysayer, as I have alot of ideas for language features and am very interested in this, but one should consider the implications of this, and how maintainable it would be. Imagine if you got hired to work somewhere and they had implemented all kinds of new syntax that you had to learn, and without those features having been vetted the same way C#'s features have, you can bet some of them would be not well designed/implemented.

You can check www.metaprogramming.ninja (I am the developer), it provides an easy way to accomplish language extensions (I provide examples for constructors, properties, even js-style functions) as well as full-blown grammar based DSLs.
The project is open source as well. You can find documentations, examples, etc at github.
Hope it helps.

You can't create your own syntactic abstractions in C#, so the best you can do is to create your own higher-order function. You could create an Action extension method:
public static void DoUntil(this Action act, Func<bool> condition)
{
do
{
act();
} while (!condition());
}
Which you can use as:
int i = 1;
new Action(() => { Console.WriteLine(i); i++; }).DoUntil(() => i == 15);
although it's questionable whether this is preferable to using a do..while directly.

I found the easiest way to extend the C# language is to use the T4 text processor to preprocess my source. The T4 Script would read my C# and then call a Roslyn based parser, which would generate a new source with custom generated code.
During build time, all my T4 scripts would be executed, thus effectively working as an extended preprocessor.
In your case, the none-compliant C# code could be entered as follows:
#if ExtendedCSharp
do
#endif
{
Console.WriteLine("hello world");
i++;
}
#if ExtendedCSharp
until (i > 10);
#endif
This would allow syntax checking the rest of your (C# compliant) code during development of your program.

No there is no way to achieve what you'are talking about.
Cause what you're asking about is defining new language construct, so new lexical analysis, language parser, semantic analyzer, compilation and optimization of generated IL.
What you can do in such cases is use of some macros/functions.
public bool Until(int val, int check)
{
return !(val == check);
}
and use it like
do {
//Things happen here
} while (Until(i, 15))

Is there a programatic way to identify c# reserved words?

I'm looking for a function like
public bool IsAReservedWord(string TestWord)
I know I could roll my own by grabbing a reserve word list from MSDN. However I was hoping there was something built into either the language or .NET reflection that could be relied upon so I wouldn't have to revisit the function when I move to newer versions of C#/.NET.
The reason I'm looking for this is I'm looking for a safeguard in .tt file code generation.

CSharpCodeProvider cs = new CSharpCodeProvider();
var test = cs.IsValidIdentifier("new"); // returns false
var test2 = cs.IsValidIdentifier("new1"); // returns true

The Microsoft.CSharp.CSharpCodeGenerator has an IsKeyword(string) method that does exactly that. However, the class is internal, so you have to use reflection to access it and there's no guarantee it will be available in future versions of the .NET framework. Please note that IsKeyword doesn't take care of different versions of C#.
The public method System.CodeDom.Compiler.ICodeGenerator.IsValidIdentifier(string) rejects keywords as well. The drawback is this method does some other validations as well, so other non-keyword strings are also rejected.
Update: If you just need to produce a valid identifier rather than decide if a particular string is a keyword, you can use ICodeGenerator.CreateValidIdentifier(string). This method takes care of strings with two leading underscores as well by prefixing them with one more underscore. The same holds for keywords. Note that ICodeGenerator.CreateEscapedIdentifier(string) prefixes such strings with the # sign.
Identifiers startings with two leading underscores are reserved for the implementation (i.e. the C# compiler and associated code generators etc.), so avoiding such identifiers from your code is generally a good idea.
Update 2: The reason to prefer ICodeGenerator.CreateValidIdentifier over ICodeGenerator.CreateEscapedIdentifier is that __x and #__x are essentially the same identifier. The following won't compile:
int __x = 10;
int #__x = 20;
In case the compiler would generate and use a __x identifier, and the user would use #__x as a result to a call to CreateEscapedIdentifier, a compilation error would occur. When using CreateValidIdentifier this situation is prevented, because the custom identifier is turned into ___x (three underscores).

However I was hoping there was something built into either the language or .NET reflection that could be relied upon so I wouldn't have to revisit the function when I move to newer versions of C#/.NET.
Note that C# has never added a new reserved keyword since v1.0. Every new keyword has been an unreserved contextual keyword.
Though it is of course possible that we might add a new reserved keyword in the future, we have tried hard to avoid doing so.
For a list of all the reserved and contextual keywords up to C# 5, see
http://ericlippert.com/2009/05/11/reserved-and-contextual-keywords/

static System.CodeDom.Compiler.CodeDomProvider CSprovider =
Microsoft.CSharp.CSharpCodeProvider.CreateProvider("C#");
public static string QuoteName(string name)
{
return CSprovider.CreateEscapedIdentifier(name);
}
public static bool IsAReservedWord(string TestWord)
{
return QuoteName(TestWord) != TestWord;
}
Since the definition of CreateEscapedIdentifier is:
public string CreateEscapedIdentifier(string name)
{
if (!IsKeyword(name) && !IsPrefixTwoUnderscore(name))
{
return name;
}
return ("#" + name);
}
it will properly identify __ identifiers as reserved.

Finding methods in source code using regular expressions

I have a program which looks in source code, locates methods, and performs some calculations on the code inside of each method. I am trying to use regular expressions to do this, but this is my first time using them in C# and I am having difficulty testing the results.
If I use this regular expression to find the method signature:
((private)|(public)|(sealed)|(protected)|(virtual)|(internal))+([a-z]|[A-Z]|[0-9]|[\s])*([\()([a-z]|[A-Z]|[0-9]|[\s])*([\)|\{]+)
and then split the source code by this method, storing the results in an array of strings:
string[] MethodSignatureCollection = regularExpression.Split(SourceAsString);
would this get me what I want, ie a list of methods including the code inside of them?

I would strongly suggest using Reflection (if it is appropriate) or CSharpCodeProvider.Parse(...) (as recommended by rstevens)
It can be very difficult to write a regular expression that works in all cases.
Here are some cases you'd have to handle:
public /* comment */ void Foo(...) // Comments can be everywhere
string foo = "public void Foo(...){}"; // Don't match signatures in strings
private __fooClass _Foo() // Underscores are ugly, but legal
private void #while() // Identifier escaping
public override void Foo(...) // Have to recognize overrides
void Foo(); // Defaults to private
void IDisposable.Dispose() // Explicit implementation
public // More comments // Signatures can span lines
void Foo(...)
private void // Attributes
Foo([Description("Foo")] string foo)
#if(DEBUG) // Don't forget the pre-processor
private
#else
public
#endif
int Foo() { }
Notes:
The Split approach will throw away everything that it matches, so you will in fact lose all the "signatures" that you are splitting on.
Don't forget that signatures can have commas in them
{...} can be nested, your current regexp could consume more { than it should
There is a lot of other stuff (preprocessor commands, using statements, properties, comments, enum definitions, attributes) that can show up in code, so just because something is between two method signatures does not make it part of a method body.

Maybe it is a better approach to use the CSharpCodeProvider.Parse() which can "compile" C# source code into a CompileUnit.
You can then walk through the namespaces, types, classes and methods of in that Compile Unit.

using ICSharpCode.NRefactory.CSharp;
PM> install-package ICSharpCode.NRefactory
var parser = new CSharpParser();
var syntaxTree = parser.Parse(File.ReadAllText(sourceFilePath));
var result = syntaxTree.Descendants.OfType<MethodDeclaration>()
.FirstOrDefault(y => y.NameToken.Name == methodName);
if (result != null)
{
return result.ToString(FormattingOptionsFactory.CreateSharpDevelop()).Trim();
}

It is feasible, I guess, to get something working using regex's, however this does require looking very carefully at the specifications for the C# language and a deep understanding of the C# grammar, this is not a simple problem. I know you've said you want to store the methods as arrays of strings, but presumably there is something beyond that. It has already been pointed out to look at using reflection, however if that does not do what you want, you should consider ANTLR (ANother Tool for Language Recognition). ANTLR does have C# grammars available.
http://www.antlr.org/about.html

No, those access modifiers can also be used for internal classes and fields, among other things. You'd need to write a full C# parser to get it right.
You can do what you want using reflection. Try something like the following:
var methods = typeof (Foo).GetMethods();
foreach (var info in methods)
{
var body = info.GetMethodBody();
}
That probably has what you need for your calculations.
If you need the original C# source code you can't get it with reflection. Don't write your own parser. Use an existing one, listed here.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.