Extracting code as pure string from Roslyn API

Extracting code as pure string from Roslyn API - c#

Our goal is to build a toy abstract syntax tree for C# classes using Roslyn. We just want to show the basic structure of a class instead of walking through the entire AST. For example (Taken from MSDN):
class TimePeriod
{
private double seconds;
public double Hours
{
get { return seconds / 3600; }
set { seconds = value * 3600; }
}
}
Let us only consider the Property Hours; we are only interested in extracting the tokens for modifier (public), return type (double), identifier (Hours) as for the body of two accessors we want to directly extract it as a String.
However, as we walk though the roslyn (shown in the screen dump) when we get to get accessor's body we did not find the field representing the entire string. What's the correct way of achieving this?

The obvious way is to call ToString:
Returns the string representation of this node, not including its leading and trailing trivia.
If you want the leading and trailing trivia (whitespace, comments, ...), there's ToFullString:
Returns full string representation of this node including its leading and trailing trivia.
For efficiency purposes, you may also be interested in the WriteTo method, which writes what ToFullString would produce to a TextWriter, avoiding intermediate string allocations:
Writes the full text of this node to the specified TextWriter.

Related

Is there a way to convert C# code to JSON

is there a way to convert C# code to any block representation and back?
Something like this:
int foo(int a){return a+1;}
to
{function:{name:"foo", return:"int", args:[{type:"int", name:"a"}], operations:[{type:"return", operations:[{type:"add", args:[{type:"vairable", value:"a"},{type:"const", value: 1}]}]}]}}
Does not have to be JSON, but I need it to be split to smallest parts.
UPDATE:
Lets say I generate a function that fills a structure based on a data from database:
public Person GetPerson(int id)
{
try { // <-- entire block added by user
using (var query = db.GetPerson(id))
{
return new Person(){
/*0*/name = query['name'], // /*#*/ is my mark of generated line
/*1*/age = query['age']
};
}
}
...
}
Assume that a user changed the line:
/*1*/age = query['age']
to /*1*/age = 10 - query['age'] for some reason.
Now the database column age is changed to years.
The new line should be /*1*/years = 10 - query['years']. The problem is that I need to keep the 10 - entered by the user.
If I had this code is JSON (or any graph) I could find the part that needs to be changed and only affect the nodes I genereated before keeping the excess.
This exmaple is trivial, but it can get complicated very quickly. Especially with quotes and brackets. This is the only approach I can see to work right now. Just hoped that tools for it already exists.

Is there a way to convert C# code to JSON
Sure. You can either make your json contain a string with your c# code and use runtime code generation to execute it, or encode dlls as base64 data and put it into your json. If you want to dig deeper you could probably extract the CIL code and make some custom JSON encoding of it. There are also expression trees, but I only think they allow encoding of expressions, not arbitrary code.
But in any case it is probably not a good idea to let the user customize code at that level. If you want to allow customization you should probably go for an actual plugin architecture. Or if you just want some customization of some simple mathematical expression, just store it as a string and write a simple parser to validate & evaluate it.

C# double with zero after comma

So I have kinda helper with this code which I use everywhere in system where I need to round a double:
public static double Round(double source)
{
return Math.Round(source, Program.AppSettings.DigitAfterComma);
}
The idea is to round any input double to double with some character after comma which is read from file. I use it in my services with calculations and in my ViewModels and .cshtml for results render.
The problem is that dealing with calculations is OK, but when I need to render doubles like 15.0% I get only 15% as output. It will be hard to write 2 methods for renderer and for calculations because this method has a numerous references all over the system.
Is there any way to get xx.0 everytime I call the method without formatting it to string because I need double type output for calculations?

Update values of array when a specific element is changed

Let us assume that I have defined a function named computeValue (double x):
whose input is a double value
that returns a value obtained by performing a certain set of operations using the elements of an array, to be described below.
Also, we have the above mentioned array that
is some class's member
contains only 4 positions.
The 1st position contains a certain value, the 4th contains another value that will be the input to our function
(here comes the tricky bit), the 2nd and 3rd values of the array should be the result of a linear interpolation between positions 1 and 4 of the array. That is, if we modify the position 1 or 4 of the array, then positions 2 and 3 should change their values automatically according to the interpolation method.
My aim is to invoke a root-finding algorithm (such as Newton-Raphson, Secant Method, etc) that will aim to minimize the following expression:
f = CONSTANT - computeValue(array[4])
As you may have already observed, the problem is that every time my root-finding routine modifies the 4th element of my array to obtain a new solution, the positions 2 and 3 of my array should be modified accordingly given that they are the result of a interpolation (as mentioned in point 4 above), thus changing the result of computeValue.
What is a possible way of making the values of the array change dynamically as the root finding algorithm works its way towards the root? Maybe something to do with an array storing lambda expressions defining the interpolation?

It cannot be done with a classic array, but you can implement your own type that solves the problem. This type internally uses an array of length four and offers access by
public int this[int index]
{
// get and set accessors
}
Here, you can write your own getters and setters, so that you can recalculate values as soon as the others were changed.

Instead of confusing yourself with array indices, this seems like an excellent time to create your own object. You can use JF Meier's approach and manually create or augment an array class. But I suggest just creating a new object entirely. You can make your interpolated points get only properties that return the proper interpolation. Your object might look like this:
public class Interpolator
{
public double Constant {get; set;} //same as your array[0]
public double Value {get; set;} //same as your array[3]
public double Interpolation1 { get { return CalculateInterpolation1(); } }
public double Interpolation2 { get { return CalculateInterpolation2(); } }
private double CalculateInterpolation1()
{
//define your interpolation here
}
private double CalculateInterpolation2()
{
//define your interpolation here
}
}
A brief .Net Fiddle demo

solving a math expression

I want to evaluate a math expression which the user enters in a textbox. I have done this so far
string equation, finalString;
equation = textBox1.Text;
StringBuilder stringEvaluate = new StringBuilder(equation);
stringEvaluate.Replace("sin", "math.sin");
stringEvaluate.Replace("cos", "math.cos");
stringEvaluate.Replace("tan", "math.tan");
stringEvaluate.Replace("log", "math.log10");
stringEvaluate.Replace("e^", "math.exp");
finalString = stringEvaluate.ToString();
StringBuilder replaceI = new StringBuilder(finalString);
replaceI.Replace("x", "i");
double a;
for (int i = 0; i<5 ; i++)
{
a = double.Parse(finalStringI);
if(a<0)
break;
}
when I run this program it gives an error "Input string was not in a correct format." and highlights a=double.Parse(finalStringI);
I used a pre defined expression a=i*math.log10(i)-1.2 and it works, but when I enter the same thing in the textbox it doesn't.
I did some search and it came up with something to do with compiling the code at runtime.
any ideas how to do this?
i'm an absolute beginner.
thanks :)

The issue is within your stringEvaluate StringBuilder. When you're replacing "sin" with "math.sin", the content within stringEvaluate is still a string. You've got the right idea, but the error you're getting is because of that fact.
Math.sin is a method inside the Math class, thus it cannot be operated on as you are in your a = double.Parse(finalStringI); call.
It would be a pretty big undertaking to accomplish your goal, but I would go about it this way:
Create a class (perhaps call it Expression).
Members of the Expression class could include Lists of operators and operands, and perhaps a double called solution.
Pass this class the string at instantiation, and tear it apart using the StringBuilder class. For example, if you encounter a "sin", add Math.sin to the operator collection (of which I'd use type object).
Each operator and operand within said string should be placed within the two collections.
Create a method that evaluates the elements within the operator and operand collection accordingly. This could get sticky for complex calculations with more than 2 operators, as you would have to implement a PEMDAS-esque algorithm to re-order the collections to obey the order of operations (and thus achieve correct solutions).
Hope this helps :)

The .Parse methods (Int.Parse, double.Parse, etc) will only take a string such as "25" or "3.141" and convert it to the matching value type (int 25, or double 3.141). They will not evaluate math expressions!
You'll pretty much have to write your own text-parser and parse-tree evaluator, or explore run-time code-generation, or MSIL code-emission.
Neither topic can really be covered in the Q&A format of StackOverflow answers.

Take a look at this blog post:
http://www.c-sharpcorner.com/UploadFile/mgold/CodeDomCalculator08082005003253AM/CodeDomCalculator.aspx
It sounds like it does pretty much what you're trying to do. Evaluating math expressions is not as simple as just parsing a double (which is really only going to work for strings like "1.234", not "1 + 2.34"), but apparently it is possible.

You can use the eval function that the framework includes for JScript.NET code.
More details: http://odetocode.com/code/80.aspx
Or, if you're not scared to use classes marked "deprecated", it's really easy:
static string EvalExpression(string s)
{
return Microsoft.JScript.Eval.JScriptEvaluate(s, null, Microsoft.JScript.Vsa.VsaEngine.CreateEngine()).ToString();
}
For example, input "Math.cos(Math.PI / 3)" and the result is "0.5" (which is the correct cosine of 60 degrees)

Performance issue: comparing to String.Format

A while back a post by Jon Skeet planted the idea in my head of building a CompiledFormatter class, for using in a loop instead of String.Format().
The idea is the portion of a call to String.Format() spent parsing the format string is overhead; we should be able to improve performance by moving that code outside of the loop. The trick, of course, is the new code should exactly match the String.Format() behavior.
This week I finally did it. I went through using the .Net framework source provided by Microsoft to do a direct adaption of their parser (it turns out String.Format() actually farms the work to StringBuilder.AppendFormat()). The code I came up with works, in that my results are accurate within my (admittedly limited) test data.
Unfortunately, I still have one problem: performance. In my initial tests the performance of my code closely matches that of the normal String.Format(). There's no improvement at all; it's even consistently a few milliseconds slower. At least it's still in the same order (ie: the amount slower doesn't increase; it stays within a few milliseconds even as the test set grows), but I was hoping for something better.
It's possible that the internal calls to StringBuilder.Append() are what actually drive the performance, but I'd like to see if the smart people here can help improve things.
Here is the relevant portion:
private class FormatItem
{
public int index; //index of item in the argument list. -1 means it's a literal from the original format string
public char[] value; //literal data from original format string
public string format; //simple format to use with supplied argument (ie: {0:X} for Hex
// for fixed-width format (examples below)
public int width; // {0,7} means it should be at least 7 characters
public bool justify; // {0,-7} would use opposite alignment
}
//this data is all populated by the constructor
private List<FormatItem> parts = new List<FormatItem>();
private int baseSize = 0;
private string format;
private IFormatProvider formatProvider = null;
private ICustomFormatter customFormatter = null;
// the code in here very closely matches the code in the String.Format/StringBuilder.AppendFormat methods.
// Could it be faster?
public String Format(params Object[] args)
{
if (format == null || args == null)
throw new ArgumentNullException((format == null) ? "format" : "args");
var sb = new StringBuilder(baseSize);
foreach (FormatItem fi in parts)
{
if (fi.index < 0)
sb.Append(fi.value);
else
{
//if (fi.index >= args.Length) throw new FormatException(Environment.GetResourceString("Format_IndexOutOfRange"));
if (fi.index >= args.Length) throw new FormatException("Format_IndexOutOfRange");
object arg = args[fi.index];
string s = null;
if (customFormatter != null)
{
s = customFormatter.Format(fi.format, arg, formatProvider);
}
if (s == null)
{
if (arg is IFormattable)
{
s = ((IFormattable)arg).ToString(fi.format, formatProvider);
}
else if (arg != null)
{
s = arg.ToString();
}
}
if (s == null) s = String.Empty;
int pad = fi.width - s.Length;
if (!fi.justify && pad > 0) sb.Append(' ', pad);
sb.Append(s);
if (fi.justify && pad > 0) sb.Append(' ', pad);
}
}
return sb.ToString();
}
//alternate implementation (for comparative testing)
// my own test call String.Format() separately: I don't use this. But it's useful to see
// how my format method fits.
public string OriginalFormat(params Object[] args)
{
return String.Format(formatProvider, format, args);
}
Additional notes:
I'm wary of providing the source code for my constructor, because I'm not sure of the licensing implications from my reliance on the original .Net implementation. However, anyone who wants to test this can just make the relevant private data public and assign values that mimic a particular format string.
Also, I'm very open to changing the FormatInfo class and even the parts List if anyone has a suggestion that could improve the build time. Since my primary concern is sequential iteration time from front to end maybe a LinkedList would fare better?
[Update]:
Hmm... something else I can try is adjusting my tests. My benchmarks were fairly simple: composing names to a "{lastname}, {firstname}" format and composing formatted phone numbers from the area code, prefix, number, and extension components. Neither of those have much in the way of literal segments within the string. As I think about how the original state machine parser worked, I think those literal segments are exactly where my code has the best chance to do well, because I no longer have to examine each character in the string.
Another thought:
This class is still useful, even if I can't make it go faster. As long as performance is no worse than the base String.Format(), I've still created a strongly-typed interface which allows a program to assemble it's own "format string" at run time. All I need to do is provide public access to the parts list.

Here's the final result:
I changed the format string in a benchmark trial to something that should favor my code a little more:
The quick brown {0} jumped over the lazy {1}.
As I expected, this fares much better compared to the original; 2 million iterations in 5.3 seconds for this code vs 6.1 seconds for String.Format. This is an undeniable improvement. You might even be tempted to start using this as a no-brainer replacement for many String.Format situations. After all, you'll do no worse and you might even get a small performance boost: as much 14%, and that's nothing to sneeze at.
Except that it is. Keep in mind, we're still talking less than half a second difference for 2 million attempts, under a situation specifically designed to favor this code. Not even busy ASP.Net pages are likely to create that much load, unless you're lucky enough to work on a top 100 web site.
Most of all, this omits one important alternative: you can create a new StringBuilder each time and manually handle your own formatting using raw Append() calls. With that technique my benchmark finished in only 3.9 seconds. That's a much greater improvement.
In summary, if performance doesn't matter as much, you should stick with the clarity and simplicity of the built-in option. But when in a situation where profiling shows this really is driving your performance, there is a better alternative available via StringBuilder.Append().

Don't stop now!
Your custom formatter might only be slightly more efficient than the built-in API, but you can add more features to your own implementation that would make it more useful.
I did a similar thing in Java, and here are some of the features I added (besides just pre-compiled format strings):
1) The format() method accepts either a varargs array or a Map (in .NET, it'd be a dictionary). So my format strings can look like this:
StringFormatter f = StringFormatter.parse(
"the quick brown {animal} jumped over the {attitude} dog"
);
Then, if I already have my objects in a map (which is pretty common), I can call the format method like this:
String s = f.format(myMap);
2) I have a special syntax for performing regular expression replacements on strings during the formatting process:
// After calling obj.toString(), all space characters in the formatted
// object string are converted to underscores.
StringFormatter f = StringFormatter.parse(
"blah blah blah {0:/\\s+/_/} blah blah blah"
);
3) I have a special syntax that allows the formatted to check the argument for null-ness, applying a different formatter depending on whether the object is null or non-null.
StringFormatter f = StringFormatter.parse(
"blah blah blah {0:?'NULL'|'NOT NULL'} blah blah blah"
);
There are a zillion other things you can do. One of the tasks on my todo list is to add a new syntax where you can automatically format Lists, Sets, and other Collections by specifying a formatter to apply to each element as well as a string to insert between all elements. Something like this...
// Wraps each elements in single-quote charts, separating
// adjacent elements with a comma.
StringFormatter f = StringFormatter.parse(
"blah blah blah {0:#['$'][,]} blah blah blah"
);
But the syntax is a little awkward and I'm not in love with it yet.
Anyhow, the point is that your existing class might not be much more efficient than the framework API, but if you extend it to satisfy all of your personal string-formatting needs, you might end up with a very convenient library in the end. Personally, I use my own version of this library for dynamically constructing all SQL strings, error messages, and localization strings. It's enormously useful.

It seems to me that in order to get actual performance improvement, you'd need to factor out any format analysis done by your customFormatter and formattable arguments into a function that returns some data structure that tells a later formatting call what to do. Then you pull those data structures in your constructor and store them for later use. Presumably this would involve extending ICustomFormatter and IFormattable. Seems kinda unlikely.

Have you accounted for the time to do the JIT compile as well? After all, the framework will be ngen'd which could account for the differences?

The framework provides explicit overrides to the format methods that take fixed-sized parameter lists instead of the params object[] approach to remove the overhead of allocating and collecting all of the temporary object arrays. You might want to consider that for your code as well. Also, providing strongly-typed overloads for common value types would reduce boxing overhead.

I gotta believe that spending as much time optimizing data IO would earn exponentially bigger returns!
This is surely a kissin' cousin to YAGNI for this. Avoid Premature Optimization. APO.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Extracting code as pure string from Roslyn API - c#

Related

Is there a way to convert C# code to JSON

C# double with zero after comma

Update values of array when a specific element is changed

solving a math expression

Performance issue: comparing to String.Format

Categories

Resources