C# Preprocessor

C# Preprocessor - c#

While the C# spec does include a pre-processor and basic directives (#define, #if, etc), the language does not have the same flexible pre-processor found in languages such as C/C++. I believe the lack of such a flexible pre-processor was a design decision made by Anders Hejlsberg (although, unfortunately, I can't find reference to this now). From experience, this is certainly a good decision, as there were some really terrible un-maintainable macros created back when I was doing a lot of C/C++.
That said, there are a number of scenarios where I could find a slightly more flexible pre-processor to be useful. Code such as the following could be improved by some simple pre-processor directives:
public string MyProperty
{
get { return _myProperty; }
set
{
if (value != _myProperty)
{
_myProperty = value;
NotifyPropertyChanged("MyProperty");
// This line above could be improved by replacing the literal string with
// a pre-processor directive like "#Property", which could be translated
// to the string value "MyProperty" This new notify call would be as follows:
// NotifyPropertyChanged(#Property);
}
}
}
Would it be a good idea to write a pre-processor to handle extremely simple cases like this? Steve McConnell wrote in Code Complete (p208):
Write your own preprocessor If a language doesn't include a preprocessor, it's fairly easy to write one...
I am torn. It was a design decision to leave such a flexible pre-processor out of C#. However, an author I highly respect mentions it may be ok in some circumstances.
Should I build a C# pre-processor? Is there one available that does the simple things I want to do?

Consider taking a look at an aspect-oriented solution like PostSharp, which injects code after the fact based on custom attributes. It's the opposite of a precompiler but can give you the sort of functionality you're looking for (PropertyChanged notifications etc).

Should I build a C# pre-processor? Is there one available that does the simple things I want to do?
You can always use the C pre-processor -- C# is close enough, syntax-wise. M4 is also an option.

I know a lot of people think short code equals elegant code but that isn't true.
The example you propose is perfectly solved in code, as you have shown so, what do you need a preprocessor directive to? You don't want to "preprocess" your code, you want the compiler to insert some code for you in your properties. It's common code but that's not the purpose of the preprocessor.
With your example, Where do you put the limit? Clearly that satisfies an observer pattern and there's no doubt that will be useful but there are a lot of things that would be useful that are actually done because code provides flexibility where as the preprocessor does not. If you try to implement common patterns through preprocessor directives you'll end with a preprocessor which needs to be as powerful as the language itself. If you want to process your code in a different way the use a preprocessor directive but if you just want a code snippet then find another way because the preprocessor wasn't meant to do that.

Using a C++-style preprocessor, the OP's code could be reduced to this one line:
OBSERVABLE_PROPERTY(string, MyProperty)
OBSERVABLE_PROPERTY would look more or less like this:
#define OBSERVABLE_PROPERTY(propType, propName) \
private propType _##propName; \
public propType propName \
{ \
get { return _##propName; } \
set \
{ \
if (value != _##propName) \
{ \
_##propName = value; \
NotifyPropertyChanged(#propName); \
} \
} \
}
If you have 100 properties to deal with, that's ~1,200 lines of code vs. ~100. Which is easier to read and understand? Which is easier to write?
With C#, assuming you cut-and-paste to create each property, that's 8 pastes per property, 800 total. With the macro, no pasting at all. Which is more likely to contain coding errors? Which is easier to change if you have to add e.g. an IsDirty flag?
Macros are not as helpful when there are likely to be custom variations in a significant number of cases.
Like any tool, macros can be abused, and may even be dangerous in the wrong hands. For some programmers, this is a religious issue, and the merits of one approach over another are irrelevant; if that's you, you should avoid macros. For those of us who regularly, skillfully, and safely use extremely sharp tools, macros can offer not only an immediate productivity gain while coding, but downstream as well during debugging and maintenance.

The main argument agaisnt building a pre-rocessor for C# is integration in Visual Studio: it would take a lot of efforts (if at all possible) to get intellisense and the new background compiling to work seamlessly.
Alternatives are to use a Visual Studio productivity plugin like ReSharper or CodeRush.
The latter has -to the best of my knowledge- an unmatched templating system and comes with an excellent refactoring tool.
Another thing that could be helpful in solving the exact types of problems you are referring to is an AOP framework like PostSharp.
You can then use custom attributes to add common functionality.

To get the name of the currently executed method, you can look at the stack trace:
public static string GetNameOfCurrentMethod()
{
// Skip 1 frame (this method call)
var trace = new System.Diagnostics.StackTrace( 1 );
var frame = trace.GetFrame( 0 );
return frame.GetMethod().Name;
}
When you are in a property set method, the name is set_Property.
Using the same technique, you can also query the source file and line/column info.
However, I did not benchmark this, creating the stacktrace object once for every property set might be a too time consuming operation.

I think you're possibly missing one important part of the problem when implementing the INotifyPropertyChanged. Your consumer needs a way of determining the property name. For this reason you should have your property names defined as constants or static readonly strings, this way the consumer does not have to `guess' the property names. If you used a preprocessor, how would the consumer know what the string name of the property is?
public static string MyPropertyPropertyName
public string MyProperty {
get { return _myProperty; }
set {
if (!String.Equals(value, _myProperty)) {
_myProperty = value;
NotifyPropertyChanged(MyPropertyPropertyName);
}
}
}
// in the consumer.
private void MyPropertyChangedHandler(object sender,
PropertyChangedEventArgs args) {
switch (e.PropertyName) {
case MyClass.MyPropertyPropertyName:
// Handle property change.
break;
}
}

If I were designing the next version of C#, I'd think about each function having an automatically included local variable holding the name of the class and the name of the function. In most cases, the compiler's optimizer would take it out.
I'm not sure there's much of a demand for that sort of thing though.

#Jorge wrote: If you want to process your code in a different way the use a preprocessor directive but if you just want a code snippet then find another way because the preprocessor wasn't meant to do that.
Interesting. I don't really consider a preprocessor to necessarily work this way. In the example provided, I am doing a simple text substitution, which is in-line with the definition of a preprocessor on Wikipedia.
If this isn't the proper use of a preprocessor, what should we call a simple text replacement, which generally needs to occur before a compilation?

At least for the provided scenario, there's a cleaner, type-safe solution than building a pre-processor:
Use generics. Like so:
public static class ObjectExtensions
{
public static string PropertyName<TModel, TProperty>( this TModel #this, Expression<Func<TModel, TProperty>> expr )
{
Type source = typeof(TModel);
MemberExpression member = expr.Body as MemberExpression;
if (member == null)
throw new ArgumentException(String.Format(
"Expression '{0}' refers to a method, not a property",
expr.ToString( )));
PropertyInfo property = member.Member as PropertyInfo;
if (property == null)
throw new ArgumentException(String.Format(
"Expression '{0}' refers to a field, not a property",
expr.ToString( )));
if (source != property.ReflectedType ||
!source.IsSubclassOf(property.ReflectedType) ||
!property.ReflectedType.IsAssignableFrom(source))
throw new ArgumentException(String.Format(
"Expression '{0}' refers to a property that is not a member of type '{1}'.",
expr.ToString( ),
source));
return property.Name;
}
}
This can easily be extended to return a PropertyInfo instead, allowing you to get way more stuff than just the name of the property.
Since it's an Extension method, you can use this method on virtually every object.
Also, this is type-safe.
Can't stress that enough.
(I know its an old question, but I found it lacking a practical solution.)

While there are plenty of good reflection-based answers here, the most obvious answer is missing and that is to use the compiler, at compile time.
Note that the following method has been supported in C# and .NET since .NET 4.5 and C# 5.
The compiler does in fact have some support for obtaining this information, just in a slightly roundabout way, and that is through the CallerMemberNameAttribute attribute. This allows you to get the compiler to inject the name of the member that is calling a method. There are two sibling attributes as well, but I think an example is easier to understand:
Given this simple class:
public static class Code
{
[MethodImplAttribute(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
public static string MemberName([CallerMemberName] string name = null) => name;
[MethodImplAttribute(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
public static string FilePath([CallerFilePathAttribute] string filePath = null) => filePath;
[MethodImplAttribute(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
public static int LineNumber([CallerLineNumberAttribute] int lineNumber = 0) => lineNumber;
}
of which in the context of this question you actually only need the first method, you can use it like this:
public class Test : INotifyPropertyChanged
{
private string _myProperty;
public string MyProperty
{
get => _myProperty;
set
{
PropertyChanged?.Invoke(this, new PropertyChangedEventArgs(Code.MemberName()));
_myProperty = value;
}
}
public event PropertyChangedEventHandler PropertyChanged;
}
Now, since this method is only returning the argument back to the caller, chances are that it will be inlined completely which means the actual code at runtime will just grab the string that contains the name of the property.
Example usage:
void Main()
{
var t = new Test();
t.PropertyChanged += (s, e) => Console.WriteLine(e.PropertyName);
t.MyProperty = "Test";
}
output:
MyProperty
The property code actually looks like this when decompiled:
IL_0000 ldarg.0
IL_0001 ldfld Test.PropertyChanged
IL_0006 dup
IL_0007 brtrue.s IL_000C
IL_0009 pop
IL_000A br.s IL_0021
IL_000C ldarg.0
// important bit here
IL_000D ldstr "MyProperty"
IL_0012 call Code.MemberName (String)
// important bit here
IL_0017 newobj PropertyChangedEventArgs..ctor
IL_001C callvirt PropertyChangedEventHandler.Invoke (Object, PropertyChangedEventArgs)
IL_0021 ldarg.0
IL_0022 ldarg.1
IL_0023 stfld Test._myProperty
IL_0028 ret

Under VS2019, you do get enhanced ability to precompile, without losing intellisense, when using a generator (see https://devblogs.microsoft.com/dotnet/introducing-c-source-generators/).
For example: if you would be in need to remove readonly keywords (useful when manipulating constructors), then your generator could act as a precompiler to remove these keywords at compile time and generate the actual source that is to be compiled instead.
Your original source would then look like the following (the §RegexReplace macro is to be executed by the Generator and subsequently commented out in the generated source):
#if Precompiled || DEBUG
#if Precompiled
§RegexReplace("((private|internal|public|protected)( static)?) readonly","$1")
#endif
#if !Precompiled && DEBUG
namespace NotPrecompiled
{
#endif
... // your code
#if !Precompiled && DEBUG
}
#endif
#endif // Precompiled || DEBUG
The generated source would then have:
#define Precompiled
at the top and the Generator would have executed the other required changes to the source.
During development, you could thus still have intellisense, but the release version would only have the generated code. Care should be taken to never reference the NotPrecompiled namespace anywhere.

If you are ready to ditch C# you might want to check out the Boo language which has incredibly flexible macro support through AST (Abstract Syntax Tree) manipulations. It really is great stuff if you can ditch the C# language.
For more information on Boo see these related questions:
Non-C++ languages for generative programming?
https://stackoverflow.com/questions/595593/who-is-using-boo-programming-language
Boo vs. IronPython
Good dynamic programming language for .net recommendation
What can Boo do for you?

Related

Why can't I give a default value as optional parameter except null?

I want to have a optional parameter and set it to default value that I determine, when I do this:
private void Process(Foo f = new Foo())
{
}
I'm getting the following error (Foo is a class):
'f' is type of Foo, A default parameter of a reference type other than string can only be initialized with null.
If I change Foo to struct then it works but with only default parameterless constructor.
I read the documentation and it's clearly states that I cannot do this but it doesn't mention why?, Why is this restriction exists and why string is excluded from this? Why the value of an optional parameter has to be compile-time constant? If that wouldn't be a constant then what would be the side-effects ?

A starting point is that the CLR has no support for this. It must be implemented by the compiler. Something you can see from a little test program:
class Program {
static void Main(string[] args) {
Test();
Test(42);
}
static void Test(int value = 42) {
}
}
Which decompiles to:
.method private hidebysig static void Main(string[] args) cil managed
{
.entrypoint
// Code size 15 (0xf)
.maxstack 8
IL_0000: ldc.i4.s 42
IL_0002: call void Program::Test(int32)
IL_0007: ldc.i4.s 42
IL_0009: call void Program::Test(int32)
IL_000e: ret
} // end of method Program::Main
.method private hidebysig static void Test([opt] int32 'value') cil managed
{
.param [1] = int32(0x0000002A)
// Code size 1 (0x1)
.maxstack 8
IL_0000: ret
} // end of method Program::Test
Note how there is no difference whatsoever between the two call statements after the compiler is done with it. It was the compiler that applied the default value and did so at the call site.
Also note that this still needs to work when the Test() method actually lives in another assembly. Which implies that the default value needs to be encoded in the metadata. Note how the .param directive did this. The CLI spec (Ecma-335) documents it in section II.15.4.1.4
This directive stores in the metadata a constant value associated with method parameter number Int32,
see §II.22.9. While the CLI requires that a value be supplied for the parameter, some tools can use the
presence of this attribute to indicate that the tool rather than the user is intended to supply the value of
the parameter. Unlike CIL instructions, .param uses index 0 to specify the return value of the method,
index 1 to specify the first parameter of the method, index 2 to specify the second parameter of the
method, and so on.
[Note: The CLI attaches no semantic whatsoever to these values—it is entirely up to compilers to
implement any semantic they wish (e.g., so-called default argument values). end note]
The quoted section II.22.9 goes into the detail of what a constant value means. The most relevant part:
Type shall be exactly one of: ELEMENT_TYPE_BOOLEAN, ELEMENT_TYPE_CHAR,
ELEMENT_TYPE_I1, ELEMENT_TYPE_U1, ELEMENT_TYPE_I2, ELEMENT_TYPE_U2,
ELEMENT_TYPE_I4, ELEMENT_TYPE_U4, ELEMENT_TYPE_I8, ELEMENT_TYPE_U8,
ELEMENT_TYPE_R4, ELEMENT_TYPE_R8, or ELEMENT_TYPE_STRING; or
ELEMENT_TYPE_CLASS with a Value of zero
So that's where the buck stops, no good way to even reference an anonymous helper method so some kind of code hoisting trick cannot work either.
Notable is that it just isn't a problem, you can always implement an arbitrary default value for an argument of a reference type. For example:
private void Process(Foo f = null)
{
if (f == null) f = new Foo();
}
Which is quite reasonable. And the kind of code you want in the method instead of the call site.

Because there's no other compile-time constant than null. For strings, string literals are such compile-time constants.
I think that some of the design decisions behind it may have been:
Simplicity of implementation
Elimination of hidden / unexpected behavior
Clarity of method contract, esp. in cross-assembly scenarios
Lets elaborate on these three a bit more to get some insight under the hood of the problem:
1. Simplicity of implementation
When limited to constant values, both the compiler's and CLR's jobs are pretty easy. Constant values can be easily stored in assembly metadata, and the compiler can easily . How this is done was outlined in Hans Passant's answer.
But what could the CLR and compiler do to implement non-constant default values? There are two options:
Store the initialization expressions themselves, and compile them there:
// seen by the developer in the source code
Process();
// actually done by the compiler
Process(new Foo());
Generate thunks:
// seen by the developer in the source code
Process();
…
void Process(Foo arg = new Foo())
{
…
}
// actually done by the compiler
Process_Thunk();
…
void Process_Thunk()
{
Process(new Foo());
}
void Process()
{
…
}
Both solutions introduce a lot more new metadata into assemblies and require complex handling by the compiler. Also, while solution (2) can be seen as a hidden technicality (as well as (1)), it has consequences in respect to the perceived behavior. The developer expects that arguments are evaluated at call site, not somewhere else. This may impose extra problems to be solved (see part related to method contract).
2. Elimination of hidden / unexpected behavior
The initialization expression could have been arbitrarily complex. Hence a simple call like this:
Process();
would unroll into a complex calculation performed at call site. For example:
Process(new Foo(HorriblyComplexCalculation(SomeStaticVar) * Math.Power(GetCoefficient, 17)));
That can be rather unexpected from the point of view of reader that does not inspect ´Process´'s declaration thoroughly. It clutters the code, makes it less readable.
3. Clarity of method contract, esp. in cross-assembly scenarios
The signature of a method together with default values imposes a contract. This contract lives in a particular context. If the initialization expression required bindings to some other assemblies, what would that require from the caller? How about this example, where the method 'CalculateInput' is from 'Other.Assembly':
void Process(Foo arg = new Foo(Other.Assembly.Namespace.CalculateInput()))
Here's the point where the way this would be implemented plays critical role in thinking whether this is a problem or note. In the “simplicity” section I've outlined implementation methods (1) and (2). So if (1) were chosen, it would require the caller to bind to 'Other.Assembly'. On the other hand, if (2) were chosen, there's far less a need—from the implemenetation point of view—for such rule, because the compiler-generated Process_Thunk is declared at the same place as Process and hence naturally has a reference to Other.Aseembly. However, a sane language designer would even though impose such a rule, because multiple implementations of the same thing are possible, and for the sake of stability and clarity of method contract.
Nevertheless, there cross-assembly scenarios would impose assembly references that are not clearly seen from the plain source code at call site. And that's a usability and readability problem, again.

It is just the way the language works, I can't say why they do it (and this site is not a site for discussions like that, if you want to discuss it take it to chat).
I can show you how to work around it, just make two methods and overload it (modified your example slightly to show how you would return results too).
private Bar Process()
{
return Process(new Foo());
}
private Bar Process(Foo f)
{
//Whatever.
}

Default parameters manipulate the caller in a way that wheb you supply a default parameter, it will change your methods signature at compile time. Because of that you need to supply a Constant Value, which in your case "new Foo()" is not.
That is why you need a constant.

Is there an object that contains the parameters of a function?

In a method that is so long that it scrolls off the screen. Just to make life easier as I program, if I want to refer to the variables of a class I can use the Me or this objects depending on which language I am using.
eg. Me.var1 = "Hello"
Is there an object (like Me) that would allow easy reference to the parameters of a function?
eg. params.par1 = "World"

There's no such feature in the language. Local variables and method arguments are treated specially by the .NET jitter, they are heavily optimized at runtime. Anything .NET would do, or you would do, to capture those variables would defeat such optimizations.
A very simple solution is to use Window + Split, it gives you two views on your code. Scroll the top one to the method header, write your code in the bottom one. You can adjust the splitter to give you more room in the bottom window.
Taking advantage of IntelliSense would be another way. Prefix the argument names with a little string, like "par". Then typing "par" in your code automatically gives you the list of argument names in the IntelliSense popup window.
These are however but band-aids for the real problem. As soon as you find yourself reaching like this, your first thought should be to split up the code in the function to make it smaller. There are some hard truths I discovered after thirty years of coding:
Long methods have more bugs. There's a metric for this, called "cyclomatic complexity". The higher the number, the more likely that the code is broken. Well supported by Visual Studio, this blog post is useful.
Code should never be indented more than 3 levels deep. By far the simplest way to discover that your cyclomatic complexity is getting out of hand without running a tool.
A method should never be larger than what fits on the screen. Any code that doesn't fit is a cognitive tax that produces compile errors and bugs. There's a corollary to this, programmers with big monitors create more bugs. The hard rule I use is one inspired by using DOS editors, a method should not have more than 25 lines of code.
Wide code produces a special kind of bug, the nasty kind that you can't see. Anything that's off the screen to the right is code that may have a bug that can take you a long time to discover. VB.NET is especially prone to this kind of bug since it uses end-of-line as a statement terminator. Much improved in VS2010 btw, the underscore is now optional in many cases. Always break your line to avoid this kind of bug.
Plan ahead and write maintainable code. Maintained code is never smaller than the original. If you already have trouble writing the original code then by definition you cannot maintain it. You have to start out small.
Always design first, code later. Long methods are a strong indicator of not thinking about code long enough before you start coding. In itself a strong bug inducer, in addition to writing correct code that just doesn't do the job.

The short answer is no. It seems like you are hoping to use this to distinguish between parameter scope and class scope for function parameters and fields with the same name, unfortunately you can't. Either use different naming schemes, or do the following:
public class MyClass {
private string myString;
private int myInt;
public MyClass(string myString) {
this.myString = myString;
}
public int DoStuff(int myInt) {
this.myInt += myInt;
return this.myInt;
}
}
to be really clear and avoid problems, you could change the names:
public class MyClass {
private string m_myString;
private int m_myInt;
public MyClass(string myString) {
m_myString = myString;
}
public int DoStuff(int myInt) {
m_myInt += myInt;
return m_myInt;
}
}
And you should really start by writing a test before the code, then you can check that you haven't accidentally mixed things up in your code.
Footnote
I include this as people coming to the title of this question may be looking for the following information.
While you say
Just for ease of programming - if I am a long way down in a function I would like to see what parameters there are without having to scroll up
In case you really want to look at your parameters from inside your code for other reasons then you need reflection. This is slow, and it's typical use would be to find a method, then reflect the parameters in that method. For a very comprehensive sample, see MSDN - ParameterInfo Class. The pertinent part of the code is:
foreach (MemberInfo mi in typeof(MyClass).GetMembers() )
{
// If the member is a method, display information about its parameters.
if (mi.MemberType==MemberTypes.Method)
{
foreach ( ParameterInfo pi in ((MethodInfo) mi).GetParameters() )
{
Console.WriteLine("Parameter: Type={0}, Name={1}", pi.ParameterType, pi.Name);
}
}

You should be able to use GetParameters() reflection method
MethodInfo barMI = bar.GetMethod("Foo");
ParameterInfo[] pars = barMI.GetParameters();
foreach (ParameterInfo p in pars)
{
Console.WriteLine(p.Name);
}
You can use this in run time. But for your aim, I would try to refactor the number of functions and their names. I try to keep code length under 80 symbols per line and the number of lines in a class under 100. Which is not always possible, but it's a good objective to decouple stuff and keep classes simple.

A simple way would be to encapsulate your parameters in an object so you can just refer to that, and intellitype (or whatever predictive feature) would show you what properties you have available without having to scroll back up. Like this
public class MyParamObject{
public string FirstParam {get;set;}
public string SecondParam {get;set;}
}
Then you could change your method from
public void MyReallyOvergrownMethod(string firstParam, string secondParam){...
to
public void MyReallyOvergrownMethod(MyParamObject params){...
then you can use the params parameter like this in the method
//Deep inside the method
if(params.FirstParam == "SomeValue"{//Do something
This is a numpty solution to a problem that would be best solved by refactoring your method. Look at loops, and conditionals and get them out into seperate private methods that are named after what they do. Loads of stuff on this, if you search for cleancoders.

In light of your comment "Just for ease of programming - if I am a long way down in a function I would like to see what parameters there are without having to scroll up": in Visual Studio, with code showing, just above the scrollbar there is a little bit you can grab and pull down to split the window. You can then have your function declaration visible in one pane and scroll as much as you like in the other. Or you can use Window menu->Split.

FxCop rule around ensuring a certain method accepting a lambda is called first in a test

Using a custom FXCop rule, I want to ensure that a method is called at the top of each unit test and that all unit test code is part of an Action passed into that method. Essentially I want this:
[TestMethod]
public void SomeTest()
{
Run(() => {
// ALL unit test code has to be inside a Run call
});
}
It's not hard to ensure that Run is indeed called:
public override void VisitMethod(Method member)
{
var method = member as Method;
if (method == null || method.Attributes == null)
return;
if (method.Attributes.Any(attr => attr.Type.Name.Name == "TestMethodAttribute") &&
method.Instructions != null)
{
if (!method.Instructions.Any(i => i.OpCode == OpCode.Call || i.Value.ToString() == "MyNamespace.Run"))
{
this.Problems.Add(new Problem(this.GetResolution(), method.GetUnmangledNameWithoutTypeParameters()));
}
base.VisitMethod(method);
}
The trick is to ensure there isn't something at the top of the test that is called BEFORE the Run statement. I've spent the past few hours hacking at the Instructions collection for patterns and trying to understand how to use the Body.Statements collection in code effectively.
This could also be posed as a simple IL question. I want to know a specific pattern I can validate that will accept this:
public void SomeTest()
{
Run(() => {
// Potentially lots of code
});
}
But will reject either of these:
public void SomeTest()
{
String blah = “no code allowed before Run”;
Run(() => {
// Potentially lots of code
});
}
public void SomeTest()
{
Run(() => {
// Potentially lots of code
});
String blah = “no code allowed after Run”;
}

Although you can access an expression tree like structure using Method.Body I would probably examine the instructions anyway as I have found it to get confused by common situations in the past (eg. inline array initialisation).
How C# generates the lambda expression depends on what the lambda expression accesses:
If the lambda expression accesses any locals/parameters, it will create an object to hold those values where they can be accessed from the lambda expression, and then create a delegate to an instance method on the object.
If the lambda expression accesses any instance members via this, then it will simply create the delegate to the instance method on this.
Otherwise if the lambda expression doesn't access locals or fields:
Prior to the Roslyn compiler that came with VS2015, it would create a delegate to a static method and cache it in a static field.
As of the Roslyn compiler, it would create an instance method on a nested class and cache the delegate in a static method. (this change was made for performance reasons, calling a delegate to an instance method is faster than calling a delegate to a static method as it doesn't have to shuffle arguments across.)
You could probably create your own evaluation stack and trace the values through the method (I have had to do this in the past, non-trivial and a fair bit of code but not particularly difficult), but I suspect you could achieve "good enough" just by enforcing the following rules:
All locals for the method must be compiler generated.
Static fields may be accessed only if they are compiler generated.
Only System.Action and compiler generated types may be created.
Only Run may be called, and it must be called exactly once.
The call to Run must be followed by a ret instruction (ignoring any number of nop instructions that may appear between them).
No branch instruction may jump to a location after the call to Run.
Prohibit all instructions other than the following:
branch instructions (conditional and unconditional)
argument, local and field accessor instructions.
newobj, call, callvirt, ldftn, nop, ret, ldnull
_Locals (this is just a pseudo instruction that FxCop inserts for local variables.)
FxCop provides RuleUtilities.IsCompilerGenerated to determine if a local is compiler generated, but it won't help for fields and I suspect only for locals if FxCop can find the pdb file. You might find it easier to say "the local/field/type is compiler generated if it's type name is not a valid identifier in C#".
Having said all that, insisting that all tests run entirely via the Run method seems a little arbitrary. If the goal is for the Run method to provide common setup/tear-down logic, there are better ways provided via nunit. Enforcing that people apply your action attribute is much easier than enforcing that they write their test in a particular way.
Alternatively, you could write a Roslyn analyzer instead; analyzers have access to the syntax tree describing how the code was written, without having to reverse engineer the structure from the IL & metadata.

Is there a way to implement custom language features in C#?

I've been puzzling about this for a while and I've looked around a bit, unable to find any discussion about the subject.
Lets assume I wanted to implement a trivial example, like a new looping construct: do..until
Written very similarly to do..while
do {
//Things happen here
} until (i == 15)
This could be transformed into valid csharp by doing so:
do {
//Things happen here
} while (!(i == 15))
This is obviously a simple example, but is there any way to add something of this nature? Ideally as a Visual Studio extension to enable syntax highlighting etc.

Microsoft proposes Rolsyn API as an implementation of C# compiler with public API. It contains individual APIs for each of compiler pipeline stages: syntax analysis, symbol creation, binding, MSIL emission. You can provide your own implementation of syntax parser or extend existing one in order to get C# compiler w/ any features you would like.
Roslyn CTP
Let's extend C# language using Roslyn! In my example I'm replacing do-until statement w/ corresponding do-while:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Roslyn.Compilers.CSharp;
namespace RoslynTest
{
class Program
{
static void Main(string[] args)
{
var code = #"
using System;
class Program {
public void My() {
var i = 5;
do {
Console.WriteLine(""hello world"");
i++;
}
until (i > 10);
}
}
";
//Parsing input code into a SynaxTree object.
var syntaxTree = SyntaxTree.ParseCompilationUnit(code);
var syntaxRoot = syntaxTree.GetRoot();
//Here we will keep all nodes to replace
var replaceDictionary = new Dictionary<DoStatementSyntax, DoStatementSyntax>();
//Looking for do-until statements in all descendant nodes
foreach (var doStatement in syntaxRoot.DescendantNodes().OfType<DoStatementSyntax>())
{
//Until token is treated as an identifier by C# compiler. It doesn't know that in our case it is a keyword.
var untilNode = doStatement.Condition.ChildNodes().OfType<IdentifierNameSyntax>().FirstOrDefault((_node =>
{
return _node.Identifier.ValueText == "until";
}));
//Condition is treated as an argument list
var conditionNode = doStatement.Condition.ChildNodes().OfType<ArgumentListSyntax>().FirstOrDefault();
if (untilNode != null && conditionNode != null)
{
//Let's replace identifier w/ correct while keyword and condition
var whileNode = Syntax.ParseToken("while");
var condition = Syntax.ParseExpression("(!" + conditionNode.GetFullText() + ")");
var newDoStatement = doStatement.WithWhileKeyword(whileNode).WithCondition(condition);
//Accumulating all replacements
replaceDictionary.Add(doStatement, newDoStatement);
}
}
syntaxRoot = syntaxRoot.ReplaceNodes(replaceDictionary.Keys, (node1, node2) => replaceDictionary[node1]);
//Output preprocessed code
Console.WriteLine(syntaxRoot.GetFullText());
}
}
}
///////////
//OUTPUT://
///////////
// using System;
// class Program {
// public void My() {
// var i = 5;
// do {
// Console.WriteLine("hello world");
// i++;
// }
//while(!(i > 10));
// }
// }
Now we can compile updated syntax tree using Roslyn API or save syntaxRoot.GetFullText() to text file and pass it to csc.exe.

The big missing piece is hooking into the pipeline, otherwise you're not much further along than what .Emit provided. Don't misunderstand, Roslyn brings alot of great things, but for those of us who want to implement preprocessors and meta programming, it seems for now that was not on the plate. You can implement "code suggestions" or what they call "issues"/"actions" as an extension, but this is basically a one off transformation of code that acts as a suggested inline replacement and is not the way you would implement a new language feature. This is something you could always do with extensions, but Roslyn makes the code analysis/transformation tremendously easier:
From what I've read of comments from Roslyn developers on the codeplex forums, providing hooks into the pipeline has not been an initial goal. All of the new C# language features they've provided in C# 6 preview involved modifying Roslyn itself. So you'd essentially need to fork Roslyn. They have documentation on how to build Roslyn and test it with Visual Studio. This would be a heavy handed way to fork Roslyn and have Visual Studio use it. I say heavy-handed because now anyone who wants to use your new language features must replace the default compiler with yours. You could see where this would begin to get messy.
Building Roslyn and replacing Visual Studio 2015 Preview's compiler with your own build
Another approach would be to build a compiler that acts as a proxy to Roslyn. There are standard APIs for building compilers that VS can leverage. It's not a trivial task though. You'd read in the code files, call upon the Roslyn APIs to transform the syntax trees and emit the results.
The other challenge with the proxy approach is going to be getting intellisense to play nicely with any new language features you implement. You'd probably have to have your "new" variant of C#, use a different file extension, and implement all the APIs that Visual Studio requires for intellisense to work.
Lastly, consider the C# ecosystem, and what an extensible compiler would mean. Let's say Roslyn did support these hooks, and it was as easy as providing a Nuget package or a VS extension to support a new language feature. All of your C# leveraging the new Do-Until feature is essentially invalid C#, and will not compile without the use of your custom extension. If you go far enough down this road with enough people implementing new features, very quickly you will find incompatible language features. Maybe someone implements a preprocessor macro syntax, but it can't be used along side someone else's new syntax because they happened to use similar syntax to delineate the beginning of the macro. If you leverage alot of open source projects and find yourself digging into their code, you would encounter alot of strange syntax that would require you side track and research the particular language extensions that project is leveraging. It could be madness. I don't mean to sound like a naysayer, as I have alot of ideas for language features and am very interested in this, but one should consider the implications of this, and how maintainable it would be. Imagine if you got hired to work somewhere and they had implemented all kinds of new syntax that you had to learn, and without those features having been vetted the same way C#'s features have, you can bet some of them would be not well designed/implemented.

You can check www.metaprogramming.ninja (I am the developer), it provides an easy way to accomplish language extensions (I provide examples for constructors, properties, even js-style functions) as well as full-blown grammar based DSLs.
The project is open source as well. You can find documentations, examples, etc at github.
Hope it helps.

You can't create your own syntactic abstractions in C#, so the best you can do is to create your own higher-order function. You could create an Action extension method:
public static void DoUntil(this Action act, Func<bool> condition)
{
do
{
act();
} while (!condition());
}
Which you can use as:
int i = 1;
new Action(() => { Console.WriteLine(i); i++; }).DoUntil(() => i == 15);
although it's questionable whether this is preferable to using a do..while directly.

I found the easiest way to extend the C# language is to use the T4 text processor to preprocess my source. The T4 Script would read my C# and then call a Roslyn based parser, which would generate a new source with custom generated code.
During build time, all my T4 scripts would be executed, thus effectively working as an extended preprocessor.
In your case, the none-compliant C# code could be entered as follows:
#if ExtendedCSharp
do
#endif
{
Console.WriteLine("hello world");
i++;
}
#if ExtendedCSharp
until (i > 10);
#endif
This would allow syntax checking the rest of your (C# compliant) code during development of your program.

No there is no way to achieve what you'are talking about.
Cause what you're asking about is defining new language construct, so new lexical analysis, language parser, semantic analyzer, compilation and optimization of generated IL.
What you can do in such cases is use of some macros/functions.
public bool Until(int val, int check)
{
return !(val == check);
}
and use it like
do {
//Things happen here
} while (Until(i, 15))

Finding methods in source code using regular expressions

I have a program which looks in source code, locates methods, and performs some calculations on the code inside of each method. I am trying to use regular expressions to do this, but this is my first time using them in C# and I am having difficulty testing the results.
If I use this regular expression to find the method signature:
((private)|(public)|(sealed)|(protected)|(virtual)|(internal))+([a-z]|[A-Z]|[0-9]|[\s])*([\()([a-z]|[A-Z]|[0-9]|[\s])*([\)|\{]+)
and then split the source code by this method, storing the results in an array of strings:
string[] MethodSignatureCollection = regularExpression.Split(SourceAsString);
would this get me what I want, ie a list of methods including the code inside of them?

I would strongly suggest using Reflection (if it is appropriate) or CSharpCodeProvider.Parse(...) (as recommended by rstevens)
It can be very difficult to write a regular expression that works in all cases.
Here are some cases you'd have to handle:
public /* comment */ void Foo(...) // Comments can be everywhere
string foo = "public void Foo(...){}"; // Don't match signatures in strings
private __fooClass _Foo() // Underscores are ugly, but legal
private void #while() // Identifier escaping
public override void Foo(...) // Have to recognize overrides
void Foo(); // Defaults to private
void IDisposable.Dispose() // Explicit implementation
public // More comments // Signatures can span lines
void Foo(...)
private void // Attributes
Foo([Description("Foo")] string foo)
#if(DEBUG) // Don't forget the pre-processor
private
#else
public
#endif
int Foo() { }
Notes:
The Split approach will throw away everything that it matches, so you will in fact lose all the "signatures" that you are splitting on.
Don't forget that signatures can have commas in them
{...} can be nested, your current regexp could consume more { than it should
There is a lot of other stuff (preprocessor commands, using statements, properties, comments, enum definitions, attributes) that can show up in code, so just because something is between two method signatures does not make it part of a method body.

Maybe it is a better approach to use the CSharpCodeProvider.Parse() which can "compile" C# source code into a CompileUnit.
You can then walk through the namespaces, types, classes and methods of in that Compile Unit.

using ICSharpCode.NRefactory.CSharp;
PM> install-package ICSharpCode.NRefactory
var parser = new CSharpParser();
var syntaxTree = parser.Parse(File.ReadAllText(sourceFilePath));
var result = syntaxTree.Descendants.OfType<MethodDeclaration>()
.FirstOrDefault(y => y.NameToken.Name == methodName);
if (result != null)
{
return result.ToString(FormattingOptionsFactory.CreateSharpDevelop()).Trim();
}

It is feasible, I guess, to get something working using regex's, however this does require looking very carefully at the specifications for the C# language and a deep understanding of the C# grammar, this is not a simple problem. I know you've said you want to store the methods as arrays of strings, but presumably there is something beyond that. It has already been pointed out to look at using reflection, however if that does not do what you want, you should consider ANTLR (ANother Tool for Language Recognition). ANTLR does have C# grammars available.
http://www.antlr.org/about.html

No, those access modifiers can also be used for internal classes and fields, among other things. You'd need to write a full C# parser to get it right.
You can do what you want using reflection. Try something like the following:
var methods = typeof (Foo).GetMethods();
foreach (var info in methods)
{
var body = info.GetMethodBody();
}
That probably has what you need for your calculations.
If you need the original C# source code you can't get it with reflection. Don't write your own parser. Use an existing one, listed here.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# Preprocessor - c#

Consider taking a look at an aspect-oriented solution like PostSharp, which injects code after the fact based on custom attributes. It's the opposite of a precompiler but can give you the sort of functionality you're looking for (PropertyChanged notifications etc).

Should I build a C# pre-processor? Is there one available that does the simple things I want to do? You can always use the C pre-processor -- C# is close enough, syntax-wise. M4 is also an option.