As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
We are trying to use the IDesign C# Coding standard. Unfortunately, I found no comprehensive document to explain all the rules that it gives, and also his book does not always help.
Here are the open questions that remain for me (from chapter 2, Coding Practices):
No. 26: Avoid providing explicit values for enums unless they are integer powers of 2
No. 34: Always explicitly initialize an array of reference types using a for loop
No. 50: Avoid events as interface members
No. 52: Expose interfaces on class hierarchies
No. 73: Do not define method-specific constraints in interfaces
No. 74: Do not define constraints in delegates
Here's what I think about those:
I thought that providing explicit values would be especially useful when adding new enum members at a later point in time. If these members are added between other already existing members, I would provide explicit values to make sure the integer representation of existing members does not change.
No idea why I would want to do this. I'd say this totally depends on the logic of my program.
I see that there is alternative option of providing "Sink interfaces" (simply providing already all "OnXxxHappened" methods), but what is the reason to prefer one over the other?
Unsure what he means here: Could this mean "When implementing an interface explicitly in a non-sealed class, consider providing the implementation in a protected virtual method that can be overridden"? (see Programming .NET Components 2nd Edition, end of chapter “Interfaces and Class Hierarchies”).
I suppose this is about providing a "where" clause when using generics, but why is this bad on an interface?
I suppose this is about providing a "where" clause when using generics, but why is this bad on a delegate?
Ok so I will basically chip in with the little rep I have on stackoverflow: You can all kill me for opening a religious debate.
The hint to working with Juval's code standard is really in it's preface:
"I believe that while fully understanding every insight that goes into a particular programming decision may require reading books and even years of experience, applying the standard should not."
Next hint is to read his work.
Next would be doing what he advices.
Maybe on the way understand why he is recognised as a "microsoft software legend".
The next might be that it is delivered in a read-only pdf.
That didnt satisfy you?
The key here is realizing this is not a religious narrative, it is indeed a pragmatic one.
My personal experience with his standard is pretty much:
If you follow the advices as best you can, you will get a productive outcome.
Your needs of refactoring will be less, you productivity in the later cycles of the project will be higher.
This is an experience based evaluation.
Basically what you are looking for are the answers to "why" certain portions of the coding standard is in there, specifically these: 26, 34, 50, 52, 73, 74
You are asking these questions from a pragmatic perspective: You want to incorporate the standard and you "somehow" know this will give you a benefit in the longer run.
By examining these rules, through experience, you will perhaps come to understand why they are indeed there. Experience is here doing and not doing your code in accordance with the principles.
But really that isnt the point. The point is that by working from the basis of a well considered, mature and reliable standard, your work now becomes work of quality.
On how to follow a standard
Really read the rules as advices, they are already very strictly worded: "Avoid, do not, do and so on" really means what they say. Avoid means "think really hard before breaking the principle". "Do not" really means "do not".
Juval is all about decoupling and most of his harder-to-grok advices really comes from not thinking "separation" into your code design.
Sometimes this is hard to do with a team that works in less abstract terms or in more feature oriented environments, and you can therefore sometimes find it necessary to break rules, "to get the team moving", but you really should refactor this to the standard when you can do so.
After a few years, a few projects more, you perhaps will come to understand the rationale for each and every rule in the simple guide. If you are a bright student. Me, I'm not so bright, so I base it on experience: Stuff that is in the non-standardized parts of the code often fails during integration testing. It is often harder to come back to. It often ties poorly to the environment and so on.
It really is a matter of trust. If you cant find it in yourself to trust this (not adopt it - trust it), standard, I will propose to you that you will find it hard to ever really write comprehensible c# code that is generally of a acceptable quality.
This of course not considering the few really smart and really experienced out there that managed to build up their own set of internalizable rule sets on how to write code that is:
Stable, readable, maintainable, extendable and generally mega-ble.
This not saying that this is the only standard, just that it do indeed work. Also over time.
You will be able to find other standards, most if not all, though sub-standard to this for working with code you "really want to work" in the long run, involving a team of developers and changing real time situations.
So consider your standard, it is really setting the bar for the quality you provide for the $ you earn.
Do not reconsider it, unless you have "real good reasons to".
But do learn from it.
You reached here and you are still not satisfied. You want answers!
Darn it. Ok let me give you my clouded and sub-par experience based view on the rules your team can't grok.
No. 26: Avoid providing explicit values for enums unless they are integer powers of 2
Its about simplicity and working within that simplicity.
Enum's in c# are already numbered from "i=0" and injecting "++i" for every new field you define. The simplest way to [read code with/] think about enums are therefore either you flag them or you enumerate them.
When you dont do this, you are probably doing something wrong.
If you use the enum to "map" some other domain model, this rule can be broken, but the enum should be visibily separated from normal enums through placement/namespace/etc - to indicate you are "doing something not ordinary".
Now look at the enums you have created out-of-standard. Really look at them. Probably they are mapping something that really do not belong in a hard-coded enum at all. You wanted the efficiency of a switch, but you have in actuality now begun to hardcode out-of-domain properties. Sometimes this is ok and can be refactored later, sometimes it is not.
So avoid it.
The argument of "inserting into the middle" in the development process, is not really an issue that should break the standard. Why? Well if you are using or in any way storing the "int value" of the enumeration, then you are already diverging into a usage pattern where you need to stay very focused indeed. Not using 0, 1, 2 is not the answer to problems in that particular domain, i.e you are probably "doing it wrong".
The EF code-first argument is, probably not an issue to circumvent the rule here. If you feel you have such a need, please do describe your situation in a separate thread and I will try to guide you how to both follow the standard and work with your persistance framework.
My initial take would be: If they are code-first poco entities, then let them abide by the code standard, After all your team have decided to work with a code-first view on the data model. Allow them to do just that, following the same semantics as the rest of their code base.
If you run into specific problems related to the mapping to database tables, then solve these problems while maintaining the standard. For EF 4.3 I would say use the get/set with int pattern and replace in 5.0.
The real gritty stuff here is how to maintain this entity as the database evolves. Be sure to provide clear migration paths for your production database when your enum containing entities change design. Be very sure if the entities themselves change values. This go for add/remove/insert of new definitions. Not just add.
No. 34: Always explicitly initialize an array of reference types using a for loop
My guess is this is an "experience based" rule. In such a way that he looked through a lot of code and this point seemed to fail often for those that did not do it.
No. 50: Avoid events as interface members
Basically a question of separation - your interfaces are now coupled "both" ways into the interface and out of it. Read his book for clarification on why that is not "best practice" in the c# world.
In my limited view this is probably argumentation along the lines of: "Call-backs are very different in nature from function-calls. they make assumptions on the state of the receiver at an undefined "later time" of execution. If you want to provide callbacks, by all means do provide them, but make them into separate interfaces and also define an interface to negotiate these callbacks, in effect establishing all the hard things you wanted to abstract away, but really just circumvented. Or probably just use a recognized pattern. aka read the book."
No. 52: Expose interfaces on class hierarchies
Separation. I cant really explain it any further than: The alternative is hard to factor into a larger context/solution.
No. 73: Do not define method-specific constraints in interfaces
Separation. same as #52
No. 74: Do not define constraints in delegates
Separation.
Conclusion
......
Another view on #50 is that it is one of those rules where:
If you dont get it, its probably not important for you.
Importance here is an estimate on how critical the code is - and how critically you want that code to always work as intended.
This again leads into a broader context on how to verify your code actually is quality.
Some would say this can be done with tests only.
They however fail to understand the interconnectedness between defensively writing quality software and aggressively trying to prove it wrong.
In the real world, those two should be parts of a greater totality with the actual requirements of the software being the third.
Doing a quick mock-up? who cares. Doing a little visual thingy that you understand the runtime constraints of? who cares - well probably you should, but perhaps acceptable. and so on.
In the end compliance to a coding standard is not something you take on lightly.
There are reasons for doing it that goes beyond just the code:
Collaboration and communication primarily.
Agreement on shared views.
Assertions of assumptions.
etc.
No. 26: Power of two means you want to use the enum as a bitmask (flags). Thats the only reason to specify enum values. For adding new members later on, you can still append them to the enum definition without changing existing values. No reason to put them between existing members.
No. 34: I think he wants to avoid the situation where you have an array wich contains (partially) uninitialized pointers (null references). As the consumer of an array its tempting
not to check for null entries in a valid array variable.
No. 26: He's quite wrong, at least for public enums. The problem crops up when you remove an item, not when you add one (for which adding to the end of the list would be equivalent to adding an item with the next available value).
For the others, I'm really not sure why he's making these recommendations, although I have to admit that I find a few (or maybe most) of them rather dubious...
Related
I need to provide a copy of the source code to a third party, but given it's a nifty extensible framework that could be easily repurposed, I'd rather provide a less OO version (a 'procedural' version for want of a better term) that would allow minor tweaks to values etc but not reimplementation using the full flexibility of how it is currently structured.
The code makes use of the usual stuff: classes, constructors, etc. Is there a tool or method for 'simplifying' this into what is still the 'source' but using only plain variables etc.
For example, if I had a class instance 'myclass' which initialised this.blah in the constructor, the same could be done with a variable called myclass_blah which would then be manipulated in a more 'flat' way. I realise some things like polymorphism would probably not be possible in such a situation. Perhaps an obfuscator, set to a 'super mild' setting would achieve it?
Thanks
My experience with nifty extensible frameworks has been that most shops have their own nifty extensible frameworks (usually more than one) and are not likely to steal them from vendor-provided source code. If you are under obligation to provide source code (due to some business relationship), then, at least in my mind, there's an ethical obligation to provide the actual source code, in a maintainable form. How you protect the source code is a legal matter and I can't offer legal advice, but really you should be including some license with your release and dealing with clients who are not going to outright steal your IP (assuming it's actually yours under the terms you're developing it.)
As had already been said, if this is a requirement based on restrictions of contracts then don't do it. In short, providing a version of the source that differs from what they're actually running becomes a liability and I doubt that it is one that your company should be willing to take. Proving that the code provided matches the code they are running is simple. This is also true if you're trying to avoid license restrictions of libraries your application uses (e.g. GPL).
If that isn't the case then why not provide a limited version of your extensibility framework that only works with internal types and statically compile any required extensions in your application? This will allow the application to continue to function as what they currently run while remaining maintainable without giving up your sacred framework. I've never done it myself but this sounds like something ILMerge could help with.
If you don't want to give out framework - just don't. Provide only source you think is required. Otherwise most likely you'll need to either support both versions in the future OR never work/interact with these people (and people they know) again.
Don't forget that non-obfuscated .Net assemblies have IL in easily de-compilable form. It is often easier to use ILSpy/Reflector to read someone else code than looking at sources.
If the reason to provide code is some sort of inspection (even simply looking at the code) you'd better have semi-decent code. I would seriously consider throwing away tool if its code looks written in FORTRAN-style using C# ( http://www.nikhef.nl/~templon/fortran/fortran_style ).
Side note: I believe "nifty extensible frameworks" are one of the roots of "not invented here" syndrome - I'd be more worried about comments on the framework (like "this code is ##### because it does not use YYY pattern and spacing is wrong") than reuse.
I was reading some entries in Eric Lippert's blog about immutable data structures and I got to thinking, why doesn't C# have this built into the standard libraries? It seems strange for something with obvious reuse to not be already implemented out of the box.
EDIT: I feel I might be misunderstood on my question. I'm not asking how to implement one in C#, I'm asking why some of the basic data structures (Stack, Queue, etc.) aren't already available as immutable variants.
It does now.
.NET just shipped their first immutable collections, which I suggest you try out.
Any framework, language, or combination thereof that is not a purely experimental exercise has a market. Some purely experimental ones go on to develop a market.
In this sense, "market" does not necessarily refer to market economics, it's as true whether the producers of the framework/language/both are commercially or non-commercially oriented and distributing the framework/language/both (I'm just going to say "framework" for now on) at a cost or for free. Indeed, free-as-in-both-beer-and-speech projects can be even more heavily dependent on their markets than commercial projects in this way, because their producers are a subset of their market. The market is anyone and everyone who uses it.
The nature of this market will affect the framework in several ways both by organic processes (some parts prove more popular than others and get a bigger share of the mindspace within the community that educates its own members about them) and by fiat (someone decides a feature will serve the market and therefore adds it).
When .NET was developed, it was developed to serve its future market. Ideas about what would serve them therefore influenced decisions as to what should and should not be included in the FCL, the runtime, and the languages that work with it.
Someone decided that we'd quite likely need System.Collections.ArrayList. Someone decided we'd quite likely need System.IO.DirectoryInfo to have a Delete() method. Nobody decided we'd be likely to need a System.Collections.ImmutableStack.
Maybe nobody thought of it at all. Maybe someone did and even implemented it and then it was decided not to be of enough general use. Maybe it was debated at length within Microsoft. I've never worked for MS, so I don't have a clue.
What I can consider though, is the question as to what the people who were using the .NET framework in 2002 using in 2001.
Well, COM, ActiveX, ("Classic") ASP, and VB6 & VBScript is now much less used than it was, so it can be said to have been replaced by .NET. Indeed, that could be said to have been an intention.
As well as VB6 & VBScript, a considerable number who were writing in C++ and Java with Windows as a sole or primary target platform are now at least partly using .NET instead. Again, I think that could be said to be an intention, or at the very least I don't think MS were surprised that it went that way.
In COM we had an enumerator-object based foreach approach to iteration that had direct support in some languages (the VB family*), and .NET we have an enumerator-object based foreach approach to iteration that has direct support in some languages (C#, VB.NET, and others)†.
In C++ we had a rich set of collection types from the STL, and in .NET we have a rich set of collection types from the FCL (and typesafe generic types from .NET2.0 on).
In Java we had a strong everything-is-an-object style of OOP with a small set of methods provided by a common base-type and a boxing mechanism to allow for simple efficient primitives while keeping to this style. In .NET we have a strong everything-is-an-object style of OOP with a small set of methods provided by a common base-type and a (different) boxing mechanism to allow for simple efficient primitives while keeping to this style.
These cases show choices that are unsurprising considering who was likely to end up being the market for .NET (though such broad statements above shouldn't be read in a way that underestimates the amount of work and subtlety of issues within each of them). Another thing that relates to this is when .NET differs from COM or classic VB or C++ or Java, there may well be a bit of an explanation given in the documentation. When .NET differs from Haskell or Lisp, nobody feels the need to point it out!
Now, of course there are things done differently in .NET than to any of the above (or there'd have been no point and we could have stayed with COM etc.)
However, my point is that out of the near-infinite range of possible things that could end up in a framework like .NET, there are some complete no-brainers ("they might need some sort of string type..."), some close-to-obvious ("this is really easy to do in COM, so it must be easy to do in .NET"), some harder calls ("this will be more complicated than in VB6, but the benefits are worth it"), some improvements ("locale support could really be made a lot easier for developers, so lets build a new approach to the old issue") and some that were less related to the above.
At the other extreme, we can probably all imagine something that would be so out there as to be bizarre ("hey, all coders like Conway's Life - let's put a Conway's Life right into the framework") and hence there's no surprise at not finding it supported.
So far I've quickly brushed over a lot of hard work and difficult design balances in a way that makes the design they came up with seem simpler than it no doubt was. Most likely, the more "obvious" it seems to an outsider after the fact, the more difficult it was for the designers.
Immutable collection types falls into the large range of possible components to the FCL that while not as bizarre as a built-in-conway-support idea, was not as strongly called for by examining the market as a mutable list or a way to encapsulate locale information nicely. It would have been novel to much of the initial market, and therefore at risk of ending up not being used. In an alternate universe there's a .NET1.0 with immutable collections, but it's not very surprising that there wasn't one here.
*At least for consuming. Producing IEnumVARIANT implementations in VB6 wasn't simple, and could involve writing pointer values straight into v-tables in a rather nasty way that it suddenly occurs to me, is possibly not even allowed by today's DEP.
†With a sometimes impossible to implement .Reset() method. Is there any reason for this other than it was in IEnumVARIANT? Was it even ever much used in IEnumVARIANT?
I'll quote from that Eric Lippert blog that you've been reading:
because no one ever designed, specified, implemented, tested, documented and shipped that feature.
In other words, there is no reason other than it hasn't been high enough value or priority to get done ahead of all the other things they're working on.
Why can't you make an immutable struct? Consider the following:
public struct Coordinate
{
public int X
{
get { return _x; }
}
private int _x;
public int Y
{
get { return _y; }
}
private int _y;
public Coordinate(int x, int y)
{
_x = x;
_y = y;
}
}
It's an immutable value type.
It's hard to work with immutable data structures unless you have some functional programming constructs. Suppose you wanted to create an immutable vector containing every other capital letter. How would you do it unless you
A) had functions that did things like range(65, 91), filter(only even) and map(int -> char) to create the sequence in one shot and then turn it into an array
B) created the vector as mutable, added the characters in a loop, then then "froze" it, making it immutable?
By the way, C# does have the B option to some extent -- ReadOnlyCollection can wrap a mutable collection and prevent people from mutating it. However, it's a pain in the ass to do that all the time (and obviously it's hard to support sharing structure between copies when you don't know if something is going to become immutable or not.) A is a better option.
Remember, when C# 1.0 existed, it didn't have anonymous functions, it didn't have language support for generators or other laziness, it didn't have any functional APIs like LINQ -- not even map or filter -- it didn't have concise array initialization syntax (you couldn't write new int[] { 1, 2, 5 }) and it didn't have generic types; just putting stuff into and getting stuff out of collections normally was a pain. So I don't think it would have been a great choice to spend time on making robust immutable collections with such poor language support for using them.
It would be nice if .net had some really solid support for immutable data holders (classes and structures). One difficulty with adding really good support for such things, though, is that taking maximum advantage of mutable and immutable data structures would require some fundamental changes to the way inheritance works. While I would like to see such support in the next major object-oriented framework, I don't know that it can be efficiently worked into existing frameworks like .net or Java.
To see the problem, imagine that there are two basic data types: basicItem and deluxeItem (which is a basicItem with a few extra fields added). Each can exist in two concrete forms: mutable and immutable. Each can also be described in an abstract form: readable. Thus, there should be six data types; all but ReadableBasicItem should be substitutable for at least one other:
ReadableBasicItem: Not substitutable for anything
MutableBasicItem: ReadableBasicItem
ImmutableBasicItem: ReadableBasicItem
ReadableDeluxeItem: ReadableBasicItem
MutableDeluxeItem: ReadableDeluxeItem, MutableBasicItem (also ReadableBasicItem)
ImmutableDeluxeItem: ReadableDeluxeItem, ImmutableBasicItem (also ReadableBasicItem)
Even thought the underlying data type has just one base and one derived type, there inheritance graph has two "diamonds" since both "MutableDeluxeItem" and "ImmutableDeluxeItem" have two parents (MutableBasicItem and ReadableDeluxeItem), both of which inherit from ReadableBasicItem. Existing class architectures cannot effectively deal with that. Note that it wouldn't be necessary to support generalized multiple inheritance; merely to allow some specific variants such as those above (which, despite having "diamonds" in the inheritance graph, have an internal structure such that both ReadableDeluxeItem and MutableBasicItem would inherit from "the same" ReadableBasicItem).
Also, while support for that style of inheritance of mutable and immutable types might be nice, the biggest payoff wouldn't happen unless the system had a means of distinguishing heap-stored objects that should expose value semantics from those which should expose reference semantics, could distinguish mutable objects from immutable ones, and could allow objects to start out in an "uncommitted" state (neither mutable nor guaranteed immutable). Copying a reference to a heap object with mutable value semantics should perform a memberwise clone on that object and any nested objects with mutable value semantics, except in cases where the original reference would be guaranteed destroyed; the clones should start life as uncommitted, but be CompareExchange'd to mutable or immutable as needed.
Adding framework support for such features would allow copy-on-write value semantics to be implemented much more efficiently than would be possible without framework support, but such support would really have to be built into the framework from the ground up. I don't think it could very well be overlaid onto an existing framework.
So yeah, the question basically says it all. What do you gain when you ensure that private members / methods / whatever are marked private (or protected, or public, or internal, etc) appropriately?
I mean, of course I could just go and mark all my methods as public and everything should still work fine. Of course, if we'd talk about good programming practice (which I am a solid advocate of, by the way ), I'd mark a method as private if it should be marked as such, no questions asked.
But let's set aside good programming practice, and just look at this in terms of actual quantitative gain. What do I get for proper scoping of my methods, members, classes, etc.?
I'm thinking that this would most generally translate to performance gains, but I'd appreciate it if someone could provide more detail about it.
(For purposes of this question, I'm thinking more along C#.NET, but hey, feel free to provide answers on whatever language / framework you deem fit.)
EDIT: Most pointed out that this doesn't lead to performance gain, and yeah, thinking back, I don't even know why I thought that. Lack of coffee probably.
In any case, any good programmer should know about how proper scopes (1) help your code maintenance / (2) control the proper use of your library / app / package; I was kinda curious as to whether or not there was any other benefit you get from it that's not apparently obvious outright. Based on the answers below, it looks like it basically sums up to just those two things most importantly.
Performance has absolutely nothing to do with the visibility of methods. Virtual methods have some overhead, but that's not why we scope. It has to do with maintenance of code. Your public methods are the API to your class or library. You as a class designer want to provide some guarantee to the outside world that future changes aren't going to break other peoples code. By marking some methods private, you take away the ability for users to depend on certain implementations which allows you freedom to change that implementation at will.
Even languages that don't have visibility modifiers, like python, have conventions for marking methods as internal and subject to change. By prefixing the method with an _underscore(), you're signalling to the outside world that if you use that method, you do so at your own risk, as it can change at any time.
On the other hand, public methods are an explicit entry way into your code. Every effort should go towards making public methods backward compatible to avoid the pitfalls I described above.
By better encapsulation, you provide a better API. Only methods / properties that are of interest of the user of your class are available : visible.
Next to that, you ensure that certain variables that should not be called / modified, cannot be called/modified.
That's the most important thing. Why do you think this would lead to performance gains ?
As I see you gain two important features from proper scoping. You API is reduced in size and clearly focused on the task at hand.
Second, you get a less brittle implementation as you are free to change implementation details without altering the exposed API.
I cannot see how accessibility modifiers would affect performance in any way.
There are mainly two types of methods/properties.
That are helpful to perform a task to whoever consumes it. (Recommended Scope: Public)
That are helpful to the above methods to get their task done. (Recommended Scope: Private or Protected)
Type 1 methods are the only methods that any client code requires and does not need any other method. This avoids confusion, keeps things simple and prevents client code to do something wrong.
Type 2 methods are methods into which Type 1 methods are divided. They help Type 1 methods to complete their task and still allow them to be simple, concise, less complex and more readable. They are not really needed for client code but just the class/module itself.
A fair example would be of a car. What you have is a gas pedal, brakes, gearbox, etc. You don't have an interface to minor details for what is under the hood. That is for the mechanic.
In C# programming, it helps to make sure that your API/classes/methods/members are "easy to use correctly and difficult to use incorrectly".
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I currently support an application at work that was originally written by a team of four but has now reduced to just me. We recently got a contractor in to look at some performance issues while I'm occupied with other things.
While the contractor has appeared to do a good job with the performance, they have also gone through large amounts of the code replacing the pre-existing style with their personal preference.
Unfortunately we don't have a coding standards doc, just a general rule to adhere to c# general rules.
As an example of what they've done, it includes:
removing nearly all the uses of the 'var' keyword
Anywhere with an if statement and a single line, they've added curly braces
Removing most of the lambdas and replacing it with more verbose code
Changing method signatures so every parameter is on a separate line rather than one line
We also operate a TDD policy but the test coverage, especially on the performance specific parts, is very low leaving very little documentation on what they've changed and making it even harder as their checkin comments aren't particularly helpful and the actual functional changes are lost amongst the swathe of 'tweaks'.
How do I talk to the contractor about this? Obviously there's not much impetus on them to change it given they have no responsibility to support the project and they don't seem particularly receptive to change.
Or should I just put up with it for the short duration of the contract then change everything back to the code formatting we used before?
Made community-wiki 'cos there's probably not one right answer here.
Anywhere with an if statement and a single line, they've added curly braces
This one and the only one may be beneficial.
removing nearly all the uses of the 'var' keyword
Removing most of the lambdas and replacing it with more verbose code
Changing method signatures so every parameter is on a separate line rather than one line
These ones make little sense to change.
Tell him he's not authorized to restyle code. You won't be paying for the time wasted for these activities and they'll have to use their own time to put things back. That should provide a refreshment.
These things should be discussed in advance. You should state clearly what activities are allowed and what not. Not a long ago there was another similar question here about a contractor who would put his initials all over the code including database entities. It was some perverse kind of self-promotion for which there is no place in someone else's code.
P.S. There may also be a possibility that by doing all these things your contractor is artificially creating extra workload to bill you more hours.
I'm a contractor (sometimes) and if I did this I would expect to be shown the door with great speed and no payment. Seriously, this person is hired by you and should be doing exactly what he is told to do, no more and no less. And don't worry about being "nice" - contractors don't expect that from permies.
How do I talk to the contractor about this?
Politely: explain why you want to minimize changes to the source code.
ALternatively, have a code inspection of the changes before check-in: and don't allow check-in of changes that you don't understand/don't want/haven't been tested.
Implement FxCop - this should be your first line of defense. Also if you use source control (if you don't then implement one ASAP), make sure to use dev labelling (only build on file that have been labelled for the build), and don't give him rights to move labels on the files. This way you can scrutinize his changes, and refuse to dev label his code until it meets your standards. Whatever he codes won't make it into QA until you move the dev label to the revision in question, so he's pretty much at your mercy there. Note that some shops don't use a single label for their sandbox builds, they like to apply new labels even to the sandbox, so you may be inclined to do that as well.
The problem has happened now, and as the other said it's an unjustifiable waste of your money and it's outright impolite (as correct as the curly braces thing may be).
Certainly to help prevent future problems, and maybe helpful to resolve this, I'd advise you set up a stylecop implementation - at the very least they can't fail to be unaware of when they are breaking your rules.
I think we all know the temptation of seeing coded we think is "not the way I'd do it". But we resist it.
I would have a chat about it with your boss first to get their take on it. But the first thing that springs to mind is that unless you specifically asked the contractor to do the work, he was not doing what he was hired to do, regardless of any benefit he thinks he may have been adding. So there needs to be a discussion about that.
The next thing that sprung to mind is that regardless of how good they may be or well intentioned, people who make bulk changes without discussing it with the owners of the code are bad news. They will piss people off, or worse introduce bugs and unforeseen behavior that you will have to clean up. He needs to be set straight that doing this sort of thing without permission on other peoples code is not acceptable.
When I see things I don't like in others code which are serious enough to warrant attention, I check with the owners of the code first. Even if there are obvious bugs, it
s their code and their decision about cleaning it up, not mine.
As others have said, these changes are simply for coding style. If he is there to improve performance, he is wasting time with these changes. If he can't cite how these changes will improve performance, then his OCD is just running up the bill.
I would say, "I appreciate your changes to the coding style, but lets focus on non-style changes to areas of the code that are causing the slowdown."
If a contractor did wholesale reformatting of code without authorization, I'd give him one and only one change to put things back the way they were -- and on his own time.
In addition to the valid points others make, consider the version-control nightmare this causes. Instead of the clean progression of a few lines added here, a few lines changed here, you now have this "rift" in your source control database, so that any comparisons between versions before and after this contractor's "improvements" will be meaningless.
Have the contractor back out all of his changes. Today. And on his own time.
This is quite common my experience, that people can't resist making 'improvements' and suddenly you find you're billed for stuff you didn't want. Sometimes I'm sure it's done deliberately to get more paying work, but mostly I think it's a developer getting side-tracked and unable to deal with leaving 'wrong' code.
It might require a bit of a battle, but you basically have to keep reiterating "don't change anything you're not asked to work on". Depending on his personality, you might just have to ask once nicely, or get someone higher to force him.
First, as others have said. You are paying the bill. He is not an employee. His job is to do what you ask him to do, and only what you ask him to do, otherwise you can show him the door. Always remember this. You are driving the boat, not him. You can try to not pay him, but that will be hard to do if you have a legal contract and there is nothing in it about leaving code as-is. But, you can simply let him go at any time.
Second, if you can't get him to stop and revert, and you can't get rid of him, you can tell him that if he plans to do style changes, then he should do all style changes in one check-in with absolutely NO code changes. This allows you to move forward from a base set of code that can be diffed to see code changes.
Third, make him explain the justification for the changes he's made. Removing var has no performance benefit.
Fourth, and this may suck a great deal, but youc an always use ReSharper to put the code back to your accepted style after the fact. It's more work, and you still have borked diffs, but oh well. The lambdas are harder, and that's the one you should really get on his case about.
Fifth, to drive home your point, force him to back out every change he's made and re-implement only the code changes, and not the style changes. That should open his eyes as to the mess he's created when he can't figure it out himself.
Finally, you may just have the bite the bullet and PAY him to revet back. Yes, it sucks, but since you made the mistake of not policing him, not specifying up front what you wanted, and what he's not allowed to do... You will pay the ultimate price. You can either pay him to do it, pay someone else to do it, pay you to do it, or live with it (and pay the price of the borked diffs). Any way you cut it, it will cost you money.
Well, smells like a solution wide code reformatting to me, that could be automated/enforced by settings in a tool like Resharper. I would think it very impolite and would ask him to refrain from pressing the "Reformat all code according to my personal taste" button.
To avoid the situation happening in the first place, introduce code review, particularly for any new developers joining who may not know your standards.
I'm a big fan of using git, feature branches and a service that supports pull requests (github or bitbucket). TFS isn't really up to the job, but thankfully Visual Studio supports git now. Doing code review before merging to master ensures it doesn't get forgotton. If you're paranoid you don't even need to give contractors write access to your primary repository.
Alternate point of view:
Your make two statements: "While the contractor has appeared to do a good job with the performance" and "they have also gone through large amounts of the code replacing the pre-existing style with their personal preference."
This raises many questions such as: Whenever you can "drop in" a contractor for a short period of time and gain performance enhancements. This indicates that there must have been very major flaws in the application in the first place. Anytime you need to bring in a contractor to "fix performance" this is a sign of very poorly written code or a very complex problem that requires high end expertise.
Next: When you complain that they have changed the code style even though you did not have any stated code style are you just making a pointless argument about your mojo being better than someone else's mojo. Maybe you should ask the person why they made changes which appear syntactical such that you have a complete picture.
I'm looking at the long list of one sided answers on this post and wondering what happened to the other side. Folks take the emotion out of it and look at it objectively. It's often amazing how many people will look past a beautiful algorithm solution to a complex problem just to notice that the variable naming convention has been altered from camel case to pascal case. I generally put this type of reaction down to justification of self worth by finding immaterial flaws.
Key question I have to ask is: Does the newly formatted code make the application any less readable. If you had budget constraints why did you not make it explicit that you wanted very specific fixes and nothing else. If you wanted to maintain a specific coding style then why not have that explicitly stated?
The project I'm working on has just hit 4200 lines in the main C# file, which is causing IntelliSense to take a few seconds (sometimes up to 6 or so) to respond, during which Visual Studio locks up. I'm wondering how everyone else splits their files and whether there's a consensus.
I tried to look for some guides and found Google's C++ guide, but I couldn't see anything about semantics such as function sizes and file sizes; maybe it's there - I haven't looked at it for a while.
So how do you split your files? Do you group your methods by the functions they serve? By types (event handlers, private/public)? And at what size limit do you split functions?
To clarify, the application in question handles data - so its interface is a big-ass grid, and everything revolves around the grid. It has a few dialogs forms for management, but it's all about the data. The reason why it's so big is that there is a lot of error checking, event handling, and also the grid set up as master-detail with three more grids for each row (but these load on master row expanded). I hope this helps to clarify what I'm on about.
I think your problem is summed up with the term you use: "Main C# file".
Unless you mean main (as in the method main()) there is no place for that concept.
If you have a catch-all utility class or other common methods you should break them into similar functional parts.
Typically my files are just one-to-one mappings of classes.
Sometimes classes that are very related are in the same file.
If your file is too large it is an indication your class is too big and too general.
I try to keep my methods to half a screen or less. (When it is code I write from scratch it is usually 12 lines or fewer, but lately I have been working in existing code from other developers and having to refactor 100 line functions...)
Sometimes it is a screen, but that is getting very large.
EDIT:
To address your size limit question about functions - for me it is less about size (though that is a good indicator of a problem) and more about doing only one thing and keeping each one SIMPLE.
In the classic book "Structured Programming" Dijkstra once wrote a section entitled: "On our inability to do much." His point was simple. Humans aren't very smart. We can't juggle more than a few concepts in our minds at one time.
It is very important to keep your classes and methods small. When a method gets above a dozen lines or so, it should be broken apart. When a class gets above a couple of hundred lines, it should be broken apart. This is the only way to keep code well organized and manageable. I've been programming for nearly 40 years, and with every year that has gone by, I realize just how important the word "small" is when writing software.
As to how you do this, this is a very large topic that has been written about many different times. It's all about dependency management, information hiding, and object-oriented design in general. Here is a reading list.
Clean Code
SOLID
Agile Principles, Patterns, and Pratices in C#
Split your types where it's natural to split them - but watch out for types that are doing too much. At about 500 lines (of Java or C#) I get concerned. At about 1000 lines I start looking hard at whether the type should be split up... but sometimes it just can't/shouldn't be.
As for methods: I don't like it when I can't see the whole method on the screen at a time. Obviously that depends on size of monitor etc, but it's a reasonable rule of thumb. I prefer them to be shorter though. Again, there are exceptions - some logic is really hard to disentangle, particularly if there are lots of local variables which don't naturally want to be encapsulated together.
Sometimes it makes sense for a single type to have a lot of methods - such as System.Linq.Enumerable but partial classes can help in such cases, if you can break the type up into logical groups (in the case of Enumerable, grouping by aggregation / set operations / filtering etc would seem natural). Such cases are rare in my experience though.
Martin Fowler's book Refactoring I think gives you a good starting point for this. It instructs on how to identify "code smells" and how to refactor your code to fix these "smells." The natural result (although it's not the primary goal) is that you end up with smaller more maintainable classes.
EDIT
In light of your edit, I have always insisted that good coding practice for back-end code is the same in the presentation tier. Some very useful patterns to consider for UI refactorings are Command, Strategy, Specification, and State.
In brief, your view should only have code in it to handle events and assign values. All logic should be separated into another class. Once you do this, you'll find that it becomes more obvious where you can refactor. Grids make this a little more difficult because they make it too easy to split your presentation state between the presentation logic and the view, but with some work, you can put in indirection to minimize the pain caused by this.
Don't code procedurally, and you won't end up with 4,200 lines in one file.
In C# it's a good idea to adhere to some SOLID object-oriented design principles. Every class should have one and only one reason to change. The main method should simply launch the starting point for the application (and configure your dependency injection container, if you're using something like StructureMap).
I generally don't have files with more than 200 lines of code, and I prefer them if they're under 100.
There are no hard and fast rules, but there's a general agreement that more, shorter functions are better than a single big function, and more smaller classes are better than 1 big class.
Functions bigger than 40 lines or so should make you consider how you can break it up. Especially look at nested loops, which are confusing and often easy to translate to function calls with nice descriptive names.
I break up classes when I feel like they do more than 1 thing, like mix presentation and logic. A big class is less of a problem than a big method, as long as the class does 1 thing.
The consensus in style guides I've seen is to group methods by access, with constructors and public methods on the top. Anything consistent is great.
You should read up on C# style and refactoring to really understand the issues you're addressing.
Refactoring is an excellent book that has tips for rewriting code so that behavior is preserved but the code is more clear and easier to work with.
Elements of C# Style is a good dead tree C# style guide, and this blog post has a number of links to good online style guides.
Finally, consider using FxCop and StyleCop. These won't help with the questions you asked, but can detect other stylistic issues with your code. Since you've dipped your toe in the water you might as well jump in.
That's a lot, but developing taste, style and clarity is a major difference between good developers and bad ones.
Each class should do one small thing and do it well. Is your class a Form? Then it should not have ANY business logic in it.
Does it represent a single concept, like a user or a state? Then it shouldn't have any drawing, load/save, etc...
Every programmer goes through stages and levels. You're recognizing a problem with your current level and you are ready to approach the next.
From what you said, it sounds like your current level is "Solving a problem", most likely using procedural code, and you need to start to look more at new ways to approach it.
I recommend looking into how to really do OO design. There are many theories that you've probably heard that don't make sense. The reason they don't is that they don't apply to the way you are currently programming.
Lemme find a good post... Look through these to start:
how-do-i-break-my-procedural-coding-habits
are-there-any-rules-for-oop
object-oriented-best-practices-inheritance-v-composition-v-interfaces
There are also posts that will refer you to good OO design books. A "Refactoring" book is probably one of the very best places you could start.
You're at a good point right now, but you wouldn't believe how far you have to go. I hope you're excited about it because some of this stuff in your near future is some of the best programming "Learning Experiences" you'll ever have.
Good luck.
You can look for small things to change and change each slowly over time.
Are all the methods used in that class only? Look for support methods, such as validation, string manipulation, that can be moved out into helper/util classes.
Are you using any #region sections? Logical groupings of related methods in a #region often lend themselves to being split into separate classes.
Is the class a form? Consider using User Controls for form controls or groups of form controls.
Sometimes large classes evolve over time due to lots of developers doing quick fixes / new features without considering the overall design. Revisit some of the design theory links others have provided here and consider on-going support to enforce these such as code reviews and team workshops to review design.
Well, I'm afraid to say that you may have a bigger issue at hand than a slow load time. You're going to hit issues of tightly coupled code and maintainability/readability problems.
There are very good reasons to split class files into smaller files (and equally good reasons to move files to different projects/assemblies).
Think about what the purpose that your class is supposed to achieve. Each file should really only have a single purpose. If it's too generalized in its goal, for example, "Contain Shopping Basket Logic", then you're headed down the wrong path.
Also, as mentioned, the term you use: "Main C# file" just reeks that you have a very procedural mindset. My advise would be to stop, step back, and have a quick read up on some of the following topics:
General OOP principles
Domain-driven design
Unit testing
IoC Containers
Good luck with your searches.
Use Partial classes. You can basically break a single class into multiple files.
Perhaps the OP can respond: is your project using Object-Oriented Programming? The fact that you use the word "file" suggests that it is not.
Until you understand object orientation, there is no hope for improving your code in any important way. You'd do better to not split up the file at all, wait until it grows to be unbearably slow and buggy, then instead of bearing it any more, go learn OO.
The Intellisense parser in Visual Studio 2008 seems to be considerably faster than the one 2005 (I know they specifically did a lot of work in this area), so although you should definitely look into splitting the file up at some point as others have mentioned, Visual Studio 2008 may resolve your immediate performance problem. I've used it to open a 100K+ line Linq to SQL file without much issue.
Split the code so that each class/file/function/etc. does only One Thing™. The Single Responsibility Principle is a good guideline for splitting functionality into classes.
Nowadays, the largest classes that I write are about 200 lines long, and the methods are mostly 1-10 lines long.
If you have regions of code within a class, a simple method is to use the partial keyword and breakout that definition of the class into that file. I typically do this for large classes.
The convention I use is to have the ClassName_RegionName.cs. For example, if I want to break out a class that manages the schema for a database connection, and I called the class DatabaseConnection I would create a file called DatabaseConnection.cs for the main class and then DatabaseConnection_Schema.cs for the schema functionality.
Some classes just have to be large. That's not bad design; they are just implementation-heavy.