I am trying to rewrite extremely ugly class in one application at work. In one of our classes, there are hundreds of lines of code that ensure initialization and re-initialization of some classes. Currently, this is done in the awful brute force-y way, where you write your init code and manually copy it to re-init part (as they are very similar).
Because of this , I started to rewrite it to a form of a list of delegates which are then called with a parameter in both places (bool isReinit). Then I noticed that most of the delegates are also identical, as the initialization process of 90 percent of the classes is identical. This means that I should be able to create some default initialization function to simplify the code drastically. Currently I created something like this :
https://dotnetfiddle.net/RVS5UT
I also created class CustomInitializer which implements IInitializer and only takes one Func as a parameter and runs it in Initialize, for the cases where the initialization is a lot different.
Now, this simplified and anonymized piece of working code, but it works. The problem is that the whole approach is very awkward and the constructor signature is ugly as hell. Is there some way to simplify this ? I can't find any pattern or approach that would help me ? Any step towards better code is welcome and maybe I am just missing something.
There is also another catch. One solution I figured out would be to store the property pairs (var1a + var1b, var2a+var2b, ..) in an object and pass it directly to Initialize method. But this would mean moving the properties, which is sadly not possible at the moment, because the file has over 18k lines and code reviewers would kill me for changing third of them because of refactoring of one method (even if its a long one). I need to leave the target properties (var1a, var1b, var2a, ..) where they are now. This could also mean that there is no elegant way to solve this.
I am using .NET 4.0, C# 5.0
EDIT: I have no access to the initialized types (another stupid catch)
Thanks for your help.
the file has over 18k lines
Wow, looks like a lot of fun.
It is absolutely good to try to improve it. And believe me, whatever your co-workers may think, there is nothing else to do than refactoring here, unless this code does not need to evolve.
But, it seems to me you go on the path of complexity, trying to be DRY instead of trying to be expressive. The idea of having StandardInitializer and CustomInitializer managing lambdas is extremely complex. The initialization of a class should be in the class it is responsible to initialize. If some behaviors are really shared, they may share a base class or a collaboration class.
I recommend you this discussion on Working Effectively With Legacy Code. As you'll see and probably already know, the first key point is to have tests.
Please don't try to refactor such a class without a test harness. Otherwise you'll introduce regression, you'll be frustrated, and your co-workers will be comforted in their vision that nothing can be done here without breaking everything.
And don't forget if tests are hard to create, it's because of bad code, not because tests are expensive. Bad code is expensive.
After some tests protect you, try to think in terms of responsibility and life cycle. For example in a WPF application, it is a common issue to have "initializable" ViewModel because they do some async web service call to initialize themselves.
In this case, the object with the responsibilty of lifecycle for a given ViewModel, has also the responsibility to init it. If it manages several Initializable view models, then this kind of code is fine:
foreach (var initializable in initializables)
{
initializable.Initialize();
}
But please, whatever solution you choose, keep a clear separation between Initialize and Reinitialize (if they have things in common, make them call an internal shared function). It is a very bad idea to write stuff like:
init.Initialize(true);
It clearly states that the behavior of your Initialize function will change depending of a boolean value. If you have 2 behaviors, you should have 2 functions with clear naming.
Related
I'm writing unit tests for the implementation of an API I wrote myself in my company's application. Still new to this whole thing. When looking for answeres on how to unit test certain things I come across a certain pattern. It goes something like this:
Question:
I have this private method I need to unit test.
Top voted answer:
Don't.
I also came across this article arguing against unit testing private methods as well.
Basically how I'm implementing an API I'm given is I write the code first, then I write unit tests to "break it the worst way possible" (as my superior puts it). Once I notice something broke I fix it in the code. To me this seems like a mash-up of OOD and TDD. Is that a legit approach?
The reason I got so many private methods in the first place is that I'm required to break up larger chunks of code into methods. Since these methods are only supposed to be used within the scope of this API implementation I set them to private. Since the file structure established by my team requires me to write all the code into a single file corresponding to an API I can't separate these private methods into a new class and set them to public.
My superior expects me to test these private methods as well. But I'm beginning to doubt if this is even really necessary if the Asserts on the public methods all run successfully?
From my point of view, if my tests on the public methods return the values I expected, I infer that my private methods also work like I intended.
Or am I missing something?
The core point is: unit tests exist to guarantee that your class under tests behaves as expected.
The behavior of your classes manifests itself via those methods that can be called from "outside" of your classes.
Therefore there is neither need nor sense in trying to directly test private methods.
Of course, it is fair to measure coverage while running unit tests; in order to understand which paths in your code are taken. This information can be used to either enhance test cases (to gain more coverage); or to delete production code (which isn't required).
And to align with your question: you do not use TDD to implement private methods.
You use TDD to create a special form of your "contract" that can be executed automatically. You verify what needs to be done; not how it is actually done in detail. That is especially true since the TDD methodology includes continuous refactoring. You write your tests, you turn them green (by writing production code); and then, at some point, you look into improving the quality of your code. Meaning: you start reworking internal aspects of your class under test. Like: creating more private methods, moving content around; maybe even creating internal-only helper classes and so on. But you keep running your existing tests ... which should still all work; because as said: you write them to check the externally observable behavior (as far as possible).
And beyond that: you should rather looking into "fuzzying" the test data that your unit tests drive into your code instead of worrying about private methods.
What I mean: instead of trying to manually find that test data that makes your production code break, look into concepts like QuickCheck that try to do exactly that automatically.
Final words: and if your management keeps hammering on "test private methods"; then it is your responsibility as engineer to convince them that they are wrong about this. And there is plenty of material out there to back that up.
The way you are splitting your code at the moment is out of necessity. You are delegating some work in a private method, because, well, other public methods need to re-use this, and you don't want to copy-paste that code. Of course, since these methods don't make sense being used as standalone methods, you keep them private.
Good, at least you're true to the DRY (Don't Repeat Yourself) principle.
Now, another way to look it is that you want to separate your private methods from the rest of the code, because you want to have a Separation of Concerns. If you do this, you will see that these private methods, although they can't be used on their own, don't really belong to the class containing your public methods, because they don't solve the same concern : This is the Single Responsibility principle: the S in SOLID.
Instead of having your private method within your class, what you can do is move it to another class (a service as I call them), inject it in the class in which they were before, and call these methods instead of the call to the private ones.
Why should you do this ?
Because it will be so much easier to test: you delegate a big part of the code, that you will not have to test under a big combination of scenarios.
Because you can then inject an alternative implementation (think maintainability: it's easier to replace a brick, than a part of a brick)
Because you can delegate the implementation (and the testing) of this service to someone else (you can have 2 developers in parallel working on a very small area of the code)
Sometimes, it makes even more sense, because these service classes will then be re-used by other completely different classes that will have the same needs, if they really take care of one single concern.
This last point doesn't always happen, but quite often, it does. I found it is easier to re-use existing data services when they are self-documented: properly-named services and properly-named methods. (your co-workers will discover them more easily)
Now, you don't need to test a private method... because it's public.
You may think it's cheating, because you just made it public, but this comes from a very legitimate approach: Separation of Concerns.
Final notes:
I am convinced your superior is right about asking you to test this code. One thing he could have added was to do that separation into different classes. Also, make sure that you inject these classes using Dependency Injection and Inversion of Control containers. don't instantiate them using the new statement, otherwise, you will not be able to assert that the right method was called with the right arguments !
There's a lot of code like this in company's application I'm working at:
var something = new Lazy<ISomething>(() =>
(ISomething)SomethingFactory
.GetSomething<ISomething>(args));
ISomething sth = something.Value;
From my understanding of Lazy this is totally meaningless, but I'm new at the company and I don't want to argue without reason.
So - does this code have any sense?
Code that is being actively developed is never static, so one possibility of why they code it this way is in case they need to move the assignment to another place in the code later on. However, it sounds as if this is occurring within a method, and normally I would expect Lazy initialization to occur most often for class fields or properties, where it would make more sense (because you may not know which method in the class would first use it).
Unfortunately, it could just as likely be more a lack of knowledge of how the Lazy feature works in C# (or lazy init in general), and maybe they are just trying to use the latest "cool feature" they found out about.
I have seen weird or odd things proliferate in code at a company, simply because people saw it coded one way, and then just copied it, because they thought the original person knew what they were doing and it made sense. The best thing to do is to ask why it was done that way. Worst case, you'll learn something about your company's procedures or coding practices. Best case, you may wind up educating them if they say "gee, I don't know".
Well, in this case is meaningless of course because you are getting the value right after creating the object but maybe this is done to follow a standard or something like that.
At my company we do similar things registering the objects in the Unity container and calling Unity to create the instance just after registering it.
Unless they are using something multiple times in the method, it seems pretty useless, and slightly less efficient than just performing the action immediately. Otherwise, Lazy<T> is going through the Value get and checking to see if the value has been materialized yet, and performing a Func call.. Usefull for deferred loading, but pointless if it is just used once in a method immediately..
Lazy<T> however is usually really helpful for Properties on a class
It can be useful if the Lazy.Value is going to be moved out of the method in the future, but anyway it can be considered as overengineering, and not the best implementation as the Lazy declaration seemed to be extracted to a property in this case.
Thus shortly - yes, it's useless.
Coming from a .NET/C# Background and having solid exposure to PRISM, I really like the idea of having a CompositionContainer to get just this one instance of a class whenever it is needed.
As through the ServiceLocator this instance is also globally accessible this pretty much sums up to the Singleton Pattern.
Now, my current Project is in c++, and I'm at the point of deciding how to manage plugins (external dll loading and stuff like that) for the program.
In C# I'd create a PluginService, export it as shared and channel everything through that one instance (the members would basically only amount to one list, holding the plugins and a bunch of methods). In c++ obviously I don't have a CompositionContainer or a ServiceLocator.
I could probably realize a basic version of this, but whatever I imagine involves using Singletons or Global variables for that matter. The general concern about this seems to be though: DON'T EVER DO GLOBALS AND MUCH LESS SINGLETONS.
what am I to do?
(and what I'm also interested in: is Microsoft here giving us a bad example of how to code, or is this an actual case of where singletons are the right choice?)
There's really no difference between C# and C++ in terms of whether globals and singletons are "good" or "bad".
The solution you outline is equally bad (or good) in both C# and C++.
What you seem to have discovered is simply that different people have different opinions. Some C# developers like to use singletons for something like this. And some C++ programmers feel the same way.
Some C++ programmers think a singleton is a terrible idea, and... some C# programmers feel the same way. :)
Microsoft has given many bad examples of how to code. Never ever accept their sample code as "good practices" just because it says Microsoft on the box. What matters is the code, not the name behind it.
Now, my main beef with singletons is not the global aspect of them.
Like most people, I generally dislike and distrust globals, but I won't say they should never be used. There are situations where it's just more convenient to make something globally accessible. They're not common (and I think most people still overuse globals), but they exist.
But the real problem with singletons is that they enforce an unnecessary and often harmful constraint on your code: they prevent you from creating multiple instances of an object, as though you, when you write the class, know how it's going to be used better than the actual user does.
When you write a class, say, a PluginService as you mentioned in a comment, you certainly have some idea of how you plan it to be used. You probably think "an instance of it should be globally accessible (which is debatable, because many classes should not access the pluginservice, but let's assume that we do want it to be global for now). And you probably think "I can't imagine why I'd want to have two instances".
But the problem is when you take this assumption and actively prevent the creation of two instances.
What if, two months from now, you find a need for creating two PluginServices? If you'd taken the easy route when you wrote the class, and had not built unnecessary constraints into it, then you could also take the easy route now, and simply create two instances.
But if you took the difficult path of writing extra code to prevent multiple instances from being created, then you now again have to take the difficult path: now you have to go back and change your class.
Don't build limitations into your code unless you have a reason: if it makes your job easier, go ahead and do it. And if it prevents harmful misuse of the class, go ahead and do it.
But in the singleton case it does neither of those: you create extra work for yourself, in order to prevent uses that might be perfectly legitimate.
You may be interested in reading this blog post I wrote to answer the question of singletons.
But to answer the specific question of how to handle your specific situation, I would recommend one of two approaches:
the "purist" approach would be to create a ServiceLocator which is not global. Pass it to those who need to locate services. In my experience, you'll probably find that this is much easier than it sounds. You tend to find out that it's not actually needed in as many different places as you thought it'd be. And it gives you a motivation to decouple the code, to minimize dependencies, to ensure that only those who really have a genuine need for the ServiceLocator get access to it. That's healthy.
or there's the pragmatic approach: create a single global instance of the ServiceLocator. Anyone who needs it can use it, and there's never any doubt about how to find it -- it's global, after all. But don't make it a singleton. Let it be possible to create other instances. If you never need to create another instance, then simply don't do it. But this leaves the door open so that if you do end up needing another instance, you can create it.
There are many situations where you end up needing multiple instances of a class that you thought would only ever need one instance. Configuration/settings objects, loggers or wrappers around some piece of hardware are all things people often call out as "this should obviously be a singleton, it makes no sense to have multiple instances", and in each of these cases, they're wrong. There are many cases where you want multiple instances of just such classes.
But the most universally applicable scenario is simply: testing.
You want to ensure that your ServiceLocator works. So you want to test it.
If it's singleton, that's really hard to do. A good test should run in a pristine, isolated environment, unaffected by previous tests. But a singleton lives for the duration of the application, so if you have multiple tests of the ServiceLocator, they'll all run on the same "dirty" instance, so each test might affect the state seen by the next test.
Instead, the tests should each create a new, clean ServiceLocator, so they can control exactly which state it is in. And to do that, you need to be able to create instances of the class.
So don't make it a singleton. :)
There's absolutely nothing wrong with singletons when they're
appropriate. I have my doubts concerning CompositionContainer (but
I'm not sure I understand what it is actually supposed to do), but
ServiceLocator is the sort of thing that will generally be a singleton
in any well designed application. Having two or more ServiceLocator
will result in the program not functionning as it should (because a
service will be registered in one of them, and you'll be looking it up
in another); enforcing this programatically is positive, at least if you
favor robust programming. In addition, in C++, the singleton idiom is
used to control the order of initialization; unless you make
ServiceLocator a singleton, you can't use it in the constructor of any
object with static lifetime.
While there is a small group of very vocal anti-singleton fanatics,
within the larger C++ community, you'll find that the consensus favors
singletons, in certain very restricted cases. They're easily abused
(but then, so are templates, dynamic allocation and polymorphism), but
they do solve one particular problem very nicely, and it would be silly
to forgo them for some arbitrary dogmatic reason when they're the best
solution for the problem.
I'm writing an XNA engine and I am storing all of the models in a List. In order to be able to use this throughout the engine, I've made this a public static List<Model> so I can access it from any new classes that I develop. It certainly makes obtaining the list of models really easy to get too, but is this the right usage? Or would I be better off actually passing a variable through in a method declaration?
In OOP it's generally advisable to avoid using static methods and properties, unless you have a very good reason to do so. One of the reasons for that is that in the future you may want to have two or more instances of this list for some reason, and then you'll be stuck with static calls.
Static methods and properties are too rigid. As Stevey states it:
Static methods are as flexible as
granite. Every time you use one,
you're casting part of your program in
concrete. Just make sure you don't
have your foot jammed in there as
you're watching it harden. Someday you
will be amazed that, by gosh, you
really DO need another implementation
of that dang PrintSpooler class, and
it should have been an interface, a
factory, and a set of implementation
classes. D'oh!
For game development I advocate "Doing The Simplest Thing That Could Possibly Work". That includes using global variables (public static in C#), if that is an easy solution. You can always turn it into something more formal later. The "find all references" tool in Visual Studio makes this really easy.
That being said, there are very few cases where a global variable is actually the "correct" way to do something. So if you are going to use it, you should be aware of and understand the correct solution. So you can make the best tradeoff between "being lazy" and "writing good code".
If you are going to make something global, you need to fully understand why you are doing so.
In this particular case, it sounds like you're trying to trying to get at content. You should be aware that ContentManager will automatically return the same content object if you ask for it multiple times. So rather than loading models into a global list, consider making your Game class's built-in ContentManager available via a public static property on your Game class.
Or, better still, there's a method that I prefer, that I think is a bit better: I explain it in the answer to another question. Basically you make the content references private static in the classes that use them and pass the ConentManager into public static LoadContent functions. This compartmentalises your use of static to individual classes, rather than using a global that is accessed from all over your program (which would be difficult to extricate later). It also correctly handles loading content at the correct time.
I'd avoid using static as much as possible, over time you'll just end up with spaghetti code.
If you pass it in the constructor you're eliminating an unnecessary dependency, low coupling is good. The fewer dependencies there are, the better.
I would suggest to implement a Singleon object which encapsulates the model list.
Have a look at the MSDN singleton implementation.
This is a matter of balance and trade-offs.
Of course, OOP purists will say that avoid such global variables at all costs, since it breaks code compartmentization by introducing something that goes "out of the box" for any module, and thus making it hard to maintain, change, debug etc.
However, my personal experience has been that it should be avoided only if you are part of a very large enterprise solutions team, maintaining a very large enterprise-class application.
For others cases, encapsulating globally-accessible data into a "global" object (or a static object, same thing) simplifies OOP coding to a great extent.
You may get the middle-ground by writing a global GetModels() function that returns the list of models. Or use DI to automatically inject the list of models.
In the past I have used a few different methods for doing dirty checking on my entities. I have been entertaining the idea of using AOP to accomplish this on a new a project. This would require me to add an attribute on every proptery in my classes where I want to invoke the dirty flag logic when the property is set. If I have to add an extra line of code to each property for the attribture, what is the benefit over just calling a SetDirty() method in the setters. I guess I am asking what would be the advantage, if any, of using the AOP approach?
I'd say that not only is there not any advantage in this case: there's a bit of a disadvantage. You're using the same number of lines of code whether you call dirty() or you use AOP, but just calling dirty() is more simple and clear, as far as intent goes.
AOP, honestly, is a bit oversold, I think. It adds another level of indirection, in terms of reading the code, that often it doesn't pay back.
The key thing to think about here is, does it help the next guy reading this (which may be you a few months down the road) understand more quickly and clearly what I'm trying to do. If you have trouble figuring out what's better about the less straightforward approach, you probably shouldn't be using it. (And I say this as a Haskell programmer, which means I'm far from adverse to non-straightforward approaches myself.)
The advantage is that should you decide to change the implementation of how to invoke the dirty flag logic, you'll only need to make one change (in the AOP method's body), not N changes (replacing all your SetDirty calls with something else).
I don't see any benefit if you have to decorate your entities with an attribute. Espeically if all your doing is calling a single method. If the logic was more complex then I could make an argument for using AOP.
If let's say each time you modify a property you wanted to track that change as a version, this might be more complex behavior that could be injected, then having this abstracted out of the property could be beneficul. At the same point you would probally want to version changing several properties at once so I come back to there not being much value.
The use of AOP is for cross cutting concerns. This means that you want to have a feature such as logging, security, ect but the business logic really does not belong in your class. This could be for the Dirty flag logic as the Domain object should not care that it has been changed. That is up to your DirtyLogicUtility or what ever name it has.
For example you want to log every time a method gets called for every you could place this in every function, but later on you want to have logic so that it is logged on every other call.
AOP keeps your classes clean doing what they are supposed to do while leaving the other pieces alone.
Some AOP implementations, specifically PostSharp, allow you to apply the attribute at an Assembly level with wildcards as to which classes it applies to.
Why do you want the dirty check to be the responsibility of the entities? You can manage this somewhere else. The pattern is called Unit of work