When would I actually be required to unit test private methods?

When would I actually be required to unit test private methods? - c#

I'm writing unit tests for the implementation of an API I wrote myself in my company's application. Still new to this whole thing. When looking for answeres on how to unit test certain things I come across a certain pattern. It goes something like this:
Question:
I have this private method I need to unit test.
Top voted answer:
Don't.
I also came across this article arguing against unit testing private methods as well.
Basically how I'm implementing an API I'm given is I write the code first, then I write unit tests to "break it the worst way possible" (as my superior puts it). Once I notice something broke I fix it in the code. To me this seems like a mash-up of OOD and TDD. Is that a legit approach?
The reason I got so many private methods in the first place is that I'm required to break up larger chunks of code into methods. Since these methods are only supposed to be used within the scope of this API implementation I set them to private. Since the file structure established by my team requires me to write all the code into a single file corresponding to an API I can't separate these private methods into a new class and set them to public.
My superior expects me to test these private methods as well. But I'm beginning to doubt if this is even really necessary if the Asserts on the public methods all run successfully?
From my point of view, if my tests on the public methods return the values I expected, I infer that my private methods also work like I intended.
Or am I missing something?

The core point is: unit tests exist to guarantee that your class under tests behaves as expected.
The behavior of your classes manifests itself via those methods that can be called from "outside" of your classes.
Therefore there is neither need nor sense in trying to directly test private methods.
Of course, it is fair to measure coverage while running unit tests; in order to understand which paths in your code are taken. This information can be used to either enhance test cases (to gain more coverage); or to delete production code (which isn't required).
And to align with your question: you do not use TDD to implement private methods.
You use TDD to create a special form of your "contract" that can be executed automatically. You verify what needs to be done; not how it is actually done in detail. That is especially true since the TDD methodology includes continuous refactoring. You write your tests, you turn them green (by writing production code); and then, at some point, you look into improving the quality of your code. Meaning: you start reworking internal aspects of your class under test. Like: creating more private methods, moving content around; maybe even creating internal-only helper classes and so on. But you keep running your existing tests ... which should still all work; because as said: you write them to check the externally observable behavior (as far as possible).
And beyond that: you should rather looking into "fuzzying" the test data that your unit tests drive into your code instead of worrying about private methods.
What I mean: instead of trying to manually find that test data that makes your production code break, look into concepts like QuickCheck that try to do exactly that automatically.
Final words: and if your management keeps hammering on "test private methods"; then it is your responsibility as engineer to convince them that they are wrong about this. And there is plenty of material out there to back that up.

The way you are splitting your code at the moment is out of necessity. You are delegating some work in a private method, because, well, other public methods need to re-use this, and you don't want to copy-paste that code. Of course, since these methods don't make sense being used as standalone methods, you keep them private.
Good, at least you're true to the DRY (Don't Repeat Yourself) principle.
Now, another way to look it is that you want to separate your private methods from the rest of the code, because you want to have a Separation of Concerns. If you do this, you will see that these private methods, although they can't be used on their own, don't really belong to the class containing your public methods, because they don't solve the same concern : This is the Single Responsibility principle: the S in SOLID.
Instead of having your private method within your class, what you can do is move it to another class (a service as I call them), inject it in the class in which they were before, and call these methods instead of the call to the private ones.
Why should you do this ?
Because it will be so much easier to test: you delegate a big part of the code, that you will not have to test under a big combination of scenarios.
Because you can then inject an alternative implementation (think maintainability: it's easier to replace a brick, than a part of a brick)
Because you can delegate the implementation (and the testing) of this service to someone else (you can have 2 developers in parallel working on a very small area of the code)
Sometimes, it makes even more sense, because these service classes will then be re-used by other completely different classes that will have the same needs, if they really take care of one single concern.
This last point doesn't always happen, but quite often, it does. I found it is easier to re-use existing data services when they are self-documented: properly-named services and properly-named methods. (your co-workers will discover them more easily)
Now, you don't need to test a private method... because it's public.
You may think it's cheating, because you just made it public, but this comes from a very legitimate approach: Separation of Concerns.
Final notes:
I am convinced your superior is right about asking you to test this code. One thing he could have added was to do that separation into different classes. Also, make sure that you inject these classes using Dependency Injection and Inversion of Control containers. don't instantiate them using the new statement, otherwise, you will not be able to assert that the right method was called with the right arguments !

Related

Why this particular test could be useful?

I recently joined a company where they use TDD, but it's still not clear for me, for example, we have a test:
[TestMethod]
public void ShouldReturnGameIsRunning()
{
var game = new Mock<IGame>();
game.Setup(g => g.IsRunning).Returns(true);
Assert.IsTrue(game.Object.IsRunning);
}
What's the purpose of it? As far as I understand, this is testing nothing! I say that because it's mocking an interface, and saying, return true for IsRunning, it will never return a different value...
I'm probably thinking wrong, as everybody says it's a good practice etc etc... or is this test wrong?

This test is excerising an interface, and verifying that the interface exposes a Boolean getter named IsRunning.
This test was probably written before the interface existed, and probably well before any concrete classes of IGame existed either. I would imagine, if the project is of any maturity, there are other tests that exercise the actual implementions and verify behavior for that level, and probably use mocking in the more traditional isolationist context rather than stubbing theoretical shapes.
The value of these kinds of tests is that it forces one to think about how a shape or object is going to be used. Something along the lines of "I don't know exactly what a game is going to do yet, but I do know that I'm going to want to check if it's running...". So this style of testing is more to drive design decisions, as well as validate behavior.
Not the most valuable test on the planet, but serves its purpose in my opinion.
Edit: I was on the train last night when I was typing this, but I wanted to address Andrew Whitaker comment directly in the answer.
One could argue that the signature in the interface itself is enough of an enforcement of the desired contract. However, this really depends on how you treat your interfaces and ultimately how you validate the system that you are trying to test.
There may be tangible value in explicitly stating this functionality as a test. You are explicitly verifying that this is a desired functionality for an IGame.
Lets take a use case, lets say that as a designer you know that you want to expose a public getter for IsRunning, so you add it to the interface. But lets say that another developer gets this and sees a property, but doesn't see any usages of it anywhere else in the code, one might assume that it is eligible for deletion (or code-rot removal, which is normally a good thing) without harm. The existence of this test, however, explicitly states this properity should exist.
Without this test, a developer could modify the interface, drop the implementation and the project would still compile and run seemingly correctly. And it would be until much later that someone might notice that interface has been changed.
With this test, even if it appears as if no one is using it, your test suite will fail. The self documenting nature of the test tells you that ShouldReturnGameIsRunning(), which at the very least should spawn a conversation if IGame should really expose an IsRunning getter.

The test is invalid. Moq is a framework used to "mock" objects used by the method under test. If you were testing IsRunning, you might use Mock to provide a certain implementation for those methods that IsRunning depends on, for example to cause certain execution paths to execute.
People can tell you all day that testing first is better; you won't realize that it actually is until you start doing it yourself and realize that you have fewer bugs with code you wrote tests for first, which will save you and your co-workers time and probably make the quality of your code higher.

It could be that the test has been written (before) the code which is good practice.
It might equally be that this is a pointless test created by a tool.

Another possibility is that the code was written by somebody who, at the time, didn't understand how to use that particular mocking framework - and wrote one or more simple tests to learn. Kent Beck talked about that sort of "learning test" in his original TDD book.

This test could have had a point in the past. It could have been testing a concrete class. It could have been testing an abstract one -- as it sometimes makes sense to mock an abstract class that you want to test. I can see how a (less informed(like myself)) follower of the TDD way of doing things could have left a small mess like this. Cleanup of dead code and dead test code, does not really fit in the natural workflow of red, green, refactor (since it doesn't break any tests!).
However, history is not justification for a test that does nothing but confuse the reader. So, be a good boyscout and remove it.

In my honest oppinion this is just rubbish, it only proves that moq works as expected

Why should I use a mocking framework instead of fakes?

There are some other variations of this question here at SO, but please read the entire question.
By just using fakes, we look at the constructor to see what kind of dependencies that a class have and then create fakes for them accordingly.
Then we write a test for a method by just looking at it's contract (method signature). If we can't figure out how to test the method by doing so, shouldn't we rather try to refactor the method (most likely break it up in smaller pieces) than to look inside it to figure our how we should test it? In other words, it also gives us a quality control by doing so.
Isn't mocks a bad thing since they require us to look inside the method that we are going to test? And therefore skip the whole "look at the signature as a critic".
Update to answer the comment
Say a stub then (just a dummy class providing the requested objects).
A framework like Moq makes sure that Method A gets called with the arguments X and Y. And to be able to setup those checks, one needs to look inside the tested method.
Isn't the important thing (the method contract) forgotten when setting up all those checks, as the focus is shifted from the method signature/contract to look inside the method and create the checks.
Isn't it better to try to test the method by just looking at the contract? After all, when we use the method we'll just look at the contract when using it. So it's quite important the it's contract is easy to follow and understand.

This is a bit of a grey area and I think that there is some overlap. On the whole I would say using mock objects is preferred by me.
I guess some of it depends on how you go about testing code - test or code first?
If you follow a test driven design plan with objects implementing interfaces then you effectively produce a mock object as you go.
Each test treats the tested object / method as a black box.
It focuses you onto writing simpler method code in that you know what answer you want.
But above all else it allows you to have runtime code that uses mock objects for unwritten areas of the code.
On the macro level it also allows for major areas of the code to be switched at runtime to use mock objects e.g. a mock data access layer rather than one with actual database access.

Fakes are just stupid dummy objects. Mocks enable you to verify that the controlflow of the unit is correct (e.g. that it calls the correct functions with the expected arguments). Doing so is very often a good way to test things. An example is that a saveProject()-function probably want's to call something like saveToProject() on the objects to be saved. I consider doing this a lot better than saving the project to a temporary buffer, then loading it to verify that everything was fine (this tests more than it should - it also verifies that the saveToProject() implementation(s) are correct).
As of mocks vs stubs, I usually (not always) find that mocks provide clearer tests and (optionally) more fine-grained control over the expectations. Mocks can be too powerful though, allowing you to test an implementation to the level that changing implementation under test leaving the result unchanged, but the test failing.

By just looking on method/function signature you can test only the output, providing some input (stubs that are only able to feed you with needed data). While this is ok in some cases, sometimes you do need to test what's happening inside that method, you need to test whether it behaves correctly.
string readDoc(name, fileManager) { return fileManager.Read(name).ToString() }
You can directly test returned value here, so stub works just fine.
void saveDoc(doc, fileManager) { fileManager.Save(doc) }
here you would much rather like to test, whether method Save got called with proper arguments (doc). The doc content is not changing, the fileManager is not outputting anything. This is because the method that is tested depends on some other functionality provided by the interface. And, the interface is the contract, so you not only want to test whether your method gives correct results. You also test whether it uses provided contract in correct way.

I see it a little different. Let me explain my view:
I use a mocking framework. When I try to test a class, to ensure it will work as intended, I have to test all the situations may happening. When my class under test uses other classes, I have to ensure in certain test situation that a special exceptions is raised by a used class or a certain return value, and so on... This is hardly to simulate with the real implementations of those classes, so I have to write fakes of them. But I think that in the case I use fakes, tests are not so easy to understand. In my tests I use MOQ-Framework and have the setup for the mocks in my test method. In case I have to analyse my testmethod, I can easy see how the mocks are configured and have not to switch to the coding of the fakes to understand the test.
Hope that helps you finding your answer ...

Is it ok to change method visibility for the sake of unit testing?

Many times I find myself torn between making a method private to prevent someone from calling it in a context that doesn't make sense (or would screw up the internal state of the object involved), or making the method public (or typically internal) in order to expose it to the unit test assembly. I was just wondering what the Stack Overflow community thought of this dilemma?
So I guess the question truly is, is it better to focus on testability or on maintaining proper encapsulation?
Lately I've been leaning towards testability, as most of the code is only going to be leveraged by a small group of developers, but I thought I would see what everyone else thought?

Its NOT ok to change method visibility on methods that the customers or users can see. Doing this is ugly, a hack, exposes methods that any dumb user could try to use and explode your app... its a liability you do not need.
You are using C# yes? Check out the internals visible to attribute class.
You can declare your testable methods as internal, and allow your unit testing assembly access to your internals.

It depends on whether the method is part of a public API or not. If a method does not belong to part of a public API, but is called publicly from other types within the same assembly, use internal, friend your unit test assembly, and unit test it.
However, if the method is not/should not be part of a public API, and it is not called by other types internal to the assembly, DO NOT test it directly. It should be protected or private, and it should only be tested indirectly by unit testing your public API. If you write unit tests for non-public (or what should be non-public) members of your types, you are binding test code to internal implementation details.
Thats a bad kind of coupling, increases the amount of unit tests you need, increases workload both in the short term (more unit tests) as well as in the long term (more test maintenance and modification in response to refactoring internal implementation details). Another problem with testing non-public members is that you test code that may not actually be needed or used. A GREAT way to find dead code is when it is not covered by any of your unit tests when your public API is covered 100%. Removing dead code is a great way to keep your code base lean and mean, and is impossible if you are not careful about what you put into your public API, and what parts of your code you unit test.
EDIT:
As a quick additional note...with a properly designed public API, you can very effectively use a tool like Microsoft PEX to automatically generate full-coverage unit tests that test every execution path of your code. Combined with a few manually written tests that cover critical behavior, anything not covered can be considered dead code and removed, and you can greatly shortcut your unit testing process.

This is a common thought.
It's generally best to test the private methods by testing the public methods that call them (so you don't explicitly test the private methods). However, I understand that there are times when you really do want to test those private methods.
The answers to this question (Java) and this question (.NET) should be helpful.
To answer the question: no, you shouldn't change method visibility for the sake of testing. You generally shouldn't be testing private methods, and when you do, there are better ways to do it.

In general I agree with #jrista. But, as usual, it depends.
When trying to work with legacy code, the key is to get it under test. After that, you can add tests for new features and existing bugs, refactor to improve design, etc. This is risky without tests. Legacy code tends to be rife with dependencies, and is often extremely difficult to get under test.
In Working Effectively with Legacy Code, Michael Feathers suggests multiple techniques for getting code under test. Many of these techniques involve breaking encapsulation or complicating the design, and the author is up front about this. Once tests are in place, the code can be improved safely.
So for legacy code, do what you have to do.

In .NET you should use Accessors for unit testing, even rather than the InternalsVisibleTo attribute. Accessors allow you to get access to any method in the class even if it is private. They even let you test abstract classes using an empty mock derived object (see the "PrivateObject" class).
Basically in your test project you use the accessor class rather than the actual class with the methods you want to test. The accessor class is the same as the "real" class, except everything is public to your test project. Visual studio can generate accessors for you.
NEVER make a type more visible to facilitate unit testing.
IMO is it WRONG to say that you should not unit test private methods. Unit tests are of exceptional value for regression testing and there is no reason why private methods should not be regression tested with granular unit tests.

Unit testing large blocks of code (mappings, translation, etc)

We unit test most of our business logic, but are stuck on how best to test some of our large service tasks and import/export routines. For example, consider the export of payroll data from one system to a 3rd party system. To export the data in the format the company needs, we need to hit ~40 tables, which creates a nightmare situation for creating test data and mocking out dependencies.
For example, consider the following (a subset of ~3500 lines of export code):
public void ExportPaychecks()
{
var pays = _pays.GetPaysForCurrentDate();
foreach (PayObject pay in pays)
{
WriteHeaderRow(pay);
if (pay.IsFirstCheck)
{
WriteDetailRowType1(pay);
}
}
}
private void WriteHeaderRow(PayObject pay)
{
//do lots more stuff
}
private void WriteDetailRowType1(PayObject pay)
{
//do lots more stuff
}
We only have the one public method in this particular export class - ExportPaychecks(). That's really the only action that makes any sense to someone calling this class ... everything else is private (~80 private functions). We could make them public for testing, but then we'd need to mock them to test each one separately (i.e. you can't test ExportPaychecks in a vacuum without mocking the WriteHeaderRow function. This is a huge pain too.
Since this is a single export, for a single vendor, moving logic into the Domain doesn't make sense. The logic has no domain significance outside of this particular class. As a test, we built out unit tests which had close to 100% code coverage ... but this required an insane amount of test data typed into stub/mock objects, plus over 7000 lines of code due to stubbing/mocking our many dependencies.
As a maker of HRIS software, we have hundreds of exports and imports. Do other companies REALLY unit test this type of thing? If so, are there any shortcuts to make it less painful? I'm half tempted to say "no unit testing the import/export routines" and just implement integration testing later.
Update - thanks for the answers all. One thing I'd love to see is an example, as I'm still not seeing how someone can turn something like a large file export into an easily testable block of code without turning the code into a mess.

This style of (attempted) unit testing where you try to cover an entire huge code base through a single public method always reminds me of surgeons, dentists or gynaecologists whe have perform complex operations through small openings. Possible, but not easy.
Encapsulation is an old concept in object-oriented design, but some people take it to such extremes that testability suffers. There's another OO principle called the Open/Closed Principle that fits much better with testability. Encapsulation is still valuable, but not at the expense of extensibility - in fact, testability is really just another word for the Open/Closed Principle.
I'm not saying that you should make your private methods public, but what I am saying is that you should consider refactoring your application into composable parts - many small classes that collaborate instead of one big Transaction Script. You may think it doesn't make much sense to do this for a solution to a single vendor, but right now you are suffering, and this is one way out.
What will often happen when you split up a single method in a complex API is that you also gain a lot of extra flexibility. What started out as a one-off project may turn into a reusable library.
Here are some thoughts on how to perform a refactoring for the problem at hand: Every ETL application must perform at least these three steps:
Extract data from the source
Transform the data
Load the data into the destination
(hence, the name ETL). As a start for refactoring, this give us at least three classes with distinct responsibilities: Extractor, Transformer and Loader. Now, instead of one big class, you have three with more targeted responsibilities. Nothing messy about that, and already a bit more testable.
Now zoom in on each of these three areas and see where you can split up responsibilities even more.
At the very least, you will need a good in-memory representation of each 'row' of source data. If the source is a relational database, you may want to use an ORM, but if not, such classes need to be modeled so that they correctly protect the invariants of each row (e.g. if a field is non-nullable, the class should guarantee this by throwing an exception if a null value is attempted). Such classes have a well-defined purpose and can be tested in isolation.
The same holds true for the destination: You need a good object model for that.
If there's advanced application-side filtering going on at the source, you could consider implementing these using the Specification design pattern. Those tend to be very testable as well.
The Transform step is where a lot of the action happens, but now that you have good object models of both source and destination, transformation can be performed by Mappers - again testable classes.
If you have many 'rows' of source and destination data, you can further split this up in Mappers for each logical 'row', etc.
It never needs to become messy, and the added benefit (besides automated testing) is that the object model is now way more flexible. If you ever need to write another ETL application involving one of the two sides, you alread have at least one third of the code written.

Something general that came to my mind about refactoring:
Refactoring does not mean you take your 3.5k LOC and divide it into n parts. I would not recommend to make some of your 80 methods public or stuff like this. It's more like vertically slicing your code:
Try to factor out self-standing algorithms and data structures like parsers, renderers, search operations, converters, special-purpose data structures ...
Try to figure out if your data is processed in several steps and can be build in a kind of pipe and filter mechanism, or tiered architecture. Try to find as many layers as possible.
Separate technical (files, database) parts from logical parts.
If you have many of these import/export monsters see what they have in common and factor that parts out and reuse them.
Expect in general that your code is too dense, i.e. it contains too many different functionalities next to each in too few LOC. Visit the different "inventions" in your code and think about if they are in fact tricky facilities that are worth having their own class(es).
Both LOC and number of classes are likely to increase when you refactor.
Try to make your code real simple ('baby code') inside classes and complex in the relations between the classes.
As a result, you won't have to write unit tests that cover the whole 3.5k LOC at all. Only small fractions of it are covered in a single test, and you'll have many small tests that are independent from each other.
EDIT
Here's a nice list of refactoring patterns. Among those, one shows quite nicely my intention: Decompose Conditional.
In the example, certain expressions are factored out to methods. Not only becomes the code easier to read but you also achieve the opportunity to unit test those methods.
Even better, you can lift this pattern to a higher level and factor out those expressions, algorithms, values etc. not only to methods but also to their own classes.

What you should have initially are integration tests. These will test that the functions perform as expected and you could hit the actual database for this.
Once you have that savety net you could start refactoring the code to be more maintainable and introducing unit tests.
As mentioned by serbrech Workign Effectively with Legacy code will help you to no end, I would strongly advise reading it even for greenfield projects.
http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052
The main question I would ask is how often does the code change? If it is infrequent is it really worth the effort trying to introduce unit tests, if it is changed frequently then I would definatly consider cleaning it up a bit.

It sounds like integration tests may be sufficient. Especially if these export routines that don't change once their done or are only used for a limited time. Just get some sample input data with a variations, and have a test that verifies the final result is as expected.
A concern with your tests was the amount of fake data you had to create. You may be able to reduce this by creating a shared fixture (http://xunitpatterns.com/Shared%20Fixture.html). For unit tests the fixture which may be an in-memory representation of business objects to export, or for the case on integration tests it may be the actual databases initialized with known data. The point is that however you generate the shared fixture is the same in each test, so creating new tests is just a matter of doing minor tweaks to the existing fixture to trigger the code you want to test.
So should you use integration tests? One barrier is how to set up the shared fixture. If you can duplicate the databases somewhere, you could use something like DbUnit to prepare the shared fixture. It might be easier to break the code into pieces (import, transform, export). Then use the DbUnit based tests to test import and export, and use regular unit tests to verify the transform step. If you do that you don't need DbUnit to set up a shared fixture for the transform step. If you can break the code into 3 steps (extract, transform, export) at least you can focus your testing efforts on the part thats likely to have bugs or change later.

I have nothing to do with C#, but I have some idea you could try here. If you split your code a bit, then you'll notice that what you have is basically chain of operations performed on sequences.
First one gets pays for current date:
var pays = _pays.GetPaysForCurrentDate();
Second one unconditionally processes the result
foreach (PayObject pay in pays)
{
WriteHeaderRow(pay);
}
Third one performs conditional processing:
foreach (PayObject pay in pays)
{
if (pay.IsFirstCheck)
{
WriteDetailRowType1(pay);
}
}
Now, you could make those stages more generic (sorry for pseudocode, I don't know C#):
var all_pays = _pays.GetAll();
var pwcdate = filter_pays(all_pays, current_date()) // filter_pays could also be made more generic, able to filter any sequence
var pwcdate_ann = annotate_with_header_row(pwcdate);
var pwcdate_ann_fc = filter_first_check_only(pwcdate_annotated);
var pwcdate_ann_fc_ann = annotate_with_detail_row(pwcdate_ann_fc); // this could be made more generic, able to annotate with arbitrary row passed as parameter
(Etc.)
As you can see, now you have set of unconnected stages that could be separately tested and then connected together in arbitrary order. Such connection, or composition, could also be tested separately. And so on (i.e. - you can choose what to test)

This is one of those areas where the concept of mocking everything falls over. Certainly testing each method in isolation would be a "better" way of doing things, but compare the effort of making test versions of all your methods to that of pointing the code at a test database (reset at the start of each test run if necessary).
That is the approach I'm using with code that has a lot of complex interactions between components, and it works well enough. As each test will run more code, you are more likely to need to step through with the debugger to find exactly where something went wrong, but you get the primary benefit of unit tests (knowing that something went wrong) without putting in significant additional effort.

I think Tomasz Zielinski has a piece of the answer. But if you say you have 3500 lines of procedural codes, then the the problem is bigger than that.
Cutting it into more functions will not help you test it. However, it' a first step to identify responsibilities that could be extracted further to another class (if you have good names for the methods, that can be obvious in some cases).
I guess with such a class you have an incredible list of dependencies to tackle just to be able to instanciate this class into a test. It becomes then really hard to create an instance of that class in a test...
The book from Michael Feathers "Working With Legacy Code" answer very well such questions.
The first goal to be able to test well that code into should be to identify the roles of the class and to break it into smaller classes. Of course that's easy to say and the irony is that it's risky to do without tests to secure your modifications...
You say you have only 1 public method in that class. That should ease the refactoring as you don't need to worry about the users fro, all the private methods. Encapsulation is nice, but if you have so much stuff private in that class, that probably means it doesn't belong here and you should extract different classes from that monster, that you will eventually be able to test. Pieces by pieces, the design should look cleaner, and you will be able to test more of that big piece of code.
You best friend if you start this will be a refactoring tool, then it should help you not to break logic while extracting classes and methods.
Again the book from Michael Feathers seems to be a must read for you :)
http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052
ADDED EXAMPLE :
This example come from the book from Michael Feathers and illustrate well your problem I think :
RuleParser
public evaluate(string)
private brachingExpression
private causalExpression
private variableExpression
private valueExpression
private nextTerm()
private hasMoreTerms()
public addVariables()
obvioulsy here, it doesn't make sense to make the methods nextTerm and hasMoreTerms public. Nobody should see these methods, the way we are moving to the next item is definitely internal to the class. so how to test this logic??
Well if you see that this is a separate responsibility and extract a class, Tokenizer for example. this method will suddenly be public within this new class! because that's its purpose. It becomes then easy to test that behaviour...
So if you would apply that to your huge piece of code, and extract pieces of it to other classes with less responsibilities, and where it would feel more natural to make these methods public, you also will be able to test them easily.
You said you are accessing about 40 different tables to map them. Why not breaking that into classes for each part of the mapping?
It's a bit hard to reason about a code I can't read. You maybe have other issues that prevent you to do this, but that's my best try on it.
Hope this helps
Good luck :)

I really find it hard to accept that you've got multiple, ~3.5 Klines data-export functions with no common functionality at all between them. If that's in fact the case, then maybe Unit Testing is not what you need to be looking at here. If there really is only one thing that each export module does, and it's essentially indivisible, then maybe a snapshot-comparison, data driven integration test suite is what's called for.
If there are common bits of functionality, then extract each of them out (as separate classes) and test them individually. Those little helper classes will naturally have different public interfaces, which should reduce the problem of private APIs that can't be tested.
You don't give any details about what the actual output formats look like, but if they're generally tabular, fixed-width or delimited text, then you ought at least to be able to split the exporters up into structural and formatting code. By which I mean, instead of your example code up above, you'd have something like:
public void ExportPaychecks(HeaderFormatter h, CheckRowFormatter f)
{
var pays = _pays.GetPaysForCurrentDate();
foreach (PayObject pay in pays)
{
h.formatHeader(pay);
f.WriteDetailRow(pay);
}
}
The HeaderFormatter and CheckRowFormatter abstract classes would define a common interface for those types of report elements, and the individual concrete subclasses (for the various reports) would contain logic for removing duplicate rows, for example (or whatever a particular vendor requires).
Another way to slice this is to separate data extraction and formatting from each other. Write code that extracts all the records from the various databases into an intermediate representation that's a super-set of the needed representations, then write relatively simple-minded filter routines that convert from the uber-format down to the required format for each vendor.
After thinking about this a little more, I realize you've identified this as an ETL application, but your example seems to combine all three steps together. That suggests that a first step would be to split things up such that all the data is extracted first, then translated, then stored. You can certainly test at least those steps separately.

I maintain some reports similar to what you describe, but not as many of them and with fewer database tables. I use a 3-fold strategy that might scale well enough to be useful to you:
At the method level, I unit test anything I subjectively deem to be 'complicated'. This includes 100% of bug fixes, plus anything that just makes me feel nervous.
At the module level, I unit test the main use cases. As you have encountered, this is fairly painful since it does require somehow mocking the data. I have accomplished this by abstracting the database interfaces (i.e. no direct SQL connections within my reporting module). For some simple tests I have typed the test data by hand, for others I have written a database interface that records and/or plays back queries, so that I can bootstrap my tests with real data. In other words, I run once in record mode and it not only fetches real data but it also saves a snapshot for me in a file; when I run in playback mode, it consults this file instead of the real database tables. (I'm sure there are mocking frameworks that can do this, but since every SQL interaction in my world has the signature Stored Procedure Call -> Recordset it was quite simple just to write it myself.)
I'm fortunate to have access to a staging environment with a full copy of production data, so I can perform integration tests with full regression against previous software versions.

Have you looked into Moq?
Quote from the site:
Moq (pronounced "Mock-you" or just
"Mock") is the only mocking library
for .NET developed from scratch to
take full advantage of .NET 3.5 (i.e.
Linq expression trees) and C# 3.0
features (i.e. lambda expressions)
that make it the most productive,
type-safe and refactoring-friendly
mocking library available.

Unit testing private code [duplicate]

This question already has answers here:
Unit testing private methods in C#
(17 answers)
Closed 6 years ago.
I am currently involved in developing with C# - Here is some background:
We implement MVP with our client application and we have a cyclomatic rule which states that no method should have a cyclomatic complexity greater than 5.
This leads to a lot of small private methods which are generally responsible for one thing.
My question is about unit testing a class:
Testing the private implementation through the public methods is all fine... I don't have a problem implementing this.
But... what about the following cases:
Example 1. Handle the result of an async data retrival request (The callback method shouldn't be public purely for testing)
Example 2. An event handler which does an operation (such as update a View label's text - silly example I know...)
Example 3. You are using a third party framework which allows you to extend by overriding protected virtual methods (the path from the public methods to these virtual methods are generally treated as black box programming and will have all sorts of dependancies that the framework provides that you don't want to know about)
The examples above don't appear to me to be the result of poor design.
They also do not appear be be candidates for moving to a seperate class for testing in isolation as such methods will lose their context.
Doesn anyone have any thoughts about this?
Cheers,
Jason
EDIT:
I don't think I was clear enough in my original question - I can test private methods using accessors and mock out calls/ methods using TypeMock. That isn't the problem. The problem is testing things which don't need to be public, or can't be public.
I don't want to make code public for the sake of testing as it can introduce security loopholes (only publishing an interface to hide this is not an option because anyone can just cast the object back to its original type and get access to stuff I wouldn't want them to)
Code that gets refactored out to another class for testing is fine - but can lose context. I've always thought it bad practice to have 'helper' classes which can contain a pot of code with no specific context - (thinking SRP here). I really don't think this works for event handlers either.
I am happy to be proven wrong - I just am unsure how to test this functionality! I have always been of the mind that if it can break or be changed - test it.
Cheers, Jason

As Chris has stated, it is standard practice to only unit test public methods. This is because, as a consumer of that object, you are only concerned about what is publically available to you. And, in theory, proper unit tests with edge cases will fully exercise all private method dependencies they have.
That being said, I find there are a few times where writing unit tests directly against private methods can be extremely useful, and most succinct in explaining, through your unit tests, some of the more complex scenarios or edge cases that might be encountered.
If that is the case, you can still invoke private methods using reflection.
MyClass obj = new MyClass();
MethodInfo methodInfo = obj.GetType().GetMethod("MethodName", BindingFlags.Instance | BindingFlags.NonPublic);
object result = methodInfo.Invoke(obj, new object[] { "asdf", 1, 2 });
// assert your expected result against the one above

we have a cyclomatic rule which states
that no method should have a
cyclomatic complexity greater than 5
I like that rule.
The point is that the private methods are implementation details. They are subject to change/refactoring. You want to test the public interface.
If you have private methods with complex logic, consider refactoring them out into a separate class. That can also help keep cyclomatic complexity down. Another option is to make the method internal and use InternalsVisibleTo (mentioned in one of the links in Chris's answer).
The catches tend to come in when you have external dependencies referenced in private methods. In most cases you can use techniques such as Dependency Injection to decouple your classes. For your example with the third-party framework, that might be difficult. I'd try first to refactor the design to separate the third-party dependencies. If that's not possible, consider using Typemock Isolator. I haven't used it, but its key feature is being able to "mock" out private, static, etc. methods.
Classes are black boxes. Test them that way.
EDIT: I'll try to respond to Jason's comment on my answer and the edit to the original question. First, I think SRP pushes towards more classes, not away from them. Yes, Swiss Army helper classes are best avoided. But what about a class designed to handle async operations? Or a data retrieval class? Are these part of the responsibility of the original class, or can they be separated?
For example, say you move this logic to another class (which could be internal). That class implements an Asynchronous Design Pattern that permits the caller to choose if the method is called synchronously or asynchronously. Unit tests or integration tests are written against the synchronous method. The asynchronous calls use a standard pattern and have low complexity; we don't test those (except in acceptance tests). If the async class is internal, use InternalsVisibleTo to test it.

There is really only two cases you need to consider:
the private code is called, directly or indirectly from public code and
the private code is not called from public code.
In the first case, the private code is automatically being tested by the tests which exercise the public code that calls the private code, so there is no need to test the private code. And in the second case, the private code cannot be called at all, therefore it should be deleted, not tested.
Ergo: there is no need to explicitly test the private code.
Note that when you do TDD it is impossible for untested private code to even exist. Because when you do TDD, the only way that private code can be appear, is by an Extract {Method|Class|...} Refactoring from public code. And Refactorings are, by definition, behavior-preserving and therefore test-coverage-preserving. And the only way that public code can appear is as the result of a failing test. If public code can only appear as already tested code as the result of a failing test, and private code can only appear as the result of being extracted from public code via a behavior-preserving refactoring, it follows that untested private code can never appear.

In all of my unit testing, I've never bothered testing private functions. I typically just tested public functions. This goes along with the Black Box Testing methodology.
You are correct that you really can't test the private functions unless you expose the private class.
If your "seperate class for testing" is in the same assembly, you can choose to use internal instead of private. This exposes the internal methods to your code, but they methods will not be accessible to code not in your assembly.
EDIT: searching SO for this topic I came across this question. The most voted answer is similar to my response.

A few points from a TDD guy who has been banging around in C#:
1) If you program to interfaces then any method of a class that is not in the interface is effectively private. You might find this a better way to promote testability and a better way to use interfaces as well. Test these as public members.
2) Those small helper methods may more properly belong to some other class. Look for feature envy. What may not be reasonable as a private member of the original class (in which you found it) may be a reasonable public method of the class it envies. Test these in the new class as public members.
3) If you examine a number of small private methods, you might find that they have cohesion. They may represent a smaller class of interest separate from the original class. If so, that class can have all public methods, but be either held as a private member of the original class or perhaps created and destroyed in functions. Test these in the new class as public members.
4) You can derive a "Testable" class from the original, in which it is a trivial task to create a new public method that does nothing but call the old, private method. The testable class is part of the test framework, and not part of the production code, so it is cool for it to have special access. Test it in the test framework as if it were public.
All of these make it pretty trivial to have tests on the methods that are currently private helper methods, without messing up the way intellisense works.

There are some great answers here, and I basically agree with the repeated advice to sprout new classes. For your Example 3, however, there's a sneaky, simple technique:
Example 3. You are using a third party
framework which allows you to extend
by overriding protected virtual
methods (the path from the public
methods to these virtual methods are
generally treated as black box
programming and will have all sorts of
dependencies that the framework
provides that you don't want to know
about)
Let's say MyClass extends FrameworkClass. Have MyTestableClass extend MyClass, and then provide public methods in MyTestableClass that expose the protected methods of MyClass that you need. Not a great practice - it's kind of an enabler for bad design - but useful on occasion, and very simple.

Would accessor files work? http://msdn.microsoft.com/en-us/library/bb514191.aspx I've never directly worked with them, but I know a coworker used them to test private methods on some Windows Forms.

Several people have responded that private methods shouldn't be tested directly, or they should be moved to another class. While I think this is good, sometimes its just not worth it. While I agree with this in principle, I've found that this is one of those rules that cna be broken to save time without negative repercussions. If the function is small/simple the overhead of creating another class and test class is overkill. I will make these private methods public, but then not add them to the interface. This way consumers of the class (who should be getting the interface only through my IoC library) won't accidentally use them, but they're available for testing.
Now in the case of callbacks, this is a great example where making a private property public can make tests a lot easier to write and maintain. For instance, if class A passes a callback to class B, I'll make that callback a public property of class A. One test for class A use a stub implementation for B that records the callback passed in. The test then verify the the callback is passed in to B under appropriate conditions. A separate test for class A can then call the callback directly, verifying it has the appropriate side effects.
I think this approach works great for verifying async behaviors, I've been doing it in some javascript tests and some lua tests. The benefit is I have two small simple tests (one that verifies the callback is setup, one that verifies it behaves as expected). If you try to keep the callback private then the test verifying the callback behavior has a lot more setup to do, and that setup will overlap with behavior that should be in other tests. Bad coupling.
I know, its not pretty, but I think it works well.

I will admit that when recently writing units tests for C# I discovered that many of the tricks I knew for Java did not really apply (in my case it was testing internal classes).
For example 1, if you can fake/mock the data retrieval handler you can get access to the callback through the fake. (Most other languages I know that use callbacks also tend not to make them private).
For example 2 I would look into firing the event to test the handler.
Example 3 is an example of the Template Pattern which does exist in other languages. I have seen two ways to do this:
Test the entire class anyway (or at least relevant pieces of it). This particularly works in cases where the abstract base class comes with its own tests, or the overall class is not too complex. In Java I might do this if I were writing an extension of AbstractList, for example. This may also be the case if the template pattern was generated by refactoring.
Extend the class again with extra hooks that allow calling the protected methods directly.

Don't test private code, or you'll be sorry later when it's time to refactor. Then, you'll do like Joel and blog about how TDD is too much work because you constantly have to refactor your tests with your code.
There are techniques (mocks, stub) to do proper black box testing. Look them up.

This is a question that comes up pretty early when introducing testing. The best technique to solving this problem is to black-box test (as mentioned above) and follow the single responsibility principle. If each of your classes only have only one reason to change, they should be pretty easy to test their behavior without getting at their private methods.
SRP - wikipedia / pdf
This also leads to more robust and adaptable code as the single responsibility principle is really just saying that your class should have high cohesion.

In C# you can use the attribute in AssemblyInfo.cs:
[assembly: InternalsVisibleTo("Worker.Tests")]
Simply mark your private methods with internal, and the test project will still see the method. Simple! You get to keep encapsulation AND have testing, without all the TDD nonsense.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.