I'm trying to write a unit test for a class that generates distinct strings. My initial reaction was the following:
public void GeneratedStringsShouldBeDistinct()
{
UniqueStringCreator stringCreator = new UniqueStringCreator();
HashSet<string> generatedStrings = new HashSet<string>();
string str;
for (int i = 0; i < 10000; i++)
{
str = stringCreator.GetNext();
if (!generatedStrings.Add(str))
{
Assert.Fail("Generated {0} twice", str);
}
}
}
I liked this approach because I knew the underlying algorithm wasn't using any randomness so I'm not in a situation where it might fail one time but succeed the next - but that could be swapped out underneath by someone in the future. OTOH, testing of any randomized algorithm would cause that type of test inconsistency, so why not do it this way?
Should I just get 2 elements out and check distinctness (using a 0/1/Many philosophy)?
Any other opinions or suggestions?
I would keep using your approach; it's probably the most reliable option.
By the way, you don't need the if statement:
Assert.IsTrue(generatedStrings.Add(str), "Generated {0} twice", str);
If I wanted to test code that relied on random input, I would try to stub out the random generation (say, ITestableRandomGenerator) so that it could be mocked for testing. You can then inject different 'random' sequences that appropriately trigger the different execution pathways of your code under test and guarantee the necessary code coverage.
The particular test you've shown is basically a black box test, as you're just generating outputs and verifying that it works for at least N cycles. Since the code does not have any inputs, this is a reasonable test, though it might be better if you know what different conditions may impact your algorithm so that you can test those particular inputs. This may mean somehow running the algorithm with different 'seed' values, choosing the seeds so that it will exercise the code in different ways.
If you passed the algorithm into UniqueStringCreator's constructor, you could use a stub object in your unit testing to generate psuedo-random (predictable) data. See also the strategy pattern.
If there's some kind of check inside the class, you can always separate out the part which checks for distinctness with the part that generates the strings.
Then you mock out the checker, and test the behaviour in each of the two contexts; the one in which the checker thinks a string has been created, and the one in which it doesn't.
You may find similar ways to split up the responsibilities, whatever the underlying implementation logic.
Otherwise, I agree with SLaks - stick with what you have. The main reason for having tests is so that the code stays easy to change, so as long as people can read it and think, "Oh, that's what it's doing!" you're probably good.
Related
I would like to do a very simple test for the Constructor of my class,
[Test]
public void InitLensShadingPluginTest()
{
_lensShadingStory.WithScenario("Init Lens Shading plug-in")
.Given(InitLensShadingPlugin)
.When(Nothing)
.Then(PluginIsCreated)
.Execute();
}
this can be in Given or When it... I think it should be in When() but it doesn't really matter.
private void InitLensShadingPlugin()
{
_plugin = new LSCPlugin(_imagesDatabaseProvider, n_iExternalToolImageViewerControl);
}
Since the Constructor is the one being tested, I do not have anything to do inside the When() statement,
And in Then() I assert about the plugin creation.
private void PluginIsCreated()
{
Assert.NotNull(_plugin);
}
my question is about StoryQ, since I do not want to do anything inside When()
i tried to use When(()=>{}) however this is not supported by storyQ,
this means I need to implement something like
private void Nothing()
{
}
and call When(Nothing)
is there a better practice?
It's strange that StoryQ doesn't support missing steps; your scenario is actually pretty typical of other examples I've used of starting applications, games etc. up:
Given the chess program is running
Then the pieces should be in the starting positions
for instance. So your desire to use a condition followed by an outcome is perfectly valid.
Looking at StoryQ's API, it doesn't look as if it supports these empty steps. You could always make your own method and call both the Given and When steps inside it, returning the operation from the When:
.GivenIStartedWith(InitLensShadingPlugin)
.Then(PluginIsCreated)
If that seems too clunky, I'd do as you suggested and move the Given to a When, initializing the Given with an empty method with a more meaningful name instead:
Given(NothingIsInitializedYet)
.When(InitLensShadingPlugin)
.Then(PluginIsCreated)
Either of these will solve your problem.
However, if all you're testing is a class, rather than an entire application, using StoryQ is probably overkill. The natural-language BDD frameworks like StoryQ, Cucumber, JBehave etc. are intended to help business and development teams collaborate in their exploration of requirements. They incur significant setup and maintenance overhead, so if the audience of your class-level scenarios / examples is technical, there may be an easier way.
For class-level examples of behaviour I would just go with a plain unit testing tool like NUnit or MSpec. I like using NUnit and putting my "Given / When / Then" in comments:
// Given I initialized the lens shading plugin on startup
_plugin = new LSCPlugin(_imagesDatabaseProvider, n_iExternalToolImageViewerControl);
// Then the plugin should have been created
Assert.NotNull(_plugin);
Steps at a class level aren't reused in the same way they are in full-system scenarios, because classes have much smaller, more encapsulated responsibilities; and developers benefit from reading the code rather than having it hidden away in the step definitions.
Your Given/When/Then comments here might still echo scenarios at a higher level, if the class is directly driving the functionality that the user sees.
Normally for full-system scenarios we would derive the steps from conversations with the "3 amigos":
a business representative (PO, SME, someone who has a problem to be solved)
a tester (who spots scenarios we might otherwise miss)
the dev (who's going to solve the problem).
There might be a pair of devs. UI designers can get involved if they want to. Matt Wynne says it's "3 amigos, where 3 is any number between 3 and 7". The best time to have the conversations is right before the devs pick up the work to begin coding it.
However, if you're working on your own, whether it's a toy or a real application, you might benefit just from having imaginary conversations. I use a pixie called Thistle for mine.
I'm writing unit tests for a simple IsBoolean(x) function to test if a value is boolean. There's 16 different values I want to test.
Will I be burnt in hell, or mocked ruthlessly by the .NET programming community (which would be worse?), if I don't break them up into individual unit tests, and run them together as follows:
[TestMethod]
public void IsBoolean_VariousValues_ReturnsCorrectly()
{
//These should all be considered Boolean values
Assert.IsTrue(General.IsBoolean(true));
Assert.IsTrue(General.IsBoolean(false));
Assert.IsTrue(General.IsBoolean("true"));
Assert.IsTrue(General.IsBoolean("false"));
Assert.IsTrue(General.IsBoolean("tRuE"));
Assert.IsTrue(General.IsBoolean("fAlSe"));
Assert.IsTrue(General.IsBoolean(1));
Assert.IsTrue(General.IsBoolean(0));
Assert.IsTrue(General.IsBoolean(-1));
//These should all be considered NOT boolean values
Assert.IsFalse(General.IsBoolean(null));
Assert.IsFalse(General.IsBoolean(""));
Assert.IsFalse(General.IsBoolean("asdf"));
Assert.IsFalse(General.IsBoolean(DateTime.MaxValue));
Assert.IsFalse(General.IsBoolean(2));
Assert.IsFalse(General.IsBoolean(-2));
Assert.IsFalse(General.IsBoolean(int.MaxValue));
}
I ask this because "best practice" I keep reading about would demand I do the following:
[TestMethod]
public void IsBoolean_TrueValue_ReturnsTrue()
{
//Arrange
var value = true;
//Act
var returnValue = General.IsBoolean(value);
//Assert
Assert.IsTrue(returnValue);
}
[TestMethod]
public void IsBoolean_FalseValue_ReturnsTrue()
{
//Arrange
var value = false;
//Act
var returnValue = General.IsBoolean(value);
//Assert
Assert.IsTrue(returnValue);
}
//Fell asleep at this point
For the 50+ functions and 500+ values I'll be testing against this seems like a total waste of time.... but it's best practice!!!!!
-Brendan
I would not worry about it. This sort of thing isn't the point. JB Rainsberger talked about this briefly in his talk Integration Tests are a Scam. He said something like, "If you have never forced yourself to use one assert per test, I recommend you try it for a month. It will give you a new perspective on test, and teach you when it matters to have one assert per test, and when it doesn't". IMO, this falls into the doesn't matter category.
Incidentally, if you use nunit, you can use the TestCaseAttribute, which is a little nicer:
[TestCase(true)]
[TestCase("tRuE")]
[TestCase(false)]
public void IsBoolean_ValidBoolRepresentations_ReturnsTrue(object candidate)
{
Assert.That(BooleanService.IsBoolean(candidate), Is.True);
}
[TestCase("-3.14")]
[TestCase("something else")]
[TestCase(7)]
public void IsBoolean_InvalidBoolRepresentations_ReturnsFalse(object candidate)
{
Assert.That(BooleanService.IsBoolean(candidate), Is.False);
}
EDIT: wrote the tests in a slightly different way, that I think communicates intent a little better.
Although I agree it's best practice to separate the values in order to more easily identify the error. I think one still has to use their own common sense and follow such rules as guidelines and not as an absolute. You want to minimize assertion counts in a unit test, but what's generally most important is to insure a single concept per test.
In your specific case, given the simplicity of the function, I think that the one unit test you provided is fine. It's easy to read, simple, and clear. It also tests the function thoroughly and if ever it were to break somewhere down the line, you would be able to quickly identify the source and debug it.
As an extra note, in order to maintain good unit tests, you'll want to always keep them up to date and treat them with the same care as you do the actual production code. That's in many ways the greatest challenge. Probably the best reason to do Test Driven Development is how it actually allows you to program faster in the long run because you stop worrying about breaking the code that exists.
It's best practice to split each of the values you want to test into separate unit tests. Each unit test should be named specifically to the value you're passing and the expected result. If you were changing code and broke just one of your tests, then that test alone would fail and the other 15 would pass. This buys you the ability to instantly know what you broke without then having to debug the one unit test and find out which of the Asserts failed.
Hope this helps.
I can't comment on "Best Practice" because there is no such thing.
I agree with what Ayende Rahien says in his blog:
At the end, it boils down to the fact that I don’t consider tests to
be, by themselves, a value to the product. Their only value is their
binary ability to tell me whatever the product is okay or not.
Spending a lot of extra time on the tests distract from creating real
value, shippable software.
If you put them all in one test and this test fails "somewhere", then what do you do? Either your test framework will tell you exactly which line it failed on, or, failing that, you step through it with a debugger. The extra effort required because it's all in one function is negligible.
The extra value of knowing exactly which subset of tests failed in this particular instance is small, and overshadowed by the ponderous amount of code you had to write and maintain.
Think for a minute the reasons for breaking them up into individual tests. It's to isolate different functionality and to accurately identify all the things that went wrong when a test breaks. It looks like you might be testing two things: Boolean and Not Boolean, so consider two tests if your code follows two different paths. The bigger point, though, is that if none of the tests break, there are no errors to pinpoint.
If you keep running them, and later have one of these tests fail, that would be the time to refactor them into individual tests, and leave them that way.
I have the following test:
[Test]
public void VerifyThat_WhenProvidingAServiceOrderWithALinkedAccountGetSerivceProcessWithStatusReasonOfEndOfEntitlementToUpdateStatusAndStopReasonForAccountGetServiceProcessesAndServiceOrders_TheProcessIsUpdatedWithAStatusReasonOfEndOfEntitlement()
{
IFixture fixture = new Fixture()
.Customize(new AutoMoqCustomization());
Mock<ICrmService> crmService = new Mock<ICrmService>();
fixture.Inject(crmService);
var followupHandler = fixture.CreateAnonymous<FollowupForEndOfEntitlementHandler>();
var accountGetService = fixture.Build<EndOfEntitlementAccountGetService>()
.With(handler => handler.ServiceOrders, new HashedSet<EndOfEntitlementServiceOrder>
{
{
fixture.Build<EndOfEntitlementServiceOrder>()
.With(order => order.AccountGetServiceProcess, fixture.Build<EndOfEntitlementAccountGetServiceProcess>()
.With(process => process.StatusReason, fixture.Build<StatusReason>()
.With(statusReason=> statusReason.Id == MashlatReasonStatus.Worthiness)
.CreateAnonymous())
.CreateAnonymous())
.CreateAnonymous()
}
})
.CreateAnonymous();
followupHandler.UpdateStatusAndStopReasonForAccountGetServiceProcessesAndServiceOrders(accountGetService);
crmService.Verify(svc => svc.Update(It.IsAny<DynamicEntity>()), Times.Never());
}
My problem is that it will never fail on the first run, like TDD specifies that it should.
What it should test is that whenever there is a certain value to a status for a process of a service order, perform no updates.
Is this test checking what it should?
I'm struggling a bit to understand the question here...
Is your problem that this test passes on the first try?
If yes, that means one of two things
your test has an error
you have already met this spec/requirement
Since the first has been ruled out, Green it is. Off you go to the next one on the list..
Somewhere down the line I assume, you will implement more functionality that results in the expected method being called. i.e. when the status value is different, perform an update.
The fix for that test must ensure that both tests pass.
If not, give me more information to help me understand.
Following TDD methodology, we only write new tests for functionality that doesn't exist. If a test passes on the first run, it is important to understand why.
One of my favorite things about TDD is its subtle ability to challenge our assumptions, and knock our egos flat. The practice of "Calling your Shots" is not only a great way to work through tests, but it's also a lot of fun. I love when a test fails when I expect it to pass - many great learning opportunities come from this; Time after time, evidence of working software trumps developer ego.
When a test passes when I think it shouldn't, the next step is to make it fail.
For example, your test, which expects that something doesn't happen, is guaranteed to pass if the implementation is commented out. Tamper with the logic that you think you are implementing by commenting it out or by altering the conditions of the implementation and verify if you get the same results.
If after doing this, and you're confident that the functionality is correct, write another test that proves the opposite. Will Update get called with different state or inputs?
With both sets in place, you should be able to comment out that feature and have the ability to know in advance which test will be impacted. (8-ball, corner pocket)
I would also suggest that you add another assertion to the above test to ensure that the subject and functionality under test is actually being invoked.
change the Times.Never() to Times.AtLeastOnce() and you got a good start for tdd.
Try to find nothing in nothing, well that's a good test ,but not they way to start tdd, first go with the simple specification, the naive operation the user could do (from your view point of course).
As you done some work, keep it for later, when it fails.
Say I'm trying to test a simple Set class
public IntSet : IEnumerable<int>
{
Add(int i) {...}
//IEnumerable implementation...
}
And suppose I'm trying to test that no duplicate values can exist in the set. My first option is to insert some sample data into the set, and test for duplicates using my knowledge of the data I used, for example:
//OPTION 1
void InsertDuplicateValues_OnlyOneInstancePerValueShouldBeInTheSet()
{
var set = new IntSet();
//3 will be added 3 times
var values = new List<int> {1, 2, 3, 3, 3, 4, 5};
foreach (int i in values)
set.Add(i);
//I know 3 is the only candidate to appear multiple times
int counter = 0;
foreach (int i in set)
if (i == 3) counter++;
Assert.AreEqual(1, counter);
}
My second option is to test for my condition generically:
//OPTION 2
void InsertDuplicateValues_OnlyOneInstancePerValueShouldBeInTheSet()
{
var set = new IntSet();
//The following could even be a list of random numbers with a duplicate
var values = new List<int> { 1, 2, 3, 3, 3, 4, 5};
foreach (int i in values)
set.Add(i);
//I am not using my prior knowledge of the sample data
//the following line would work for any data
CollectionAssert.AreEquivalent(new HashSet<int>(values), set);
}
Of course, in this example, I conveniently have a set implementation to check against, as well as code to compare collections (CollectionAssert). But what if I didn't have either ? This code would be definitely more complicated than that of the previous option! And this is the situation when you are testing your real life custom business logic.
Granted, testing for expected conditions generically covers more cases - but it becomes very similar to implementing the logic again (which is both tedious and useless - you can't use the same code to check itself!). Basically I'm asking whether my tests should look like "insert 1, 2, 3 then check something about 3" or "insert 1, 2, 3 and check for something in general"
EDIT - To help me understand, please state in your answer if you prefer OPTION 1 or OPTION 2 (or neither, or that it depends on the case, etc). Just to clarify, it's pretty clear that in this case (IntSet), option 2 is better in all aspects. However, my question pertains to the cases where you don't have an alternative implementation to check against, so the code in option 2 would be definitely more complicated than option 1.
I usually prefer to test use cases one by one - this works nicely the TDD manner: "code a little, test a little". Of course, after a while my test cases start to contain duplicated code, so I refactor. The actual method of verifying the results does not matter to me as long as it is working for sure, and doesn't get into the way of testing itself. So if there is a "reference implementation" to test against, it is all the better.
An important thing, however, is that the tests should be reproducable and it should be clear what each test method is actually testing. To me, inserting random values into a collection is neither - of course if there is a huge amount of data/use cases involved, every tool or approach is welcome which helps to handle the situation better without lulling me into a false sense of security.
If you have an alternative implementation, then definitely use it.
In some situations, you can avoid reimplementing an alternative implementation, but still test the functionality in general. For instance, in your example, you could first generate a set of unique values, and then randomly duplicate elements before passing it to your implementation. You can test that the output is equivalent to your starting vector, without having to reimplement the sort.
I try to take this approach whenever it's feasible.
Update: Essentially, I'm advocating the OP's "Option #2". With this approach, there's precisely one output vector that will allow the test to pass. With "Option #1", there's an infinite number of acceptable output vectors (it's testing an invariant, but it's not testing for any relationship to the input data).
Basically I'm asking whether my tests
should look like "insert 1, 2, 3 then
check something about 3" or "insert 1,
2, 3 and check for something in
general"
I am not a TDD purist but it seems people are saying that the test should break if the condition that you are trying to test is broken. e.i. if you implement a test which checks a general condition, then your test will break in more than a few cases so it is not optimal.
If I am testing for not being able to add duplicates, then I would only test that. So in this case, I would say I would go with first.
(Update)
OK, now you have updated the code and I need to update my answer.
Which one would I choose? It depends on the implementation of CollectionAssert.AreEquivalent(new HashSet<int>(values), set);. For example, IEnumerable<T> does keep the order while HashSet<T> does not so even this could break the test while it should not. For me first is still superior.
According to xUnit Test Patterns, it's usually more favorable to test the state of the system under test. If you want to test its behavior and the way in which the algorithm operates, you can use Mock Object Testing.
That being said, both of your tests are known as Data Driven Tests. What is usually acceptable is to use as much knowledge as the API provides. Remember, those tests also serve as documentation for your software. Therefore it's critical to keep them as simple as possible - whatever that means for your specific case.
The first step should be demonstrating the correctness of the Add method using an activity diagram/flowchart. The next step would be to formally prove the correctness of the Add method if you have the time. Then testing with a specific sets of data where you expect duplications and non-duplications (i.e. some sets of data have duplications and some sets don't and you are seeing if the data structure performs correctly - it's important to have cases that should succeed (no duplicates) and to check that they were added correctly to the set rather than just testing for failure cases (cases in which duplicates should be found)). And finally checking generically. Even though it is now somewhat deprecated I would suggest constructing data to fully exercise every execution path in the method being tested. At any point you made a code change then begin all over applying regression testing.
I would opt for the algorithmic approach, but preferably without relying on an alternate implementation such as HashSet. You're actually testing for more than just "no duplicates" with the HashSet match. For example, the test will fail if any items didn't make it into the resulting set, and you presumably have other tests that check for this.
A cleaner verification of the "no duplicates" expectation might be something like the following:
Assert.AreEqual(values.Distinct().Count(), set.Count());
I don't think that this is specific to a language or framework, but I am using xUnit.net and C#.
I have a function that returns a random date in a certain range. I pass in a date, and the returning date is always in range of 1 to 40 years before the given date.
Now I just wonder if there is a good way to unit test this. The best approach seems to be to create a loop and let the function run i.e. 100 times and assert that every of these 100 results are in the desired range, which is my current approach.
I also realize that unless I am able to control my Random generator, there will not be a perfect solution (after all, the result IS random), but I wonder what approaches you take when you have to test functionality that returns a random result in a certain range?
Mock or fake out the random number generator
Do something like this... I didn't compile it so there might be a few syntax errors.
public interface IRandomGenerator
{
double Generate(double max);
}
public class SomethingThatUsesRandom
{
private readonly IRandomGenerator _generator;
private class DefaultRandom : IRandomGenerator
{
public double Generate(double max)
{
return (new Random()).Next(max);
}
}
public SomethingThatUsesRandom(IRandomGenerator generator)
{
_generator = generator;
}
public SomethingThatUsesRandom() : this(new DefaultRandom())
{}
public double MethodThatUsesRandom()
{
return _generator.Generate(40.0);
}
}
In your test, just fake or mock out the IRandomGenerator to return something canned.
In addition to testing that the function returns a date in the desired range, you want to ensure that the result is well-distributed. The test you describe would pass a function that simply returned the date you sent in!
So in addition to calling the function multiple times and testing that the result stays in the desired range, I would also try to assess the distribution, perhaps by putting the results in buckets and checking that the buckets have roughly equal numbers of results after you are done. You may need more than 100 calls to get stable results, but this doesn't sound like an expensive (run-time wise) function, so you can easily run it for a few K iterations.
I've had a problem before with non-uniform "random" functions.. they can be a real pain, it's worth testing for early.
I think there are three different aspects of this problem that you test.
The first one: is my algorithm the right one? That is, given a properly-functioning random-number generator, will it produce dates that are randomly distributed across the range?
The second one: does the algorithm handle edge cases properly? That is, when the random number generator produces the highest or lowest allowable values, does anything break?
The third one: is my implementation of the algorithm working? That is, given a known list of pseudo-random inputs, is it producing the expected list of pseudo-random dates?
The first two things aren't something I'd build into the unit-testing suite. They're something I'd prove out while designing the system. I'd probably do this by writing a test harness that generated a zillion dates and performed a chi-square test, as daniel.rikowski suggested. I'd also make sure this test harness didn't terminate until it handled both of the edge cases (assuming that my range of random numbers is small enough that I can get away with this). And I'd document this, so that anyone coming along and trying to improve the algorithm would know that that's a breaking change.
The last one is something I'd make a unit test for. I need to know that nothing has crept into the code that breaks its implementation of this algorithm. The first sign I'll get when that happens is that the test will fail. Then I'll go back to the code and find out that someone else thought that they were fixing something and broke it instead. If someone did fix the algorithm, it'd be on them to fix this test too.
You don't need to control the system to make the results deterministic. You're on the right approach: decide what is important about the output of the function and test for that. In this case, it is important that the result be in a range of 40 days, and you are testing for that. It's also important that it not always return the same result, so test for that too. If you want to be fancier, you can test that the results pass some kind of randomness test..
Normaly I use exactly your suggested approach: Control the Random generator.
Initialize it for test with a default seed (or replace him by a proxy returning numbers which fit my testcases), so I have deterministic/testable behaviour.
If you want to check the quality of the random numbers (in terms of independance) there are several ways to do this. One good way is the Chi square test.
Sure, using a fixed seed random number generator will work just fine, but even then you're simply trying to test for that which you cannot predict. Which is ok. It's equivalent to having a bunch of fixed tests. However, remember--test what is important, but don't try to test everything. I believe random tests are a way to try to test everything, and it's not efficient (or fast). You could potentially have to run a great many randomized tests before hitting a bug.
What I'm trying to get at here is that you should simply write a test for each bug you find in your system. You test out edge cases to make sure your function is running even in the extreme conditions, but really that's the best you can do without either spending too much time or making the unit tests slow to run, or simply wasting processor cycles.
Depending on how your function creates the random date, you may also want to check for illegal dates: impossible leap years, or the 31st day of a 30-day month.
Methods that do not exhibit a deterministic behavior cannot be properly unit-tested,as the results will differ from one execution to another. One way to get around this is to seed the random number generator with a fixed value for the unit test. You can also extract the randomness of the date generation class (and thus applying the Single Responsibility Principle), and inject known values for the unit-tests.
I would recommend overriding the random function. I am unit testing in PHP so I write this code:
// If we are unit testing, then...
if (defined('UNIT_TESTING') && UNIT_TESTING)
{
// ...make our my_rand() function deterministic to aid testing.
function my_rand($min, $max)
{
return $GLOBALS['random_table'][$min][$max];
}
}
else
{
// ...else make our my_rand() function truly random.
function my_rand($min = 0, $max = PHP_INT_MAX)
{
if ($max === PHP_INT_MAX)
{
$max = getrandmax();
}
return rand($min, $max);
}
}
I then set the random_table as I require it per test.
Testing the true randomness of a random function is a separate test altogether. I would avoid testing the randomness in unit tests and would instead do separate tests and google the true randomness of the random function in the programming language you are using. Non-deterministic tests (if any at all) should be left out of unit tests. Maybe have a separate suite for those tests, that requires human input or much longer running times to minimise the chances of a fail that is really a pass.
I don't think Unit testing is meant for this. You can use Unit testing for functions that return a stochastic value, but use a fixed seed, in which case in a way they are not stochastic, so to speak,
for random seed, I dont think Unit testing is what you want, for example for RNGs what you mean to have is a system test, in which you run the RNG many many times and look at the distribution or moments of it.