how to organize test data in integration tests

how to organize test data in integration tests - c#

I am working on a large test project consisting of thousands of integration tests.
It is a bit messy with lots of code duplication. Most tests are composed of several steps, and a lot of "dummy" objects are created. With dummy I mean something like this:
new Address {
Name = "fake address",
Email = "some#email.com",
... and so on
}
where it often really doesn't matter what the data is. This kind of code is spread out and duplicated in tests, test base classes, and helpers.
What I want is to have a single "test data builder", having a single responsibility, generate test data which is consumed by the tests.
One approach is to have a class with a bunch of static methods like following:
Something CreateSomething(){
return new Something {
// set default dummy values
}
and an overload:
Something CreateSomething(params){
return new Something {
// create the Something from the params
}
Another approach is to have xml files containing the data but i am afraid then it would be too far away from the tests.
The goal is to move this kind of code out of the tests because right now the tests are big and not readable. In a typical case, of 50 lines of test code, 20-30 is of this kind of code.
Are there any patterns for accomplishing this? Or any example of big open source codebase with something similar that I can have a look at?

For code, use a simple method chaining builder pattern:
public class TestOrderBuilder
{
private order = new Order();
// These methods represent sentances / grammar that describe the sort
// of test data you are building
public AnObjectBuilder AddHighQuantityOrderLine()
{
//... code to add a high quantity order line
return this; // for method chaining
}
// These methods represent sentances / grammar that describe the sort
// of test data you are building
public AnObjectBuilder MakeDescriptionInvalid()
{
//... code to set descriptions etc...
return this; // for method chaining
}
public Order Order
{
get { return this.order; }
}
}
// then using it:
var order = new TestOrderBuilder()
.AddHighQuantityOrderLine()
.MakeDescriptionInvalid()
.Order

I would shy away from xml files that specify test dependencies.
My thought process stems from a lack of refactoring tools that these xml files cannot take advantage of within the Visual Studio ecosystem.
Instead, I would create a TestAPI.
This API will be responsible for serving dependency data to test clients.
Note that the dependency data that is being requested will already be initialized with general data and ready to go for the clients that are requesting the dependencies.
Any values that serve as required test inputs for a particular test would be assigned within the test itself. This is because I want the test to self document its intent or objective and abstract away the dependencies that are not being tested.
XUnit Test Patterns provided me a lot of insight for writing tests.

Related

How to put a TPL Dataflow TranformBlock or ActionBlock in a separate file?

I want to use the TPL Dataflow for my .NET Core application and followed the example from the docs.
Instead of having all the logic in one file I would like to separate each TransformBlock and ActionBlock (I don't need the other ones yet) into their own files. A small TransformBlock example converting integers to strings
class IntToStringTransformer : TransformBlock<int, string>
{
public IntToStringTransformer() : base(number => number.ToString()) { }
}
and a small ActionBlock example writing strings to the console
class StringWriter : ActionBlock<string>
{
public StringWriter() : base(Console.WriteLine) { }
}
Unfortunately this won't work because the block classes are sealed. Is there a way I can organize those blocks into their own files?

Dataflow steps/blocks/goroutines are fundamentally functional in nature and best organized as modules of factory functions, not separate classes. A TPL DataFlow pipeline is quite similar to a pipeline of function calls in F#, or any other language. In fact, one could look at it as a PowerShell pipeline, except it's easier to write.
There's no need to create a class or implement an interface to add a new function to that pipeline, you just add it and redirect the output to the next function.
TPL Dataflow blocks provide the primitives to construct a pipeline already and only require a transformation function. That's why they are sealed, to prevent misuse.
The natural way to organize dataflows is similar to F# too - create libraries with the functions that perform each job, putting them in modules of related functions. Those functions are stateless, so they can easily go into a static library, just like extension methods.
For example, there could be one module for database related functions that perform bulk inserts or read data, another to handle exports to various file formats, separate classes to call external web services, another to parse specific message formats.
A real Example
For the last 7 years I'm working with several complex pipelines for an Online Travel Agency (OTA). One of them calls several GDSs (the intermediaries between OTAs and airlines) to retrieve transaction information - ticket issues, refunds, cancellations etc. Next step retrieves the ticket records, the detailed ticket informations. Finally, the records are inserted into the database.
GDSs are too big to bother with standards, so their "SOAP" web services aren't even SOAP-compliant, much less follow WS-* standards. So each GDS needs a separate class library to call the services and parse the outputs. No dataflows there yet, the project is already complex enough
Writing the data to the database is pretty much the same always, so there's a separate project with methods that take eg an IEnumerable<T> and write it to the database with SqlBulkCopy.
It's not enough to load new data though, things often go wrong so I need to be able to load already stored ticket information.
Organisation
To preserve sanity :
Each pipeline gets its own file:
A Daily pipeline to load new data,
A Reload pipeline to load all stored data
A "Rerun" pipeline to use the existing data and ask again for any missing data.
Static classes are used to hold the worker functions and separately factory methods that produce Dataflow blocks based on configuration. Eg, a CreateLogger(path,level) creates an ActionBlock<Message> that logs specific messages.
Common dataflow extension methods - since DataFlow blocks follow the same basic patterns, it's easy to create a logged block by combining eg a Func<TIn,TOut> and a logger block. Or create a LinkTo overload that redirects bad records to a logger or database. Those are common enough they can become extension methods.
If those were in the same file, it would be very hard to edit one pipeline without affecting another. Besides, there's a lot more to a pipeline than the core tasks, eg:
Logging
Handling bad records and partial results (can't stop a 100K import for 10 errors)
error handling (which isn't the same as handling bad records)
monitoring - what's this monster doing for the last 15 minutes? Did a DOP=10 improve performance at all?
Don't create a parent pipeline class.
Some of the steps are common, so at first, I created a parent class with common steps that got overloaded, or simply replaced in child classes. VERY BAD IDEA. Each pipeline is similar but not quite, and inheritance means that modifying one step or one connection risks breaking everything. After about 1 year things became unbearable, so I split the parent class into separate classes.

As #Panagiotis explained, I think you have to put aside the OOP Mindset a little.
What you have with DataFlow are Buildingblocks that you configure to execute what you need. I'll try to create a little example of what I mean by that:
// Interface and impl. are in separate files. Actually, they could
// even be in a different project ...
public interface IMyComplicatedTransform
{
Task<string> TransformFunction(int input);
}
public class MyComplicatedTransform : IMyComplicatedTransform
{
public Task<string> IMyComplicatedTransform.TransformFunction(int input)
{
// Some complex logic
}
}
class DataFlowUsingClass{
private readonly IMyComplicatedTransform myTransformer;
private readonly TransformBlock<int , string> myTransform;
// ... some more blocks ...
public DataFlowUsingClass()
{
myTransformer = new MyComplicatedTransform(); // maybe use ctor injection?
CreatePipeline();
}
private void CreatePipeline()
{
// create blocks
myTransform = new TransformBlock<int, string>(myTransformer.TransformFunction);
// ... init some more blocks
// TODO link blocks
}
}
I think this is the closest to what you are looking for to do.
What you end up with is a set of interfaces and implementations which can be tested independently. The client basically boils down to "gluecode".
Edit: As #Panagiotis correctly states, the interfaces are even superfluent. You could do without.

How Can I Redesign My C# Application To Avoid Constructor Injection Overload?

I am still new to DI and Unit Tests. I have been tasked with adding Unit Tests to some of our legacy code. I am working on a WCF web service. A lot of refactoring has had to be done. Monster classes split into separate classes that make sense. Monster methods split to multiple methods. And lastly, creating interface classes for external dependencies. This was done initially to facilitate mocking those dependencies for unit tests.
As I have gone about this the list of dependencies keeps growing and growing. I have other web services to call, SQL Servers and DB2 Databases to interact with, a config file to read, a log to write to, and reading from Sharepoint data. So far I have 10 dependencies. And every time I add one it breaks all my Unit Tests since there is a new parameter in the constructor.
If it helps, I am using .Net 4.5, Castle Windsor as my IOC, MSTest, and Moq for testing.
I have looked here How to avoid Dependency Injection constructor madness? but this doesn't provide any real solution. Only to say "your class may be doing too much." I looked into Facade and Aggregate services but that seemed to just move where things were.
So I need some help on how to make this class to "less" but still provide the same output.
public AccountServices(ISomeWebServiceProvider someWebServiceProvider,
ISomeOtherWebProvider someOtherWebProvider,
IConfigurationSettings configurationSettings,
IDB2Connect dB2Connect,
IDB2SomeOtherData dB2SomeOtherData,
IDB2DatabaseData dB2DatabaseData,
ISharepointServiceProvider sharepointServiceProvider,
ILoggingProvider loggingProvider,
IAnotherProvider AnotherProvider,
ISQLConnect SQLConnect)
{
_configurationSettings = configurationSettings;
_someWebServiceProvider = someWebServiceProvider;
_someOtherWebProvider = someOtherWebProvider;
_dB2Connect = dB2Connect;
_dB2SomeOtherData = dB2SomeOtherData;
_dB2DatabaseData = dB2DatabaseData;
_sharepointServiceProvider = sharepointServiceProvider;
_loggingProvider = loggingProvider;
_AnotherProvider = AnotherProvider;
_SQLConnect = SQLConnect;
}
Almost all of the there are in other components but I need to be able to use them in the main application and mock them in unit tests.
Here is how one of the methods is laid out.
public ExpectedResponse GetAccountData(string AccountNumber)
{
// Get Needed Config Settings
...
// Do some data validation before processing data
...
// Try to retrieve data for DB2
...
// Try to retrieve data for Sharepoint
...
// Map data to response.
...
// If error, handle it and write error to log
}
Other methods are very similar but they may be reaching out to SQL Server or one or more web services.
Ideally what I would like to have is an example of an application that needs a lot of dependencies, that has unit tests, and avoids having to keep adding a new dependency to the constructor causing you to update all your unit tests just to add the new parameter.
Thanks

Not sure if this helps, but I came up with a pattern I called GetTester that wraps the constructor and makes handling the parameters a little easier. Here's an example:
private SmartCache GetTester(out Mock<IMemoryCache> memory, out Mock<IRedisCache> redis)
{
memory = new Mock<IMemoryCache>();
redis = new Mock<IRedisCache>();
return new SmartCache(memory.Object, redis.Object);
}
Callers look like this if they need all the mocks:
SmartCache cache = GetTester(out Mock<IMemoryCache> memory, out Mock<IRedisCache> redis);
Or like this if they don't:
SmartCache cache = GetTester(out _, out _);
These still break if you have constructor changes, but you can create overloads to minimize the changes to tests. It's a hassle but easier than it would otherwise be.

So possibly your classes might be doing too much. If you're finding that you're constantly increasing the work that a class is doing and as a result are needing to provide additional objects to assist in those additional tasks then this is probably the issue and you need to consider breaking up the work.
However, if this isn't the case then another option is to have the classes take in a reference to a dependency class that provides access to either the instantiated concrete objects that implement your various interfaces or instantiated factory objects which can be used to construct objects. Then instead of constantly passing new parameters you can just pass the single object and from there pull or create objects as necessary from that.

Unit Testing Methods With File IO

I'm trying to get into the habit of writing unit tests, I've written a few before but they've usually been quite basic...I'd like to start making a move to TDD as i want to improve the quality of my code (design and structure) - reducing coupling, while at the same time hopefully reduce number of regressions which slip through to a testable build.
I have taken a relatively simple project i work on to begin with. The resultant program watches a folder and then acts on files within this folder.
Here is a typical example of some code extracted from the project:
private string RestoreExtension(String file)
{
var unknownFile = Path.GetFileName(file);
var ignoreDir = Path.GetDirectoryName(file) + "\\Unknown";
string newFile;
//We need this library for determining mime
if (CheckLibrary("urlmon.dll"))
{
AddLogLine("Attempting to restore file extension");
var mime = GetMimeType(BufferFile(file, 256));
var extension = FileExtensions.FindExtension(mime);
if (!String.IsNullOrEmpty(extension))
{
AddLogLine("Found extension: " + extension);
newFile = file + "." + extension;
File.Move(file, newFile);
return newFile;
}
}
else
{
AddLogLine("Unable to load urlmon.dll");
}
AddLogLine("Unable to restore file extension");
if (!Directory.Exists(ignoreDir))
{
Directory.CreateDirectory(ignoreDir);
}
var time = DateTime.Now;
const string format = "dd-MM-yyyy HH-mm-ss";
newFile = ignoreDir + "\\" + unknownFile + "-" + time.ToString(format);
File.Move(file, newFile);
return String.Empty;
}
Questions :
How can i test a method using IO?
I don't really want to be using a real file as this would be slow (and more of an integration test?), but i cant really see another way. I could add a switchable IO abstraction layer, but this sounds to me like it may unnecessarily complicate code...
Is a project like this worth unit testing?
By that i mean, is it too simple. Once you strip out .Net and 3rd party library calls there is not much left...So is it a case that the amount of work to make it testable means its not a good candidate to test?
A lot of the methods in this project are private, as this project happens to be a windows service. I've read you should only really test methods which are externally visible (public or exposed through an interface), it is unlikely in this case however that i want to expose any of my own methods. So is this project worth testing (as id probably need to either change method access modifiers to public or add some attributes so that NUnit can see them)?

Regarding how to test File I/O:
A common way of doing this is encapsulating File I/O in a wrapper class. For example:
public class FileHandler : IFileHandler
{
public string GetFilename(string name)
{
return System.IO.GetFileName(name);
}
public string GetDirectoryName(string directory)
{
return System.IO.GetDirectoryName(directory);
}
}
public interface IFileHandler
{
string GetFilename(string name);
string GetDirectoryName(string directory);
}
In your method under test, you only program against the interface. When you run your unit test, you would use a mock object which returns predefined values. For this, you could use Moq.
This is a bit work to do but then you do not need to test File I/O as well.
Best
Jan

How can i test a method using IO?
By introducing abstraction over the .NET's IO libraries and then using real implementation (BCL) in application, and fake (mocked) one in unit test. Such abstraction is fairly simple to write on your own, but those projects already exist (eg. System.IO.Abstraction) so reinventing wheel is pointless, especialy in such trivial matter.
I don't really want to be using a real file as this would be slow
This is correct, in test you want to use fake (mock) objects.
Is a project like this worth unit testing?
Every project is worth testing and you will test it in one way or another (in most extreme case, by manually executing application and doing good, old click-and-wait testing).
Note that almost always it is too time consuming to manually test 10 edge cases of complex parser method which everytime requires you to load large file and wait for results. This is where automated tests in isolated environment help greatly. It is the same with your code - file system, web service, they all introduce large overhead before you can get to actual business logic. Must isolate.
A lot of the methods in this project are private (...) I've read you should only really test methods which are externally visible
Correct again, you should be testing public contract. However, if private stuff gets big enough to test it, most of the times it's a good indicator it is also big enough to put it in its own class. Note that changing members access modifiers only for the unit test usage is not the best idea. Almost always it's better to extract-class/method refactor.

You can do that using Microsoft Fakes. First generate a fake assembly for System.dll - or any other package and then mock expected returns as in:
using Microsoft.QualityTools.Testing.Fakes;
...
using (ShimsContext.Create())
{
System.IO.Fakes.ShimDirectory.ExistsString = (p) => true;
System.IO.Fakes.ShimFile.MoveStringString = (p,n) => {};
// .. and other methods you'd like to test
var result = RestoreExtension("C:\myfile.ext");
// then you can assert with this result
}

I'm pretty new with TDD too, but I think I can get some suggestion to you:
About IO testing. You can use mock objects. Create a mock for the method you want to test and compare it with real passed in value. In Moq Framework it looks like that:
var mock = new Mock<IRepository<PersonalCabinetAccount>>();
mock.Setup(m => m.RemoveByID(expectedID))
.Callback((Int32 a) => Assert.AreEqual(expectedID, a));
var account = new PersonalCabinetAccount {ID = myID};
var repository = mock.Object;
repository.RemoveByID(account.myId);
Absolutely, yes. If you want to get a habit, you need to test everything, i suppose. That way knowledge comes ;)
You can use InternalsVisibleTo attribute to avoid visibility issues.
test methods which are externally visible
I think this one about enterprise programming. just a spare of time with your own trusted code.
Hope my point can help you.

Unit Testing Database-Modifying functions?

I have methods like this:
public static bool ChangeCaseEstimatedHours(int caseID, decimal? time)
{
Case c = Cases.Get(caseID);
if (c != null)
{
c.EstimatedHours = time;
return Cases.Update(c);
}
return false;
}
public static bool RemoveCase(int caseID)
{
return Cases.Remove(caseID);
}
which internally uses LINQ to do the queries.
I am wondering how I should go about testing these. They do not have a state so they are static. They also modify the database.
Thus, I would have to create a case, then remove it in the same test, but a unit test should only be doing 1 thing. What is usually done in these situations?
How do I test database queries, updates and deletes?
Thanks

A test that requires database access is probably the hardest test to maintain. If you can avoid touching the database in your unit test, I would try to create an interface for the Cases class.
interface ICase
{
ICase Get(int caseID);
bool RemoveCase(int caseID);
}
And then use RhinoMock, Moq to verify if the Get() and RemoveCase() were called.
If you insist you need to test with a database, then you will need to spend time on setting up a testing database and do proper data clean up.

There is a difference between "Knowing the path and walking the path". What you want to test would class as an integration test not a unit test. A unit test is something that is carried out in isolation without any external dependencies and therefore makes the test runnable even if the machine you are running it on does not have a physical database instance present on it. See more differences here.
That's why patterns like Repository are really popular when it comes to doing persistence based applications as they promote unit test-ability of the data layer via Mocking. See a sample here.
So, you have 2 options (the blue pill Or the red pill):
Blue Pill: Buy a tool like TypeMock Isolator essential or JustMock and you'll be able to mock anything in your code "and the story ends".
Red Pill: Refactor existing design of the data layer to one of the interface based patterns and use free Mocking frameworks like Moq or RhinoMocks and "see how deep the rabbit hole goes"
All the best.

Maybe you could consider approach based on Rails framework:
Initialize database tables content via presets (fixtures in Rails)
Open transaction
Run test
Rollback transaction
Go to point 2. with another test

Are those unit tests fine?

I'm trying to grasp test driven development, and I'm wondering if those unit tests is fine. I have a interface which looks like this:
public interface IEntryRepository
{
IEnumerable<Entry> FetchAll();
Entry Fetch(int id);
void Add(Entry entry);
void Delete(Entry entry);
}
And then this class which implements that interface:
public class EntryRepository : IEntryRepository
{
public List<Entry> Entries {get; set; }
public EntryRepository()
{
Entries = new List<Entry>();
}
public IEnumerable<Entry> FetchAll()
{
throw new NotImplementedException();
}
public Entry Fetch(int id)
{
return Entries.SingleOrDefault(e => e.ID == id);
}
public void Add(Entry entry)
{
Entries.Add(entry);
}
public void Delete(Entry entry)
{
Entries.Remove(entry);
}
}
Theese are the unit tests I have written so far, are they fine or should I do something different? Should i be mocking the EntryRepository?
[TestClass]
public class EntryRepositoryTests
{
private EntryRepository rep;
public EntryRepositoryTests()
{
rep = new EntryRepository();
}
[TestMethod]
public void TestAddEntry()
{
Entry e = new Entry { ID = 1, Date = DateTime.Now, Task = "Testing" };
rep.Add(e);
Assert.AreEqual(1, rep.Entries.Count, "Add entry failed");
}
[TestMethod]
public void TestRemoveEntry()
{
Entry e = new Entry { ID = 1, Date = DateTime.Now, Task = "Testing" };
rep.Add(e);
rep.Delete(e);
Assert.AreEqual(null, rep.Entries.SingleOrDefault(i => i.ID == 1), "Delete entry failed");
}
[TestMethod]
public void TestFetchEntry()
{
Entry e = new Entry { ID = 2, Date = DateTime.Now, Task = "Testing" };
rep.Add(e);
Assert.AreEqual(2, rep.Fetch(2).ID, "Fetch entry failed");
}
}
Thanks!

Just off the top of my head...
Although your testing of add really only tests the framework:
You've got adding 1 item, that's good
what about adding LOTS of items
(I mean, ridiculous amounts - for what value of n entries does the container add fail?)
what about adding no items? (null entry)
if you add items to the list, are they in a particular order?
should they be?
likewise with your fetch:
what happens in your fetch(x) if x > rep.Count ?
what happens if x < 0?
what happens if the rep is empty?
does x match performance requirements (what's it's algorithmic
complexity? is it within range when there's just one entry and when
there's a ridiculously large amount of entries?
There's a good checklist in the book Pragmatic Unit Testing (good book, highly recommended)
Are the results right?
Are all the boundary conditions CORRECT
Conform to an expected format
Ordered correctly
In a reasonable range
Does it Reference any external dependencies
Is the Cardinality correct? (right number of values)
does it complete in the correct amount of Time (real or relative)
Can you check inverse relationships
Can you cross check the results with another proven method
Can you force error conditions
Are performance characteristics within bounds

Here's some thoughts:
Positive
You're Unit Testing!
You're following the convention Arrange, Act, Assert
Negative
Where's the test to remove an entry when there's no entry?
Where's the test to fetch an entry when there's no entry?
What is supposed to happen when you add two entries and remove one? Which one should be left?
Should Entries be public. The fact that one of your asserts calls rep.Entries.SingleOrDefault suggests to me you're not constructing the class correctly.
Your test naming is a bit vague; typically a good pattern to follow is: {MethodName}_{Context}_{Expected Behavior} that remove the redundancy "test" redundancy.
As a beginner to TDD I found the book Test-Driven Development By Example to be a huge help. Second, Roy Osherove has some good Test Review video tutorials, check those out.

Before I answer, let me state that I am fairly new to unit testing and by no means an expert, so take everything I state with a grain of salt.
But I feel that your unit tests are largely redundant. Many of your methods are simple pass through, like your AddEntry method is simply a call to the underlying List method Add. Your not testing your code, your testing the Java library.
I would recommend only unit testing methods that contain logic that you write. Avoid testing obvious methods like getters and setters, because they operate at the most basic level. That's my philosophy, but I know some people do believe in testing obvious methods, I just happen to think it is pointless.

Seems fine like this. I personally like to give my tests a bit more descriptive names but that's more of a personal preference.
You can use mocking for dependencies of the class you're testing, EntryRepository is the class under test so no need to mock that, else you would end up testing the mock implementation instead of the class.
Just to give a quick example. If your EntryRepository would use a backend database to store the Entries instead of a List you could inject a mock-implementation for the data-access stuff instead of calling a real database.

This looks like a fine start, but you should try to test the 'borderline' cases as much as possible. Think about what might cause your methods to fail - Would passing a null Entry to Add or Delete be valid? Try to write tests that exercise every possible code path. Writing tests in this manner will make your test suite more useful in the future, should you make any changes to the code.
Also, it is useful for each test method to leave the test object in the same state as when it was invoked. I noticed your TestFetchEntry method adds an element to the EntryRepository, but never removes it. Having each method not affect test object state makes it easier to run series of tests.

You should not be mocking the IEntryRepository since the implementing class is the class under test. You might want to mock the List<Entry> and inject it, then just test that the methods that you invoke via your public interface are called appropriately. This would just be an alternative to the way you have it implemented and isn't necessarily better -- unless you want the class to have it's backing class injected in which case writing the tests that way would force that behavior.
You might want some more tests to make sure that when you insert an entry, the right entry is inserted. Likewise with delete -- insert a couple of entries, then delete one and make sure that the proper one has been deleted. Once you come up with tests to get the code to do what you want it to, keep thinking of ways that you could mess up writing the code and write tests to make sure those don't happen. Granted this class is pretty simple and you may be able to convince yourself that the tests you have that drive behavior is sufficient. It doesn't take much complexity, though, to make it worth testing edge cases and unexpected behavior.

For a TDD beginner and for this particular class, your tests are fine. +1 for making the effort.
Post another question once you get to more complex scenarios involving dependency injection and mocking. This is where things get really interesting ;).

Looks good overall. You should use transactions (or create new instance of repository in TestInitialize) to make sure the tests are trully isolated.
Also use more descriptive test methods, like When_a_new_entity_is_added_to_a_EntryRepository_the_total_count_of_objects_should_get_incremented

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

how to organize test data in integration tests - c#

Related

How to put a TPL Dataflow TranformBlock or ActionBlock in a separate file?

How Can I Redesign My C# Application To Avoid Constructor Injection Overload?

Unit Testing Methods With File IO

Unit Testing Database-Modifying functions?

Are those unit tests fine?

Categories

Resources