How can I unit test Entity Framework Code First Mappings? - c#

I'm using Code First to map classes to an existing database. I need a way to unit test these mappings, which are a mix of convention-based, attribute-based, and fluent-api.
To unit test, I need to confirm that properties of the classes map to the correct table and column names in the database. This test needs to be performed against the context, and should cover all configuration options for code first.
At a very high level, I'd be looking to assert something like (pseudo-code):
Assert.IsTrue(context.TableFor<Widget>().IsNamed("tbl_Widget"));
Assert.IsTrue(context.ColumnFor<Widget>(w => w.Property).IsNamed("WidgetProperty"));

Another idea to consider is using Linq and ToString().
For eaxample this :
context.Widget.Select(c => c.Property).ToString()
Will result in this for SQL Server Provider :
"SELECT [Var_3].[WidgetProperty] AS [WidgetProperty] FROM [dbo].[Widget]..."
Now we could hide it all in some Extension method that and parses resulting SQL it would look almost like Your pseudo-code :
Assert.IsTrue(context.Widgets.GetSqlColumnNameFor(w => w.Property).IsNamed("WidgetProperty"));
Draft for extension :
public string GetSqlColumnNameFor<TSource>(this DbSet<T> source, Expression<Func<TSource, TResult>> selector)
{
var sql = source.Select(selector).ToString();
var columnName = sql... // TODO : Some regex parsing
return
columnName;
}
Similary we could create GetSqlTableNameFor().
UPDATE : I decided to look for some dedicates SQL Parsers, so this solution is more generic, obviously there is such a thing for .NET :
http://www.dpriver.com/blog/list-of-demos-illustrate-how-to-use-general-sql-parser/generate-internal-query-parse-tree-in-xml-for-further-processing/

The only way I can think of to cover every possible option would be to use the Entity Framework Power Tools to pre-compile the views of your DbContext, and probably use a combination of reflection on that generated type and RegEx on the generated code itself to verify everything maps the way you want it to. Sounds pretty painful to me.
Another thing that comes to mind is creating a facade around DbModelBuilder to intercept and check everything that passes through it, but I don't know if that would handle the convention-based stuff. Also sounds painful.
As a less-complete, but much easier alternative, you can probably knock out a large portion of this by switching to attribute-based mapping wherever possible. This would allow you to create a base test class, say, ModelTesting<TEntity>, which includes a few test methods that use reflection to verify that TEntity has:
A single TableAttribute.
Each property has a single ColumnAttribute or NotMappedAttribute.
At least one property with a KeyAttribute.
Each property type maps to a compatible database type.
You could even go so far as to enforce a naming convention based on the names of the properties and class (with a caveat for table-per-hierarchy types). It would also be possible to check the foreign key mappings as well. That's a write-once base class you can derive from once for each of your model types and catch the majority of your mistakes (well, it catches the majority of mine, anyway).
Anything that can't be represented by attributes, like TPH inheritance and such, becomes a little harder. An integration test that fires up the DbContext and does a FirstOrDefault on Set<TEntity>() would probably cover most of those bases, assuming your DbContext isn't generating your database for you.

If you wrote a method
public static string ToMappingString(this Widget obj)
Then you could easily testing this via approval tests ( www.approvaltests.com or nuget)
There's a video here: http://www.youtube.com/watch?v=vKLUycNLhgc
However, if you are looking to test "My objects save and retrive themselves"
Then this is a perfect place of "Theory Based Testing"
Theory based testing
Most unit test take the form of
Given A,B expect C
Theory based testing is
Given A,B expect Theory
The beauty of this is there is no need to worry about which particular form A & B take since you don't need to know C, so any random generator will work.
Example 1: Testing Add and Subtract methods
Normally you would have stuff like
Assert.AreEqual(5, Add(2,3));
Assert.AreEqual(9, Add(10,-1));
Assert.AreEqual(10, Add(5,5));
Assert.AreEqual(7, Subtract(10,3));
However if you wrote a Theory Test it would look like
for(int i = 1; i < 100; i++)
{
int a = random.Next();
int b = random.Next();
Assert.AreEqual(a, Subtract(Add(a,b),b, string.Format("Failed for [a,b] = [{0},{1}], a,b));
}
Now that you understand Theory based testing, the theory you are trying to test is
Given Model A
When A is stored to the database, and retrieved the resulting object is equal to A

Related

Can I define custom type mapping for parameters and fields in PetaPoco?

I'm currently trying to pick a C# ORM to use with my PostgreSQL database, and I'm interested in the micro-ORMs, since they allow me to better utilize the power of Postgres(and since full blown ORMs are hard to configure. While Dapper simply works, trying to deal with NHibernate has left a forehead shaped dent in my screen...)
Anyways, currently PetaPoco has the lead, but there is one feature I need and can't figure if it has(to be fair - I couldn't find it in the other ORMs either) - mapping of custom types.
My PostgreSQL database uses the hstore and Postgis extensions, which define custom types. I don't expect any ORM to support those types(it's hard enough to find one that supports PostgreSQL!) but I want to be able to provide my own mappers for them, so when I get them as columns or send them as parameters PetaPoco will automatically use my mappers.
Is this even possible? The closest I could find is IDbParameter support, but those are built-in types and I need to write mappers for extension types that are not part of the list...
Based on Schotime's comment, I came with half a solution - how to parse the hstore from the query results into the object. I'm leaving this question open in case someone wants to get the other solution.
I need to define my own mapper. Obviously I want to use PetaPoco's default mapping for regular types, so it's only natural to inherit PetaPoco.StandardMapper - but that won't work, because StandardMapper implements PetaPoco.IMapper's fields without the virtual attribute - so I can't override them(I can only overshadow them, but that's not really helping).
What I did instead was to implement IMapper directly, and delegate regular types to an instance of PetaPoco.IMapper:
public class MyMapper:PetaPoco.IMapper{
private PetaPoco.StandardMapper standardMapper=new PetaPoco.StandardMapper();
public PetaPoco.TableInfo GetTableInfo(Type pocoType){
return standardMapper.GetTableInfo(pocoType);
}
public PetaPoco.ColumnInfo GetColumnInfo(PropertyInfo pocoProperty){
return standardMapper.GetColumnInfo(pocoProperty);
}
public Func<object, object> GetFromDbConverter(PropertyInfo TargetProperty, Type SourceType){
if(TargetProperty.PropertyType==typeof(HStore)){
return (x)=>HStore.Create((string)x);
}
return standardMapper.GetFromDbConverter(TargetProperty,SourceType);
}
public Func<object, object> GetToDbConverter(PropertyInfo SourceProperty){
if(SourceProperty.PropertyType==typeof(HStore)){
return (x)=>((HStore)x).ToSqlString();
}
return standardMapper.GetToDbConverter(SourceProperty);
}
}
The HStore object is constructed similarly to the one in Schotime's gist.
I also need to register the mapper:
PetaPoco.Mappers.Register(Assembly.GetAssembly(typeof(MainClass)),new MyMapper());
PetaPoco.Mappers.Register(typeof(HStore),new MyMapper());
Now, all of this works perfectly when I read from the query - but not when I write query parameters(even though I defined GetToDbConverter. It seems my mapper simply isn't called when I'm writing query parameters. Any idea how to do that?

Which of the following methods should I write unit tests for?

I'm looking at unit tests for the first time.
As I'm in Visual Studio 2008 I started with the build in testing framework.
I've hit the button and started looking at filling in the blanks, it all seems fairly simple.
Except, I can see two problems.
1) A lot of the blank unit tests seem to be redundant, is there a rule of thumb for choosing which methods not to write unit tests for.
2) Is there a best practise for writing tests for methods that read/write a database (SQL Server in this case)
I'll give an example for (1).
I'm writing unit tests for a WCF web service. We use wscf.blue to write our web service WSDL/XSD first.
Here's the path through the (heavily simplified) code for the methods which consumes a list of Users and writes them to the Users table in the database.
Entry Point
|
|
V
void PutOperators(PutOperatorsRequest request) (This method is auto generated code)
|
|
V
void PutOperatorsImplementation(PutOperatorsRequest input) (Creates a data context and a transaction, top level exception handling)
|
|
V
void PutEntities<T>(IEnumerable<T> input) (Generic method for putting a set of entities into the database, just a for loop, T is Operator in this case)
|
|
V
U PutEntity<T, U>(T entity) (Generic Method for converting the input to what the database expects and adding it to the DataContext ready for submission, T is Operator, U is the data layer entity, called User)
|
|
V
(This method calls 3 methods, first 2 of which are methods belonging to "entity" passed into this method, the 3rd is an abstract method that, when overridden, knows how to consume a BL entity and flatten it to a database row)
void EnsureIDPresent() (Ensures that incoming entity has a unique ID, or creates one)
void ValidateForInsert(AllOperators) (Does this ID already exists, etc)
User ToDataEntity(Operator entity) (simple field mapping excersice, User.Name = Operator.Name, etc)
So, as far as I can tell I have 3 methods that do something obviously testable:
EnsureIDPresent() - This method takes and input and modifies it in an easily testable way
ValidateForInsert() - This method takes the input and throws exceptions if criteria are not met
ToDataEntity() - This method takes an input, creates a data row entity and populates the values. Should be very easy to test.
There is also:
PutOperatorsImplementation() - It's here that DataContext.SubmitChanges() and TransactionScope.Complete() is called. Should I write tests to test what is written to the database? And then what? Delete them the records? Not sure what to do here.
I think I should delete the tests for:
PutOperators() - Auto generated code, one line, calls PutOperatorsImplementation()
PutEntities()- Just a for loop calling PutEntity(), and it's a generic method on a base class
PutEntity() - Calls three methods that already have unit tests and the calls DataContext.InsertOnSubmit.
I have a similar path for getting the data as well:
GetOperatorsResponse GetOperators(GetOperatorsRequest request) - Auto generated
GetOperatorsResponse GetOperatorsImplementation(GetOperatorsRequest input) - Set up DataContext
List<Operator> GetEntities() - Linq Query
Operator ToOperator(User) - Flattens one data entity into it's equivalent BL entity.
I think I should just be testing ToOperator() and GetEntities()
Should I just have a dedicated test database with known good test data in it?
Is that the correct way to approach this?
There is no "hard and fast" rule as to what you should and should not test.
Unit tests are there to Test any implementation your'e writing works and to enable you to be confident when re-factoring that you haven't broke anything. You need to consider whether the tests you write will give you value or not.
The main things you need to consider when deciding what code to cover are
How likely is the code I'm about to write / have written likely to
change and need refactoring?
Is the code written mainly custom code or autogenrated - If
autogenrated then there is little value in writing tests as your
simply just testing the autogenerator you are using does its job
properly (and you should be able to trust it).
You should not use databases or anything that can change outside of the test environment to test Data access code. Instead consider writing "Mocks" to mock the response from the data layer for your tests. This will ensure your tests are consistent. Consider looking at some mocking frameworks such as Rhino Mocks Or MOQ.
Always remember you are writing test for a reason and not for the sake of writing test. If you are not going to gain any value from the tests you write(e.g. if your codebase is not going to change) then simply don't write them.

"Selecting" or "Wrapping" an IQueryable so that it is still queryable

I have a Class / API that uses an IQueryable<FirstClass> data source however I wish to expose an IQueryable<SecondClass>, where SecondClass is a wrapper class for FirstClass that exposes nearly identical properties, however for various reasons needs to inherit from an unrelated base class. For example:
// My API
void IQueryable<SecondClass> GetCurrentRecords()
{
return from row in dataSource
/* Linq query */
select new SecondClass(row);
}
// User of my API
var results = GetCurrentRecords().Where(row => row.Owner = "Mike");
Now I can make the above compile simply by using AsQueryable however I want to expose a "true" IQueryable that efficiently queries the database based on the API users query.
I know that this isn't trivial (my wrapper IQueryable implementation needs to understand the relationship between the properties of SecondClass and FirstClass), and that it has nothing to do with the Select function, but it seems like it should be possible.
How do I do this?
Note: I know that instead my API could just expose FirstClass along with a helper method to convert FirstClass to SecondClass for when the API user is "done" creating their query, but it feels messy and I don't like the idea of exposing my generated classes in this way. Also I'd like to know how to do the above anyway just from a purely academic standpoint.
Probably, you should return not an IQueriable, but Expression. Then you will be able to modify expression and let LINQ generate a query from a final Expression object. Example is here: http://msdn.microsoft.com/en-us/library/bb882637.aspx

.NET refactoring, DRY. dual inheritance, data access and separation of concerns

Back story:
So I've been stuck on an architecture problem for the past couple of nights on a refactor I've been toying with. Nothing important, but it's been bothering me. It's actually an exercise in DRY, and an attempt to take it to such an extreme as the DAL architecture is completely DRY. It's a completely philosophical/theoretical exercise.
The code is based in part on one of #JohnMacIntyre's refactorings which I recently convinced him to blog about at http://whileicompile.wordpress.com/2010/08/24/my-clean-code-experience-no-1/. I've modified the code slightly, as I tend to, in order to take the code one level further - usually, just to see what extra mileage I can get out of a concept... anyway, my reasons are largely irrelevant.
Part of my data access layer is based on the following architecture:
abstract public class AppCommandBase : IDisposable { }
This contains basic stuff, like creation of a command object and cleanup after the AppCommand is disposed of. All of my command base objects derive from this.
abstract public class ReadCommandBase<T, ResultT> : AppCommandBase
This contains basic stuff that affects all read-commands - specifically in this case, reading data from tables and views. No editing, no updating, no saving.
abstract public class ReadItemCommandBase<T, FilterT> : ReadCommandBase<T, T> { }
This contains some more basic generic stuff - like definition of methods that will be required to read a single item from a table in the database, where the table name, key field name and field list names are defined as required abstract properties (to be defined by the derived class.
public class MyTableReadItemCommand : ReadItemCommandBase<MyTableClass, Int?> { }
This contains specific properties that define my table name, the list of fields from the table or view, the name of the key field, a method to parse the data out of the IDataReader row into my business object and a method that initiates the whole process.
Now, I also have this structure for my ReadList...
abstract public ReadListCommandBase<T> : ReadCommandBase<T, IEnumerable<T>> { }
public class MyTableReadListCommand : ReadListCommandBase<MyTableClass> { }
The difference being that the List classes contain properties that pertain to list generation (i.e. PageStart, PageSize, Sort and returns an IEnumerable) vs. return of a single DataObject (which just requires a filter that identifies a unique record).
Problem:
I'm hating that I've got a bunch of properties in my MyTableReadListCommand class that are identical in my MyTableReadItemCommand class. I've thought about moving them to a helper class, but while that may centralize the member contents in one place, I'll still have identical members in each of the classes, that instead point to the helper class, which I still dislike.
My first thought was dual inheritance would solve this nicely, even though I agree that dual inheritance is usually a code smell - but it would solve this issue very elegantly. So, given that .NET doesn't support dual inheritance, where do I go from here?
Perhaps a different refactor would be more suitable... but I'm having trouble wrapping my head around how to sidestep this problem.
If anyone needs a full code base to see what I'm harping on about, I've got a prototype solution on my DropBox at http://dl.dropbox.com/u/3029830/Prototypes/Prototype%20-%20DAL%20Refactor.zip. The code in question is in the DataAccessLayer project.
P.S. This isn't part of an ongoing active project, it's more a refactor puzzle for my own amusement.
Thanks in advance folks, I appreciate it.
Separate the result processing from the data retrieval. Your inheritance hierarchy is already more than deep enough at ReadCommandBase.
Define an interface IDatabaseResultParser. Implement ItemDatabaseResultParser and ListDatabaseResultParser, both with a constructor parameter of type ReadCommandBase ( and maybe convert that to an interface too ).
When you call IDatabaseResultParser.Value() it executes the command, parses the results and returns a result of type T.
Your commands focus on retrieving the data from the database and returning them as tuples of some description ( actual Tuples or and array of arrays etc etc ), your parser focuses on converting the tuples into objects of whatever type you need. See NHibernates IResultTransformer for an idea of how this can work (and it's probably a better name than IDatabaseResultParser too).
Favor composition over inheritance.
Having looked at the sample I'll go even further...
Throw away AppCommandBase - it adds no value to your inheritance hierarchy as all it does is check that the connection is not null and open and creates a command.
Separate query building from query execution and result parsing - now you can greatly simplify the query execution implementation as it is either a read operation that returns an enumeration of tuples or a write operation that returns the number of rows affected.
Your query builder could all be wrapped up in one class to include paging / sorting / filtering, however it may be easier to build some form of limited structure around these so you can separate paging and sorting and filtering. If I was doing this I wouldn't bother building the queries, I would simply write the sql inside an object that allowed me to pass in some parameters ( effectively stored procedures in c# ).
So now you have IDatabaseQuery / IDatabaseCommand / IResultTransformer and almost no inheritance =)
I think the short answer is that, in a system where multiple inheritance has been outlawed "for your protection", strategy/delegation is the direct substitute. Yes, you still end up with some parallel structure, such as the property for the delegate object. But it is minimized as much as possible within the confines of the language.
But lets step back from the simple answer and take a wide view....
Another big alternative is to refactor the larger design structure such that you inherently avoid this situation where a given class consists of the composite of behaviors of multiple "sibling" or "cousin" classes above it in the inheritance tree. To put it more concisely, refactor to an inheritance chain rather than an inheritance tree. This is easier said than done. It usually requires abstracting very different pieces of functionality.
The challenge you'll have in taking this tack that I'm recommending is that you've already made a concession in your design: You're optimizing for different SQL in the "item" and "list" cases. Preserving this as is will get in your way no matter what, because you've given them equal billing, so they must by necessity be siblings. So I would say that your first step in trying to get out of this "local maximum" of design elegance would be to roll back that optimization and treat the single item as what it truly is: a special case of a list, with just one element. You can always try to re-introduce an optimization for single items again later. But wait till you've addressed the elegance issue that is vexing you at the moment.
But you have to acknowledge that any optimization for anything other than the elegance of your C# code is going to put a roadblock in the way of design elegance for the C# code. This trade-off, just like the "memory-space" conjugate of algorithm design, is fundamental to the very nature of programming.
As is mentioned by Kirk, this is the delegation pattern. When I do this, I usually construct an interface that is implemented by the delegator and the delegated class. This reduces the perceived code smell, at least for me.
I think the simple answer is... Since .NET doesn't support Multiple Inheritence, there is always going to be some repetition when creating objects of a similar type. .NET simply does not give you the tools to re-use some classes in a way that would facilitate perfect DRY.
The not-so-simple answer is that you could use code generation tools, instrumentation, code dom, and other techniques to inject the objects you want into the classes you want. It still creates duplication in memory, but it would simplify the source code (at the cost of added complexity in your code injection framework).
This may seem unsatisfying like the other solutions, however if you think about it, that's really what languages that support MI are doing behind the scenes, hooking up delegation systems that you can't see in source code.
The question comes down to, how much effort are you willing to put into making your source code simple. Think about that, it's rather profound.
I haven't looked deeply at your scenario, but I have some thoughs on the dual-hierarchy problem in C#. To share code in a dual-hierarchy, we need a different construct in the language: either a mixin, a trait (pdf) (C# research -pdf) or a role (as in perl 6). C# makes it very easy to share code with inheritance (which is not the right operator for code-reuse), and very laborious to share code via composition (you know, you have to write all that delegation code by hand).
There are ways to get a kind of mixin in C#, but it's not ideal.
The Oxygene (download) language (an Object Pascal for .NET) also has an interesting feature for interface delegation that can be used to create all that delegating code for you.

Linq based generic alternate to Predicate<T>?

I have an interface called ICatalog as shown below where each ICatalog has a name and a method that will return items based on a Predicate<Item> function.
public interface ICatalog
{
string Name { get; }
IEnumerable<Item> GetItems(Predicate<Item> predicate);
}
A specific implementation of a catalog may be linked to catalogs in various format such as XML, or a SQL database.
With an XML catalog I end up deserializing the entire XML file into memory, so testing each item with the predicate function does does not add a whole lot more overhead as it's already in memory.
Yet with the SQL implementation I'd rather not retrieve the entire contents of the database into memory, and then filter the items with the predicate function. Instead I'd want to find a way to somehow pass the predicate to the SQL server, or somehow convert it to a SQL query.
This seems like a problem that can be solved with Linq, but I'm pretty new to it. Should my interface return IQueryable instead? I'm not concerned right now with how to actually implement a SQL version of my ICatalog. I just want to make sure my interface will allow for it in the future.
Rob has indicated how you might do this (although a more classic LINQ approach might take Expression<Func<Item,bool>>, and possbily return IQueryable<IFamily>).
The good news is that if you want to use the predicate with LINQ-to-Objects (for your xml scenario) you can then just use:
Predicate<Item> func = predicate.Compile();
or (for the other signature):
Func<Item,bool> func = predicate.Compile();
and you have a delegate (func) to test your objects with.
The problem though, is that this is a nightmare to unit test - you can only really integration test it.
The problem is that you can't reliably mock (with LINQ-to-Objects) anything involving complex data-stores; for example, the following will work fine in your unit tests but won't work "for real" against a database:
var foo = GetItems(x => SomeMagicFunction(x.Name));
static bool SomeMagicFunction(string name) { return name.Length > 3; } // why not
The problem is that only some operations can be translated to TSQL. You get the same problem with IQueryable<T> - for example, EF and LINQ-to-SQL support different operations on a query; even just First() behaves differently (EF demands you explicitly order it first, LINQ-to-SQL doesn't).
So in summary:
it can work
but think carefully whether you want to do this; a more classic black box repository / service interface may be more testable
You don't need to go all the way and create an IQueryable implementation
If you declare your GetItems method as:
IEnumerable<IFamily> GetItems(Expression<Predicate<Item>> predicate);
Then your implementing class can inspect the Expression to determine what is being asked.
Have a read of the IQueryable article though, because it explains how to build a expression tree visitor, which you'll need to build a simple version of.

Categories