What DAL strategy do you use or suggest? - c#

My situation is that I screwed up essentially. I inherited my code base about 1.5 years ago when I took this position and rather than reinventing the wheel, even though I know now that I should have, I kept the DAL in pretty much the same structure as the previous developer.
Essentially there is one file (now at 15k lines of code) that serves as a go between to a bunch of DAO's that use DataSets and TableAdapters to retrieve data. My xsd files have grown to such size that they cause R# to crash visual studio every time it opens and the intermediary class that is now 15k lines also takes forever for R# to analyze. Not to mention it is ugly, it works but not well, and is an absolute nightmare to debug.
What I have tried thus far is switching to NHibernate. NHibernate is a great library, but unfortunately it was not adaptable enough to work with my application, from what the lead developer says (Fabio Maulo) it is pretty much a combination of my application requirements and the restrictions upon NHibernate when using identity as a database PK strategy.
So now I am back to essentially designing my own DAL. I am looking at a few different patterns for this, but would like to get your DAL design strategies. There are so many ways and reasons to implement a DAL in a particular manner so if you could please explain your strategy and why it was best fit for you I would greatly appreciate it.
Thanks in advance!
Edit: Let me explain why NHibernate did not work since that seems to be the immediate response. My users create a "job" that is actually just a transient representation of my Job class. Within this job they will give it one or a list of weight factors that are also transient at the time of creation. Finally they provide a list of job details that have a particular weight factor associated to them. Because, in the DB, weight factors are unique when I go to persist the job and it cascades down to weight factor it dies when it finds a duplicate weight factor. I tried running a check before assigning the weight factor to the detail (which I didn't want to do because I don't want the extra calls to the db) but calling CreateCriteria in NH also causes a flush in the session, according to Fabio, which destroys my cache and thus kills the entire in memory representation of the job. Folks over at the NH mailing list said I should switch over to GUID, but that is not a feasible option as the conversion process would be a nightmare.

My experience with NHibernate is that, while it is packed with features and very high-performance, you will eventually need to become an NHibernate expert in order to fix some unexpected behavior. Reading through the pro-NHibernate answers and seeing
Hmm , perhaps he uses long running
Sessions (Session per Business
Transaction model), and in such an
approach, using identity is
discouraged, since it breaks your
unitofwork (it needs to flush directly
after inserting a new entity). A
solution could be to drop the
identity, and use the HiLo identity
generator.
illustrates exactly what I mean.
What I've done is create a base class modeled somewhat off of the ActiveRecord pattern, that I inherit from and mark up the inherited class with attributes that attach it to a stored procedure each for Select, Insert, Update and Delete. The base class uses Reflection to read the attributes and assign the class's property values to SP parameters, and in the case of Select(), assign the result SQLDataReader's column values to the properties of a list of generics.
This is what DataObjectBase looks like:
interface IDataObjectBase<T>
{
void Delete();
void Insert();
System.Collections.Generic.List<T> Select();
void Update();
}
This is an example of a data class deriving from it:
[StoredProcedure("usp_refund_CustRefundDetailInsert", OperationType.Insert)]
[StoredProcedure("usp_refund_CustRefundDetailSelect", OperationType.Select)]
[StoredProcedure("usp_refund_CustRefundDetailUpdate", OperationType.Update)]
public class RefundDetail : DataObjectBase<RefundDetail>
{
[StoredProcedureParameter(null, OperationType.Update, ParameterDirection.Input)]
[StoredProcedureParameter(null, OperationType.Insert, ParameterDirection.Output)]
[StoredProcedureParameter(null, OperationType.Select, ParameterDirection.Input)]
[ResultColumn(null)]
public int? RefundDetailId
{ get; set; }
[StoredProcedureParameter(null, OperationType.Update, ParameterDirection.Input)]
[StoredProcedureParameter(null, OperationType.Insert, ParameterDirection.Input)]
[StoredProcedureParameter(null, OperationType.Select, ParameterDirection.Input)]
[ResultColumn(null)]
public int? RefundId
{ get; set; }
[StoredProcedureParameter(null, OperationType.Update, ParameterDirection.Input)]
[StoredProcedureParameter(null, OperationType.Insert, ParameterDirection.Input)]
[ResultColumn(null)]
public int RefundTypeId
{ get; set; }
[StoredProcedureParameter(null, OperationType.Update, ParameterDirection.Input)]
[StoredProcedureParameter(null, OperationType.Insert, ParameterDirection.Input)]
[ResultColumn(null)]
public decimal? RefundAmount
{ get; set; }
[StoredProcedureParameter(null, OperationType.Update, ParameterDirection.Input)]
[StoredProcedureParameter(null, OperationType.Insert, ParameterDirection.Input)]
[ResultColumn(null)]
public string ARTranId
{ get; set; }
}
I know it seems like I'm reinventing the wheel, but all of the libraries I found either had too much dependence on other libraries (ActiveRecord + NHibernate, for instance, which was a close second) or were too complicated to use and administer.
The library I made is very lightweight (maybe a couple of hundred lines of C#) and doesn't do anything more than assign values to parameters and execute the SP. It also lends itself very well to code generation, so eventually I expect to write no data access code. I also like that it uses a class instance instead of a static class, so that I can pass data to queries without some awkward criteria collection or HQL. Select() means "get more like me".

For me the best fit was a pretty simple concept - use DAO class definitions and with reflection create all SQL necessary to populate and save them. This way there is no mapping file, only simple classes. My DAO's require an Entity base class so it is not a POCO but that doesn't bother me. It does support any type of primary key, be it single identity column or multi column.

If your DAL is written to an interface, it would be much easier to switch to NHibernate or something comperable (I would prefer Fluent-NHibernate, but I digress). So why not spend the time instead refactoring the DAL to use an interface, and then write a new implementation using NH or your ORM of choice?

In recent projects we have stopped programming a separate DAL.
Instead we use an Object Relational Mapper (in our case Entity Framework). We then let the business layer program directly against the ORM.
This has saved us over 90% of development effort in some cases.

My first step would be to break the code out of a 15 KLOC monster, then come up with a strategy for creating a new DAL.

Linq to SQL is nice if you are using SQL Server. There is source out there for a LinqToSQL provider to Access and MySQL. I haven't tested it though. LinqToSql follows the UnitOfWork model which is similar to the way ADO.NET functions. You make a series of changes to a local copy of the data then commit all the changes with one update call. It's pretty clean I think.
You can also extend the DataRow class yourself to provide strongly typed access to your fields. I used XSLT to generate the DataRow descendants based on the metadata of each table. I have a generic DataTable decendant. MyDataTable where T is my derived row. I know that MS's strongly-typed datasets do a similar thing but I wanted a light-weight generic version that I complete control of. Once you have this, you can write static access methods that query the db and fill the DataTable.
You would be in charge of writing the changes from the DataTable back to the DataSource. I would write a generic class or method that creates the update,inserts and deletes.
Good Luck!

I use mine wrapper for SPs for the fastest data retrieving and L2S when perfomance is not a goal. My DAL uses repository pattern and encapsulated logic for TDD.

Related

Entity framework 6 easiest way to denormalize column to avoid frequent joins

Let's assume, I have two entities.
class Author{
public int Id{get;set;}
public string Name{get;set;}
//.....
}
class Article{
public int Id{get;set;}
public int AuthorId{get;set;}
public string Text{get;set;}
}
Now I want to add to Article AuthorName property doubling existing Author.Name to simplify resulting linq queries and execution time. I'm sure that my database will be used by only one Asp.Net MVC project. What is common way to implement such a column using EF (without database triggers)?
Also here can be a bit more difficult case. Let's say I want to have TotalWordCountInAllArticles column in Author entity which calculated by Text property of Article.
You can add the AuthorName property to Article and just manually keep the integrity by making sure that any code that creates Articles or updates the Author.Name also updates all of the Articles. Same thing with TotalWordCount, and time the Article.Text changes, re-add up all of the counts from the other Articles.
There are a few patterns you could look at to make this more automatic, such as a Domain Event pattern (https://lostechies.com/jimmybogard/2014/05/13/a-better-domain-events-pattern/), but it definitely isn't just plug and play. Really depends on if these are just a couple of items or if this is going to happen frequently.
If you are frequently denormalizing data for performance, you may want to look at more of an architecture where there is a normalized DB and then a separate process which generates denormalized views on the data and put into a document store.
NOTE: This might not answer the EF part of your question but it does offer an alternative solution to your problem.
Not sure how far along you are into the development of your project but you may want to consider having a look at Drapper which would make this trivial, fast and offer a number of other benefits.
Let's assume a small change to your Article model to include the Author model.
public class Article
{
public int ArticleId { get; set; }
public string Text { get; set; }
// using Author model
public Author Author { get; set; }
}
And assuming that the SQL you'd expect to execute would be something conceptually similar to:
select article.[Id]
,article.[Text]
,article.[AuthorId]
,author.Name
from [Article] article
join [Author] author on author.AuthorId = article.AuthorId;
Implementing a repository to retrieve them with Drapper would be really trivial. It might look something like:
public class ArticleRepository : IArticleRepository
{
// IDbCommander is a Drapper construct
private readonly IDbCommander _commander;
/// <summary>
/// Initializes a new instance of the <see cref="ArticleRepository"/> class,
/// injecting an instance of the IDbCommander using your IoC framework of
/// choice.
/// </summary>
public ArticleRepository(IDbCommander commander)
{
_commander = commander;
}
/// <summary>
/// Retrieves all article instances.
/// </summary>
public IEnumerable<Article> RetrieveAll()
{
// pass the query method a reference to a
// mapping function (Func<T1, T2, TResult>)
// although you *could* pass the predicate
// in right here, the code is more readable
// when it's separated out.
return _commander.Query(Map.AuthorToArticle);
}
private static class Map
{
// simple mapping function which allows you
// to map out exactly what you want, exactly
// how you want it. no hoop jumping!
internal static Func<Article, Author, Article>
AuthorToArticle = (article, author) =>
{
article.Author = author;
return article;
};
}
}
You'd wire the SQL to the repository using the configuration available to Drapper. It supports both json and xml config files or you could configure it all in code as well if you wanted to.
I've thrown a quick sample together for you over on Github.
Why should you consider this?
There's a number of benefits to going this route:
You indicated a performance concern (execution time). Drapper is an abstraction layer built on top of Dapper - the king of high performance micro-ORM's.
You control the mapping of your objects explicitly - no weird semantics or framework quirks (like the one you're facing).
No auto generated SQL. You decide exactly what SQL will be executed.
Your SQL is separated from your C# - if your schema changes (perhaps to improve performance) there's no need to recompile your project, change your entity mapping or alter any of your domain code or repository logic. You simply update the SQL code in your configuration.
Along the same lines, you can design your service/repository layers to be more domain friendly without having to be concerned about data access concerns polluting your service layer (or vice versa).
Fully testable - you can easily mock the results from the IDbCommander.
Less coding - no need for both entities and dto's (unless you want them), no overriding OnModelCreating methods or deriving from DbContext, no special attributes on your POCO's.
And that's just the tip of the iceberg.

Static vs. Instance Write Methods in Data Access Layer

I am creating a Data Access Layer in C# for an SQL Server database table. The data access layer contains a property for each column in the table, as well as methods to read and write the data from the database. It seems to make sense to have the read methods be instance based. The question I have is regarding handling the database generated primary key property getter/setter and the write method. As far as I know I have three options...
Option 1: Using a static method while only allowing a getter on the primary key would allow me to enforce writing all of the correct values into the database, but is unwieldy as a developer.
Option 2: Using and instance based write method would be more maintainable, but I am not sure how I would handle the get/set on the primary key and it I would probably have to implement some kind of validation of the instance prior to writing to the database.
Option 3: Something else, but I am wary of LINQ and drag/drop stuff, they have burned me before.
Is there a standard practice here? Maybe I just need a link to a solid tutorial?
You might want to read up on active record patterns and some examples of them, and then implement your own class/classes.
Here's a rough sketch of a simple class that contains some basic concepts (below).
Following this approach you can expand on the pattern to meet your needs. You might be OK with retrieving a record from the DB as an object, altering its values, then updating the record (Option2). Or if that is too much overhead, using a static method that directly updates the record in the database (Option1). For an insert, the database (SP/query) should validate the natural/unique key on the table if you need to, and probably return a specific value/code indicating a unique constraint error). For updates, the same check would need to be performed if allowing natural key fields to be updated.
A lot of this depends on what functionality your application will allow for the specific table.
I tend to prefer retrieving an object from the DB then altering values and saving, over static methods. For me, it's easier to use from calling code and can handle arcane business logic inside the class easier.
public class MyEntityClass
{
private int _isNew;
private int _isDirty;
private int _pkValue;
private string _colValue;
public MyEntityClass()
{
_isNew = true;
}
public int PKValue
{
get {return _pkValue;}
}
public string ColValue
{
get {return _colValue;}
set
{
if (value != _colValue)
{
_colValue = value;
_isDirty = true;
}
}
}
public void Load(int pkValue)
{
_pkValue = pkValue;
//TODO: query database and set member vars based on results (_colVal)
// if data found
_isNew = false;
_isDirty = false;
}
public void Save()
{
if (_isNew)
{
//TODO: insert record into DB
//TODO: return DB generated PK ID value from SP/query, and set to _pkValue
}
else if (_isDirty)
{
//TODO: update record in DB
}
}
}
Have you had a look at the Entity Framework. I know you said you are wary of LINQ, but EF4 takes care of a lot of the things you mentioned and is a fairly standard practice for DALs.
I would stick with an ORM Tool (EF, OpenAccess by Telerik, etc) unless you need a customized dal that you need (not want) total control over. For side projects I use an ORM - at work however we have our own custom DAL with provider abstractions and with custom mappings between objects and the database.
Nhibernate is also a very solid tried and true ORM with a large community backing it.
Entity Framework is the way to go for your initial DAL, then optimize where you need it: Our company actually did some benchmarking in comparing EF vs SQL reader, and found that for querying the database for one or two tables worth of information, the speed is about 6's (neither being appreciably faster than the other). After two tables there is a performance hit, but its not terribly significant. The one place that writing your own SQL statements became worthwhile was in batch commit operations. At which point EF allows you to directly write the SQL queries. So save your self some time and use EF for the basic heavy lifting, and then use its direct connection for the more complicated operations. (Its the best of both worlds)

Can I dynamically/on the fly create a class from an interface, and will nHibernate support this practice?

I’ve done some Googling but I have yet to find a solution, or even a definitive answer to my problem.
The problem is simple. I want to dynamically create a table per instance of a dynamically named/created object. Each table would then contain records that are specific to the object. I am aware that this is essentially an anti-pattern but these tables could theoretically become quite large so having all of the data in one table could lead to performance issues.
A more concrete example:
I have a base class/interface ACCOUNT which contains a collection of transactions. For each company that uses my software I create a new concrete version of the class, BOBS_SUB_SHOP_ACCOUNT or SAMS_GARAGE_ACCOUNT, etc. So the identifying value for the class is the class name, not a field within the class.
I am using C# and Fluent nHibernate.
So my questions are:
Does this make sense or do I need to clarify more? (or am I trying
to do something I REALLY shouldn’t?)
Does this pattern have a name?
Does nHibernate support this?
Do you know of any documentation on
the pattern I could read?
Edit: I thought about this a bit more and I realized that I don't REALLY need dynamic objects. All I need is a way to tie objects with some identifier to a table through NHibernate. For example:
//begin - just a brain dump
public class Account
{
public virtual string AccountName { get; set; }
public virtual IList Stuff { get; set; }
}
... somewhere else in code ...
//gets mapped to a table BobsGarageAccount (or something similar)
var BobsGarage = new Account{AccountName="BobsGarage"};
//gets mapped to a table StevesSubShop(or something similar)
var StevesSubShop = new Account{AccountName="StevesSubShop"};
//end
That should suffice for what i need, assuming NHibernate would allow it. I am trying to avoid a situation where one giant table would have the heck beat out of it if high volume occurred on the account tables. If all accounts were in one table... it could be ugly.
Thank you in advance.
Rather than creating a class on the fly, I would recommend a dynamic object. If you implement the right interfaces (one example is here, and in any case you can get there by inheriting from DynamicObject), you can write
dynamic bobsSubShopAccount = new DynamicAccount("BOBS_SUB_SHOP_ACCOUNT");
Console.WriteLine("Balance = {0}", bobsSubShopAccount.Balance);
in your client code. If you use the DLR to implement DynamicAccount, all these calls get intercepted at runtime and passed to your class at runtime. So, you could have the method
public override bool TryGetMember(GetMemberBinder binder, out object result)
{
if (DatabaseConnection.TryGetField(binder.Name, out result))
return true;
// Log the database failure here
result = null;
return false; // The attempt to get the member fails at runtime
}
to read the data from the database using the name of the member requested by client code.
I haven't used NHibernate, so I can't comment with any authority on how NHibernate will play with dynamic objects.
Those classes seem awfully smelly to me, and attempt to solve what amounts to be an actual storage layer issue, not a domain issue. Sharding is the term that you are looking for, essentially.
If you are truly worried about performance of the db, and your loads will be so large, perhaps you might look at partitioning the table instead? Your domain objects could easily handle creating the partition key, and you don't have to do crazy voodoo with NHibernate. This will also more easily permit you to not do nutty domain level things in case you change your persistence mechanisms later. You can create collection filters in your maps, or map readonly objects to a view. The latter option would be a bit smelly in the domain though.
If you absolutely insist on doing some voodoo you might want to look at NHibernate.Shards, it was intended for easy database sharding. I can't say what the current dev state and compatibility is, but it's an option.

Entity Framework Decorator Pattern

In my line of business we have Products. These products can be modified by a user by adding Modifications to them. Modifications can do things such as alter the price and alter properties of the Product. This, to me, seems to fit the Decorator pattern perfectly.
Now, envision a database in which Products exist in one table and Modifications exist in another table and the database is hooked up to my app through the Entity Framework. How would I go about getting the Product objects and the Modification objects to implement the same interface so that I could use them interchangeably?
For instance, the kind of things I would like to be able to do:
Given a Modification object, call .GetNumThings(), which would then return the number of things in the original object, plus or minus the number of things added by the modification.
This question may be stemming from a pretty serious lack of exposure to the nitty-gritty of EF (all of my experience so far has been pretty straight-forward LOB Silverlight apps), and if that's the case, please feel free to tell me to RTFM.
Thanks in advance!
Edit:
It would also be nice if, given a third table, linking a Products to Modifications (one-to-many) it could reconstruct the decorated object (I realize that this is likely way out of bound for the EF to do automatically). How would you recommend going about this, and where would that code reside? Would it be part of the EF classes or would every entity I received from the DB need to be passed through some sort of "builder" to construct a decorated object from a Product and its list of Modifications?
I am not entirely sure if I understood your question correctly, but here goes: You can create partial classes to those defined in your EF model. You could define a common interface and use the partial classes to implement that interface.
For example:
public interface IProduct{
public int GetNumThings();
}
public partial class Product : IProduct{
public int GetNumThings()
{
...
}
}
public partial class Modification: IProduct{
public int GetNumThings()
{
...
}
}

Designing an OO and Unit Test Friendly Query System

I'm working on an application that allows dentists to capture information about certain clinical activities. While the application is not highly customizable (no custom workflows or forms) it does offer some rudimentary customization capabilities; clients can choose to augment the predefined form fields with their own custom ones. There are about half a dozen different field types that admins can create (i.e. Text, Date, Numeric, DropDown, etc). We're using Entity-Attribute-Value (EAV) on the persistence side to model this functionality.
One of the other key features of the application is the ability to create custom queries against these custom fields. This is accomplished via a UI in which any number of rules (Date <= (Now - 5 Days), Text Like '444', DropDown == 'ICU') can be created. All rules are AND'ed together to produce a query.
The current implementation (which I "inherited") is neither object oriented nor unit testable. Essentially, there is a single "God" class that compiles all the myriad rule types directly into a complex dynamic SQL statement (i.e. inner joins, outer joins, and subselects). This approach is troublesome for several reasons:
Unit testing individual rules in isolation
is nearly impossible
That last point also means adding additional rule types in the
future will most definitely violate
the Open Closed Principle.
Business logic and persistence concerns are being co-mingled.
Slow running unit tests since a real database is required (SQLLite can't parse T-SQL and mocking out a parser would be uhh...hard)
I'm trying to come up with a replacement design that is flexible, maintainable and testable, while still keeping query performance fairly snappy. This last point is key since I imagine an OOAD based implementation will move at least some of the data filtering logic from the database server to the (.NET) application server.
I'm considering a combination of the Command and Chain-of-Responsibility patterns:
The Query class contains a collection of abstract Rule classes (DateRule, TextRule, etc). and holds a reference to a DataSet class that contains an unfiltered set of data. DataSet is modeled in a persistence agnostic fashion (i.e no references or hooks into database types)
Rule has a single Filter() method which takes in an DataSet, filters it appropriately, and then returns it to the caller. The Query class than simply iterates over each Rule, allowing each Rule to filter the DataSet as it sees fit. Execution would stop once all rules have been executed or once the DataSet has been filtered down to nothing.
The one thing that worries me about this approach are the performance implications of parsing a potentially large unfiltered data set in .NET. Surely there are some tried and true approaches to solving just this kind of problem that offer a good balance between maintainability and performance?
One final note: management won't allow the use of NHibernate. Linq to SQL might be possible, but I'm not sure how applicable that technology would be to the task at hand.
Many thanks and I look forward to everyone's feedback!
Update: Still looking for a solution on this.
I think that LINQ to SQL would be an ideal solution coupled, perhaps, with Dynamic LINQ from the VS2008 samples. Using LINQ, particularly with extension methods on IEnumerable/IQueryable, you can build up your queries using your standard and custom logic depending on the inputs that you get. I use this technique heavily to implement filters on many of my MVC actions to great effect. Since it actually builds an expression tree then uses it to generate the SQL at the point where the query needs to be materialized, I think it would be ideal for your scenario since most of the heavy lifting is still done by the SQL server. In cases where LINQ proves to generate non-optimal queries you can always use table-valued functions or stored procedures added to your LINQ data context as methods to take advantage of optimized queries.
Updated: You might also try using PredicateBuilder from C# 3.0 in a Nutshell.
Example: find all Books where the Title contains one of a set of search terms and the publisher is O'Reilly.
var predicate = PredicateBuilder.True<Book>();
predicate = predicate.And( b => b.Publisher == "O'Reilly" );
var titlePredicate = PredicateBuilder.False<Book>();
foreach (var term in searchTerms)
{
titlePredicate = titlePredicate.Or( b => b.Title.Contains( term ) );
}
predicate = predicate.And( titlePredicate );
var books = dc.Book.Where( predicate );
The way I've seen it done is by creating objects that model each of the conditions you want the user to build their query from, and build up a tree of objects using those.
From the tree of objects you should be able to recursively build up an SQL statement that satisfies the query.
The basic ones you'll need will be AND and OR objects, as well as objects to model comparison, like EQUALS, LESSTHAN etc. You'll probably want to use an interface for these objects to make chaining them together in different ways easier.
A trivial example:
public interface IQueryItem
{
public String GenerateSQL();
}
public class AndQueryItem : IQueryItem
{
private IQueryItem _FirstItem;
private IQueryItem _SecondItem;
// Properties and the like
public String GenerateSQL()
{
StringBuilder builder = new StringBuilder();
builder.Append(_FirstItem.GenerateSQL());
builder.Append(" AND ");
builder.Append(_SecondItem.GenerateSQL());
return builder.ToString();
}
}
Implementing it this way should allow you to Unit Test the rules pretty easily.
On the negative side, this solution still leaves the database to do a lot of the work, which it sounds like you don't really want to do.

Categories