Abstracting Data Access Layer with Entity Framework code first

Abstracting Data Access Layer with Entity Framework code first - c#

I have two separate databases. Database A is small and contains 6 tables and some stored procedures, database B is larger and contains 52 tables, 159 views, and some stored procedures. I have been tasked with creating an application that does operations on both of these databases. There is already code that has been written that accesses these databases but it is not testable and I would like to create the DAL code from scratch. So I made a code first Entity Framework connection and let it create all of the POCOs for me as well as the Context classes. Now I have 3 DatabaseContext classes. 1 for database A which is just the 6 separate tables. 1 for the tables from database B and 1 for the views from database B. I split the Context class for database B into 2 separate context classes because the tables and the views are used in 2 different scenarios.
I have created 2 projects in a single solution. DomainModel and DataAccessLayer. The DomainModel project holds all of the POCOs which the Entity Framework code first generated for me. DataAccessLayer is where the DatabaseContexts are and where I would like to create the code which other projects can use to access the database. My goal being to make a reusable and testable data access layer code.
The QUESTION is: What do I create in the DataAccessLayer project to make the DAL resusable in other projects that want to be able to talk to the database in an abstract way and what is the best way to structure it?
What I have done:
In my search on the internet I have seen that people recommend the repository pattern (sometimes with UnitOfWork) and it seems like that is the way to go, so I created an interface called IGenericRepository and an implementation called GenericEntityFrameworkRepository. The interface has methods such as GetByID, Insert, Delete, and Update. The implementation takes a DbContext as a parameter in the constructor and implements all of the methods. The next step might be to create specific implementations for the different tables and views but that is going to be a very tedious thing to do since there are so many tables and views.
I have also read that Entity Framework IS the DAL. Then it might be enough to just create another project that holds the business logic and returns IEnumerables. But that seems like it will be hard to test.
What I am trying to achieve is a well structured and testable foundation to access our database that can be tested thoroughly and expanded as other projects start to require other functionality to the database.
The folder structure for the project can be seen in the following picture:

Design and architecture is all about trade-offs. Be careful when you want to consider abstracting your data access for purposes of reuse. Within an application this leads to overly complex and poorly performing solutions to what can be a very simple and fast solution.
For instance, if I want to build a business domain layer as a set of services to be consumed by a number of different applications (web applications and services/APIs that will serve mobile apps or third party interests) then that domain layer becomes a boundary that defines what information will come in, and what information will go out. Everything in and out of that abstraction should be DTOs that reflect the domain that I am making available via the actions I make available. Within that boundary, EF should be leveraged to perform the projection needed to produce the relevant DTOs, or update entities based on the specific details passed in. (After verification) The trade-off is that every consumer will need to make due with this common access point, so there is little wiggle room to customize what comes out unless you build it in. This is preferable to having everything just do their own thing accessing the database and running custom queries or calling Stored Procedures.
What I see most often though is that within a single solution, developers want to abstract the system not to "know" that the data domain is provided by EF. The application passes around POCO Entities or DTOs and they implement things like Generic Repositories with GetById, GetAll, Update, Upsert, etc. This is a very, very poor design decision because it means that your data domain cannot take advantage of things like projection, pagination, custom filtering, ordering, eager loading related data, etc. Abstracting the EF DbContext is certainly a valid objective for enabling unit testing, but I strongly recommend avoiding the performance pitfall of a Generic Repository pattern. IMO attempting to abstract away EF just for the sake of hiding it or making it swappable is either a poor performing and/or an overly complex mess.
Take for example a typical GetAll() method.
public IEnumerable<T> GetAll()
{
return _context.Set<T>().ToList();
}
What happens if you only want orders from the last 3 months?
var orders = orderRepository.GetAll()
.Where(x => x.OrderDate >= startDate).ToList();
The issue here is that GetAll() will always return all rows. If you have 4 million orders you can appreciate that is not desirable.
You could make an OrderRepository that extends Repository to implement a method like GetOrdersAfter(DateTime) but soon, what is the point of the Generic Repository? The next option would be to pass something like an Expression<Func<TEntity>> Where clause into the GetAll() method. There are plenty of examples of that out there. However, doing that is leaking EF-isms into the consuming code. The Where clause expression has to conform to EF rules. For instance passing order => order.SomeUnmappedProperty == someValue would cause the query to fail as EF won't be able to convert that down to SQL. Even if you're Ok with that the GetAll method will need to start looking like:
IEnumerable<TEntity> GetAll(Expression<Func<TEntity>> whereClause = null,
IEnumerable<OrderByClause> orderByClauses = null,
IEnumerable<string> includes = null,
int? pageNumber = null, int? pageSize = null )
or some similar monstrosity to handle filtering, ordering, eager loading, and pagination, and this doesn't even cover projection. For that you end up with additional methods like:
IEnumerable<TEntity, TDTO> GetAll(Expression<Func<TEntity>> whereClause = null,
IEnumerable<OrderByClause> orderByClauses = null,
IEnumerable<string> includes = null,
int? pageNumber = null, int? pageSize = null )
which will call the other GetAll then perform a Mapper.Map to return a desired DTO/ViewModel that is hopefully registered in a mapping dependency. Then you need to consider async vs. synchronous flavours of the methods. (or forcing all calls to be synchronous or async)
Overall my advice would be to avoid Generic Repository patterns and don't abstract away EF just for the sake of abstraction, instead look for ways to leverage the most you can out of it. Most of the problems developers run into with performance and odd behaviour with EF stem from efforts to abstract it. They are so worried about needing to abstract it to be able to replace it if they need to, that they end up with those performance and complexity problems entirely because of the limitations imposed by the abstraction. (A self-fulfilling prophecy) "Architecture" can be a dirty word when you try to prematurely optimize a solution to solve a problem you don't currently, and probably will never face. Always keep YAGNI and KISS at the forefront of all design considerations.

I guess you tumbled between selecting right architecture for a given solution. a software must be created for meet the business needs. so depending on the business, complexity of the domain, we have to choose the write architecture.
Microsoft elaborated this in a detail PDF you will find it here
https://learn.microsoft.com/en-us/dotnet/architecture/modern-web-apps-azure/
You want some level of abstraction between your Data and the BL then "N-Layer" architecture is not the best solution. even though i am not telling the N-Layer entirely out dated but there is more suitable solutions for this problem .
And if you using EF or EF core then there is no need to implement Repository cus EF itself contain UOW and Repositories.
Clean Architecture introduce higher level of abstraction between layers. it abstract away your data layer and make it reusable over different project or different solutions. Clean architecture is a big topic i can't describe here in more details. but i learn the subject from him.
Jason Taylor- Clean Architecture
https://www.youtube.com/watch?v=dK4Yb6-LxAk
https://github.com/jasontaylordev/CleanArchitecture

Related

How can I reduce the amount of SQL-querying in my code? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I'm writing a big C# application that communicates with a MS-SQL Server database.
As the app grows bigger, I find myself writing more and more "boilerplate" code containing various SQL queries in various classes and forms like this:
using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Windows.Forms;
public class SomeForm : Form
{
public void LoadData(int ticketId)
{
// some multi-table SQL select and join query
string sqlQuery = #"
SELECT TOP(1) [Foo].Id AS FooId, [Foo].Name AS FooName, [Foo].Address AS FooAddress
[Bar].Name AS BarName, [Bar].UnitPrice AS BarPrice,
[Bif].Plop
FROM [dbo].[Foo]
INNER JOIN [dbo].[Bar]
ON [Bar].Id = [Foo].BarId
INNER JOIN [dbo].[Bif]
ON [Bar].BifId = [Bif].Id
WHERE [Foo].TicketId = #ticketId";
SqlCommand sqlCmd = new SqlCommand();
sqlCmd.CommandText = sqlQuery;
sqlCmd.Parameters.AddWithValue("#ticketId", ticketId);
// connection string params etc and connection open/close handled by this call below
DataTable resultsDataTable = SqlQueryHelper.ExecuteSqlReadCommand(sqlCmd);
if (resultsDataTable.Rows.Count > 0)
{
var row = resultsDataTable.Rows[0];
// read-out the fields
int fooId = 0;
if (!row.IsNull("FooId"))
fooId = row.Field<int>("FooId");
string fooName = "";
if (!row.IsNull("FooName"))
fooName = row.Field<string>("FooName");
// read out further fields...
// display in form
this.fooNameTextBox.Text = fooName;
// etc.
}
}
}
There are dozens of forms in this project all doing conceptually the same thing, just with different SQL queries (different columns selected, etc.)
And each time the forms are opened, the database is being continually queried.
For a local DB server the speed is OK but using the app over a slow VPN is painful.
Are there better ways of cutting down the amount of querying the database? Some sort of caching the database in memory and performing the queries on the in-memory data?
I've added some data tables to a data source in my project but can't understand how I can do complex queries like the one stated above.
Is there a better way of doing this?
Thanks for all your suggestions folks!

I think non of the recommendations like using a framework to access a datase will solve your problem. The problem is not writing SQL or LINQ queries. You always have to dispatch a query to the database or set of data at some point.
The problem is from where you query the database.
Your statement about writing "code containing various SQL queries in various classes and forms" gives me chills. I recently worked for a company and they did exactly the same. As a result they can't maintain their database anymore. Maintenance is still possible, but only rudimentary and very expensive/time consuming and very frustrating - so nobody likes to do it and therefore nobody does it and as a result it gets worse and worse. Queries are getting slower and slower and the only possible quick fix is to buy more bandwidth for the database server.
The actual queries are scattered all over the projects making it impossible to improve/refactor queries (and identify them) or improve the table design. But the worst is, they are not able to switch to a faster or cheaper database model e.g. a graph structured database, although there is a burning desire and an urgent need to do so. It makes you look sloppy in front of customers. Absolutely no fun at all to work in such an environment (so I left). Sounds bad?
You really should decouple the database and the SQL from your business code. All this should be hidden behind an interface:
IRepository repository = Factory.GetRepository();
// IRepository exposes query methods, which hide
// the actual query and the database related code details from the repository client or business logic
var customers = repository.GetCustomers();
Spreading this code through your project doesn't hurt. It will improve maintainability and readability. It separates the data persistence from the actual business logic, since you are hiding/encapsulating all the details like the actual database, the queries and the query language. If you want to switch to another database you just have to implement a new IRepository to modify the existing queries. Changing the database won't break the application.
And all queries are implemented in one location/layer. And when talking about queries, this includes LINQ queries as well (that's why using a framework like Entity Framework doesn't solve your problem). You can use dependency injection or Factory pattern to distribute the implementation of IRepository. This even allows to switch between different databases during runtime without recompiling the application.
Using this pattern (Repository Pattern) also allows to decouple frameworks like Entitiy Framework from your business logic.
Take a look at the Repository Pattern. If it's not too late you should start to refactor your existing code. The price is too high if keep it like it is.
Regarding caching I know that the database or the DBMS already handles the caching of data very very efficiently. What you can do is to
cache data locally e.g. if the data won't change remotely (because
some other client will modify it) e.g. local client settings. This
way you can reduce network traffic significantly.
cache data locally e.g. if you access this data set frequently and remote changes are not likely to occur or will have impact. You can update the local data store periodically or invalidate it e.g. after a period of time to force the client to dispatch a new query in order to update the cache.
You also can filter data locally using LINQ. This may also reduce
network traffic although you may end up reading more data the
necessary. Also filtering is generally done more efficient by the
DBMS.
You can consider to upgrade the database server or VPN to increase
bandwidth.
Indexing will also improve lookup times significantly.
You should consider to refactor all your SQL queries. There are many
articles on how to improve performance of SQL queries. The way you
build up a query will have significant impact on performance or
execution times especially with big data.
You can use data virtualization. It doesn't make sense to pull thousands of records from the database when you can only show 20 of them to the user. Pull more data when the user scrolls the view. Or even better, display a pre-selected list e.g. of most recent items and the allow the user to search for the data of interest. This way you read only the data that the user explicitly asked for. This will improve overall performance drastically, since usually the user is only interested in very few records.
Before introducing an interface (Dependency Inversion)
The following examples are meant to show, that this is a question of architecture or design, instead a question of frameworks or libraries.
Libraries or frameworks can help on a different level, but won't solve the problem, which is introduced by spreading environment specific queries all over the business code. Queries should always be neutral. When looking at the business code, you shouldn't be able to tell if the data is fetched from a file or a database. This details must be hidden or encapsulated.
When you spread the actual database access code (whether directly using plain SQL or with help of a framework) throughout your business code, you are not able to write unit tests without a database attached. This is not desired. It makes testing too complicated and the tests will execute unnecessarily slow. You want to test your business logic and not the database. This are separate tests. You usually want to mock the database away.
The problem:
you need data from the database in multiple places across the application's model or business logic.
The most intuitive approach is to dispatch a database query whenever and where ever you need the data. This means, when the database is a Postgre database, all the code would of course use PostgreSQL or some framework like Entity Framework or ORM in general. If you decide to change the database or DBMS e.g. to some Oracle or want to use a different framework to manage your entities, you would be forced to touch and rewrite every code that uses PostgreSQL or Entity Framework.
In a big business application, this will be the reason that forces your company to stay with what you have and leaves your team dreaming of a better world. Frustration level will rise. Maintaining database related code is nearly impossible, error prone and time consuming. Since the actual database access is not centralized, rewriting the database related code means to crawl through the complete application. Worst case is the spreading of meaningless SQL query strings, nobody understands or remembers. Impossible to move to a new database or to refactor queries to improve performance, without scarifying valuable and expensive time and team resources.
Imaging the following simplified symbolic method is repeated in some form across the application's business logic, maybe accessing different entities and using different filters, but using the same query language, framework or library. Let's say we find similar code a thousand times:
private IEnumerable GetCustomers()
{
// Use Entity Framework to manage the database directly
return DbContext.Customers;
}
We have introduced a tight coupling to the framework as it is woven deep into our business code. The code "knows" how the database is manged. It knows about Entity Framework as it has to use its classes or API everywhere.
The proof is, that if you would want to replace Entity Framework with some other framework or just want to drop it, you would have to refactor the code in thousand places - everywhere you used this framework in your application.
After introducing an interface (Dependency Inversion) and encapsulating all the database access
Dependency Inversion will help to remove a dependency on concrete classes by introducing interfaces. Since we prefer loose coupling between components and classes to enhance flexibility, testability and maintainability when using a helper framework or plain SQL dialect, we have to wrap this specific code and hide it behind an interface (Repository Pattern).
Instead of having a thousand places, which explicitly use the database framework or SQL or LINQ queries to read, write or filter data, we now introduce interface methods e.g GetHighPriorityCustomers and GetAllCustomers. How the data is supplied or from which kind of database it is fetched are details, that are only known to the implementation of this interface.
Now the application no longer uses any framework or database specific languages directly:
interface IRepository
{
IEnumerable<Customer> GetHighPriorityCustomers();
IEnumerable<Customer> GetAllCustomers();
}
The previous thousand places now look something like:
private IRepository Repository { get; } // Initialized e.g. from constructor
private IEnumerable GetCustomers()
{
// Use a repository hidden behind an interface.
// We don't know in this place (business logic) how the interface is implemented
// and what classes it uses. When the implementation changes from Entity Framework to something else,, no changes have to be made here (loose coupling).
return this.Repository.GetAllCustomers();
}
The implementation of IRepository:
class EntityFrameworkRepository : IRepository
{
IEnumerable<Customer> GetAllCustomers()
{
// If we want to drop Entity Framework, we just have to provide a new implementation of IRepository
return DbContext.Customers;
}
...
}
Now, you decide to use plain SQL. The only change to make is to implement a new IRepository, instead of changing thousand places to remove the specialized code:
class MySqlRepository : IRepository
{
// The caller still accesses this method via the interface IRepository.GetAllCustomers()
IEnumerable<Customer> GetAllCustomers()
{
return this.Connection.ExecuteQuery("SELECT * FROM ...");
}
...
}
Now you decide to replace MySQL with Microsoft SQL. All you have to do is to implement a new IRepository.
You can swap in and out any database and change the query language or introduce helper frameworks without affecting your original business logic. Once written, never touched again (at least for the changes regarding the database).
If you move the implementation to a separate assembly, you can even swap them at runtime.

There is another solution besides Entity Framework.
For example you can use The Sharp Factory.
It is a commercial product but it not only maps database objects like Entity Framework but also creates a full Repository based on layers.
It is better than EF in my opinion if you are willing to pay for it.
There are some downsides to Entity Framework. For example the fact that your Repository will leak throughout your layers.
Because you need to reference Entity Framework on all consumers of your entities... So even if your architecture looks correct, at runtime you can still execute sql queries from your upper layers even unknowingly.

I can suggest several things:
Instead of using ADO.NET switch to Entity Framework, where you can use LINQ to
SQL/LINQ to EF(newer version) and write down queries with simply C# language and not worry about querying SQL.
Use Stored procedures, SQL Functions, Views - written down in SQL Server database, which calls are cached by the SQL Server, which provides more efficient executing, security and more maintainability.
For making your Queries more efficient against database tables, consider using Full-Text Index over the tables which data you use more often in the Filtering operations, like Search.
Use Repository and Unit of Work patterns in your C#(including integration with Entity Framework) code which actually will do exactly the thing you want, i.e. collection several amounts of SQL Queries and sending them to execute by SQL Server at once, instead of sending Queries one by one. This will not only drastically improve performance but will keep your coding as simple as it can.
Note: One of the problems with your queries is related not only to their executions but also on Opening and closing SQL database connections each time you need to execute the particular Query. This problem is solved with the Repository and Unit of Work design patterns approach.
Based on your business needs, use In Memory or Database Caching for the data, which repeats for a lot of users.

Automapper vs Dapper for mapping

This question is to verify if the current implementation is the right way to go about in terms of best practices and performance. So far in all my previous companies I have been using Auto Mapper to map relational objects to domain model entities and domain model entities to Dtos. The ORM tools have been Entity framework.
In my current company they are using Dapper as ORM tool and do not use AutoMapper as they say Dapper does the mapping for you internally. So the way they have structured the project is create a separate class library project that contains Dtos and reference the Dtos in Dataccess and Business layer.
The query returned by Dapper is internally mapped to the Dtos. These Dtos are returned to the Business layer and so on.
For example
In the code below the Participant function is Dto.
Repository file in DataAccess layer
public List<ParticipantFunction> GetParticipantFunctions(int workflowId)
{
// Update the Action for Participant
string selectSql = #"SELECT [WFPatFunc_ID] AS WFPatFuncID
,[WFFunction]
,[SubIndustryID]
,[DepartmentID]
FROM [dbo].[WF_ParticipantsFunctions]
WHERE [DepartmentID] = (SELECT TOP 1 [DepartmentID] FROM [dbo].[WF] WHERE [WF_ID] = #workflowId)";
return _unitOfWork.GetConnection().Query<ParticipantFunction>(selectSql, new
{
workflowId = workflowId
}).ToList();
}
The reason what I have been told by the developers is that AutoMapper would be just an overhead and reduce speed and since Dapper does mapping internally, there is no need for it.
I would like to know if the practice they are following is fine and have no issues.

There is no right or wrong here. If the current system works and solves all their requirements, then great: use that! If you have an actual need for something where auto-mapper would be useful, then great: use that!
But: if you don't have a need for the thing that auto-mapper does (and it appears that they do not), then... don't use that?
Perhaps one key point / question is: what is your ability to refactor the code if your requirements change later. For many people, the answer there is "sure, we can change stuff" - so in that case I would say: defer on adding an additional layer until you actually have a requirement for an additional layer.
If you will absolutely not be able to change the code later, perhaps due to lots of public-facing APIs (software as a product), then it makes sense to de-couple everything now so there is no coupling / dependency in the public API. But: most people don't have that. Besides which, dapper makes no demands on your type model whatseover other than: it must look kinda like the tables. If it does that, then again: why add an additional layer if you don't need it?

This is more of an architecture problem and there is no good or bad.
Pros of DTOs:
Layering - You are not directly using data object so you can use Attributes for mapping and stuff not needed in your UI. That way you can have the DTOs in a library that has no dependencies to your data access stuff.(Note you could do this with fluent mapping but this way you can use what you like)
Modification - If you domain model changes your public interface will stay the same. Lets say you add a property to the model all the stuff you already build wont be getting the new field in you JSON for no reason.
Security - This is why Microsoft started pushing DTOs if I remember correctly I think CodePlex(Not 100% sure it was them) was using EF entitles directly to add stuff to the database. Someone figure this out and just expanded the JSON with stuff that he was not allowed to access, for example if you have a post that has a reference to a user you could change the role of the user by adding a new post because of change tracking. There are ways to protect you self from this but security should always be an opt-out not opt-in.
I like to use DTOs when I need to expose BI level to a public interface. For example if I have an API that has System operations like api/Users/AllUsers or api/Users/GetFilteredUsers.
No DTOs
Performance - Generally not using DTOs will run faster. No extra step of mapping. Projections help with this but you can really optimize when you know what you need to do.
Speed of development and smaller code base - Sometime a large architecture is an overkill and you just want to get things done. And you are not just doing copy paste of your properties for most of you time.
More flexibility - This is opposite to the security sometimes you want to use the same api to do more then one thing. For example if you want to have the UI decide what it wants to see from a big object. Like select and expand. (Note this can be done with DTOs but if you ever tried to do expand with DTOs you know how tricky it can get)
I use it when I need to expose the data access level to the client, if you need to use Breeze or JayData and/or Odata. With api like api/Users and api/Users?filter=(a=>a.Value > 5)

Using View-Models with Repository pattern

I'm using Domain driven N-layered application architecture with EF code first in my recent project, I defined my Repository contracts, In Domain layer.
A basic contract to make other Repositories less verbose:
public interface IRepository<TEntity, in TKey> where TEntity : class
{
TEntity GetById(TKey id);
void Create(TEntity entity);
void Update(TEntity entity);
void Delete(TEntity entity);
}
And specialized Repositories per each Aggregation root, e.g:
public interface IOrderRepository : IRepository<Order, int>
{
IEnumerable<Order> FindAllOrders();
IEnumerable<Order> Find(string text);
//other methods that return Order aggregation root
}
As you see, all of these methods depend on Domain entities.
But in some cases, an application's UI, needs some data that isn't Entity, that data may made from two or more enteritis's data(View-Models), in these cases, I define the View-Models in Application layer, because they are closely depend on an Application's needs and not to the Domain.
So, I think I have 2 way's to show data as View-Models in the UI:
Leave the specialized Repository depends on Entities only, and map the results of Repositories's method to View-Models when I want to show to user(in Application Layer usually).
Add some methods to my specialized Repositories that return their results as View-Models directly, and use these returned values, in Application Layer and then UI(these specialized Repositories's contracts that I call them Readonly Repository Contracts, put in Application Layer unlike the other Repositories'e contract that put in Domain).
Suppose, my UI needs a View-Model with 3 or 4 properties(from 3 or 4 big Entities).
It's data could be generate with simple projection, but in case 1, because my methods could not access to View-Models, I have to fetch all the fields of all 3 or 4 tables with sometimes, huge joins, and then map the results to View-Models.
But, in case 2, I could simply use projection and fill the View-Models directly.
So, I think in performance point of view, the case 2 is better than case 1. but I read that Repository should depend on Entities and not View-Models in design point of view.
Is there any better way that does not cause the Domain Layer depend on the Application layer, and also doesn't hit the performance? or is it acceptable that for reading queries, my Repositories depend on View-Models?(case2)

Perhaps using the command-query separation (at the application level) might help a bit.
You should make your repositories dependent on entities only, and keep only the trivial retrieve method - that is, GetOrderById() - on your repository (along with create / update / merge / delete, of course). Imagine that the entities, the repositories, the domain services, the user-interface commands, the application services that handles those commands (for example, a certain web controller that handles POST requests in a web application etc.) represents your write model, the write-side of your application.
Then build a separate read model that could be as dirty as you wish - put there the joins of 5 tables, the code that reads from a file the number of stars in the Universe, multiplies it with the number of books starting with A (after doing a query against Amazon) and builds up a n-dimensional structure that etc. - you get the idea :) But, on the read-model, do not add any code that deals with modifying your entities. You are free to return any View-Models you want from this read model, but do trigger any data changes from here.
The separation of reads and writes should decrease the complexity of the program and make everything a bit more manageable. And you may also see that it won't break the design rules you have mentioned in your question (hopefully).
From a performance point of view, using a read model, that is, writing the code that reads data separately from the code that writes / changes data is as best as you can get :) This is because you can even mangle some SQL code there without sleeping bad at night - and SQL queries, if written well, will give your application a considerable speed boost.
Nota bene: I was joking a bit on what and how you can code your read side - the read-side code should be as clean and simple as the write-side code, of course :)
Furthermore, you may get rid of the generic repository interface if you want, as it just clutters the domain you are modeling and forces every concrete repository to expose methods that are not necessary :) See this. For example, it is highly probable that the Delete() method would never be used for the OrderRepository - as, perhaps, Orders should never be deleted (of course, as always, it depends). Of course you can keep the database-row-managing primitives in a single module and reuse those primitives in your concrete repositories, but to not expose those primitives to anyone else but the implementation of the repositories - simply because they are not needed anywhere else and may confuse a drunk programmer if publicly exposed.
Finally, perhaps it would be also beneficial to not think about Domain Layer, Application Layer, Data Layer or View Models Layer in a too strict manner. Please read this. Packaging your software modules by their real-world meaning / purpose (or feature) is a bit better than packaging them based on an unnatural, hard-to-understand, hard-to-explain-to-a-5-old-kid criterion, that is, packaging them by layer.

Well, to me I would map the ViewModel into Model objects and use those in my repositories for reading / writing purposes, as you may know there are several tools that you can do for this, in my personal case I use automapper which I found very easy to implement.
try to keep the dependency between the Web layer and Repository layers as separate as possible saying that repos should talk only to the model and your web layer should talk to your view models.
An option could be that you can use DTO's in a service and automap those objects in the web layer (it could be the case to be a one to one mapping) the disadvantage is that you may end up with a lot of boilerplate code and the dtos and view models could feel duplicated.
the other option is to return partial hydrated objects in your model and expose those objects as DTOs and map those objects to your view models this solution could be a little bit obscure but you can make the projections you want and return only the information that you need.
you can get rid of the of the view models and expose the dtos in your web layer and use them as view models, less code but more coupled approach.

I Kindof agree with Pedro here. using a application service layer could be benificial. if your aiming for an MVVM type of implementation i would advise to create a Model class that is responsible for holding the data that is retrieved using the service layer. Mapping the data with automapper is a really good idea if your entities, DTO's and models are named consistently (so you don't have to write a lot of manual mappings).
In my experience using your entities/poco in viewmodels to display data will result in big balls of mud. Different views have different needs en will all add a need to add more properties to an entity. Slowly making your queries more complex and slower.
if your data is not changing that often you might want to consider introducing (sql/database) views that will transfer some of the heavy lifting to the database (where it is highly optimized). EF handles database views fairly well. Then retrieving the entity and mapping the data (from the views) to the model or DTO becomes fairly straightforward.

Relational databases and object oriented environment

As every body knows that Object oriented languages provide reusability features.
I have a simple three tier application:
presentation layer
business layer is designed to reap the benefits of reusability
datalayer is a dumb ado.net library(by dumb I meant that it has no business logic in place.)
I have been struggling to enforce code reusability in this datalayer. I am pasting a pseudo pattern in one of my methods in datalayer.
create connection object
open connection to database
create a transaction object
begin transaction
create command object
execute it
.
.
.
.
create nth command object
execute it
commit transaction
close connection
In reality this code swells to around 300 to 400 lines of code and it becomes impossible to read this code.
In this series of command executions we are selecting / inserting / updating queries on different tables. If it were not to be in transaction, I would have separated this code to their respective classes.
There is again a spaghetti pattern which I recently encountered:
business layer method1 calls datalayer method to update column1
business layer method2 calls datalayer method to update column2
businees layer method3 calls datalayer method to save the entire result in table by updating it.
This pattern emerged when I was trying to reap the benefits of reusability, these methods are called from different locations so they were reused. However, if were to write simple sql query not keeping in mind of reusability there would have been a single call to database.
So, is there any pattern or technique by which reusability can be achieved in data layer?
Note:
I don't want to use any stored procedures despite the fact that they
offer precompilation benefits and etc. , as they tend to tie
datalayer more specific to a particular database.
I am also currently not considering any ORM solutions here only
plain ADO.net.
Excuses for not considering any ORMs.
Learning curve
Avoiding tight coupling to a specific ORM which I think can be
removed by restricting the ORM code in datalayer itself.
I checked the internet some time 6 months ago and there were only
two popular or widely used ORM solutions available then. Entity
Framework and NHibernate. I choose Entity Framework (for some reasons
I will link later link1, link2 besides that I had a feeling that working with EF would be easy as it is provided by Microsoft) to start learning.
I used this Microsoft recommended
book in this book
there were three techniques as I understood
TPT,
TPH and
TPC; TPC I never tried.
When I checked the SQL generated from the Entity Framework, it was
very ugly and was creating some extra columns: Ids, some ugly Case
statements etc., it seemed that for a highly transactional system
the ORM solution cannot be applied.By highly transactional system I mean 1000 of insertions happening every single minute. The database continues to swell in size and reaches somewhere near 500 to 600 GBs in some distant future.

I agree with the comments to your question; you should really avoid re-inventing the wheel here and go with an ORM, if at all possible. Speaking from experience, you're going to end up writing code and solving problems that have long ago been solved and it will probably take you more time in the long run. However, I understand that sometimes there are constraints that don't permit the use of an ORM.
Here are some articles that I have found helpful:
This first article is an old one but it explains the different options that you have for data access design patterns. It has a few different patterns and only you can really decide which one will be best for you but it sounds like you might want to look at the Repository Pattern:
http://msdn.microsoft.com/en-us/magazine/dd569757.aspx
This next article is the first in a series that talks about how to implement a repository pattern with a data mapper which, based on your example above, will probably help to reduce some of your redundant code.
http://blogsprajeesh.blogspot.com/2010/02/data-access-layer-in-c-using-repository.html
Finally, depending on how you implement your data access pattern, you may find the template pattern and generics helpful. The following article talks about that a little bit and you can glean some helpful information from it:
http://www.c-sharpcorner.com/UploadFile/rmcochran/elegant_dal05212006130957PM/elegant_dal.aspx
Without knowing more about your project, it's hard to say, exactly, which pattern will best suit your needs. However, using a combination of the Unit of Work pattern with repositories and data mappers will probably help you to reuse some code and manage your data access.

What I'm not seeing is your model layer.
You have a business layer, and a DAO layer, but no model.
business layer method1 calls datalayer method to update column1
business layer method2 calls datalayer method to update column2
businees layer method3 calls datalayer method to save the entire result in table by updating it.
Why isn't this:
business layer updates model/domain object A
business layer updates model/domain object A in a different way
business layer persists model/domain to database through data layer.
This way you get re-use, and avoid repeated loops back and forth to database.
Ultimately it sounds like your business layer knows FAR too much of the database data model. You need business objects, not just business methods.

What's the best way to set up data access for an ASP.NET MVC project?

I am starting a new ASP.NET MVC project to learn with, and am wondering what's the optimal way to set up the project(s) to connect to a SQL server for the data. For example lets pretend we have a Product table and a product object I want to use to populate data in my view.
I know somewhere in here I should have an interface that gets implemented, etc but I can't wrap my mind around it today :-(
EDIT: Right now (ie: the current, poorly coded version of this app) I am just using plain old SQL server(2000 even) using only stored procedures for data access, but I would not be adverse to adding in an extra layer of flexability for using linq to sql or something.
EDIT #2: One thing I wanted to add was this: I will be writing this against a V1 of the database, and I will need to be able to let our DBA re-work the database and give me a V2 later, so it would be nice to only really have to change a few small things that are not provided via the database now that will be later. Rather than having to re-write a whole new DAL.

It really depends on which data access technology you're using. If you're using Linq To Sql, you might want to abstract away the data access behind some sort of "repository" interface, such as an IProductRepository. The main appeal for this is that you can change out the specific data access implementation at any time (such as when writing unit tests).
I've tried to cover some of this here:

I would check out Rob Conery's videos on his creation of an MVC store front. The series can be found here: MVC Store Front Series
This series dives into all sorts of design related subjects as well as coding/testing practies to use with MVC and other projects.

In my site's solution, I have the MVC web application project and a "common" project that contains my POCOs (plain ol' C# objects), business managers and data access layers.
The DAL classes are tied to SQL Server (I didn't abstract them out) and return POCOs to the business managers that I call from my controllers in the MVC project.

I think that Billy McCafferty's S#arp Architecture is a quite nice example of using ASP.NET MVC with a data access layer (using NHibernate as default), dependency injection (Ninject atm, but there are plans to support the CommonServiceLocator) and test-driven development. The framework is still in development, but I consider it quite good and stable. As of the current release, there should be few breaking changes until there is a final release, so coding against it should be okay.

I have done a few MVC applications and I have found a structure that works very nicely for me. It is based upon Rob Conery's MVC Storefront Series that JPrescottSanders mentioned (although the link he posted is wrong).
So here goes - I usually try to restrict my controllers to only contain view logic. This includes retrieving data to pass on to the views and mapping from data passed back from the view to the domain model. The key is to try and keep business logic out of this layer.
To this end I usually end up with 3 layers in my application. The first is the presentation layer - the controllers. The second is the service layer - this layer is responsible for executing complex queries as well as things like validation. The third layer is the repository layer - this layer is responsible for all access to the database.
So in your products example, this would mean that you would have a ProductRepository with methods such as GetProducts() and SaveProduct(Product product). You would also have a ProductService (which depends on the ProductRepository) with methods such as GetProductsForUser(User user), GetProductsWithCategory(Category category) and SaveProduct(Product product). Things like validation would also happen here. Finally your controller would depend on your service layer for retrieving and storing products.
You can get away with skipping the service layer but you will usually find that your controllers get very fat and tend to do too much. I have tried this architecture quite a few times and it tends to work quite nicely, especially since it supports TDD and automated testing very well.

For our application I plan on using LINQ to Entities, but as it's new to me there is the possiblity that I will want to replace this in the future if it doesn't perform as I would like and use something else like LINQ to SQL or NHibernate, so I'll be abstracting the data access objects into an abstract factory so that the implementation is hidden from the applicaiton.
How you do it is up to you, as long as you choose a proven and well know design pattern for implementation I think your final product will be well supported and robust.

Use LINQ. Create a LINQ to SQL file and drag and drop all the tables and views you need. Then when you call your model all of your CRUD level stuff is created for you automagically.
LINQ is the best thing I have seen in a long long time. Here are some simple samples for grabbing data from Scott Gu's blog.
LINQ Tutorial

I just did my first MVC project and I used a Service-Repository design pattern. There is a good bit of information about it on the net right now. It made my transition from Linq->Sql to Entity Framework effortless. If you think you're going to be changing a lot put in the little extra effort to use Interfaces.
I recommend Entity Framework for your DAL/Repository.

Check out the Code Camp Server for a good reference application that does this very thing and as #haacked stated abstract that goo away to keep them separated.

i think you need a orm.
for example entity framework(code first)
you can create some class for model.
use these models for you logic and view,and mapping them to db(v1).
when dba give you new db(v2),only change the mapping config.(v1 and v2 are all rdb,sql server,mysql,oracel...),if db(v1) is a rdb and db(v2) is a nosql(mongo,redis,couchbase...),that's not work
may be need do some find and replace

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.