Datacontext and data access class design improvements - c#

I am building a small data intensive app with Windows Forms. In the main project I have a folder that holds my DBML as well as data classes to provide CRUD operations against the database. There are about 10 said data classes currently.
The code behind in the form instantiates business objects and makes calls against them to do all the work. These business objects are making calls against the static data access classes.
An example of a data class would be something like this
static class CustomerData
{
public static IEnumerable<Customer> GetCustomersForRun(int runID)
{
var db = new FooDataContext("connectionString");
return db.Customers.Where(ri => ri.RunID == runID);
}
}
Now obviously there are a few problems with my initial design that I need to address.
1) It's not nice to have each static method need to create its own DataContext. This doesn't seem very DRY at all.
2) Because I'm relying on some lazy loading I'm not able to wrap my DataContext in a using statement.
A couple of different ideas I have to fix this problem are
1) Get rid of the static methods and instead create an abstract base data access class that can instantiate my DataContext.
2) Have each business object create it's own DataContext and pass that into the static methods of the data access classes.
An example of the method signature would then be
public static IEnumerable<Customer> GetCustomerForRun(DataContext db, int runID)
My specific questions are
1) Am I over complicating this?
2) Do you typically dispose of your DataContext objects?
3) Which of my solutions makes most sense? If none of them what do you recommend?

1) Am I over complicating this?
It really depends if your application is very small shoehorning a pattern into the mix might make things more complicated where simply using the DataContext might make things easier to understand instead of putting a layer of abstraction on top of linq to sql.
2) Do you typically dispose of your DataContext objects?
It will depend on your implamentation, if you plan on passing an IQueryable<T> around to do filtering wrapping a using(){} block will cause your grief. Since linq to sql only triggers a sql query when you do something that calls GetEnumerator() your context might be disposed and your call will fail.
Conceder this example:
IQueryable<Table> GetStuff()
{
using(var db = new DataContext())
{
return db.Tables.Where(i=>i.Id == 1);
}
}
if in another method you try to do this GetStuff().Where(i=> i.Name == "Jon").ToList() will cause the query to fail as the context has already been disposed.
Now if you don't do that you can gain the power of IQueryable
IQueryable<Table> GetStuff()
{
return db.Tables.Where(i=>i.Id == 1);
}
GetStuff().Where(i=> i.Name == "Jon").ToList() will work and allow you to filter out your query and defer execution of the sql statement until the very last minute. More information can be found here.
3) Which of my solutions makes most sense? If none of them what do you recommend?
I usually try to stay away from static classes/methods since it makes doing unit tests very difficult. Probably some good information would be to look at the Repository pattern and this answer which gives some quick information.

Related

WebApi Speed for Returning Related Entities

My WebApi is working with a lot of internal references between my objects and i'm wondering what would be less costly for the application. I'm using EF database first so i don't have access to the generated classes (I know i can edit them but it's not that smart).
For example, i have some areas where i will have 5 relations, and those relations are deep but i don't want to return them all the time to the user because i won't use all that data, sometimes i just need the parent object and to work that around i'm using AutoMapper and creating some ViewModels where i make a copy of my object.
On some point on my Api that i only want to return some entities i would start the AutoMapper and tell him what he should ignore for that case.
My problem is as i said, i have a lot of data, this system is going to be used for 15k - 20k users. Is the AutoMapper ignoring the data be a bottleneck up ahead ? If so would be better i use some other alternative ?
If this isn't the best option, what else could i use ?
This is an example of how i'm working:
Controller:
public async Task<EventVM> Get(int id)
{
var event = await eventService.Get(id);
return event;
}
Service:
public async Task<EventoVM> Get(int id)
{
var event = await _context.Event.FindAsync(id);
return event;
}
Also i checked on my configuration, Lazy Loading is enabled.
Some of the things in your initial post are not clear at all.
You say you use code first but don't have access to generated classes. Well, if you use code first there won't be generated classes, but you must have some classes initially from which your sql tables get generated, right?
As a rule of thumb, do not use anything from EF in your WebApi. Have your Api return only the data and properties you need for each endpoint. This means creating another set of classes, tipically DTOs which are much lighter, don't have any methods only public properties with exactly the data you need. Yes, you will need an extra step in between to transform the data, but that is absolutely fine.
This should help you get started, just remember the important rule : return exactly what you need, nothing more, nothing less.

Is it proper form to extend a model object (e.g. Product) and add a Create() method that inserts into the database? (MVC 5 Entity Framework 6)

So I am currently extending the classes that Entity Framework automatically generated for each of the tables in my database. I placed some helpful methods for processing data inside these partial classes that do the extending.
My question, however, is concerning the insertion of rows in the database. Would it be good form to include a method in my extended classes to handle this?
For example, in the Product controller's Create method have something like this:
[HttpPost]
public ActionResult Create(Product p)
{
p.InsertThisProductIntoTheDatabase(); //my custom method for inserting into db
return View();
}
Something about this feels wrong to me, but I can't put my finger on it. It feels like this functionality should instead be placed inside a generic MyHelpers.cs class, or something, and then just do this:
var h = new MyHelpers();
h.InsertThisProductIntoTheDatabase(p);
What do you guys think? I would prefer to do this the "correct" way.
MVC 5, EF 6
edit: the InsertThisProductIntoTheDatabase method might look something like:
public partial class Product()
{
public void InsertThisProductIntoTheDatabase()
{
var context = MyEntities();
this.CreatedDate = DateTime.Now;
this.CreatedByID = SomeUserClass.ID;
//some additional transformation/preparation of the object's data would be done here too. My goal is to bring all of this out of the controller.
context.Products.Add(this);
}
}
One of the problems I see is that the entity framework DBContext is a unit of work. if you create a unit of work on Application_BeginRequest when you pass it into controller constructor it acts as a unit of work for the entire request. maybe it's only updating 1 entity in your scenario, but you could be writing more information to your database. unless you are wrapping everything in a TransactionScope, all these Saves are going to be independent which could leave your database in an inconsistent state. And even if you are wrapping everything with a TransactionScope, I'm pretty sure that transaction is going to be promoted to the DTC because you are making multiple physical connections in a single controller and sql server isn't that smart.
Going the BeginRequest route seems like less work than adding methods to all of your entities to save itself. Another issue here is that an EF entity is supposed to be a not really know anything about it's own persistence. That's what the DbContext is for. So putting a reference back to the DbContext breaks this isolation.
Your second reason, adding audit information to the entity, again adding this to each entity is a lot of work. You could override SaveChanges on the context and do it once for every entity. See this SO answer.
By going down this road I think that you are breaking SOLID design principles because your entities violate SRP. introduce a bunch of cohesion and you are ending up writing more code than you need. So i'd advocate against doing it your way.
Why don't you simply use:
db.Products.Add(p);
db.SaveChanges();
Your code would be much cleaner and it will certainly be easier for you to manage it and get help in the future. Most of samples available in internet use this schema. Extension methods and entities does not look pleasnt.
BTW: Isn't InsertThisProductIntoTheDatabase() method name too long?

How do I maintain referential transparency between related entities without relying on a common data context instance?

Thanks for looking.
Background
In my .NET applications I usually have a Business Logic Layer (BLL) containing my business methods and a Data Access Layer (DAL) which contains my Entitiy classes and any methods for dealing with atomic entities (i.e. CRUD methods for a single entity). This is a pretty typical design pattern.
Here is a pseudocode example of what I mean:
BLL
public static int CreateProduct(ProductModel product){
return DAL.SomeClass.CreateProduct(new DAL.Product{
Name = product.Name,
Price = product.Price
});
}
DAL
public int CreateProduct(Product p){
var db = new MyDataContext();
db.Products.AddObject(p);
db.SaveChanges();
return p.Id;
}
No problems with this simple example.
Ideally, all the business of instantiating a data context and using that data context lives in the DAL. But this becomes a problem if I attempt to deal with slightly more complex objects:
BLL
public static int CreateProduct(ProductModel product){
return DAL.SomeClass.CreateProduct(new DAL.Product{
Name = product.Name,
Price = product.Price,
ProductType = DAL.SomeClass.GetProductTypeById(product.ProductTypeId) //<--PROBLEM
});
}
Now, instead of saving the entity, I get the following error:
An entity object cannot be referenced by multiple instances of IEntityChangeTracker
Ok, so the answer to dealing with that is to pass a common data context to both calls:
BLL
public static int CreateProduct(ProductModel product){
using{var db = new DAL.MyDataContext()){
return DAL.SomeClass.CreateProduct(new DAL.Product{
Name = product.Name,
Price = product.Price,
ProductType = DAL.SomeClass.GetProductTypeById(product.ProductTypeId, db) //<--CONTEXT
}, db); //<--CONTEXT
}
}
Problem
This solves the immediate problem, but now my referential transparency is blown because I have to:
Instantiate the data context in the BLL
Pass the data context to the DAL from the BLL
Create overridden methods in the DAL that accept a data context as a parameter.
This may not be a problem for some but for me, since I write my code in a more functional style, it is a big problem. It's all the same database after all, so why the heck can't I deal with unique entities regardless of their data context instance?
Other Notes
I realize that some may be tempted to simply say to create a common data context for all calls. This won't fly as doing so is bad practice for a multitude of reasons and ultimately causes a connection pool overflow. See this great answer for more details.
Any constructive input is appreciated.
Personally, I track my unit of work and associate a data context to it via static methods. This works great if you aren't talking about operations with long lifetimes, such as my current project, an ASP.NET application, where every request is a (mostly) distinct unit and a request start and end coincide with the unit start/end. I store data context in the request CurrentContext, which, if you aren't familiar with it, is basically a dictionary managed by the system that allocates a request-specific storage accessible by static methods. The work's already done for me there, but you can find lots of examples of implementing your own unit of work pattern. One DbContext per web request... why?
Another equally workable answer for many is injection. Used for this purpose (injecting datacontext), it basically mimics the code you wrote at the end of your question, but shields you from the "non-functional" stuff you dislike.
Yes, you are only accessing one database, but if you look closely, you will see the database is not the constraint here. That is arising from the cache, which is designed to permit multiple, differing, concurrent copies of the data. If you don't wish to permit that, then you have a whole host of other solutions available.

LINQ to SQL: Reusing DataContext

I have a number of static methods that perform simple operations like insert or delete a record. All these methods follow this template of using:
public static UserDataModel FromEmail(string email)
{
using (var db = new MyWebAppDataContext())
{
db.ObjectTrackingEnabled = false;
return (from u in db.UserDataModels
where u.Email == email
select u).Single();
}
}
I also have a few methods that need to perform multiple operations that use a DataContext:
public static UserPreferencesDataModel Preferences(string email)
{
return UserDataModel.Preferences(UserDataModel.FromEmail(email));
}
private static UserPreferencesViewModel Preferences(UserDataModel user)
{
using(var db = new MyWebAppDataContext())
{
var preferences = (from u in db.UserDataModels
where u == user
select u.Preferences).Single();
return new UserPreferencesViewModel(preferences);
}
}
I like that I can divide simple operations into faux-stored procedures in my data models with static methods like FromEmail(), but I'm concerned about the cost of having Preferences() invoking two connections (right?) via the two using DataContext statements.
Do I need to be? Is what I'm doing less efficient than using a single using(var db = new MyWebAppDataContext()) statement?
If you examine those "two" operations, you might see that they could be performed in 1 database roundtrip. Minimizing database roundtrips is a major performance objective (second to minimizing database io).
If you have multiple datacontexts, they view the same record differently. Normally, ObjectTracking requires that the same instance is always used to represent a single record. If you have 2 DataContexts, they each do their own object tracking on their own instances.
Suppose the record changes between DC1 observing it and and DC2 observing it. In this case, the record will not only have 2 different instances, but those different instances will have different values. It can be very challenging to express business logic against such a moving target.
You should definately retire the DataContext after the UnitOfWork, to protect yourself from stale instances of records.
Normally you should use one context for one logical unit of work. So have a look at the unit of work pattern, ex. http://dotnet.dzone.com/news/using-unit-work-pattern-entity
Of cause there is some overhead in creating a new DataContext each time. But its a good practice to do as Ludwig stated: One context per unit of work.
Its using connection pooling so its not a too expensive operation.
I also think creating a new DataContext each time is the correct way but this link explains different approaches for handling the data context. Linq to SQL DataContext Lifetime Management
I developed a wrapper component that uses an interface like:
public interface IContextCacher {
DataContext GetFromCache();
void SaveToCache(DataContext ctx);
}
And use a wrapper to instantiate the context; if it exists in cache, it's pulled from there, otherwise, a new instance is created and pushed to the Save method, and all future implementations would get the value from the getter.
Depending on the type of application would be the actual caching mechanism. Say for instance, an ASP.NET web application. This could store the context in the items collection, so its alive for the request only. For a windows app, it could pull it from some singleton collection. It could be whatever you wanted under the scenes.

Problem using LINQ to SQL with one DataContext per atomic action

I have started using Linq to SQL in a (bit DDD like) system which looks (overly simplified) like this:
public class SomeEntity // Imagine this is a fully mapped linq2sql class.
{
public Guid SomeEntityId { get; set; }
public AnotherEntity Relation { get; set; }
}
public class AnotherEntity // Imagine this is a fully mapped linq2sql class.
{
public Guid AnotherEntityId { get; set; }
}
public interface IRepository<TId, TEntity>
{
Entity Get(TId id);
}
public class SomeEntityRepository : IRepository<Guid, SomeEntity>
{
public SomeEntity Get(Guid id)
{
SomeEntity someEntity = null;
using (DataContext context = new DataContext())
{
someEntity = (
from e in context.SomeEntity
where e.SomeEntityId == id
select e).SingleOrDefault<SomeEntity>();
}
return someEntity;
}
}
Now, I got a problem. When I try to use SomeEntityRepository like this
public static class Program
{
public static void Main(string[] args)
{
IRepository<Guid, SomeEntity> someEntityRepository = new SomeEntityRepository();
SomeEntity someEntity = someEntityRepository.Get(new Guid("98011F24-6A3D-4f42-8567-4BEF07117F59"));
Console.WriteLine(someEntity.SomeEntityId);
Console.WriteLine(someEntity.Relation.AnotherEntityId);
}
}
everything works nicely until the program gets to the last WriteLine, because it throws an ObjectDisposedException, because the DataContext does not exist any more.
I do see the actual problem, but how do I solve this? I guess there are several solutions, but none of those I have thought of to date would be good in my situation.
Get away from the repository pattern and using a new DataContext for each atomic part of work.
I really would not want to do this. A reason is that I do not want to be the applications to be aware of the repository. Another one is that I do not think making linq2sql stuff COM visible would be good.
Also, I think that doing context.SubmitChanges() would probably commit much more than I intended to.
Specifying DataLoadOptions to fetch related elements.
As I want my Business Logic Layer to just reply with some entities in some cases, I do not know which sub-properties they need to use.
Disabling lazy loading/delayed loading for all properties.
Not an option, because there are quite a few tables and they are heavily linked. This could cause a lot of unnecessary traffic and database load.
Some post on the internet said that using .Single() should help.
Apparently it does not ...
Is there any way to solve this misery?
BTW: We decided to use Linq t0 SQL because it is a relatively lightweight ORM solution and included with the .NET framework and Visual Studio. If the .NET Entity Framework would fit better in this pattern, it may be an option to switch to it. (We are not that far in the implementation, yet.)
Rick Strahl has a nice article about DataContext lifecycle management here: http://www.west-wind.com/weblog/posts/246222.aspx.
Basically, the atomic action approach is nice in theory but you're going to need to keep your DataContext around to be able to track changes (and fetch children) in your data objects.
See also: Multiple/single instance of Linq to SQL DataContext and LINQ to SQL - where does your DataContext live?.
You have to either:
1) Leave the context open because you haven't fully decided what data will be used yet (aka, Lazy Loading).
or 2) Pull more data on the initial load if you know you will need that other property.
Explaination of the latter: here
I'm not sure you have to abandon Repository if you go with atomic units of work. I use both, though I admit to throwing out the optimistic concurrency checks since they don't work out in layers anyway (without using a timestamp or some other required convention). What I end up with is a repository that uses a DataContext and throws it away when it's done.
This is part of an unrelated Silverlight example, but the first three parts show how I'm using a Repository pattern with a throwaway LINQ to SQL context, FWIW: http://www.dimebrain.com/2008/09/linq-wcf-silver.html
Specifying DataLoadOptions to fetch related elements. As I want my Business Logic Layer to just reply with some entities in some cases, I do not know which sub-properties they need to use.
If the caller is granted the coupling necessary to use the .Relation property, then the caller might as well specify the DataLoadOptions.
DataLoadOptions loadOptions = new DataLoadOptions();
loadOptions.LoadWith<Entity>(e => e.Relation);
SomeEntity someEntity = someEntityRepository
.Get(new Guid("98011F24-6A3D-4f42-8567-4BEF07117F59"),
loadOptions);
//
using (DataContext context = new DataContext())
{
context.LoadOptions = loadOptions;
This is what I do, and so far it's worked really well.
1) Make the DataContext a member variable in your repository. Yes, this means you're repository should now implement IDisposable and not be left open... maybe something you want to avoid having to do, but I haven't found it to be inconvenient.
2) Add some methods to your repository like this:
public SomeEntityRepository WithSomethingElseTheCallerMightNeed()
{
dlo.LoadWith<SomeEntity>(se => se.RelatedEntities);
return this; //so you can do method chaining
}
Then, your caller looks like this:
SomeEntity someEntity = someEntityRepository.WithSomethingElseTheCallerMightNeed().Get(new Guid("98011F24-6A3D-4f42-8567-4BEF07117F59"));
You just need to make sure that when your repository hits the db, it uses the data load options specified in those helper methods... in my case "dlo" is kept as a member variable, and then set right before hitting the db.

Categories