Optimizing Repository’s SubmitChanges Method

Optimizing Repository’s SubmitChanges Method - c#

I have following repository. I have a mapping between LINQ 2 SQL generated classes and domain objects using a factory.
The following code will work; but I am seeing two potential issues
1) It is using a SELECT query before update statement.
2) It need to update all the columns (not only the changed column). This is because we don’t know what all columns got changed in the domain object.
How to overcome these shortcomings?
Note: There can be scenarios (like triggers) which gets executed based on specific column update. So I cannot update a column unnecessarily.
REFERENCE:
LINQ to SQL: Updating without Refresh when “UpdateCheck = Never”
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=113917
CODE
namespace RepositoryLayer
{
public interface ILijosBankRepository
{
void SubmitChangesForEntity();
}
public class LijosSimpleBankRepository : ILijosBankRepository
{
private IBankAccountFactory bankFactory = new MySimpleBankAccountFactory();
public System.Data.Linq.DataContext Context
{
get;
set;
}
public virtual void SubmitChangesForEntity(DomainEntitiesForBank.IBankAccount iBankAcc)
{
//Does not get help from automated change tracking (due to mapping)
//Selecting the required entity
DBML_Project.BankAccount tableEntity = Context.GetTable<DBML_Project.BankAccount>().SingleOrDefault(p => p.BankAccountID == iBankAcc.BankAccountID);
if (tableEntity != null)
{
//Setting all the values to updates (except primary key)
tableEntity.Status = iBankAcc.AccountStatus;
//Type Checking
if (iBankAcc is DomainEntitiesForBank.FixedBankAccount)
{
tableEntity.AccountType = "Fixed";
}
if (iBankAcc is DomainEntitiesForBank.SavingsBankAccount)
{
tableEntity.AccountType = "Savings";
}
Context.SubmitChanges();
}
}
}
}
namespace DomainEntitiesForBank
{
public interface IBankAccount
{
int BankAccountID { get; set; }
double Balance { get; set; }
string AccountStatus { get; set; }
void FreezeAccount();
}
public class FixedBankAccount : IBankAccount
{
public int BankAccountID { get; set; }
public string AccountStatus { get; set; }
public double Balance { get; set; }
public void FreezeAccount()
{
AccountStatus = "Frozen";
}
}
}

If I understand your question, you are being passed an entity that you need to save to the database without knowing what the original values were, or which of the columns have actually changed.
If that is the case, then you have four options
You need to go back to the database to see the original values ie perform the select, as you code is doing. This allows you to set all your entity values and Linq2Sql will take care of which columns are actually changed. So if none of your columns are actually changed, then no update statement is triggered.
You need to avoid the select and just update the columns. You already know how to do (but for others see this question and answer). Since you don't know which columns have changed you have no option but set them all. This will produce an update statement even if no columns are actually changed and this can trigger any database triggers. Apart from disabling the triggers, about the only thing you can do here is make sure that the triggers are written to check the old and new columns values to avoid any further unnecessary updates.
You need to change your requirements/program so that you require both old and new entities values, so you can determine which columns have changed without going back to the database.
Don't use LINQ for your updates. LINQ stands for Language Integrated QUERY and it is (IMHO) brilliant at query, but I always looked on the updating/deleting features as an extra bonus, but not something which it was designed for. Also, if timing/performance is critical, then there is no way that LINQ will match properly hand-crafted SQL.

This isn't really a DDD question; from what I can tell you are asking:
Use linq to generate direct update without select
Where the accepted answer was no its not possible, but theres a higher voted answer that suggests you can attach an object to your context to initiate the change tracking of the data context.
Your second point about disabling triggers has been answered here and here. But as others have commented do you really need the triggers? Should you not be controlling these updates in code?
In general I think you're looking at premature optimization. You're using an ORM and as part of that you're trusting in L2S to make the database plumbing decisions for you. But remember where appropriate you can use stored procedures execute specific your SQL.

Related

DbContext Entity framework Datetime.Now fields

Have two fields with data type datetime.
Added
Modified
When inserting new record values for both fields must be System.DateTime.Now;
but when updating only Modified needs to be changed.
I can set StoreGeneratedPattern to Computed and handle Modified field with GETDATE() in database but problem is field Added.
My guess is that I have to override SavingChanges() or something similar but don't know how.
EDIT : What I have try so far
Added another class in my project with fallowing code
namespace Winpro
{
public partial class Customer
{
public Customer()
{
this.Added = DateTime.UtcNow;
}
}
}
but then cannot build solution
Type 'Winpro.Customer' already defines a member called 'Customer' with the same parameter types

One option is to define a constructor for the type that sets the field.
Big important note: unless you know exactly what you're doing, always store dates and times in a database in UTC. DateTime.Now is the computer's local time which can vary according to daylight savings, timezone changes (brought about by political/legislative reasons), and can end up rendering date information useless. Use DateTime.UtcNow.
public partial class MyEntity {
public MyEntity() {
this.Added = DateTime.UtcNow;
}
}

We did something quite similar in the past.
There was the need to store both Date and Time and the responsible for creating the record. Also, on every change, dispite if there's an audit record or not, the base record should also get a Date and Time and the user responsible for the changes.
Here's what we have done:
Interfaces
To add some standard behavior and make things more extensible, we've created two interfaces, as follows:
public interface IAuditCreated
{
DateTime CreatedDateTime { get; set; }
string CreationUser { get; set; }
}
public interface IAuditChanged
{
DateTime LastChangeDateTime { get; set; }
string LastChangeUser { get; set; }
}
Override SaveChanges() to add some automatic control
public class WhateverContext : DbContext
{
// Some behavior and all...
public override int SaveChanges()
{
// Added ones...
var _entitiesAdded = ChangeTracker.Entries()
.Where(_e => _e.State == EntityState.Added)
.Where(_e => _e.Entity.GetType().GetInterfaces().Any(_i => _i == typeof(IAuditCreated)))
.Select(_e => _e.Entity);
foreach(var _entity in _entitiesAdded) { /* Set date and user */ }
// Changed ones...
var _entitiesChanged = ChangeTracker.Entries()
.Where(_e => _e.State == EntityState.Modified)
.Where(_e => _e.Entity.GetType().GetInterfaces().Any(_i => _i == typeof(IAuditChanged)))
.Select(_e => _e.Entity);
foreach(var _entity in _entitiesChanged) { /* Set date and user */ }
// Save...
return base.SaveChanges();
}
}
Do not simply copy and paste!
This code was written a few years ago, on the age of EntityFramework v4. It assumes that you have already detected changes (ChangeTracker available) and some other.
Also, we have absolutely no idea of how this code impacts performance on any way. That's because the usage of this system is much or related to viewing than updating and also because it's a desktop application, so we have plenty available memory and processing time to waste.
You should take that into account and you might find a better way to implement this. But the whole idea is the same: filter which entities are being updated and which are being added to properly handle that.
Another approach
There are many approaches to this. One other that might be better for performance on some cases (but also more complex) is to have some sort of proxy, similar to an EF proxy, handling that.
Again, even with an empty interface, it's good to have one to clearly distinguish between auditable records and regular ones.
If possible to force all of them having the same property name and type, do it.

CRUD on related entities using Dapper

My application has be entity model as below and use Dapper
public class Goal
{
public string Text { get; set; }
public List<SubGoal> SubGoals { get; set; }
}
public class SubGoal
{
public string Text { get; set; }
public List<Practise> Practices { get; set; }
public List<Measure> Measures { get; set; }
}
and has a repository as below
public interface IGoalPlannerRepository
{
IEnumerable<Goal> FindAll();
Goal Get(int id);
void Save(Goal goal);
}
I came across two scenarios as below
While retrieving data (goal entity), it needs to retrieve all the related objects in hierarchy (all subgoals along with practices and measures)
When a goal is saved all the related data need to be inserted and/or updated
Please suggest is there a better way to handle these scenarios other than "looping through" the collections and writing lots and lots of SQL queries.

The best way to do large batch data updates in SQL using Dapper is with compound queries.
You can retrieve all your objects in one query as a multiple resultset, like this:
CREATE PROCEDURE get_GoalAndAllChildObjects
#goal_id int
AS
SELECT * FROM goal WHERE goal_id = #goal_id
SELECT * FROM subgoals WHERE goal_id = #goal_id
Then, you write a dapper function that retrieves the objects like this:
using (var multi = connection.QueryMultiple("get_GoalAndAllChildObjects", new {goal_id=m_goal_id})) {
var goal = multi.Read<Goal>();
var subgoals = multi.Read<SubGoal>();
}
Next comes updating large data in batches. You do that through table parameter inserts (I wrote an article on this here: http://www.altdevblogaday.com/2012/05/16/sql-server-high-performance-inserts/ ). Basically, you create one table for each type of data you are going to insert, then write a procedure that takes those tables as parameters and write them to the database.
This is super high performance and about as optimized as you can get, plus the code isn't too complex.
However, I need to ask: is there any point to keeping "subgoals" and all the other objects relational? One easy alternative is to create an XML or JSON document that contains your goal and all its child objects serialized into text, and just save that object to the file system. It's unbelievably high performance, very simple, very extensible, and takes very little code. The only downside is that you can't write a SQL statement to browse across all subgoals with a bit of work. Consider it - it might be worth a thought ;)

SubSonic edit class problem

I'm having a very odd issue with SubSonic where when I edit a class the database isn't being updated, even when I delete it and regenerate it.
Example: Simple class
public class Customer {
public Guid Id { get; set; }
public string Description { get; set; }
}
Customer c = new Customer() { Id = Guid.NewGuid(), Description = "Toaster" };
var repo = new SimpleRepository("CustomerTest",
SimpleRepositoryOptions.RunMigrations);
repo.Add(c);
If I run this code it works perfectly, creates a table "Customer" and inserts the row for the toaster. However if I decide to change my Customer class to:
public class Customer {
public Guid Id { get; set; }
public string Description { get; set; }
public int Cost { get; set;}
}
And run the same code adding a value for the Cost property the database table remains "Id, Description". If I create a totally new class and past in the Customer fields it will create the table correctly the first time and again any changes dont appear to work.
Any help?

First off all, you should try to figure out if subsonic detects your class definition changes properly.
This code should give you a overview of the statements subsonic want's to execute.
var migrator=new SubSonic.Schema.Migrator(Assembly.GetExecutingAssembly());
var provider=ProviderFactory.GetProvider("CustomerTest");
string[] commands=migrator.MigrateFromModel<Customer>(provider);
commands should contain all changes subsonic wants to make to your database.
You can execute these commands by yourself with:
BatchQuery query = new BatchQuery(provider);
foreach(var s in commands)
query.QueueForTransaction(new QueryCommand(s.Trim(), provider));
//pop the transaction
query.ExecuteTransaction();
(code taken from http://subsonicproject.com/docs/3.0_Migrations).
That said, I suppose commands will be empty in your case.
In that case that could be caused by a statement that is not implemented by the provider you are using (SqlServer/MySQL/SQLite/Oracle). Maybe you should download the SubSonic source and step into the migrator.MigrateFromModel(...) method to see what happens.
Another possible cause (if you use MySQL) could be that your information schema is not up to date. I encountered this problem a while ago. After changing my database and regenerating the DAL with SubSonic 2, my generated code didn't change.
I figured out that the mysql information schema (and subsonic does queries on the information schema) hadn't changed yet.
I solved this by executing FLUSH TABLES which caused the information scheme to reload. I don't know if that's a bug in mysql or desired behaviour but you should try FLUSH TABLES first.

Why is EF4 Code First so slow when storing objects?

I'm currently doing some research on usage of db4o a storage for my web application. I'm quite happy how easy db4o works. So when I read about the Code First approach I kinda liked is, because the way of working with EF4 Code First is quite similar to working with db4o: create your domain objects (POCO's), throw them at db4o, and never look back.
But when I did a performance comparison, EF 4 was horribly slow. And I couldn't figure out why.
I use the following entities :
public class Recipe
{
private List _RecipePreparations;
public int ID { get; set; }
public String Name { get; set; }
public String Description { get; set; }
public List Tags { get; set; }
public ICollection Preparations
{ get { return _RecipePreparations.AsReadOnly(); } }
public void AddPreparation(RecipePreparation preparation)
{
this._RecipePreparations.Add(preparation);
}
}
public class RecipePreparation
{
public String Name { get; set; }
public String Description { get; set; }
public int Rating { get; set; }
public List Steps { get; set; }
public List Tags { get; set; }
public int ID { get; set; }
}
To test the performance I new up a recipe, and add 50.000 RecipePrepations. Then I stored the object in db4o like so :
IObjectContainer db = Db4oEmbedded.OpenFile(Db4oEmbedded.NewConfiguration(), #"RecipeDB.db4o");
db.Store(recipe1);
db.Close();
This takes around 13.000 (ms)
I store the stuff with EF4 in SQL Server 2008 (Express, locally) like this :
cookRecipes.Recipes.Add(recipe1);
cookRecipes.SaveChanges();
And that takes 200.000 (ms)
Now how on earth is db4o 15(!!!) times faster that EF4/SQL? Am I missing a secret turbo button for EF4? I even think that db4o could be made faster? Since I don't initialize the database file, I just let it grow dynamically.

Did you call SaveChanges() inside the loop? No wonder it's slow! Try doing this:
foreach(var recipe in The500000Recipes)
{
cookRecipes.Recipes.Add(recipe);
}
cookRecipes.SaveChanges();
EF expects you to make all the changes you want, and then call SaveChanges once. That way, it can optimize database communication and sql to perform the changes between opening state and saving state, ignoring all changes that you have undone. (For example, adding 50 000 records, then removing half of them, then hitting SaveChanges will only add 25 000 records to the database. Ever.)

Perhaps you can disable Changetracking while adding new objects, this would really increase Performance.
context.Configuration.AutoDetectChangesEnabled = false;
see also for more info: http://coding.abel.nu/2012/03/ef-code-first-change-tracking/

The EF excels at many things, but bulk loading is not one of them. If you want high-performance bulk loading, doing it directly through the DB server will be faster than any ORM. If your app's sole performance constraint is bulk loading, then you probably shouldn't use the EF.

Just to add on to the other answers: db4o typically runs in-process, while EF abstracts an out-of-process (SQL) database. However, db4o is essentially single-threaded. So while it might be faster for this one example with one request, SQL will handle concurrency (multiple queries, multiple users) much better than a default db4o database setup.

Where do objects merge/join data in a 3-tier model?

Its probarbly a simple 3-tier problem. I just want to make sure we use the best practice for this and I am not that familiary with the structures yet.
We have the 3 tiers:
GUI: ASP.NET for Presentation-layer (first platform)
BAL: Business-layer will be handling the logic on a webserver in C#, so we both can use it for webforms/MVC + webservices
DAL: LINQ to SQL in the Data-layer, returning BusinessObjects not LINQ.
DB: The SQL will be Microsoft SQL-server/Express (havent decided yet).
Lets think of setup where we have a database of [Persons]. They can all have multiple [Address]es and we have a complete list of all [PostalCode] and corresponding citynames etc.
The deal is that we have joined a lot of details from other tables.
{Relations}/[tables]
[Person]:1 --- N:{PersonAddress}:M --- 1:[Address]
[Address]:N --- 1:[PostalCode]
Now we want to build the DAL for Person. How should the PersonBO look and when does the joins occure?
Is it a business-layer problem to fetch all citynames and possible addressses pr. Person? or should the DAL complete all this before returning the PersonBO to the BAL ?
Class PersonBO
{
public int ID {get;set;}
public string Name {get;set;}
public List<AddressBO> {get;set;} // Question #1
}
// Q1: do we retrieve the objects before returning the PersonBO and should it be an Array instead? or is this totally wrong for n-tier/3-tier??
Class AddressBO
{
public int ID {get;set;}
public string StreetName {get;set;}
public int PostalCode {get;set;} // Question #2
}
// Q2: do we make the lookup or just leave the PostalCode for later lookup?
Can anyone explain in what order to pull which objects? Constructive criticism is very welcome. :o)

You're kind of reinventing the wheel; ORMs already solve most of this problem for you and you're going to find it a little tricky to do yourself.
The way ORMs like Linq to SQL, Entity Framework and NHibernate do this is a technique called lazy loading of associations (which can optionally be overriden with an eager load).
When you pull up a Person, it does not load the Address until you specifically ask for it, at which point another round-trip to the database occurs (lazy load). You can also specify on a per-query basis that you want the Address to be loaded for every person (eager load).
In a sense, with this question you are basically asking whether or not you should perform lazy or eager loads of the AddressBO for the PersonBO, and the answer is: neither. There isn't one single approach that universally works. By default you should probably lazy load, so that you don't do a whole lot of unnecessary joins; in order to pull this off, you'll have to build your PersonBO with a lazy-loading mechanism that maintains some reference to the DAL. But you'll still want to have the option to eager-load, which you'll need to build into your "business-access" logic.
Another option, if you need to return a highly-customized data set with specific properties populated from many different tables, is to not return a PersonBO at all, but instead use a Data Transfer Object (DTO). If you implement a default lazy-loading mechanism, you can sometimes substitute this as the eager-loading version.
FYI, lazy loaders in data access frameworks are usually built with the loading logic in the association itself:
public class PersonBO
{
public int ID { get; set; }
public string Name { get; set; }
public IList<AddressBO> Addresses { get; set; }
}
This is just a POCO, the magic happens in the actual list implementation:
// NOT A PRODUCTION-READY IMPLEMENTATION - DO NOT USE
internal class LazyLoadList<T> : IList<T>
{
private IQueryable<T> query;
private List<T> items;
public LazyLoadList(IQueryable<T> query)
{
if (query == null)
throw new ArgumentNullException("query");
this.query = query;
}
private void Materialize()
{
if (items == null)
items = query.ToList();
}
public void Add(T item)
{
Materialize();
items.Add(item);
}
// Etc.
}
(This obviously isn't production-grade, it's just to demonstrate the technique; you start with a query and don't materialize the actual list until you have to.)

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.