My application has be entity model as below and use Dapper
public class Goal
{
public string Text { get; set; }
public List<SubGoal> SubGoals { get; set; }
}
public class SubGoal
{
public string Text { get; set; }
public List<Practise> Practices { get; set; }
public List<Measure> Measures { get; set; }
}
and has a repository as below
public interface IGoalPlannerRepository
{
IEnumerable<Goal> FindAll();
Goal Get(int id);
void Save(Goal goal);
}
I came across two scenarios as below
While retrieving data (goal entity), it needs to retrieve all the related objects in hierarchy (all subgoals along with practices and measures)
When a goal is saved all the related data need to be inserted and/or updated
Please suggest is there a better way to handle these scenarios other than "looping through" the collections and writing lots and lots of SQL queries.
The best way to do large batch data updates in SQL using Dapper is with compound queries.
You can retrieve all your objects in one query as a multiple resultset, like this:
CREATE PROCEDURE get_GoalAndAllChildObjects
#goal_id int
AS
SELECT * FROM goal WHERE goal_id = #goal_id
SELECT * FROM subgoals WHERE goal_id = #goal_id
Then, you write a dapper function that retrieves the objects like this:
using (var multi = connection.QueryMultiple("get_GoalAndAllChildObjects", new {goal_id=m_goal_id})) {
var goal = multi.Read<Goal>();
var subgoals = multi.Read<SubGoal>();
}
Next comes updating large data in batches. You do that through table parameter inserts (I wrote an article on this here: http://www.altdevblogaday.com/2012/05/16/sql-server-high-performance-inserts/ ). Basically, you create one table for each type of data you are going to insert, then write a procedure that takes those tables as parameters and write them to the database.
This is super high performance and about as optimized as you can get, plus the code isn't too complex.
However, I need to ask: is there any point to keeping "subgoals" and all the other objects relational? One easy alternative is to create an XML or JSON document that contains your goal and all its child objects serialized into text, and just save that object to the file system. It's unbelievably high performance, very simple, very extensible, and takes very little code. The only downside is that you can't write a SQL statement to browse across all subgoals with a bit of work. Consider it - it might be worth a thought ;)
Related
I am having the nested object model as follows:
public class Product
{
public List<ProductOffering> ProductOfferings { get; set; }
}
public class ProductOffering
{
public int OfferingId { get; set; }
public string OfferingDescription { get; set; }
public string OfferingType { get; set; }
public List<OfferingPriceRegion> PriceRegions { get; set; }
}
I want to insert Product along with list of ProductOffering which having again list of OfferingPriceRegion in single stored procedure (SPInsertProduct)using C#. what is the best approach except entity framework. because ProductOfferings in Product may be in large number in count say 400. where entity framework may take more time in looping save functionality. Please suggest.
Dapper being an ADO.Net based object mapper, best option would be using TableValuedParameters, where complete required data can send to the database in a single call.
Following are the important points:
Dapper takes the TVP as a DataTable
For IEnumerable<T> to DataTable, you can use the System.Data.DatasetExtensions method CopyToDataTable or there's an Nuget API FastMember to achieve the same
Few Caveats:
Number of Columns, columns names and their order shall be exactly same for TVP and the DataTable, else it will not work and error will not suggest the issue, this mapping is not same as Json mapping where schema mismatch isn't an issue
If the number of records are very high, you may want to divide into multiple DataTables and use Async-Await to do the same operation concurrently.
I have a sample ASP.net application, and I want to create it using n tier architecture, so I have a data base that contains tables and stored procedures (that perform (CRUD) operations on these tables)
, now when I tried to create the data access layer I created methods that uses ado.net to call these stored procedures,and I have created it but these methods returns datatables like this one:
public DataTable getallcourcesdetailsbyid(string courseid) {
SqlParameter[] parameter = new SqlParameter[] { new SqlParameter("#courseid", courseid) };
return sqlhelper.ExecuteParamerizedSelectCommand("usp_getcoursedetailsbyid", CommandType.StoredProcedure, parameter);
}
So I found that there is better way to create classes with properties that represents tables in the database to hold the data returned by data access layer
like this one:
class course{
public int courseid { get; set; }
public string coursename { get; set; }
public short specializationid { get; set; }
public short subjectid { get; set; }
public short instructorid { get; set; }
public string startdate { get; set; }
public string enddate { get; set; }
public bool isactive { get; set; }
public bool isdeleted { get; set; }
}
But these stored procedures not always returns the data from specific table for example the class course above present the course table in the data base but the method above called "getallcourcesdetailsbyid" call stored procedure with the following code
select courseid,coursename,startdate,enddate,courseimgpath,specialization,firstname,lastname,subjectname,price,coursedetails,teacherimgpath
from joacademytest.course
inner join joacademytest.specialization ON joacademytest.course.specializationid = joacademytest.specialization.specializationid
inner join joacademytest.[subject] on joacademytest.course.subjectnameid=joacademytest.[subject].subjectid
inner join joacademytest.teachers on joacademytest.course.instructerid=joacademytest.teachers.teacherid
inner join dbo.courcesprices on joacademytest.course.priceid=dbo.courcesprices.priceid
where joacademytest.course.isactive=1 and joacademytest.course.isdeleted=0 and courseid = #courseid;
So the stored procedure will not return the same column exists in the course object but it represent columns from 4 joined tables, so do I have to create the entity classes based on the columns existing on the tables or based on the columns returned by my stored procedure. I have searched the Internet and never found any body mentioned that I can create entity classes based on stored procedure returned column which made me confused.
There is nothing wrong with creating classes that map to the results of a store procedure. Infact Many of the ORMs like entity framework, nHibernate allow you to do that. In the end It all depends on what you want to achieve, performance, maintenance etc and these are very broad topics. to answer your question and keeping in mind your current setup, i would propose -
Create Entity that map to store procs if, store procs are the preferred way of getting data from database, Infact so that you dont re-invent the wheel you can use one of the many Orm tools - dapper, EF.
Or instead of creating entities from your datatables , you can return dynamic objects.
hope this helps
What I always do is I create a classes that represents my tables in my database. In a scenario like you describe, where you have a stored procedure that returns data, I move it to a view. Then I create a class that represents that view. If you can't do that and you must keep it in a stored procedure, then I would just create a class around what the stored procedure returns.
I have following repository. I have a mapping between LINQ 2 SQL generated classes and domain objects using a factory.
The following code will work; but I am seeing two potential issues
1) It is using a SELECT query before update statement.
2) It need to update all the columns (not only the changed column). This is because we don’t know what all columns got changed in the domain object.
How to overcome these shortcomings?
Note: There can be scenarios (like triggers) which gets executed based on specific column update. So I cannot update a column unnecessarily.
REFERENCE:
LINQ to SQL: Updating without Refresh when “UpdateCheck = Never”
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=113917
CODE
namespace RepositoryLayer
{
public interface ILijosBankRepository
{
void SubmitChangesForEntity();
}
public class LijosSimpleBankRepository : ILijosBankRepository
{
private IBankAccountFactory bankFactory = new MySimpleBankAccountFactory();
public System.Data.Linq.DataContext Context
{
get;
set;
}
public virtual void SubmitChangesForEntity(DomainEntitiesForBank.IBankAccount iBankAcc)
{
//Does not get help from automated change tracking (due to mapping)
//Selecting the required entity
DBML_Project.BankAccount tableEntity = Context.GetTable<DBML_Project.BankAccount>().SingleOrDefault(p => p.BankAccountID == iBankAcc.BankAccountID);
if (tableEntity != null)
{
//Setting all the values to updates (except primary key)
tableEntity.Status = iBankAcc.AccountStatus;
//Type Checking
if (iBankAcc is DomainEntitiesForBank.FixedBankAccount)
{
tableEntity.AccountType = "Fixed";
}
if (iBankAcc is DomainEntitiesForBank.SavingsBankAccount)
{
tableEntity.AccountType = "Savings";
}
Context.SubmitChanges();
}
}
}
}
namespace DomainEntitiesForBank
{
public interface IBankAccount
{
int BankAccountID { get; set; }
double Balance { get; set; }
string AccountStatus { get; set; }
void FreezeAccount();
}
public class FixedBankAccount : IBankAccount
{
public int BankAccountID { get; set; }
public string AccountStatus { get; set; }
public double Balance { get; set; }
public void FreezeAccount()
{
AccountStatus = "Frozen";
}
}
}
If I understand your question, you are being passed an entity that you need to save to the database without knowing what the original values were, or which of the columns have actually changed.
If that is the case, then you have four options
You need to go back to the database to see the original values ie perform the select, as you code is doing. This allows you to set all your entity values and Linq2Sql will take care of which columns are actually changed. So if none of your columns are actually changed, then no update statement is triggered.
You need to avoid the select and just update the columns. You already know how to do (but for others see this question and answer). Since you don't know which columns have changed you have no option but set them all. This will produce an update statement even if no columns are actually changed and this can trigger any database triggers. Apart from disabling the triggers, about the only thing you can do here is make sure that the triggers are written to check the old and new columns values to avoid any further unnecessary updates.
You need to change your requirements/program so that you require both old and new entities values, so you can determine which columns have changed without going back to the database.
Don't use LINQ for your updates. LINQ stands for Language Integrated QUERY and it is (IMHO) brilliant at query, but I always looked on the updating/deleting features as an extra bonus, but not something which it was designed for. Also, if timing/performance is critical, then there is no way that LINQ will match properly hand-crafted SQL.
This isn't really a DDD question; from what I can tell you are asking:
Use linq to generate direct update without select
Where the accepted answer was no its not possible, but theres a higher voted answer that suggests you can attach an object to your context to initiate the change tracking of the data context.
Your second point about disabling triggers has been answered here and here. But as others have commented do you really need the triggers? Should you not be controlling these updates in code?
In general I think you're looking at premature optimization. You're using an ORM and as part of that you're trusting in L2S to make the database plumbing decisions for you. But remember where appropriate you can use stored procedures execute specific your SQL.
I'm currently doing some research on usage of db4o a storage for my web application. I'm quite happy how easy db4o works. So when I read about the Code First approach I kinda liked is, because the way of working with EF4 Code First is quite similar to working with db4o: create your domain objects (POCO's), throw them at db4o, and never look back.
But when I did a performance comparison, EF 4 was horribly slow. And I couldn't figure out why.
I use the following entities :
public class Recipe
{
private List _RecipePreparations;
public int ID { get; set; }
public String Name { get; set; }
public String Description { get; set; }
public List Tags { get; set; }
public ICollection Preparations
{ get { return _RecipePreparations.AsReadOnly(); } }
public void AddPreparation(RecipePreparation preparation)
{
this._RecipePreparations.Add(preparation);
}
}
public class RecipePreparation
{
public String Name { get; set; }
public String Description { get; set; }
public int Rating { get; set; }
public List Steps { get; set; }
public List Tags { get; set; }
public int ID { get; set; }
}
To test the performance I new up a recipe, and add 50.000 RecipePrepations. Then I stored the object in db4o like so :
IObjectContainer db = Db4oEmbedded.OpenFile(Db4oEmbedded.NewConfiguration(), #"RecipeDB.db4o");
db.Store(recipe1);
db.Close();
This takes around 13.000 (ms)
I store the stuff with EF4 in SQL Server 2008 (Express, locally) like this :
cookRecipes.Recipes.Add(recipe1);
cookRecipes.SaveChanges();
And that takes 200.000 (ms)
Now how on earth is db4o 15(!!!) times faster that EF4/SQL? Am I missing a secret turbo button for EF4? I even think that db4o could be made faster? Since I don't initialize the database file, I just let it grow dynamically.
Did you call SaveChanges() inside the loop? No wonder it's slow! Try doing this:
foreach(var recipe in The500000Recipes)
{
cookRecipes.Recipes.Add(recipe);
}
cookRecipes.SaveChanges();
EF expects you to make all the changes you want, and then call SaveChanges once. That way, it can optimize database communication and sql to perform the changes between opening state and saving state, ignoring all changes that you have undone. (For example, adding 50 000 records, then removing half of them, then hitting SaveChanges will only add 25 000 records to the database. Ever.)
Perhaps you can disable Changetracking while adding new objects, this would really increase Performance.
context.Configuration.AutoDetectChangesEnabled = false;
see also for more info: http://coding.abel.nu/2012/03/ef-code-first-change-tracking/
The EF excels at many things, but bulk loading is not one of them. If you want high-performance bulk loading, doing it directly through the DB server will be faster than any ORM. If your app's sole performance constraint is bulk loading, then you probably shouldn't use the EF.
Just to add on to the other answers: db4o typically runs in-process, while EF abstracts an out-of-process (SQL) database. However, db4o is essentially single-threaded. So while it might be faster for this one example with one request, SQL will handle concurrency (multiple queries, multiple users) much better than a default db4o database setup.
Its probarbly a simple 3-tier problem. I just want to make sure we use the best practice for this and I am not that familiary with the structures yet.
We have the 3 tiers:
GUI: ASP.NET for Presentation-layer (first platform)
BAL: Business-layer will be handling the logic on a webserver in C#, so we both can use it for webforms/MVC + webservices
DAL: LINQ to SQL in the Data-layer, returning BusinessObjects not LINQ.
DB: The SQL will be Microsoft SQL-server/Express (havent decided yet).
Lets think of setup where we have a database of [Persons]. They can all have multiple [Address]es and we have a complete list of all [PostalCode] and corresponding citynames etc.
The deal is that we have joined a lot of details from other tables.
{Relations}/[tables]
[Person]:1 --- N:{PersonAddress}:M --- 1:[Address]
[Address]:N --- 1:[PostalCode]
Now we want to build the DAL for Person. How should the PersonBO look and when does the joins occure?
Is it a business-layer problem to fetch all citynames and possible addressses pr. Person? or should the DAL complete all this before returning the PersonBO to the BAL ?
Class PersonBO
{
public int ID {get;set;}
public string Name {get;set;}
public List<AddressBO> {get;set;} // Question #1
}
// Q1: do we retrieve the objects before returning the PersonBO and should it be an Array instead? or is this totally wrong for n-tier/3-tier??
Class AddressBO
{
public int ID {get;set;}
public string StreetName {get;set;}
public int PostalCode {get;set;} // Question #2
}
// Q2: do we make the lookup or just leave the PostalCode for later lookup?
Can anyone explain in what order to pull which objects? Constructive criticism is very welcome. :o)
You're kind of reinventing the wheel; ORMs already solve most of this problem for you and you're going to find it a little tricky to do yourself.
The way ORMs like Linq to SQL, Entity Framework and NHibernate do this is a technique called lazy loading of associations (which can optionally be overriden with an eager load).
When you pull up a Person, it does not load the Address until you specifically ask for it, at which point another round-trip to the database occurs (lazy load). You can also specify on a per-query basis that you want the Address to be loaded for every person (eager load).
In a sense, with this question you are basically asking whether or not you should perform lazy or eager loads of the AddressBO for the PersonBO, and the answer is: neither. There isn't one single approach that universally works. By default you should probably lazy load, so that you don't do a whole lot of unnecessary joins; in order to pull this off, you'll have to build your PersonBO with a lazy-loading mechanism that maintains some reference to the DAL. But you'll still want to have the option to eager-load, which you'll need to build into your "business-access" logic.
Another option, if you need to return a highly-customized data set with specific properties populated from many different tables, is to not return a PersonBO at all, but instead use a Data Transfer Object (DTO). If you implement a default lazy-loading mechanism, you can sometimes substitute this as the eager-loading version.
FYI, lazy loaders in data access frameworks are usually built with the loading logic in the association itself:
public class PersonBO
{
public int ID { get; set; }
public string Name { get; set; }
public IList<AddressBO> Addresses { get; set; }
}
This is just a POCO, the magic happens in the actual list implementation:
// NOT A PRODUCTION-READY IMPLEMENTATION - DO NOT USE
internal class LazyLoadList<T> : IList<T>
{
private IQueryable<T> query;
private List<T> items;
public LazyLoadList(IQueryable<T> query)
{
if (query == null)
throw new ArgumentNullException("query");
this.query = query;
}
private void Materialize()
{
if (items == null)
items = query.ToList();
}
public void Add(T item)
{
Materialize();
items.Add(item);
}
// Etc.
}
(This obviously isn't production-grade, it's just to demonstrate the technique; you start with a query and don't materialize the actual list until you have to.)