Poor performance for POCO Generator - c#

I am working on a Legacy application, and we have a poor performance with Entity Framework (4.0.0) and massive insert.
When I tried the POCO Generator (T4), the issue was worse, the SaveChanges was three times longer. This is huge, if you have any idea why I have this issue, I am interested.

I don't have any performance metrics for different generators. But the bottleneck should not be in your context anyway. You should know EF will generate one SQL statement per insert, update and delete, and if you didn't explicitly open the connection first, it will log on before and log off from sql server once per SQL statement.
Also the context must maintain the states and relationships so the performance degrades as your context gets larger and larger. SaveChanges must figure out what's happening in the context first and should be the reason why POCO Generator vs Entity Object ends up with different execution times. As far as it being 3 times longer, more details will be needed to figure it out.
PS, if you are stuck with the legacy code, you should look into using bulk copy with EF.

Related

Entity Framework can't keep up with current amount of traffic

I have an asp.net project running on a web server that receives a random amounts of traffic that needs to write this information to a sql database as soon as it receives it. It needs to handle up to 2000 - 3000 messages a second at times and other times just a few a second.
The programmers above me are stuck on using entity framework for the safety it provides, but I can't keep up with the surge of messages sometimes as they need to hit the database fast and can't be queued. The best I've gotten is about 1200 messages a second with entity framework using a save after each request, which I would think is now how entity framework should be used. I know using bulk insert is way more effective, but it isn't an option as we can't hold on to the message as per our requirements given to me. If I do a direct sql insert I can keep up with the message load, but my management says no for the type safety.
Any suggestions on how I can make entity framework keep up with the load or any other frame works that provide the safety and the backing that entity framework has that I bring to management? I've heard dapper is the other good contender, but I have no experience with it to justify it for enterprise solutions.
I've tried researching all the microsoft documents on entity framework and entityframework.net documentation. Tried setting AutoDetectChangesEnabled to false. Everything I read just points to do bulk insert. I've also tried stripping out other tables and have a staging table to see if I can make it faster.

New project: ADO.Net vs Entity Framework - trying to understand if EF works out

we are at the beginning of a new project, which will replace a legacy project. The legacy one is written in .Net Framework 4.0 (SOA with WCF) + SQL Server. The connection with SQL is made by ADO.Net + stored procedures. There is a structural mistake by having most of the logic on the stored procedures, and on top of that, it is a monolytic.
The new project will be made with .Net 6 APIs and in some cases, it will have SQL Server as well, for operational data.
So, looking at the new product the question was raised: should we move from ADO.Net to EF? This is tempting since it reduces the development effort, but performance is a concern.
Taking a look at the technical must haves:
Get the product to be as fast as possible (performance is a concern)
The new project is expected to live at least for the next 15 years
Operations are executed against tables with 30 to 50 million records
We must be able to run operations against the regular database, but also against the readonly one (AlwaysOn)
We must be able to perform some resiliency policies such as retries in case of deadlocks
We don't have much room for changes if we choose one path and somewhere along the way we realize we should had gone with the other option
Quite honestly, IMHO, based on our tech requirements I feel should move forward with ADO.Net + Stored procedures (without any business logic) + some sort of package that translates the SQL results to my objects in a fast manner, but I'd like to give EF a shot, at least on this stage of the process where we are investigating possibilities.
I'd like to gather if possible opinions, specially if there is someone out there that went to EF with requirements as similar as ours, or someone who didn't go to EF or had to change from EF to ADO.Net somewhere along the way.
Thanks.
The only thing in your requirements that could support using ADO.NET over EF is
Get the product to be as fast as possible (performance is a concern)
Which is a nonsense requirement, as you can always write more code and make things more complex to make things marginally faster. You need a real performance requirement so you can measure different approaches.

Why Entity Framework performs faster than Dapper in direct select statement [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm new to using ORM in dealing with database, Currently I'm making a new project and I have to decide if i'll use Entity Framework or Dapper. I read many articles which says that Dapper is faster than Entity Framework.
So I made 2 simple prototype projects one using Dapper and the other uses Entity Framework with one function to get all the rows from one table.
The table schema as the following picture
and the code for both projects as the following
for Dapper project
System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
sw.Start();
IEnumerable<Emp> emplist = cn.Query<Emp>(#"Select * From Employees");
sw.Stop();
MessageBox.Show(sw.ElapsedMilliseconds.ToString());
for Entity Framework Project
System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
sw.Start();
IEnumerable<Employee> emplist = hrctx.Employees.ToList();
sw.Stop();
MessageBox.Show(sw.ElapsedMilliseconds.ToString());
after trying the above code many times only the first time I run the project the dapper code will be faster and after this first time always I get better results from entity framework project
I tried also the following statement on the entity framework project to stop the lazy loading
hrctx.Configuration.LazyLoadingEnabled = false;
but still the same EF performes faster except for the first time.
Can any one give me explanation or guidance on what makes EF faster in this sample although all the articles on the web says the opposite
Update
I've changed the line of code in the entity sample to be
IEnumerable<Employee> emplist = hrctx.Employees.AsNoTracking().ToList();
using the AsNoTracking as mentioned in some articles stops the entity framework caching and after stopping the caching the dapper sample is performing better, (but not a very big difference)
ORM (Object Relational Mapper) is a tool that creates layer between your application and data source and returns you the relational objects instead of
(in terms of c# that you are using) ADO.NET objects. This is basic thing that every ORM does.
To do this, ORMs generally execute the query and map the returned DataReader to the POCO class. Dapper is limited up to here.
To extend this further, some ORMs (also called "full ORM") do much more things like generating query for you to make your application database independent, cache your data for future calls, manage unit of work for you and lot more. All these are good tools and adds value to ORM; but it comes with cost. Entity Framework falls in this class.
To generate the query, EF have to execute additional code. Cache improves the performance but managing the cache needs to execute additional code. Same is true for unit of work and any other add-on feature provided by EF. All this saves you writing additional code and EF pays the cost.
And the cost is performance. As Dapper does very basic job, it is faster; but you have to write more code. As EF does much more than that, it is (bit) slower; but you have to write less code.
So why your tests show opposite results?
Because the tests you are executing are not comparable.
Full ORMs have many good features as explained above; one of them is UnitOfWork. Tracking is one of the responsibilities of UoW. When the object is requested (SQL query) for first time, it causes round trip to database. This object is then saved in memory cache. Full ORM keeps track of changes done to this already loaded object(s). If same object is requested again (other SQL query in same UoW scope that include loaded object), they do not do database round trip. Instead, they return the object from memory cache instead. This way, considerable time is saved.
Dapper do not support this feature that causes it to perform slower in your tests.
But, this benefit is only applicable if same object(s) loaded multiple times. Also, if number of objects loaded in memory is too high, this will slow down the full ORM instead as then the time required to check the objects in memory will be higher. So again, this benefit depends on use-case.
I read many articles which says that Dapper is faster than Entity Framework
The problem with the most of the benchmarks on internet is that they compare EF Linq to Dapper. And that's what you did too. Which is unfair. An auto generated query(EF) is often not equal to the one written by a good developer.
This,
IEnumerable<Employee> emplist = hrctx.Employees.ToList();
should be replaced by this.
IEnumerable<Employee> emplist = hrctx.Employees.FromSql(#"Select * From Employees").AsNoTracking();
Edit:
As pointed out by #mjwills, below is the results table for insert, update and select statements.
Dapper is outperforming EF Core 2. However, it can be seen that for EF plain queries, the difference is very minimum. I have posted complete details here.
There is no problem to mix them together. In my current project I'm using Dapper for selecting data and EF for creating and updating and database migrations.
Dapper becomes exteremely helpful when it comes to complex queries where more than two tables are involved or where there are some complex operations (joining by more than one column, joining with >= and <= clauses, recursive selections, cte's etc) where to use pure SQL is much easier than LINQ. As I know, Entity Framework (unlike Dapper) cannot use .FromSql() method on custom DTO's. It can map only one table that should be in your database context.
The article Entity Framework Core 2.0 vs. Dapper performance benchmark, querying SQL Azure tables confirms that Dapper is a bit quicker, but not enough to ignore "full ORM" benefits.

DbContext Query performance poor vs ObjectContext [duplicate]

This question already has an answer here:
Is it always better to use 'DbContext' instead of 'ObjectContext'?
(1 answer)
Closed 9 years ago.
I recently moved my entity model from an ObjectContext using 4.1 to a DbContext using 5.0. I'm starting to regret doing that because I'm noticing some very poor performance on query's using the DbContext vs ObjectContext. Here's the test scenario :
Both contexts use the same database with about 600 tables. LazyLoading and ProxyCreation is turned off for both (not shown in code example). Both have pre-generated views.
The test first makes 1 call to load up the metadata workspace. Then in a for loop that gets executed 100 times, I new up a context and make one call that takes the first 10. (I'm creating the context inside the for loop because this simulates being used in a WCF service, which would create the context every time)
for (int i = 0; i < 100; i++)
{
using (MyEntities db = new MyEntities())
{
var a = db.MyObject.Take(10).ToList();
}
}
When I run this with the ObjectContext it takes about 4.5 seconds. When I run it using the DbContext it takes about 17 seconds. I profiled this using RedGate's performance profiler. For the DbContext it seems the major culprit is a method called UpdateEntitySetMappings. This is called on every query and appears to retrieve the metadataworkspace and cycle through every item in the OSpace. AsNoTracking did not help.
EDIT : To give some better detail, the problem has to do with the creation\initialization of a DbSet vs an ObjectSet, not the actual query. When I make a call with the ObjectContext, it takes on average 42ms to create the ObjectSet. When I make a call with the DbContext, it takes about 140ms to create the internal dbset. Both ObjectSet and DbSet do some entityset mapping lookups from the metadataworkspace. What I've noticed is that the DbSet does it for ALL the types in the workspace while the ObjectSet does not. I'm guessing (haven't tried it) that a model with fewer tables that the performance difference is less.
I've been also concerned by the underperformance of the code first approach and I've performed some benchmarks in a scenario similar to yours
http://netpl.blogspot.com/2013/05/yet-another-orm-micro-benchmark-part-23_15.html
The results were no suprise, since the DbContext is a wrapper over ObjectContext, it has to sacrifice performance for the simplicity. However, my tests show that:
the more records you retrieve the less is the difference
the more records you retrieve the more important it is to turn off tracking if you want to be faster
For example, retrieving just 10 records
Note that code first is significantly slower than model first and there is no noticeable difference between tracking and no tracking - both observations are exactly like yours.
However when retrieving 10000 rows you have
Note that there is almost no difference between code first and model first in the notracking version. Also, both perform surprisingly well, almost as fast as the raw ado.net datareader.
Please follow my blog entry for more details.
That simple benchmark helped me to accept the nature of the code first. I still prefer it for smaller projects because of two features: poco entities and migrations. On the other hand, I would never pick any of the two for a project where performance is a critical requirement. This effectively means that I will probably never use the model first approach again.
(A side note: my benchmark also reveals that there is something wrong with nHibernate. I still haven't found anyone to help me to explain this even though I've consulted two independent developers who use NH daily)
DbContext is a wrapper for ObjectContext. Here is good answer about your question. It is possible that to make it easier to use they sacrificed performance.
I use Simple.Data to query millions of records and it works quite well and fast.

Is there a benefit to override the default Insert/Update/Delete queries for EF

As the question asks really.
The EF modeller tool allows us to map the Insert/Update/Delete functions to a sproc, is there any benefit to overriding them?
If it requires some custom validation, then obviously yes, but if I'm happy with how it is now, is it worth creating sproc's for them all?
I can't remember how to view the SQL it's executing for them to find out the exact query, but I should imagine it'd be pretty similar to a standard Insert/Update/Delete query.
I can think of a few cases where it could be useful:
You're working with a legacy database which doesn't quite map to your EF model precisely.
You need extra queries to be executed on insert/update/delete, but you don't have rights to triggers on your database.
Soft deletes in your database which you want to abstract away from. So a regular delete will actually perform a soft delete.
Not quite sure how viable these options are, as I personally am more of a NHibernate guy. These are theoretical options.
As for viewing the executed queries, there's a few possibilities to do that. You could attach a profiler to your SQL Server instance and look at the raw queries that are executed. There's also Entity Framework Profiler (By Ayende/Oren Eini) which isn't free, but it does make reading and debugging the queries a lot easier.
Yes. There is a benefit to overriding them.
Not everybody actually updates or deletes a row of data when an update or delete happens.
In some cases, deleting a record really simply means setting an EffictiveUntil date to an existing record and keeping the record in the database for historical purposes.
The same can go for an Update. Instead of updating an existing row, the current row gets the EffectiveUntil date set and a brand new row gets inserted with the new data with a null EffectiveUntil date (or similar mechanism).
By providing Insert/Update/Delete logic to Entity Framework, you are allowed to specify exactly what those operations mean in terms of your database rather than what they mean in the scope of an RDBMS.
As for the second question (that I apparently originally missed), if you're happy with what is currently being generated, then no it's not worth creating them. You'd just add the extra headache of having to remember to update your Stored Procedures whenever you change the table structure.

Categories