I'm trying to root cause the issue with slow speed upon sign in in my EF6 web app hosted in azure. With all the research I've done I still don't quite understand why my app is behaving the way it is.
When I first deploy the app, and attempt to sign in, it's slow, this I understand why, and it's acceptable. Subsequent sign-ins and calls are relatively quick. What I don't understand is why if I don't interact with the application for maybe 5 minutes (even though my Azure Web App has Always On set to Enabled and is a Standard pricing tier Web App) the next login will be back down to taking 20+ seconds.
I don't quite understand what "First Run" means when anyone says it in regards to Entity being slow. Does it only mean, "the first time the web app is accessed by ANYONE", or does it mean something like, "When the dbContext is instantiated by ONE SPECIFIC client for the first time, that is THEIR specific first run, and their instance of the app/dbcontext whatever is now warmed up and ready"
The latter doesn't seem to make sense, because I can sign in on one machine, and move to another machine and it will be relatively quick as well.
"First time" means first dbContext use (query) after the application starts. But when app is iddle for some time, app pool is restarted and next time you enter the site it will start again. That's why EF takes time when there is no activity for some time.
Have a look at this post about app pool restart in azure
"First Run" would refer to the first time an EF query is run after the Entity Framework assemblies are loaded in to an app domain.
EF is a fairly large set of assemblies and they take a bit of time to load initially. On the first query they will also do a lot of work verifying the model. A lot of this time can be reduced by Pre-caching the views for the model (MSDN). For databases with a lot of mapped tables and stored procs this can be a quite a long time. I've had projects that could take up to 3 minutes to start up. Precaching reduced that to about 10 seconds. It does add quite a bit of complexity to managing schema changes though.
Before the Entity Framework can execute a query or save changes to the
data source, it must generate a set of mapping views to access the
database. These mapping views are a set of Entity SQL statement that
represent the database in an abstract way, and are part of the
metadata which is cached per application domain. If you create
multiple instances of the same context in the same application domain,
they will reuse mapping views from the cached metadata rather than
regenerating them. Because mapping view generation is a significant
part of the overall cost of executing the first query, the Entity
Framework enables you to pre-generate mapping views and include them
in the compiled project.
Related
I'm upgrading an old MVC web application from EF4 to EF6 (and MVC3 to MVC5) using a guide i found on SO.
Functionally appears OK, but I've noticed performance issues.
Specific requests on the prod env for this application running MVC3/EF4 finish in under half a second.
Same in my Dev system after update take seconds.
For comparison, I've created an new test MVC/EF6 solution on the same dev machine that I am working on the migrated application. I surfaced the below linq via an MCV action, and have found there is a big performance difference, between the two to applications.
Note: both the old and test application have no overheads in the controller constructor, they both only creating the dbContext, and run the query.
var sites = DB.Sites.Take(50).Include("Users").OrderBy(s => s.SiteName).ToList();
new test EF6 application: 200ms
old application upgraded: 2 seconds
I have profiled the requests on SQL Server, and I can't see any problem there.
I am considering removing the ADO Entity Framework from the old project and starting by adding again. but this was a model first application, and doing this appears to remove all the partial classes, where the metadata has been defined (resulting in lots of compilation errors).
Should I remove the ADO Entity Framework from the old project - and
recreate as a database first application?
Is there something I have missed - that could be causing the issue?
how can i find out where the time is being used?
Edit
I removed the ADO Entity Framework model(edmx) and rediscovered from the database. This resulted in a lot of refactoring, due to pluralisation difference between EF4 and EF6. There were also changes to the Add/Update/Delete entities behaviours.
This hasn't resolve the performance issue.
This ones going to be really hard to debug from afar but I'd start by capturing the query thats being created by ef with sql profiler and seeing if you're missing any indexes on the db. SQLServer is a temperamental beast when it comes to producing query plans and if EF has changed its queries from 4 to 6, and theres a reasonable amount of data in the tables, this is most likely to be causing your issue. You might find it just needs a new index or something along those lines.
Another option thats not specifically tied to your problem, but could have an effect is precompiling the views in the EF context, which should reduce the first query time after startup of the application.
Solved:
in the upgraded app's web.config I found the connection string had: Pooling=False; I removed this.
Additionally in my test app, I found the connectiton string had: App=EntityFramework
The performance immediately improved to that of the test application
I'm building a web app that will have 30-35 tables in one database. Now the thing is I want to split the app into 3 different front ends (different teams want different things). 3 different projects.
App1 might use 15-20 tables, App2 might use 10, App3 might use 15.
I was planning on making a project called Models that has a dbContext with all the tables in the database and use that for the web app projects. If I need to add or update the database I can just update that one models project.
A colleague mentioned that you should only include what you need so I should make 3 separate dbcontexts for each web project or there will be a performance hit for including unnecessary tables.
To answer the question in the title: no, I haven't seen any performance hit with extremely large DbContexts. In one project I've worked, where the DbContext was defined with close to a thousand DbSets, the configuration time (the time taken to perform the calls to OnConfiguring and OnModelCreating) was around 2 seconds, and every single entity was configured through the Fluent API; so you can say that the hit is negligible (if there's one at all) for only 35 entities.
That said, whether you use one or more DbContext is dependent of how you will use them. If there's a clear separation of data where you can clearly say "this table will only be used here" and you will not end up having repeated DbSet, you could keep them separated.
A colleague mentioned [...] there will be a performance hit for including unnecessary tables
When colleagues say things like that, you tell them to either back such claims with evidence or to shut up. Seriously, there's enough cargo cult programming in the world already. It's the same as colleagues enforcing you to use String.Empty because it's faster than using "", because they read that on a blog once. Hint: it isn't.
It's very healthy to apply criticism to every claim you hear, especially if that claim is not grounded in any reality whatsoever.
Yes, loading a type with more properties will require more disk I/O and more CPU cycles. This will be extremely negligible though. You will not notice this on the grand scale of things.*
It becomes quite a different story if you're using an EDMX though, as loading and parsing that 5 MB of metadata will literally add seconds to the loading time of your application.*
*: yes, I'm looking for sources for both those claims at the moment.
I think its not a problem from performance perspective - but definitely I see challenge from maintenance perspective.
I experienced similar situation where we had one edmx based data model shared across different capabilities. however each capability is just focused on specific number of tables.
With this, problem we started facing whenever we required to change any table specific to any capability required us to touch one single data model and also leads to unnecessary merge conflicts during checkins.
I'm using Entity Framework 6 on a SQL Server database to query an existing database (database first, so there's an EDMX in my project).
I've noticed that the first time I request an entity, it can take up to thirty seconds for the query to be executed. Subsequent queries to the same object then get completed in a matter of milliseconds. The actual SQL being executed is very fast so it's not a slow query.
I've found that Entity Framework generates views on the background and that this is the most likely culprit. What I haven't found, however, is a good solution for this. There's a NuGet package that can handle the View Generation (EFInteractiveViews), but it hasn't been updated since 2014 and I hardly seem to find any information on how to use it.
What options do I have nowadays? I've tried initializing Entity Framework on Application_Start by doing a few queries, but this doesn't seem to help much at all, and also it's quite difficult to perform the real queries on Application_Start, because most queries use data from the current user (who is not yet logged on at Application_Start) so it's difficult to run these in advance.
I've thought about creating an ashx file that constantly polls the application by calling the API and keep it alive. I've also set the Application Pool to "AlwaysRunning" so that EF doesn't restart when the app pool is recycled.
Does anyone have any tips or ideas on how I can resolve this or things I can try?
Thanks a lot in advance. I've spent the better part of two days already searching for a viable solution.
There are many practices to speed up Entity Framework, I will mention some of them
Turn off the LazyLoading (EDMX => open the file right click anywhere => properties => Lazy Loading Enabled set it to false )
Use AsNoTracking().ToList() and when you want to update, use Attach and update object state to EntityState.Modified
Use Indexes on your table
Use Paging, do not load all the data at once
Split your Edmx into many smaller, only include the ones you need in your page, ( this will effect the performance in good way)
If you want to load related objects "be eager and not lazy", use Include, you might include using System.Data.Entity to use the lambda include features
Example for splitting your Edmx
If you have the following objects for a rent a car app : Country, City , Person, Car, Rent, Gender, Engine, Manufacturers,..etc.
Now
If you are working on a screen to Manage (CRUD) person, this means you don't need Car,Rent,Manufacturer, so create ManagePerson.edmx contains ( Country, City, Person, Gender)
If you are working on managing (CRUD) Car then you don't need (Person,City, Gender,Rent), so you can create ManageCar.edmx containing ( Car, Manufacturer,Country, Engine)
Entity Framework must first compile and translate your LINQ queries into SQL, but after this it then caches them. The first hit to a query is always going to take a long time, but as you mention after that the query will run very quickly.
When I first used EF it was constantly an issue brought up by testers, but when the system went live and was used frequently (and queries were cached) it wasn't an issue.
See Hadi Hassans answer for general speed up tips.
EDIT :
This question is not about compiled queries, it is about generating the EF database view at compile time.
From the ADO.NET team blog: Exploring the Performance of the ADO.NET Entity Framework - Part 1:
View Generation 56%– A big part of creating an abstracted view of the database is providing the actual view for queries and updates in the store’s native language. During this step, the store views are created. The good news is there is a way of making view generation part of the build process so that this step can be avoided at run time.
The first database call in my web app takes about 2.5 seconds instead of ~30 ms for subsequent identical calls.
I generated a precompiled view source file using the T4 template from the ADO.NET team blog, but it has made no detectable difference.
The T4 template takes about 2.5 seconds to run and the generated code compiles.
What am I missing?
Fixed it!
The Generated View derived from EntityViewContainer must be in the assembly that contains the STOs Self-Tracking Objects, not the one that contains the edmx model.
Your first call to your application after a build, the web server will have unloaded the application, and the first call "starts" the application again, and everything that is associated with that. This has nothing to do with pre-compiled views. Therefore, the first call will always take longer as the application is loaded.
Incidentally, the same thing will happen on your production server. An idle worker-pool might well unload your application, and the next call will load the application again, taking significantly longer to perform than usual requests.
I am making a member based web app in ASP MVC3 and I am trying to plan ahead, at first our user base will not be huge, but as with any software the potential for a sudden volume spike is always a possibility.
Thinking ahead to this scenario, I know that the database is the bottleneck area on most web apps. We are using MSSQL 2008RS we will have dedicated servers with several client databases each client has there own database so if one server begins to bottle neck we can scale vertically or move some of the databases to a new server and begin filling it up.
To access the databases we use primarily LINQ 2 SQL and are currently re-factoring some of our code to make use of the IQueryable mechanisms to do a lazy load of content. but each page contains quite a bit of content from various parts of the database.
We also have a few large databases that are used for widgets in the program that rarely change but have millions of rows. The goal with those is to somehow sync them to the primary source and distribute them across several machines and then load balance those servers.
With this layout should I even worry about caching, or will the built-in caching mechanisms in MSSQL be sufficient?
If so where should I begin? I have looked briefly at app fabric but it looks as tho it is for Azure only?
Resources:
How to cache data in a MVC application
http://stephenwalther.com/blog/archive/2008/08/28/asp-net-mvc-tip-39-use-the-velocity-distributed-cache.aspx
http://stephenwalther.com/blog/archive/2008/08/29/asp-net-mvc-tip-40-don-t-cache-pages-that-require-authentication.aspx
Lazy loading is a performance killer. Its better to load the entire object graph with one join than to lazy load other properties. This is especially the case with a list of objects. If you iterate you'll end up lazy loading for each item in the list. Furthermore every call to the db has overhead. Less calls = better performance.
SO was a top 1000 website before it needed two database servers. I think you'll be ok.
If your revenue model says "each client will have its own database" than your scaling issues should be really easy to solve. Sounds like you already have a plan to scale up with more servers as your client base increases. Whats the problem?
Caching on the web tier is usually the first scaling fix you'll have to worry about. You probably don't need to do a fresh db call with each page request.
Overall this sounds like a lot of premature optimization. Your traffic hasn't reached a point where you need to be worried about scaling. Make these kinds of decisions at the last second possible.
The database cache is different to most caches - it can if course load used data into memory and re-use query plans, but that isn't really a cache as such.
AppFabric is definitely not just azure; after all, I it was you wouldnt be able to install it (and use it) locally :) but in truth there is little between AppFabroc, redis and memcached (the latter lacks persistance, of course).
But I think you should initially look at using the inbuilt asp.net caching; both data caching via HttpContext.Cache, and caching of entire responses (or, in MVC 3, partials). Obviously you should have a broad idea of what data is used heavily by lots of requests, and is safe to re-use : cache that!
Just make sure you treat all cached FAA as immutable (if you need to update the cache, re-add a modified value; don't modify the existing objects) - reason: it won't work the same if you start needing to use distributed caching, as that uses serialization, and any changes you make won't be seen by the next request.