Petapoco repository pattern significant performance issues

Petapoco repository pattern significant performance issues - c#

We are using the petapoco repository pattern (similar to this blog post). As the page loads we open up the repository, run the query, dispose and then carry on processing. This is fine on light pages, but when this happens a few times in the page we get quite significant performance degradation.
I had, perhaps wrongly, assumed that connection pooling would cope with this, which is enabled.
I ran a couple of tests.
The page it's on (it's an aspx page) takes around 1.2 seconds to load as it is at the moment. The page is running around 30 database queries...and, looking at the profiler, is doing a login and logout per query (even with connection pooling).
If I persist the connection and don't close until the page ends, this drops to around 70ms, which is quite a significant saving.
Perhaps we need to keep the Database object hanging around for the request, but I didn't think PetaPoco had this much of an overhead...particularly with the connection pooling.
I have created a test app to demonstrate it.
This demonstrates that loading a user 1000 times takes 230ms if the repository is reused, but takes 3.5seconds if the repository is recreated every time.

Your usage of connection pooling is breaking best practices.
It says nowhere to get rid of it after every statement. I normally keep a connection / repository around while doing processing and only close it when my function is finished (with MVC).
Yes, even the connection pool has overhead and you seem to be really bound on making that show.

What I always do is create a single instance of my repository per request. Because I develop almost exclusively using the MVC pattern, that means I create a private class level variable in each of my controllers and use it to service any requests within my Action methods. Translated to WebForms (ASPX), that means I would create one in the BeforeLoad (or whatever the event just before PageLoad is) and pass it around as needed. I don't think keeping a class-level instance is a good idea for Webforms though, but I can't remember enough to be sure.
The rule of thumb is to use one instance of your repo (or any other type of class really) for the entirety of your request, which is usually a page load or Ajax call. And for the reasons you've pointed out.
Remember: information on the internet is free, and you get what you pay for.

Related

Web Api 2 Controller and Handler have five second delay between them

Recently the web requests on my Web Api 2 w/ Entity Framework 6.1 server have taken a drastic reduction in speed. Adding ~5000ms to all requests that query the database. I've spent the last three days ripping my hair out trying to figure it out.
Setup:
Web Api 2.2
Entity Framework 6.1.1
Autofac for IoC, DbContext is InstancePerLifetimeScope() along with everything else.
One custom HttpParameterBinding for getting entity id's from an access token. This does query the db.
Only one DelegatingHandler, for logging requests
What I've done:
Pre generated views, slight improvement
Reduced properties in entities we query, no improvement
Turned off AutoTrackChanges, no improvement
Tried AsNoTracking() on a number of requests, no improvement
Profiling with Ant Performance Profiler, nothing useful
Profiling database with SQL Management Studio, the queries are all fast
Why do I say there's a delay between the handler and the controller? I timed it with DateTime.Now at the beginning and the end of the controller action, 1745ms The logging handler does a time before and after the await base.SendAsync(request, cancellationToken), 6234ms. I timed the binding as well, only 2ms.
That's 4489ms of time that's unaccounted for. Other requests have similar timings. It happens after the logging handler get's the request and reports it but before the binding starts. What happens in there? Where is it coming from? We don't have any async void methods that spin off, we don't have any per request actions that should take that long. Totally stumped.
Edit: Repeating the same request does not improve the performance. I don't believe one hit performance is the issue, it's consistently poor.

I appreciate the help, I did end up finding the answer.
We had services that were being injected into the controllers and their constructors were using potentially async calls that preloaded some stuff. Changing it to use AsyncLazy was the solution.
Potentially helpful steps to those in similar situations, enumerated now.
Ever played the board game Guess Who? That's strikingly similar to debugging. You want to ask that knock down half of the potential questions. Don't start with "is it this specific method that I felt dirty with", instead start with:
What works and what doesn't work? Find the the differences. That's your problem set.
Narrow down the problem set with generic questions. Find the shared similarities and get rid of them. Is it async calls? (Thanks commenter) Is it deadlock-like stuff? (Thanks again). Is it first hit or initial loading?
Once you've removed the shared similarities it's time to start commenting out code. I narrowed it down to a constructor that was getting injected with three objects and not doing any work it's self. When it wasn't the first two objects I knew where my problem was!

EF6, Windows Service & Database polling

I have a windows service that is polling a database. I am using EF6 and linq to do my queries and updates, etc.
The polling needs to be as often as possible, probably every 2 seconds or something in that area.
My gut tells me to have one connection and keep it open while my service is running, however something else tells me to open and close the connection every time. I feel that the latter will slow it down (will this really slow it down this much?).
What are the best practices when it comes to polling a database within a windows service? Should I really be polling my database so often?

I think you should dispose of the context frequently and create a new one every time you poll the database.
The main reason is that unless you disable object tracking (really only suitable for read only operation), the context gets bigger and bigger over time, with each successive polling operation loading more data into the context's cache. As well as the increase in memory this causes, SaveChanges() gets slower as the ObjectContext then looks for changes in the objects which are attached to it.
If the connection is lost for any reason, you'll also have a hard time associating a new connection with the context. Regardless, based on my own experience, it won't slow anything down, it's quick to construct any EF context objects after the first one, because the model is cached on first load.
I wouldn't worry about every polling every 2 seconds. That seems totally reasonable to me.
As an aside, if you're using SQL Server, you can use Sql Dependency to fire an event when data changes, but polling is the most reliable option.
http://msdn.microsoft.com/en-us/library/62xk7953(v=vs.110).aspx
Alternatively, if you're dead set against polling, you could look at using a Message Broker system like RabbitMQ and updating your apps to use it, but be prepared to lose a couple of weeks implementing the infrastructure.

Resources for learning how to handle heavy traffic asp.net mvc site?

how much traffic is heavy traffic? what are the best resources for learning about heavy traffic web site development?.. like what are the approaches?

There are a lot of principles that apply to any web site, irrelevant of the underlying stack:
use HTTP caching facilities. For one there is the user agent cache. Second, the entire web backbone is full of proxies that can cache your requests, so use this to full advantage. A request that does even land on your server will add 0 to your load, you can't optimize better than that :)
corollary to the point above, use CDNs (Content Delivery Network, like CloudFront) for your static content. CSS, JPG, JS, static HTML and many more pages can be served from a CDN, thus saving the web server from a HTTP request.
second corollary to the first point: add expiration caching hints to your dynamic content. Even a short cache lifetime like 10 seconds will save a lot of hits that will be instead served from all the proxies sitting between the client and the server.
Minimize the number of HTTP requests. Seems basic, but is probably the best overlooked optimization available. In fact, Yahoo best practices puts this as the topmost optimization, see Best Practices for Speeding Up Your Web Site. Here is their bets practices list:
Minimize HTTP Requests
Use a Content Delivery Network
Add an Expires or a Cache-Control Header
Gzip Components
... (the list is quite long actually, just read the link above)
Now after you eliminated as much as possible from the superfluous hits, you still left with optimizing whatever requests actually hit your server. Once your ASP code starts to run, everything will pale in comparison with the database requests:
reduce number of DB calls per page. The best optimization possible is, obviously, not to make the request to the DB at all to start with. Some say 4 reads and 1 write per page are the most a high load server should handle, other say one DB call per page, still other say 10 calls per page is OK. The point is that fewer is always better than more, and writes are significantly more costly than reads. Review your UI design, perhaps that hit count in the corner of the page that nobody sees doesn't need to be that accurate...
Make sure every single DB request you send to the SQL server is optimized. Look at each and every query plan, make sure you have proper covering indexes in place, make sure you don't do any table scan, review your clustered index design strategy, review all your IO load, storage design, etc etc. Really, there is no short cut you can take her, you have to analyze and optimize the heck out of your database, it will be your chocking point.
eliminate contention. Don't have readers wait for writers. For your stack, SNAPSHOT ISOLATION is a must.
cache results. And usually this is were the cookie crumbles. Designing a good cache is actually quite hard to pull off. I would recommend you watch the Facebook SOCC keynote: Building Facebook: Performance at Massive Scale. Somewhere at slide 47 they show how a typical internal Facebook API looks like:
.
cache_get (
$ids,
'cache_function',
$cache_params,
'db_function',
$db_params);
Everything is requested from a cache, and if not found, requested from their MySQL back end. You probably won't start with 60000 servers thought :)
On the SQL Server stack the best caching strategy is one based on Query Notifications. You can almost mix it with LINQ...

I will define heavy traffic as traffic which triggers resource intensive work. Meaning, if one web request triggers multiple sql calls, or they all calculate pi with a lot of decimals, then it is heavy.
If you are returning static html, then your bandwidth is more of an issue than what a good server today can handle (more or less).
The principles are the same no matter if you use MVC or not when it comes to optimize for speed.
Having a decoupled architecture
makes it easier to scale by adding
more servers etc
Use a repository
pattern for data retrieval (makes
adding a cache easier)
Cache data
which is expensive to query
Data to
be written could be written thru a
cache, so that the client don't have
to wait for the actual database
commit
There's probably more ground rules as well. Maybe you can you say something about the architecture of your application, and how much load you need to plan for?

MSDN has some resources on this. This particular article is out of date, but is a start.
I would suggest also not limiting yourself to reading about the MVC stack: many principles are cross-platform.

Some questions coming from application programming (C#/Visual C++) to ASP.NET (C#)

At the new place I am working, I've been tasking with developing a web-application framework. I am new (6 months ish) to the ASP.NET framework and things seem pretty straight forward, but I have a few questions that I'd like to ask you ASP professionals. I'll note that I am no stranger to C#.
Long life objects/Caching
What is the preferred method to deal with objects that you don't want to re-initialize every time a page is it? I noticed that there was a cache manager that can be used, but are there any caveats to using this? For example, I might want to cache various things and I was thinking about writing a wrapper around the cache that prefixed cache names so that I could implement different caches using the same underlying .NET cache manager.
1) Are there any design considerations I need to think about the objects that I am want to cache?
2) If I want to implement a manager of some time that is around during the lifetime of the web application (thread-safe, obviously), is it enough to initialize it during app_start and kill it in app_end? Or is this practiced frowned upon and any managers are created uniquely in the constructor/init method of the page being served.
3) If I have a long-term object initialized at app start, is this likely to get affected when the app pool is recycled? If it is destroy at app end is it a case of it simply getting destroyed and then recreated again? I am fine with this restriction, I just want to get a little clearer :)
Long Life Threads
I've done a bit of research on this and this question is probably redundant. It seems it is not safe to start a worker thread in the ASP.NET environment and instead, use a windows service to do long-running tasks. The latter isn't exactly a problem, the target environments will have the facility to install services, but I just wanted to double check that this was absolutely necessary. I understand threads can throw exceptions and die, but I do not understand the reasoning behind prohibiting them. If .NET provided a a thread framework that encompassed System.Thread, but also provided notifications for when the Application Server was going to recycle the App-Pool, we could actually do something about it rather than just keel over and die at the point we were stopped.
Are there any solutions to threading in ASP.NET or is it basically "service"?
I am sure I'll have more queries, but this is it for now.
EDIT: Thankyou for all the responses!

So here's the main thing that you're going to want to keep in mind. The IIS may get reset or may reset itself (based on criteria) while you're working. You can never know when that will happen unless it stops rendering your page while you're waiting on the response (in which case you'll get a browser notice that the page stopped responding, eventually.
Threads
This is why you shouldn't use threads in ASP.NET apps. However, that's not to say you can't. Once again, you'll need to configure the IIS engine properly (I've had it hang when spawning a lot of threads, but that may have been machine dependent). If you can trust that nobody will cause ASP.NET to recompile your code/restart your application (by saving the web.config, for instance) then you will have less issues than you might otherwise.
Instead of running a Windows service, you could use an ASMX or WCF service which also run on IIS/.NET. That's up to you. But with multiple service pools it allows you to keep everything "in the same environment" as far as installations and builds are concerned. They obviously don't share the same processpool/memoryspace.
"You're Wrong!"
I'm sure someone will read this far and go "but you can't thread in ASP.NET!!!" so here's the link that shows you how to do it from that venerable MSDN http://msdn.microsoft.com/en-us/magazine/cc164128.aspx
Now onto Long life objects/Caching
Caching
So it depends on what you mean by caching. Is this per user, per system, per application, per database, or per page? Each is possible, but takes some contrivance and complexity, depending on needs.
The simplest way to do it per page is with static variables. This is also highly dangerous if you're using it for user-code-stuff because there's no indication to the end user that the variable is going to change, if more than one users uses the page. Instead, if you need something to live with the user while they work with the page in particular, you could either stuff it into session (serverside caching, stays with the user, they can use it across multiple pages) or you could stick it into ViewState.
The cachemanager you reference above would be good for application style caching, where everyone using the webapp can use the same datastore. That might be good for intensive queries where you want to get the values back as quickly as possible so long as they're not stale. That's up to you to decide. Also, things like application settings could be stored there, if you use a database layer for storage.
Long term cache objects
You could initialize it in the app_start with no problem, and the same goes for destroying it at the end if you felt the need, but yes, you do need to watch out for what I described at first about the system throwing all your code out and restarting.
Keel over and die
But you don't get notified when you're (the app pool here) going to be restarted (as far as I know) so you can pretty much keel over and die on anything. Always assume the app is going to go down on you before your request, and that every request is the first one.
Really tho, that just leads back into web-design in the first place. You don't know that this is the first visitor or the fifty millionth (unless you're storing that information in memory of course) so just like the app is stateless, you also need to plan your architecture to be stateless as much as possible. That's where web-apps are great.
If you need state on a regular basis, consider sticking with desktop apps. If you can live with stateless-ness, welcome to ASP.NET and web development.

1) The main thing about caching is understanding the lifetime of the cache, and the effects of caching (particularly large) objects in cache. Consider caching a 1MB object in memory that is generated each time your default.aspx page is hit; and after a year of production you're getting 10,000 hits an hour, and object lifetime is 2 hours. You can easily chew up TONS of memory, which can affect performance, and also may cause things to be prematurely expired from the cache, which in turn can cause other issues. As long as you understand the effects of all of this, you're fine.
2) Starting it up in Application_Start and shutting it down in Application_End is fine. You can also implement a custom HttpApplication with an http module.
3) Yes, when your app pool is recycled it calls Application_End and everything is shutdown and destroyed.
4) (Threads) The issue with threads comes up in relation to scaling. If you hit that default.aspx page, and it fires up a thread, and that page gets hit 10,000 in 2 minutes, you could potentially have a ton of threads running in your application pool. Again, as long as you understand the ramifications of firing up a thread, you can do it. ThreadPool is another story, the asp.net runtime uses the ThreadPool to process requests, so if you tie up all the threadpool threads, your application can potentially hang because there isn't a thread available to process the request.

1) Are there any design considerations I need to think about the objects that I am want to cache?
2) If I want to implement a manager of some time that is around during the lifetime of the web application (thread-safe, obviously), is it enough to initialize it during app_start and kill it in app_end? Or is this practiced frowned upon and any managers are created uniquely in the constructor/init method of the page being served.
There's a difference between data caching and output caching. I think you're looking for data caching which means caching some object for use in the application. This can be done via HttpContext.Current.Cache. You can also cache page output and differentiate that on conditions so the page logic doesn't have to run at all. This functionality is also built into ASP.NET. Something to keep in mind when doing data caching is that you need to be careful about the scope of the things you cache. For example, when using Entity Framework, you might be tempted to cache some object that's been retrieved from the DB. However, if your DB Context is scoped per request (a new one for every user visiting your site, probably the correct way) then your cached object will rely on this DB Context for lazy loading but the DB Context will be disposed of after the first request ends.
3) If I have a long-term object initialized at app start, is this likely to get affected when the app pool is recycled? If it is destroy at app end is it a case of it simply getting destroyed and then recreated again? I am fine with this restriction, I just want to get a little clearer :)
Perhaps the biggest issue with threading in ASP.NET is that it runs in the same process as all your requests. Even if this weren't an issue in and of itself, IIS can be configured (and if you don't own the servers almost certainly will be configured) to shut down the app if it's inactive (which you mentioned) which can cause issues for these threads. I have seen solutions to that including making sure IIS never recycles the app pool to spawning a thread that hits the site to keep it alive even on hosted servers

NHibernate session management?

Firstly, let me give a brief description of the scenario. I'm writing a simple game where pretty much all of the work is done on the server side with a thin client for players to access it. A player logs in or creates an account and can then interact with the game by moving around a grid. When they enter a cell, they should be informed of other players in that cell and similarly, other players in that cell will be informed of that player entering it. There are lots of other interactions and actions that can take place but it's not worth going in to detail on them as it's just more of the same. When a player logs out then back in or if the server goes down and comes back up, all of the game state should persist, although if the server crashes, it doesn't matter if I lose 10 minutes or so of changes.
I've decided to use NHibernate and a SQLite database, so I've been reading up a lot on NHibernate, following tutorials and writing some sample applications, and am thoroughly confused as to how I should go about this!
The question I have is: what's the best way to manage my sessions? Just from the small amount that I do understand, all these possibilities jump out at me:
Have a single session that's always opened that all clients use
Have a single session for each client that connects and periodically flush it
Open a session every time I have to use any of the persisted entities and close it as soon as the update, insert, delete or query is complete
Have a session for each client, but keep it disconnected and only reconnect it when I need to use it
Same as above, but keep it connected and only disconnect it after a certain period of inactivity
Keep the entities detached and only attach them every 10 minutes, say, to commit the changes
What kind of strategy should I use to get decent performance given that there could be many updates, inserts, deletes and queries per second from possibly hundreds of clients all at once, and they all have to be consistent with each other?
Another smaller question: how should I use transactions in an efficient manner? Is it fine for every single change to be in its own transaction, or is that going to perform badly when I have hundreds of clients all trying to alter cells in the grid? Should I try to figure out how to bulk together similar updates and place them within a single transaction, or is that going to be too complicated? Do I even need transactions for most of it?

I would use a session per request to the server, and one transaction per session. I wouldn't optimize for performance before the app is mature.
Answer to your solutions:
Have a single session that's always opened that all clients use: You will have performance issues here because the session is not thread safe and you will have to lock all calls to the session.
Have a single session for each client that connects and periodically flush it: You will have performance issues here because all data used by the client will be cached. You will also see problems with stale data from the cache.
Open a session every time I have to use any of the persisted entities and close it as soon as the update, insert, delete or query is complete: You won't have any performance problems here. A disadvantage are possible concurrency or corrupt data problems because related sql statements are not executed in the same transaction.
Have a session for each client, but keep it disconnected and only reconnect it when I need to use it: NHibernate already has build-in connection management and that is already very optimized.
Same as above, but keep it connected and only disconnect it after a certain period of inactivity: Will cause problems because the amount of sql connections is limited and will also limit the amount of users of your application.
Keep the entities detached and only attach them every 10 minutes, say, to commit the changes: Will cause problems because of stale data in the detached entities. You will have to track changes yourself, which makes you end up with a piece of code that looks like the session itself.
It would be useless to go into more detail now, because I would just repeat the manuals/tutorials/book. When you use a session per request, you probably won't have problems in 99% of the application you describe (and maybe not at all). Session is a lightweight not threadsafe class, that to live a very short. When you want to know exactly how the session/connection/caching/transaction management works, I recommend to read a manual first, and than ask some more detailed questions about the unclear subjects.

Read the 'ISessionFactory' on this page of NHibernate documentation. ISessions are meant to be single-threaded (i.e., not thread-safe) which probably means that you shouldn't be sharing it across users. ISessionFactory should be created once by your application and ISessions should be created for each unit of work. Remember that creating an ISessions does not necessarily result in opening a database connection. That depends on how your SessionFactory's connection pooling strategy is configured.
You may also want to look at Hibernate's Documentation on Session and Transaction.

I would aim to keep everything in memory, and either journal changes or take periodic offline snapshots.

Have a read through NHibernate Best Practices with ASP.NET, there are some very good tips in here for a start. As mentioned already be very careful with an ISession as it is NOT threadsafe, so just keep that in mind.
If you require something a little more complex then take a look into the NHibernate.Burrow contrib project. It states something like "the real power Burrow provides is that a Burrow conversation can span over multiple http requests".

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.