My question is simple. About 2 years ago we began migrating to ASP.NET from ASP Classic.
Our issue is we currently have about 350 sites on a server and the server seems to be getting bogged down. We have been trying various things to improve performance, Query Optimizations, Disabling ViewState, Session State, etc and they have all worked, but as we add more sites we end up using more of the server's resources and so the improvements we made in code are virtually erased.
Basically we're now at a tipping point, our CPUs currently average near 100%. Our IS would like us to find new ways to reword the code on the sites to improve performance.
I have a theory, that we are simply at the limit on the amount of sites one server can handle.
Any ideas? Please only respond if you have a good idea about what you are talking about. I've heard a lot of people theorize about the station. I need someone who has actual knowledge about what might be going on.
Here are the details.
250 ASP.NET Sites
250 Admin Sites (Written in ASP.NET, basically they are backend admin sites)
100 Classic ASP Sites
Running on a virtualized Windows Server 2003.
3 CPUs, 4 GB Memory.
Memory stays around 3 - 3.5 GB
CPUs Spike very badly, sometimes they remain near 100% for short period of time ( 30 - 180 seconds)
The database is on a separate server and is SQL SERVER 2005.
It looks like you've reached that point. You've optimised your apps, you've looked at server performance, you can see you are hitting peak memory usage, maxing out the CPU, and, lets face it, administering so many websites musn't be easy.
Also, the spec of your VM isn't fantastic. It's memory, in particular, potentially isn't great for the number of sites you have.
You have plenty of reasons to move.
However, some things to look at:
1) How many of those 250 sites are actually used? Which ones are the peak performance offenders? Those ones are prime candidates for being moved off onto their own box.
2) How many are not used at all? Can you retire any?
3) You are running on a virtual machine. What kind of virtual machine platform are you using? What other servers are running on that hardware?
4) What kind of redundancy do you currently have? 250 sites on one box with no backup? If you have a backup server, you could use that to round robin requests, or as a web farm, sharing the load.
Lets say you decide to move. The first thing you should probably think about is how.
Are you going to simply halve the number of sites? 125 + admins on one box, 125 + admins on the other? Or are you going to move the most used?
Or you could have several virtual machines, all active, as part of a web farm or load balanced system.
By the sounds of things, though, there's a real resistance to buy more hardware.
At some point, you are going to have to though, as sometimes, things just get old or get left behind. New servers have much more processing power and memory in the same space, and can be cheaper to run.
Oh, and one more thing. The cost of all those repeated optimizations and testing probably could easily be offset by buying more hardware. That's no excuse for not doing any optimization at all, of course, and I am impressed by the number of sites you are running, especially if you have a good number of users, but there is a balance, and I hope you can tilt towards the "more hardware" side of it some more.
I think you've answered your own question really. You've optimised the sites, you've got the database server on a different server. And you have 600 sites (250 + 250 + 100).
The answer is pretty clear to me. Buy a box with more memory and CPU power.
There is no real limit on the number of sites your server can handle, if all 600 sites had no users, you wouldn't have very much load on the server.
I think you might find a better answer at serverfault, but here are my 2 cents.
You can scale up or scale out.
Scale up -- upgrade the machine with more memory / more cores in the CPU.
Scale out -- distribute the load by splitting the sites across 2 or more servers. 300 on server A, 300 on server B, or 200 each across 3 servers.
As #uadrive mentions, this is an issue of load, not of # of sites.
Just thinking this through, it seems like you would be better off measuring the # of users hitting the server instead of # of sites. You could have 300 sites and only half are used. Knowing the usage would be better in my mind.
There's no simple formula answer, like "you can have a maximum of 47.3 sites per gig of RAM". You could surely maintain performance with many more sites if each site had only one user per day. There are likely servers that have only two sites but performance is terrible because each hit requires a massive database query.
In practice, the only way to approach this is empirically: When performance starts to degrade, you have a problem. The fact that somebody wrote in a book somewhere that a server with such-and-such resources should be able to support more sites is of little value if, in practice, YOUR server can't support YOUR sites and YOUR users.
Realistic options are:
(a) Optimize your code and database queries. You say you've already done that. Maybe you can do more. It's unlikely that your code is now the absolute best that it can possibly be, but it may well be that the effort to find further improvements will be hugely expensive.
(b) Buy a bigger server.
(c) Break your sites across multiple servers, and either update DNS or install a front-end to map requests to the correct server.
Maxing out on CPU use can be a good sign, in the sense that moving to a large server or dividing the sites between multiple servers, is likely to help.
There are many things you can do to help improve performance and scalability (in fact, I've written a book on this subject -- see my profile).
It's difficult to make meaningful suggestions without knowing much more about your apps, but here are a few quick tips that might help to get you started:
Multiple AppPools are expensive. How many sites do you have per AppPool? Combine multiple sites per AppPool if you can
Minimize client round-trips: improve client and proxy-level caching, offload static files to a CDN, use image sprites, merge multiple CSS and JS files
Enable output caching on pages and/or controls were possible
Enable compression for static files (more CPU use on first access, but less after that)
Avoid Session state all together if you can (prefer cookies for state management). If you can't, then at least configure EnableSessionState="ReadOnly" session state for pages that don't need to write it, or "false" for pages that don't need it at all
Many things on the SQL Server side: caching, SqlCacheDependency, command batching, grouping multiple insert/update/deletes into a single transaction, using stored procedures instead of dynamic SQL, using async ADO.NET instead of LINQ or EF, make sure your DB logs are on separate spindles from data, etc
Look for algorithmic issues with your code; for example, hash tables are often better than linear searches, etc
Minimize cookie sizes, and only set cookies on pages, not on static content.
In addition, using a VM is likely to cost you up to about 10% in performance -- make sure it's really worth that for what it buys you in terms of improved manageability.
Related
I have been analyzing a DB that is running in Azure SQL that is performing VERY badly. it is on the premium tier with 1750 DTUs available, and at times can still max out DTUs.
Ive identified a variety of querys and terrible data access patterns thru stored procs, which has reduced load. But there is still this massive disparity between DTU and CPU usage in the image below, any other image i see of the "Query Performance Insight" in azure sql shows the DTU line aligning with the CPU usage for the most part.
DTU (in red) to CPU usage per query
Looking at the C# app sitting ontop of this, for each user that uses the app, it creates a SQL user, and uses that user in the connection string to access the DB. This means that connection pooling is not being used, resulting in a massively larger number of active users/sessions on the SQL azure DB. Could this be the sole reason why there is such high DTU usage?
Or could i possibly be missing something regarding IO that isnt visible in the Azure portal?
Thanks
Neil
EDIT: Adding sessions and workers image!
enter image description here
Based on that im not convinved now.. what is the session percentage of? Like its 10% but 10% of what? the max allowed?
Edit2: Adding more metrics:
One week:
2-3 hours when load is high:
The purple spike i believe is the reindex so can ignore that!
Trying to understand DTU versus resources was a stumbling block for me too. Click on your Resource utilization chart and click Edit
Then you get a slider with a lot of resources you can monitor. Select Sessions, and Workers percent. More than likely one of these are your problems. If not, you can add in: CPU, Data IO, Log IO, and/or In-memory OLTP percentage. Hit OK.
Now, what you should find is the real cost in your query or queries. Learning how your queries consume the different resources can help you fix performance problems like these. I learned this when doing large inserts, I was maxing out my Log IO, and everything else was <5% utilization.
Try that, and if you are right about connection pooling, unfortunately that will mean some refactoring in the application. At the very least, using this will give you more insight than just looking at that DTU percentage!
I am at the start of a mid sized asp.net c# project and with an application performance requirement to be able to support around 400+ concurrent users.
What are the things I need to keep in mind while architecting an application to meet such performance and availability standards? The page need to be served in under 5 seconds. I plan to have the application and database on separate physical machines. From a coding and application layering perspective:-
If I have the database layer exposed to the application layer via a
WCF service, will it hamper the performance? Should I use a direct
tcp connection instead?
Will it matter if I am using Entity framework or some other ORM or the enterprise library data block?
Should I log exceptions to database or a text file?
How do I check while development if the code being built is going to meet those performance standards eventually? Or is this even a point I need to worry about at development stage?
Do I need to put my database connection code and other classes that hold lookup data that rarely change for the life of the application, in static classes so it is available thru the life of the application?
What kind of caching policy should I apply?
What free tools can I use to measure and test performance? I know of red-gate performance measurement tools but that has a high license cost, so free tools are what I'd prefer.
I apologize if this question is too open ended. Any tips or thoughts on how I should proceed?
Thanks for your time.
An important consideration when designing a scalable application is to make it stateless. No sessions. Another important consideration is to cache everything that you can in order to reduce database queries. And this cache should be distributed to other machines which are specifically design to store it. Then all you have to do is throw an additional server when the application starts to run slowly due to an increased user load.
As far as your questions about WCF are concerned, you can use WCF, it won't be a bottleneck for your application. It will definitely add an additional layer which will slow things a bit but if you want to expose a reusable layer that can scale independently on its own WCF is great.
ORMs might indeed introduce a performance slowdown in your application. It's more due to the fact that you have less control over the generated SQL queries and thus more difficult to tune them. This doesn't mean that you shouldn't use an ORM. It's just to be careful about what SQL it spits and tune it with your DB admin. There are also lightweight ORMs such as dapper, PetaPoco and Massive that you might consider.
As far as static classes are concerned, they won't improve performance that much compared to instance classes. A class instantiation on the CLR is a pretty fast operation as Ayende explains. Static classes will introduce tight coupling between your data access layer and your consuming layer. So you can forget about static classes for the moment.
For error logging, I would recommend you ELMAH.
For benchmarking there are quite a lot of tools, Apanche Bench is one that is simple to use.
There's always a trade-off between developer productivity, maintainability and performance; you can only really make that trade-off sensibly if you can measure. Productivity is measured by how long it takes to get something done; maintainability is harder to measure, but luckily, performance is fairly easy to quantify. In general, I'd say to optimize for productivity and maintainability first, and only optimize for performance if you have a measurable problem.
To work in this way, you need to have performance targets, and a way of regularly assessing the solution against those targets - it's very hard to retro-fit performance into a project. However, optimizing for performance without proven necessity tends to lead to obscure, hard-to-debug software solutions.
Firstly, you need to turn your performance target into numbers you can measure; for web applications, that's typically "dynamic page requests per second". 400 concurrent users probably don't all request pages at exactly the same time - they usually spend some time reading the page, completing forms etc. On the other hand, AJAX-driven sites request a lot more dynamic pages.
Use Excel or something to work from peak concurrent users to dynamic page generations per second based on wait time, requests per interaction, and build in a buffer - I usually over-provision by 50%.
For instance:
400 concurrent users with a session length of 5 interactions and
2 dynamic pages per interaction means 400 * 5 * 2 = 4000 page requests.
With a 30 seconds wait time, those requests will be spread over 30 * 5 = 150 seconds.
Therefore, your average page requests / second is 4000 / 150 = 27 requests / second.
With a 50% buffer, you need to be able to support a peak of roughly 40 requests / second.
That's not trivial, but by no means exceptional.
Next, set up a performance testing environment whose characteristics you completely understand and can replicate, and can map to the production environment. I usually don't recommend re-creating production at this stage. Instead, reduce your page generations / second benchmark to match the performance testing environment (e.g. if you have 4 servers in production and only 2 in the performance testing environment, reduce by half).
As soon as you start developing, regularly (at least once a week, ideally every day) deploy your work-in-progress to this testing environment. Use a load test generator (Apache Benchmark or Apache JMeter work for me), write load tests simulating typical user journeys (but without the wait time), and run them against your performance test environment. Measure success by hitting your target "page generations / second" benchmark. If you don't hit the benchmark, work out why (Redgate's ANTS profiler is your friend!).
Once you get closer to the end of the project, try to get a test environment that's closer to the production system in terms of infrastructure. Deploy your work, and re-run your performance tests, increasing the load to reflect the "real" pages / second requirement. At this stage, you should have a good idea of the performance characteristics of the app, so you're really only validating your assumptions. It's usually a lot harder and more expensive to get such a "production-like" environment, and it's usually a lot harder to make changes to the software, so you should use this purely to validate, not to do the regular performance engineering work.
is access 2007 can work good with 30 users parallel through my C# program ?
thank's in advance
Access is not very good for concurrent use. I have seen recommendation of a maximum of 10 people at one time.
To be honest, it depends on how these users are working and the load on it, however, it is not designed for such use (it is designed to be a desktop databasew not an enterprise database) so may fail under such usage. Use a database designed for you scenario - something like MySql or SQL Server Express, if you want to avoid extra costs.
See this article on 15seconds for a discussion on the suitability (or lack thereof) of access to concurrent usage.
The Jet and ACE database engines can support 255 users, not just 255 concurrent connections. This is because the standard for interaction with a Jet/ACE data store is a single connection for each user, opened and then re-used throughout the session. However, it definitely is the case that under normal usage Jet/ACE may open more than one connection per user, so 255 is not even a reliable theoretical limit.
Jet/ACE interacts with a data file, and maintains locking via its locking file (*.LDB). Contention for the data file and the LDB file can easily overwhelm the file system's ability to keep up, so in general, the practical limit on number of users is much lower than the 255 theoretical limit (you'll note that 255 is one less than a power of 2, hint, hint).
In real-world scenarios, a properly designed Access application with a Jet/ACE data store running on a reliable network and stored on a server with a native Windows file system can be quite stable into the 20-30 users range. But it depends on what those users are doing. The more that are read-only, the higher the number of simultaneous users that can be supported.
Experienced Access developers report engineering apps to work with as many as 100 simultaneous users, but at that point, you basically have to rewrite as an unbound app, and then you're giving up most of the advantages of Access as front end in order to nurse along a back end that is better used with a smaller user population.
My basic rule is that any time a user population reaches 15 simultaneous users, I start talking to the client about upsizing to SQL Server, not because it's required, but because they need to get used to the idea that as usage grows, they're going to need to upsize. Whether that happens at 15 users or 20 or 30 depends on the nature of the particular app. As I said above, if many of the users are read-only for most of their session, you have more headroom than if everybody is adding/updating records most of the time.
Given that a C# app is going to be an unbound app, I wouldn't think that 30 users should be terribly problematic, but I'm not a C# programmer. If it's new development and there's any possibility that the user population will grow beyond 30 users, it just seems like a no-brainer to me to build with a server back end instead of with Jet/ACE.
I never did it with 2007, but I had problems in the past with the XP version and only 3 users working 8 hours a day.
So, based in my previous experience, try to avoid it. Make your customer to change their requirement will be easier than the problems derived from to use Access in a paralell enviroment. After all, also based in my experience... your customer will be changing their requirements almost every week! :D
May the Force be with you.
I have coded up an ASP.NET website and running on win'08 (remotely hosted). The application queries 11 very large Lucene indexes (each ~100GB). I open IndexSearchers on Page_load() and keep them open for the duration of the user session.
My questions:
The queries take a ~5 seconds to complete - understandable these are very large indexes - but users want faster responses. I was curious to squeeze out better performance. ( I did look over the Apache Lucene website and try some of the ideas over there). Interested in if & how you tweaked it further, especially ones from asp.net perspective.
One ideas was to use Solr instead of querying Lucene directly. But that seems counter-intuitive, introducing another abstraction in between and might add to the latency. Is it worth the headache in porting to Solr? Can anyone share some metrics on what improvement you got following a switch to Solr if it has been worth it.
Are there some key things that could be done in Solr that could be replicated to speed up response times?
Some questions / ideas:
Are you hitting all 11 indexes for a single request?
Can you reorganize the indexes so that you hit only 1 index (i.e. sharding) ?
Have you run a profile of the application (using dotTrace or similar tool)? Where is the time spent? Lucene.Net?
If most of the time is spent on Lucene.Net, then if you migrate to Solr the latency should be negligible (compared to the rest of the spent time). Plus, Solr can be easily distributed to increase performance.
I'm not all too familiar with Lucene (I use Solr) but if you're searching 11 indexes per request, can you run those searches in parallel (e.g. with TPL) ?
The biggest thing is removing the search from the web tier, and isolating it to it's own tier (a search tier). That way, you have a dedicated box with dedicated resources that have the indexes loaded, and "warmed up" in cache, instead of having each user have a copy of it's own index reader.
I'm still yet to find a decent solution to my scenario. Basically I have an ASP.NET MVC website which has a fair bit of database access to make the views (2-3 queries per view) and I would like to take advantage of caching to improve performance.
The problem is that the views contain data that can change irregularly, like it might be the same for 2 days or the data could change several times in an hour.
The queries are quite simple (select... from where...) and not huge joins, each one returns on average 20-30 rows of data (with about 10 columns).
The queries are quite simple at the sites current stage, but over time the owner will be adding more data and the visitor numbers will increase. They are large at the moment and I would be looking at caching as traffic will mostly be coming from Google AdWords etc and fast loading pages will be a benefit (apparently).
The site will be hosted on a Microsoft SQL Server 2005 database (But can upgrade to 2008 if required).
Do I either:
Set the caching to the minimum time an item doesn't change for (E.g. cache for say 3 mins) and tell the owner that any changes will take upto 3 minutes to appear?
Find a way to force the cache to clear and reprocess on changes (E.g. if the owner adds an item in the administration panel it clears the relevant caches)
Forget caching all together
Or is there an option that would be suit this scenario?
If you are using Sql Server, there's also another option to consider:
Use the SqlCacheDependency class to have your cache invalidated when the underlying data is updated. Obviously this achieves a similar outcome to option 2.
I might actually have to agree with Agileguy though - your query descriptions seem pretty simplistic. Thinking forward and keeping caching in mind while you design is a good idea, but have you proven that you actually need it now? Option 3 seems a heck of a lot better than option 1, assuming you aren't actually dealing with significant performance problems right now.
Premature optimization is the root of all evil ;)
That said, if you are going to Cache I'd use a solution based around option 2.
You have less opportunity for "dirty" data in that manner.
Kindness,
Dan
2nd option is the best. Shouldn't be so hard if the same app edits/caches data. Can be more tricky if there is more than one app.
If you can't go that way, 1st might be acceptable too. With some tweaks (i.e. - i would try to update cache silently on another thread when it hits timeout) it might work well enough (if data are allowed to be a bit old).
Never drop caching if it's possible. Everyone knows "premature optimization..." verse, but caching is one of those things that can increase scalability/performance of application dramatically.