I've created an old-style .ASMX web service and would like to know how the built-in ADO.NET connection pooling works with it.
The web service is not using a singleton pattern, so it is instantiated anew with every request. My question is will connections be removed from the pool after each service request, or are they kept in the pool across requests? My service is called very frequently but I don't want to be doing setup and teardown of connections every time, if it can be avoided.
I have read that the pool is maintained for the AppDomain, but I'm not sure if each request generates a new AppDomain or not.
I am also curious if it would be beneficial to set Min Pool Size (to a small number other than 0) in this case.
Anyone know?
No each request does not generate a new app domain. All the requests for that web site/application are in the same application domain, and so share the connection pool. Once the asmx request is finished with the connection, it returns it to the pool and the next request in line grabs it (assuming there isn't another connection in the pool readily available).
One point of clarification. You can have two different web applications which point to the same code, and are in different app domains. The two applications don't share anything (think about launching the same application twice).
I am also curious if it would be
beneficial to set Min Pool Size (to a
small number other than 0) in this
case.
So it can be beneficial depending on the application. Creating connections takes time, so having some ready allows you to forgo that. If you have request that say uses one connection, that might be fine to make a person wait for (it all depends on fast you want the application to respond). This can really come into play when you need to say 3 or 4 different ones (you get the point) open for one request. So why would you need multiple connections? What about one for accessing data and a separate thread for logging to the database (logging to the database vs a file is a totally different conversation)? Now you need two. There are multiple scenarios where this can come into play. Depending on your database server holding an open connection can be pretty cheap, so setting it to a small number can be a huge bang for your buck. (For the record I've seen scenarios where connecting to a database took several seconds, like 3-5, so in that case holding an open connection for a user was beneficial.)
This is for Max Pool Size
No it's not beneficial, because all requests to that service use the same pool (assuming the connections are using the same connection string, and aren't hitting different servers. Those have separate connection pools). Having no available connections, is a really fast and surefire way of crushing the performance of your service.
Related
I am trying to create a monitoring application for our operations department to be proactive when dealing with systems that are encountering problems. I created an app that does the job but it has some draw backs:
Each copy of the app running serves individual pings to the systems, when 1 ping would suffice.
I have 3 different api's for getting the status of our systems depending whether its hosted IIS, WCF or desktop.
To fix the first issue i was going to create a database which an interim service(app)(monitor) would make the pings, then the app would query the database for updates. After thinking about this I realized the second issue and decided it is a future problem.
So my thought was to, rather than have the interim application pinging the systems, simply have each system have one interface in which it posts it status to the database every x time. But then I ran into a problem with the WCF and IIS services we have. These services can sit for days without anyone actually using the service. How would I make these services continue to post its data?
My questions are:
Is it better to have data REQUESTED or PUSHED in this type of situation?
If REQUESTED, what is a suggested practice for maintaining a single API across mulitple platforms(IIS, WCF, Desktop)?
If PUSHED, how would you handle the case of the Web services which are instance based and not continuously running?
For web services, one solution might be to implement a health-check end point , something that you can simply call like: webservice/isServiceUp?
I prefer that this information is PULLED. If a service / web Service/ Application is down, then you can't possibly rely on it to write something to the DB... it would be possible but highly risky and unreliable.
In a real world situation, it is a little more complicated than that because something might happen between your service host and the consumer (DNS problem for example), in which case, you would want to consider the case of not getting anything back from the isServiceUp (no true no false, just a 400 lvl error)...
Consider using your load balancer for checking on APPS / web services and proactively switching to a different IP in case of issues... it is a possibility.
We are developing a multi-tenant application. With respect to architecture, we have designed shared middle tier for business logic and one database per tenant for data persistence. Saying that, business tier will establish set of connections (connection pool) with the database server per tenant. That means application maintain separate connection-pool for each tenant. If we expect around 5000 tenants, then this solution needs high resource utilization (connections between app server and database server per tenant), that leads to performance issue.
We have resolved that by keeping common connection pool. In order to maintain single connection pool across different databases, we have created a new database called ‘App-master’. Now, we always connect to the ‘App-master’ database first and then change the database to tenant specific database. That solved our connection-pool issue.
This solution works perfectly fine with on-premise database server. But it does not work with Azure Sql as it does not support change database.
Appreciate in advance to suggest how to maintain connection pool or better approach / best practice to deal with such multi-tenant scenario.
I have seen this problem before with multiple tenancy schemes with separate databases. There are two overlapping problems; the number of web servers per tenant, and the total number of tenants. The first is the bigger issue - if you are caching database connections via ADO.net connection pooling then the likelihood of any specific customer connection coming into a web server that has an open connection to their database is inversely proportional to the number of web servers you have. The more you scale out, the more any given customer will notice a per-call (not initial login) delay as the web server makes the initial connection to the database on their behalf. Each call made to a non-sticky, highly scaled, web server tier will be decreasingly likely to find an existing open database connection that can be reused.
The second problem is just one of having so many connections in your pool, and the likelihood of this creating memory pressure or poor performance.
You can "solve" the first problem by establishing a limited number of database application servers (simple WCF endpoints) which carry out database communications on behalf of your web server. Each WCF database application server serves a known pool of customer connections (Eastern Region go to Server A, Western Region go to Server B) which means a very high likelihood of a connection pool hit for any given request. This also allows you to scale access to the database separately to access to HTML rendering web servers (the database is your most critical performance bottleneck so this might not be a bad thing).
A second solution is to use content specific routing via a NLB router. These route traffic based on content and allow you to segment your web server tier by customer grouping (Western Region, Eastern Region etc) and each set of web servers therefore has a much smaller number of active connections with a corresponding increase in the likelihood of getting an open and unused connection.
Both these problems are issues with caching generally, the more you scale out as a completely "unsticky" architecture, the less likelihood that any call will hit cached data - whether that is a cached database connection, or read-cached data. Managing user connections to allow for maximum likelihood of a cache hit would be useful to maintain high performance.
Another method of restricting the number of connection pools per app server is to use Application Request Routing (ARR) to divide up your tenants and assign them to subsets of the web tier. This lends itself to a more scalable "pod" architecture where a "pod" is a small collection of web/app servers coupled to a subset of the databases. A good article on this approach is here:
http://azure.microsoft.com/blog/2013/10/31/application-request-routing-in-csf/
If you are building a multi-tenant DB application Azure you should also check-out the new Elastic Scale client libraries that simplify data-dependent routing and facilitate cross-shard queries and management operations. http://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-documentation-map/
I am developing a Windows RT application that needs to get data from a MVC WebApi server.
The problem is that the response can take from few seconds to 3 minutes.
Which is the best approach to solve it?
For now, I call async to the web api and put a long timeout value to avoid exceptions. Is it a good way? I do not like too much because the server have a open connection opened all time. Can it affect significantly to the server performance?
Is there some thing like "callback" but for web services? I mean that the server calls to the client to send the data.
Yes, there are ways to get server to callback client, for example WCF duplex communication. However, such techniques will usually keep the connection open (in most cases this is TCP session). Most web servers do not support numerous concurrent requests and thus each prolonged call to the server will increment the number of concurrently connected clients. This will lead to heavy resource utilisation at the point where it shouldn't be. If you have many clients, such architecture is bound to fail.
REST requests shall be lightweight, small and fast. Consider using a database to store temporary results and worker servers, to process the load. This is a server-side problem, not client-side.
Finally I solved it using WebSockets (thanks oleksii). It keeps the connection open but I avoid to poll for the result repeatedly. Now, when the server finishes the process, sends the data directly to the client. WebSockets is a protocol that relays over TCP and has been standardized.
http://en.wikipedia.org/wiki/WebSocket
I am working on a project in which a WCF service will be consumed by iOS apps. The number of hits expected on the webserver at any given point in time is around 900-1000. Every request may take 1-2 seconds to complete. The same number of requests are expected on every second 24/7.
This is what my plan:
Write WCF RESTful service (the instance context mode will be percall).
Request/Response will be in Json.
There are some information that needs to be persisted in the server - this information is actually received from another remote system - which is shared among all the requests. Since using a database may not be a good idea (response time is very important - 2 seconds is the max the customer can wait), would it be good to keep it in server memory (say a static Dictionary - assume this dictionary will be a collection of 150000 objects - each object consists of 5-7 string types and their keys). I know, this is volatile!
Each request will spawn a new thread (by using Threading.Timers) to do some cleanup - this thread will do some database read/write as well.
Now, if there is a load balancer introduced sometime later, the in-memory stored objects cannot be shared between requests routed through another node - any ideas?
I hope you gurus could help me by throwing your comments/suggestions on the entire architecture, WCF throttling, object state persistence etc. Please provide some pointers on the required Hardware as well. We plan to use Windows 2008 Enterprise Edition server, IIS and SQL Server 2008 Std edition database.
Adding more t #3:
As I said, we get some information to the service from a remote system. On the web server where the the WCF is hosted, a client of the remote system will be installed and WCF references one of this client dlls to get the information, in the form of a hashtable(that method returns a hashtable - around 150000 objects will be there in this collection). Would you suggest writing this information to the database, and the iOS requests (on every second) which reach the service retrieves this information from the database directly? Would it perform better than consuming directly from this hashtable if this is made static?
Since you are using Windows Server 2008 I would definitely use the Windows Server App Fabric Cache to store your state:
http://msdn.microsoft.com/en-us/library/ff383813.aspx
It is free to use, well supported and integrated and is (more or less) API compatible with the Windows Azure App Fabric Cache if you every shift your service to Azure. In our company (disclaimer: not my team) we used to use MemCache but changed to the App Fabirc Cache and don't regret it.
Let me throw some comments/suggestions based on my experience in serving a similar amount or request under the WCF framework, 3.5 back in the days.
I don't agree to #3. Using a database here is the right thing to do. To address response time, implement caching and possibly cache dependency in order to keep the data synchronized across all instances (assuming that you are load balanced)(also see App Fabric suggested above/below). In real world scenarios, data changes, often, and you must minimize the impact.
We used Barracuda hardware and software to handle scalability as far as I can tell.
Consider indexing keys/values with Lucene if applicable. Lucene delivers extremely good performances when it comes to read/write. Do not use it to store your entire data, read on it. A life saver if used correctly. Note that it could be complicated to implement on a load balanced environment.
Basically, caching might be the only necessary change to your architecture.
I'm attempting to create a WCF service where several thousand (~10,000) clients can connect via a duplex NetTcpBinding for extended periods of time (weeks, maybe months).
After a bit of reading, it looks like it's better to host in IIS than a custom application or Windows service.
Is using WCF for such a service acceptable, or even possible? If so, where can I expect to run into throttling or performance issues, such as increasing the WCF ListenBacklog & MaxConcurrentConnections?
Thanks!
Why do you need to maintain opened connection for weeks / months? That will introduce a lot of complexity, timeouts handling, error handling, recreating connection, etc. I even doubt that this will work.
Net.tcp connections use transport session which leads to PerSession instancing of WCF service - the single service instance servers all requests and lives for the whole duration of the session (weeks or months in your case) = instance and whole its content is still in the memory. Any interruption or unhandled exception will break the channel and close the session = all session's local data are lost and client must crate new proxy to start a new session again. Also any timeout (default is 20 minutes of inactivity) will close the session. For the last - depending of business logic complexity you can find that if even few hundreds clients needs processing in the same time single server is not able to serve all of them and some clients will timeout (again breaks the session). Allowing load balancing with net.tcp demands load balancing algorithm with sticky sessions (session affinity) and whole architecture becomes even more complicated and fragile. Scalability in net.tcp means that service can be deployed on multiple servers but the whole client session must be handled by single server (if server dies all sessions served by the server die as well).
Hosting in IIS/WAS/AppFabric has several advantages where two of them is health monitoring and process recycling. Health monitoring continuously check that worker process is still alive and can process request - if it doesn't it silently starts new worker process and routes new incoming requests to that process. Process recycling regularly recycles (default setting is after 29 hours) application domain which makes process healthy and reducing memory leaks. The side effect is that both recreating process or application domain will kill all sessions. Once you self host the service you lose all of them so you have to deal with health of your service yourselves.
Edit:
IMHO health status information doesn't have to be send over TCP. That is information that doesn't require all the fancy stuff. If you lose some information it will not affect anything = you can use UDP for health status transfers.
When using TCP you don't need to maintain proxy / session opened just to keep opened the connection. TCP connection is not closed immediately when you close the proxy. It remains opened in a pool for short duration of time and if any other proxy needs connection to the same server it is reused (the default idle timeout in pool should be 2 minutes) - I discussed Net.Tcp transport in WCF in another answer.
I'm not a fan of callbacks - this whole concept in WCF is overused and abused. Keeping 10.000 TCP connection opened for months just in case to be able to send sometimes data back to few PCs sounds ridiculous. If you need to communicate with PC expose the service on the PC and call it when you need to send some commands. Just add functionality which will call the server when the PC starts and when the PC is about to shut down + add transfering monitoring informations.
Anyway 10.000 PCs sending information every minute - this can cause that you will receive 10.000 requests in the same time - it can have the same effect as Denial of service attack. Depending on the processing time your server(s) may not be able to process them and many requests will timeout. You can also think about some message queuing or publish-subscribe protocols. Messages will be passed to a queue or topic and server(s) will process them continuously.