I am trying to solve the problem of very long response times from MySQL when opening a connection using the MySQL Connector for .net.
I have installed MySQL 5.5 running on an Azure VM (Server 2008) with --skip-name-resolve, and the database user accounts' host restrictions are using IP addresses. I am using the latest MySQL Connector for .net in my WCF service running on Azure (in the same location US- East, I have been using a trial subscription, no affinity set). My connection string in the WCF service is using the internal IP address of the VM hosting MySQL as the server parameter value. I also have "pooling = true;Min Pool Size=2;" just in case (I have tried without these parameters too).
When tracing the WCF the query response time once the service is running and processing requests are pretty good (even where each query result is unique and so not being cached) and I have no issues with the performance of MySQL providing it's getting hit frequently.
But the huge problem I haven't been able to crack is the length time it takes to get the connection to MySQL Open after no calls to the database have been made for about 3 or 4 minutes. If no database calls are made for a few minutes it takes 8 or 9 seconds or more to open the connection again. I wrapped the actual "conn.open();" with trace statements before and after calling, and this is the behaviour I am seeing logged time and time again after a few minutes of inactivity.
Incidentally, I have also tried (and still am using) the 'using' style of connection handling to ensure that the MySQL Connector is managing the connection pool.
e.g.:
using (var conn = new MySqlConnection(Properties.Settings.Default.someConnectionString)) { ... statements ..}
I feel like I have reached a dead end on this one so any suggestions would be greatly appreciated.
I can explain your question "the length time it takes to get the connection to MySQL Open after no calls to the database have been made for about 3 or 4 minutes. If no database calls are made for a few minutes it takes 8 or 9 seconds or more to open the connection again." why it happens:
The Windows Azure websites uses concept of hot (active) and cold (inactive) sites in which if a websites has no active connection the site goes it cold state means the host IIS process exits. When a new connection is made to that websites it takes a few seconds to get the site ready and working. While you have MySQL backend associated to this website, it take a few more seconds longer to get the requested served as there is some time taken by IIS host process to get started. That is the reason after few minutes of in activity the the response time is longer.
You can see the following presentation for more details on Windows Azure Hot (active) and Cold (inactive) Websites:
http://video.ch9.ms/teched/2012/na/AZR305.pptx
As this time, I am not sure and do not know how you can keep the websites always hot, even if moving to shared website or it is not possible at all. What I can suggest you to write your issue to Windows Azure WebSites Forum and someone from that team will provide you an appropriate answer.
Related
I have a C# Windows Service running in approximately 500 locations. When the services start they initiate a connection with a remote SignalR hub running on a VM in AWS. The connection/tunnel is then used for sending information from the server to each location throughout the day.
This has been running fine for the past 18 months, but for the past week the remote services have been losing connection and have to be restarted to reinitiate the connection with the server. I'm using SignalR 2.4.1 which doesn't have the auto reconnection feature of the newer versions.
I've noticed that it seems to be happening about every 18 hours. Nothing has changed on the clients or on the server that anyone is aware of.
I'm looking for ideas on what could be causing this. Any input would be appreciated.
I have a SQL Server with two databases, a production database and a development database. The .net 2.0 website hitting the production database with manual SqlConnection code is working fine. The other database is being hit from a newer ASP.NET MVC app using Entity Framework 6.2 and is getting timeout issues. The timeout takes 30 seconds the first time, but the page comes back almost instantaneously on subsequent refreshes. Both websites are on the same box as the database, so are only using "localhost" to connect. They are using SQL Server user logins, not Windows authentication.
I copied the .edmx and .tt files into a .net console app and that app has no problem hitting the database with the exact same linq query and pulling the same data that is failing.
I then created a new web site and copied just that same code into an aspx page. It fails the first time with a timeout, and then works on subsequent attempts (and a week ago, the main dev site was doing the same thing).
I separated the dev database from the SQL Server 2008 R2 server and attached it to a newly installed instance of SQL Server Express on a different port, and get the same results.
The web server is windows server 2008 standard 32-bit. I copied both websites and the console application to a new box (I thought was 2016, but it turns out it is 2008 standard 64-bit) and get the same results.
The dev site was working up until a couple of months ago. The client was using local user accounts for everything, but had a domain and wanted to do testing with windows authentication for an old vb app that hits the same database, and I had started migrating testing accounts to the domain. When the client tried to later, for an unrelated reason, change his password, we discovered that he was already using a domain account, but that his laptop could not connect to the domain. We found several other computers that could not connect, even though the machines I had connected to the domain during my testing were working fine. An outside network "friend" was brought in to figure out what was going on. At that point, I lost all track of what was actually done. I know that different network and domain configurations were tried and didn't fix the domain issues, but I don't know what. However, the production site was never rendered inoperative.
I have no idea what is going on. Does anyone else?
Oh, and in case it was a provider issue, I've also tried manual connection using OleDbconnection from the web app, and it also fails with the Timeout issue.
Update:
I spun up a new DataCenter 2016 box, installed IIS and .net on it and copied the website to that box. It has no problems hitting the database and pulling the data from the other server.
I know patches and such were updated on the original box while the domain and network were being manipulated, but I don't know how far behind they were. I suspect that some patch changed some default or inherited .net configuration options or something. I did do a "repair" on the .net installation, and that didn't make a difference. However, with the production site working fine, I'm not currently willing to uninstall .net or anything else. I'm afraid I would risk pushing this same error into the production site and the client would be screwed.
It seems that for some reason, the timeout period elapsed while attempting to consume the pre-login handshake acknowledgement.
Try increasing the connect timeout property in your connection string to 60 or more. Default is 15 (in seconds).
Example: Data Source=(LocalDB)\v11.0;Integrated Security=True;Connect Timeout=30
I've been trying to diagnose an issue pertaining to thousands of hung/stuck EndRequest requests in IIS. This is becoming a large problem for us as we're hitting the concurrent connection cap after about a week or two and have to recycle the whole application pool to clear the request list.
Because this is a live application, I have limited troubleshooting options, so anything that would halt or bring down the application pool I am not allowed to do.
IIS Information
Concurrent connection cap is set to its maximum of 65535.
Configuration debug in the web config is set to false and we have a
timeout set at 110 seconds.
Windows Server 2012 R2 Version 6.2 (Build 9200)
IIS Version 8.5.9600.16384
The long running requests have 0 data transfer, checked with
WireShark.
I'm pretty much at a loss on why these aren't timing out. I've set all the appropriate settings - the ones I could find from MSDN and other sources. We have a very, very hard time replicating this on our development environment so it's been blind testing for the most part. I've found articles and such on other state hangs, but I cannnot find anything on why a request in the EndRequest state will not time out.
Advanced Settings Page:
https://postimg.org/image/gxec32kmt/
Application Pool Requests Page:
https://postimg.org/image/qupcw57o5/
Web Config:
https://postimg.org/image/5xt4rh1xh/
Update 1
I did a bit of digging into our fallback that is supposed to close connections after an hour of no usage. We seem to currently have 10,153 sessions still active with a last active time of 3 days ago. I've stepped through this function quite a bit and it seems to be working as intended. It goes through the list of sessions and anyone over an hour of inactivity has their WebSocketHandler.Close() method called. However it seems some sessions are refusing to close after the method is being called. We have logging in place to tell us if any exceptions are being thrown during the run but it seems as though it's running as expected.
This was my mistake. I was running against an old sessions data pull. A current pull of the session data shows no sessions running greater then their specified time. This means that the WebSocketHandler.Close() was called on them and they were removed from our in-memory list.
Update 2
NETSTAT using netstat -s on pastebin: https://pastebin.com/embed_js/qBbZ4gJ1
Update 3
Correction to update 1. Can a connection close be called and fail? If so, then we're accidentally orphaning the reference to the connection in our server. I would still expect the IIS timeout to kick in however, there must be some catches to it collecting requests.
Very very strange issue here... Apologies in advance for the wall of text.
We have a suite of applications running on an EC2 instance, all connecting to an RDS instance.
We are hosting the staging and production applications on the same EC2 server.
With one of the applications, as soon as the staging app is moved to prod, over 250 or so connections to the DB are opened, causing the RDS instance to max out CPU usage and make the entire suite slow down. The staging application itself does not have this issue.
The issue can be replicated by both deploying the app via our Octopus setup, and also physically copy pasting the BIN/Views folder from staging to live.
The connections are instant, boosting the CPU usage to 99% in less than a minute.
Things to note...
Running how to see active SQL Server connections? will show the bulk connections, none of which have a LoginName.
Resource monitor on the FE server will list the connections, all coming from a IIS, seemingly scanning all outbound ports, attempting to connect to the DB server on its port. FE server address and DB server address blacked out respectively. Only a snippet of all all of the connections.
The app needs users to log in to perform 99.9% of tasks. There is a public "Forgot your password" method that was updated to accept either a username or password. No change to the form structure or form action URL, just an extra check in the back.
Other changes were around how data was to be displayed and payment restrictions under certain conditions. Both of which require a login.
Things I've tried...
New app pools
Just giving it a few days to forget this ever happened
Not using Octopus to publish
Checking all areas that were updated between versions to see if a connection was not closed properly.
Really at a loss as to what is happening. This is the first time that I've seen something like this. Especially strange that staging is fine, but the same app on another URL/Connection string fails so badly.
The only think I can think of would potentially be some kind of scraper that is polling the public form, but that makes no sense as why isn't it happening with the current app...
Is there something in AWS that can monitor the calls that are being made? I vaguely remember something in NewRelic being able to do so.
Any suggestions and/or similar experiences are welcomed.
Edits.
Nothing outstanding in logs for the day of the issue (yesterday)
No incoming traffic to match all of the outbound requests
No initialisation is performed by the application on startup
Update...
We use ADO for most of our queries. A query was updated to get data from different tables. The method name and parameters were not changed, just the body of the query. If I use sys.dm_exec_sql_text to see what is getting sent to the DB, I can see that is IS the updated query that is being sent in each of the hundreds of connections. They are all showing as suspended though... Nothing has changed in regards to how that query is sent to the server, just the body of the query itself...
So, one of the other queries that was published in the update broke it. We reverted only that query and deployed a new version, and it is fine.
Strangely enough, it's one that is being run in one form or another over the entire suite. But just died under any sort of load that wasn't staging, which is why I assumed it would be the last place to look.
I have developed a Web Application a standard web application to allow users to display and update a set of data from an SQL database.
The Web Application uses a AngularJS client side which interacts with the Web Server via MVC Web API calls to retrieve and update data on the database.
The Server side code is written in C# using .NET 4.5 and uses Entity Framework v6.0 to access the database.
The Web Application is hosted in an Azure Web App.
The Database is the Azure SQL Database.
The issue is that when the Application has not been used for about 10-15 minutes, then it is used again, the first data retrieval often takes over 10 seconds to return to the browser. After that the performance is fine until the next time the application is left unused.
I've put trace in the application and we see that the delay is when the connection opens. The actual query on the database runs sub-second.
I've noticed though that with different hosting configurations I get different results. In particular hosting in house and pointing to the Azure database does not encounter anywhere near the same delays.
I've changed one of the routines to use ADO.NET instead of Entity Framework and changed the trace to try to narrow it down further.
What I see is this:
ConnectionStringSettings ADOcnxstring = ConfigurationManager.ConnectionStrings["DevFEConnectAdo"];
DbConnection ADOconnection = new SqlConnection(ADOcnxstring.ConnectionString);
The delay is here (before the SQL has even been defined!
and then I build the command and do the DataReader etc:
DbCommand ADOcommand = ADOconnection.CreateCommand();
:
etc
So the delay is on opening the Connection to the database.
My connection string is standard:
<add name="DevFEConnectAdo" connectionString="data
source=feeunsqldevfeconnect.database.windows.net;initial
catalog=feeunsqldbdevfeconnect;persist security info=True;user id=???
#???;password=???;multipleactiveresultsets=True"></add>
15 minutes is too short for your app to be recycled (as suggested by CSharpRocks). I dont think its the issue here.
The delay is because a new Db connection is established upon first call after idle timeout. Typically if a connection is inactive for 4-10 minutes it will be closed. If a minimum pool size is specified, those connections will be kept alive even after idle timout expires.
Try using this connection string (adjust min pool size as per your needs)
<add name="DevFEConnectAdo" connectionString="data
source=feeunsqldevfeconnect.database.windows.net;initial
catalog=feeunsqldbdevfeconnect;persist security info=True;user id=???
#???;password=???;multipleactiveresultsets=True;Min Pool Size=3;Load Balance Timeout=180;"></add>
Further details
Why do we need to set Min pool size in ConnectionString
List of SQL Connection Properties - documentation
After some time, this eventually got resolved with some help from Microsoft Azure support.
The detail that I left out was that my Web App was actually pointing to 2 databases
- 1 the Application Azure SQL database, I was having the delay problem with
- A 'Data Warehouse' we had on an Azure Virtual Machine
Because of replication between inhouse database servers and the 'Data Warehouse' the Virtual Machine and Web App were all in a Azure Virtual Network.
The problem was there can be network problems if a Web App inside a Virtual network wants to talk to Azure SQL Databases (which cannot be within a Virtual Network).
My solution was to
configure an Endpoint on the Data Warehouse Virtual Server,
take the Web App out of the Virtual network and make it point to the Virtual Server by means of the Endpoint
At this point all the delays went away and I could take off the MinPool Size settings (and Timeout which I later discovered did nothing anyway).
Web apps are recycled after a few minutes of inactivity. Try enabling the Always On setting located in Settings/Application Settings in the portal to see if this helps with your issue.