How to diagnostics multiple unexpected client connection/disconnection on blazor server side

How to diagnostics multiple unexpected client connection/disconnection on blazor server side - c#

I have an blazor server-side app deployed on average 100 site which generally works pretty well.
But for very few site, end-users complain about micro disconnection/reconnection on the app (they see the components-reconnect-modal few seconds).
I logged some information on the OnConnectionUpAsync and OnConnectionDownAsync of the CircuitHandler to see whats happen, and effectively i notice some log showing a lot of time client disconnected for one, two or three seconds. But i don't have the reason
There is an average of maybe 50 and 100 end user connected at the same time at the app on this site.
I really don't know how to start to investigate this case. This client have a server well dimensionnate, with recent windows server, the configuration is OK.
I don't have specific configuration or parameters on the app
I would suspect the problem is on the network of my client (so not specific on my app, like shutdown of the network for 3 second), but The client assure me is network works very well and he don't have any problems with his others apps.
Have you any idea what i can do to have more information on this problem ?
do you know some tools which can test the network and show potentials problems ?
thanks in advance

Related

Huge amount of close_wait after .NET CORE webapp recycled on IIS behing Windows NLB

Trying to find out the root cause of this issue for weeks without success, hope someone will get new ideas where to look.
We have an Inprocess .NET Core application running on Windows 2012RC2 with IIS Ver 8.5 (.net core hosting 2.2.6).
This application is duplicated on 2 servers sharing the same IP through Windows NLB.
Both machines are physical with recent HW and 1GB link
The traffic managed by the application is IMHO not that high at all (300 requests per minute !!)
However, randomly when IIS the application pool is recycling, the application is not able to send the ACK from the TCP sequence to acknowledge the CLOSE_WAIT. Ending up with exponential number of close_wait connection close wait comparison (see traffic picture of the 2 servers at issue time)
The server at one point is not able to serve anything anymore.
This will end-up either with the default keepalivetimeout of windows occurs or if we manually recycle again the apppool.
Looking into tcpview tool at that time, the close wait are linked to the system process , so not able to know if it is the new worker process or the closing one which triggered this issue.
What I don't understand is why suddenly the app can't ACK the new close_wait calls ?
Because of the "VIP" (NLB) shared by the servers ? For us configuration and HB traffic seems ok.
Something wrong with IIS and .net core Inprocess model ? Didn't see anyone complaining about that
Any httpclient definition in the code wrongly done ? devs had a look ad didn't find anything wrong there too
Is the application is not able to send the ACK or not able to detect the FIN_ACK2 ?quite hard to see when the close wait increase so quickly, even with wireshark dump on big server
Mitigation was to reduce Windows keepalived time and controlling the recycling at very low traffic time, but I would like to know if anyone has any other idea where to dig into, or had same issue...

Application switched to live URL causes excessive DB usage

Very very strange issue here... Apologies in advance for the wall of text.
We have a suite of applications running on an EC2 instance, all connecting to an RDS instance.
We are hosting the staging and production applications on the same EC2 server.
With one of the applications, as soon as the staging app is moved to prod, over 250 or so connections to the DB are opened, causing the RDS instance to max out CPU usage and make the entire suite slow down. The staging application itself does not have this issue.
The issue can be replicated by both deploying the app via our Octopus setup, and also physically copy pasting the BIN/Views folder from staging to live.
The connections are instant, boosting the CPU usage to 99% in less than a minute.
Things to note...
Running how to see active SQL Server connections? will show the bulk connections, none of which have a LoginName.
Resource monitor on the FE server will list the connections, all coming from a IIS, seemingly scanning all outbound ports, attempting to connect to the DB server on its port. FE server address and DB server address blacked out respectively. Only a snippet of all all of the connections.
The app needs users to log in to perform 99.9% of tasks. There is a public "Forgot your password" method that was updated to accept either a username or password. No change to the form structure or form action URL, just an extra check in the back.
Other changes were around how data was to be displayed and payment restrictions under certain conditions. Both of which require a login.
Things I've tried...
New app pools
Just giving it a few days to forget this ever happened
Not using Octopus to publish
Checking all areas that were updated between versions to see if a connection was not closed properly.
Really at a loss as to what is happening. This is the first time that I've seen something like this. Especially strange that staging is fine, but the same app on another URL/Connection string fails so badly.
The only think I can think of would potentially be some kind of scraper that is polling the public form, but that makes no sense as why isn't it happening with the current app...
Is there something in AWS that can monitor the calls that are being made? I vaguely remember something in NewRelic being able to do so.
Any suggestions and/or similar experiences are welcomed.
Edits.
Nothing outstanding in logs for the day of the issue (yesterday)
No incoming traffic to match all of the outbound requests
No initialisation is performed by the application on startup
Update...
We use ADO for most of our queries. A query was updated to get data from different tables. The method name and parameters were not changed, just the body of the query. If I use sys.dm_exec_sql_text to see what is getting sent to the DB, I can see that is IS the updated query that is being sent in each of the hundreds of connections. They are all showing as suspended though... Nothing has changed in regards to how that query is sent to the server, just the body of the query itself...

So, one of the other queries that was published in the update broke it. We reverted only that query and deployed a new version, and it is fine.
Strangely enough, it's one that is being run in one form or another over the entire suite. But just died under any sort of load that wasn't staging, which is why I assumed it would be the last place to look.

EWS intermittent "Unable to connect to the remote server" - Works 99% of the time

I have a c# app that exports data from the database to the user's Exchange Calendar, using Microsoft.Exchange.WebServices (15.0.0.0). My app is DotNet 4.5.2 and I use Visual Studio 2013.
The app has been working fine for a year, and even now works 95% of the time. The app runs constantly on machine that I am logged into and monitor. This is a low volume process, perhaps 50 items get exported per day at most.
Every couple of days, the program will give this error:
"The request failed. Unable to connect to the remote server"
when the app is attempting to create an EWS connection to the server.
It will do this for maybe 10-20 items. So I shut it down, run it again, and it works perfectly fine on all the records that failed before.
This is my first EWS app, but I've been programming 30+ years and do have somewhat little knowledge of internet based apps.
Any helpful information or suggestions would be greatly appreciated!
Thank you kind people!

The internet often works, but should never be considered reliable. Anything using it should handle errors, and retry if desired. The classic approach is Exponential backoff. Its possible something nasty is going on, like your ISP swapped your IP address, or just intermittent failure. I don't know anything EWS specific, but there may also be sources of flaky issues there as well.

ASP.NET 2.0-4.0 Web Applications experiencing extremely slow initial start-up.

(Sorry if this is a really long question, it said to be specific)
The company I work for has a number of sites, which have been running for some time with no problems. The applications are a mix of ASP.NET 2.0, 3.5, and 4.0, all using an ADO.NET to connect to a SQL Server Standard instance (on the same webserver) all being hosted with IIS7.
The problem began when we moved to an upgraded webserver. We made every effort to set up the server, db instance and IIS with the exact same settings (except for the different machine name, and the fact that we had upgraded from SQLExpress to Standard), and as far as we could tell, we did. Both servers are running Windows Server 2008 R2 (all current updates applied), and received a default install.
The problem is very apparent when starting up one of these applications. When you reach the login page of our application, the page itself loads extremely fast. This is true even when you load the page from a new machine that could not possibly have the page cached, with IIS caching disabled. The problem is actually visible when you enter your login information and click the login button. Because of the (not great)design of our databases, the login process must access a number of databases, theoretically up to 150 separate DBs, but in practice usually 2. The problem occurs even when only 2 databases (the minimum) are opened. Not a great design, but we have to live with it for now.
When trying to initially open a connection to the database, the entire process stops for about 20 seconds every time, regardless of whether you are connecting to 2 dbs or 40. I have run a .NET profiler (jetbrains dottrace) against the process, and the only information I could take from it was that one or all of the calls to sqlconnection.open() was accounting for 90% of the time. This only happens on first-use of the application, but the problem is compounded by the fact that IIS seems to disregard the recycling settings we have set for it, and recycles the application after a few minutes of idle, causing the problem to occur again.
I also tried to use the SQL Server profiler to see which database operations were the cause of the slowdown, but because of all the other DB activity, (and the fact that I had to do this on our production server, because the problem doesnt occur in our test environments) I couldn't pin down the exact operation that was causing the stoppage. I will try coming in late at night and shutting down the production sites to run the SQL profiler, but I might not be able to do this right away.
In the course of researching the problem, I have tried a couple solutions
Thinking it might be a name resolution problem, I tried modifiying both the hosts file on the webserver as well as giving the connectionstrings an IP address instead of the servername to resolve, with no difference. I have heard of the LLMNR protocol causing problems like this, but I think trying to connect by IP or resolving with the hosts file should have eliminated that possibility, tho i admit I never tried actually turning off LLMNR.
I have increased the idle timeouts, recycling intervals etc in IIS, but this doesn't even seem to be respected, much less solving the problem. This leads me to believe there is a setting overriding the IIS application settings on the machine.
multiple other code fixes, none of which made any difference. Is a SqlServer setting causing the problem?
other stuff that i forgot by now.
Any ideas, experience or whatevers would be greatly appreciated in helping me solve this problem!

I would advise using a non-tcp connection if you are still running the SQL instance on the local machine. SQL Server supports several protocols, tcp, named pipes, and shared memory are the more common.
Named Pipes
Data Source=np:computer\instance
Shared Memory
Data Source=lpc:computer\instance
Personally I prefer the Shared Memory. Remember you need to enable these protocols, and to avoid configuration mistakes I suggest you disable all you are not using.
see http://msdn.microsoft.com/en-us/library/ms187892.aspx
IIS Reset
In IIS7 there are two ways to configure the idle-timeout. Both begin by clicking on the "Application Pools" section and right-clicking the appropriate app domain. If you click the "Recycling..." option there is one setting. The other is in "Advanced Settings..." under the section for "Process Model" you will find "Idle Time-out (minutes)" which set to zero disables the process timeout. This later option is the one that works for us.
If I were you I'd solve this problem first as restarting the appdomain and/or worker process is always painful even if you don't have a 20 second lag.

Some ideas:
from the web server, can you ping the db server and get a "normal"
response, or are you seeing a similar delay?
if you're seeing a delay, run a tracert to see if you can nail down where the slowness is occurring
try using a tool like QueryExpress (http://www.albahari.com/queryexpress.aspx) which doesn't require an install to run. You can download this EXE and run it from your web server. See if you can connect to your db using this and run queries in a normal fashion.
Try something like SysInternals' TcpView (http://technet.microsoft.com/en-us/sysinternals/bb897437) to take a look at your open connections and see what activity is happening on your server and how much data is being sent to and received from your db server.
Just some initial thoughts on where I'd start to look based upon your problem description. I hope this helps. Good luck with things!

With IIS not respecting recycling settings: did restarting IIS/rebooting change the behavior?

What sort of web host lets you run crawlers on it?

I'm working on a graduation project for one of my university courses, and I need find some place to run several crawlers I wrote in C# from. With no web hosting experience, I'm a bit lost. Is this something that any site allows? Do I need a special host that gives more access to the server? The crawler is a simple app that does its work, then periodically writes information to a remote database.

A web crawler is a simulation of a normal user. It acess sites like browsers do, getting the html code (javascript, etc.) returned from the server (so no internal access to server code). Being that, any site can be crawled.
Be aware of some web crawler ethics guidelines. There are pages you shouldn't index or follow its links. And web developers build some files and instructions to web crawlers, saying what you can index or follow.

If you can't run it off your desktop for some reason, you'll need a host that lets you execute arbitrary C# code. Most cheap web servers don't do this due to the potential security implications, since there will be several other people running on the same server.
This means you'll need to be on a server where you have your own OS. Either a VPS - Virtual Private Server, where virtualization is used to give you your own OS but share the hardware - or your own dedicated server, where you have both the hardware and software to yourself.
Note that if you're running on a server that's shared in any way, you'll need to make sure to throttle yourself so as to not cause problems for your neighbors; your primary issue will be not using too much CPU or bandwidth. This isn't just for politeness - most web hosts will suspend your hosting if you're causing problems on their network, such as denying the other users of the hardware you're on resources by consuming them all yourself. You can usually burst higher usage levels, but they'll cut you off if you sustain them for a significant period of time.

This doesn't seem to have anything to do with web hosting. You just need a machine with an internet connection and a database server.
I'd check with your university if I were you. At least in my time, a lot was possible to arrange in-house when it came to graduation projects.
Failing that, you could look into a simple VPS (Virtual Private Server) account. Unless you are sure your app runs under Mono, you will need a Windows one. The resource limits are usually a lot lower than you'd get from a dedicated server, but they're relatively affordable. Some will offer a MS SQL Server database you can use next to the VPS account (on another machine). Installing SQL Server on the VPS itself can be a problem license wise.
Make sure you check the terms of usage before you open an account, as well as the (virtual) system specs though. Also check if there is some kind of minimum contract period. Sometimes this can be longer than a single month, especially if there is no setup fee.
If at all possible, find a host that's geographically close to you. A server on the other side of the world can get a little annoying to access remotely using Remote Desktop.

80legs lets you use their crawlers to process millions of web pages with your own program.
The rates are:
$2.00 per million pages
$0.03 per CPU-hour
They claim to crawl 2 billion web pages a day.

You will need a VPS(Virtual private server) or a full on dedicated server. Crawlers are nothing more then applications that "crawl" the internet. While you could set up a web site to be a crawler, it is not practical because the web page would have to be accessed for you crawler to work. You will have to read the ToS(Terms of service) for the host to see what the terms are for usage. Some of the lower prices hosts will cut your connection with a reason of "negatively impacting the network" if you try to use to much bandwidth even though they have given you plenty to use.
VPS are around $30-80 for a linux server and $60+ for a windows server.
Dedicated services run $100+ for both linux and windows servers.

You don't need any web hosting to run your spider. Just ask for a PC with web connection that can act as a dedicated server,configure the database and run the crawler from there.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.