Trace messages on StackExchange Redis client - c#

We are using this StackExhange Redis C# client and occasionally experiencing errors such as this:
StackExchange.Redis.RedisConnectionException: No connection is
available to service this operation: EXISTS
OnDemand:ExportDocument:Subscription:f3d45517-c26e-4e99-82c0-5c532a68081b
at
StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message
message, ResultProcessor`1 processor, ServerEndPoint server) in
c:\TeamCity\buildAgent\work\3ae0647004edff78\StackExchange.Redis\StackExchange\Redis\ConnectionMultiplexer.cs:line
1734
Looking through the code, it appears as though we can enable some sort of verbose tracing to gain an additional understanding of what is happening underneath the hood. I looked through the configuration part of the document and there is no mention of tracing there.
Any one with ideas on how to enable tracing on this client?

ConnectionMultiplexer.Connect() optionally accepts a TextWriter for logging.

Related

Stackexchange.Redis timeouts & socketfailures

I am using Azure Redis (using Stackexchange.Redis) as a cache storage and its generally working fine. But I am getting timeouts errors now and then and I can't nail down why it is happening.
My redis connection settings:
value="dev.redis.cache.windows.net,ssl=true,password=secret,abortConnect=false,syncTimeout=3000"
I am getting all these exception in the same second (multiple calls): [I get these on GET operations aswell. Almost all these exceptions are on StringSet & StringGet. I rarely get exceptions on HashSets or HashGets]
Timeout performing SET {key}, inst: 1, mgr: ExecuteSelect, queue: 6, qu=0, qs=6, qc=0, wr=0/0, in=0/0
SocketFailure on SET
SocketFailure on SET
No connection is available to service this operation: SET
I am guessing that setting the object is taking longer than expected, this could be due to the object being large so I could potentially increase the synctimeout but would that be hiding some other problem?
I am only getting these exceptions on synchronous calls to stackexchange.redis, I have not seen an exception when the call is asynchronous.
Stacktrace:
StackExchange.Redis.RedisConnectionException: SocketFailure on SET
at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor`1 processor, ServerEndPoint server) i
at StackExchange.Redis.RedisBase.ExecuteSync[T](Message message, ResultProcessor`1 processor, ServerEndPoint server)
at StackExchange.Redis.RedisDatabase.StringSet(RedisKey key, RedisValue value, Nullable`1 expiry, When when, CommandFlags flags)
at calling method
Edit: I am using StackExchange.Redis 1.0.414 package and I am using MessagePack to serialize my objects
Timeouts are typically caused by one of a few things. Here are some examples
Client or server CPU hitting 100%
Poorly configured ThreadPool settings, combined with bursts of traffic
Clients sending expensive commands to the server.
Maxing out your network Bandwidth (on client or on server)
Tips for Client side issues: https://gist.github.com/JonCole/db0e90bedeb3fc4823c2
Tips for server side issues: https://gist.github.com/JonCole/9225f783a40564c9879d
I would recommend upgrading to a newer version of the StackExchange.Redis also. Version 1.1.603 has some more detailed diagnostic info in the timeout error message that may help you identify some of common client-side the things I listed above.
As for Socket failures, a couple of common causes for connection drops between the client and server that I have seen are:
Scaling the client - I have seen brief client side connectivity issues when scaling client apps in Azure.
When Redis is patched, there will be some connection blips. Azure Redis patching is explained here: https://gist.github.com/JonCole/317fe03805d5802e31cfa37e646e419d
Please check the port number on which you are running Redis. In my case my port was 6359 but the actual port number 6379.

display wcf data - The underlying connection was closed:

I'm trying to implement simple wcf service. I think that my server and client side endpoints points are set correctly.
On debugging I can see that my service returns data properly but when it comes to display on the screen (simple console application is the client of the service) it says
The underlying connection was closed: A connection that was expected
to be kept alive was closed by the server.
class Program
{
static void Main(string[] args)
{
string endPoint = ConfigurationManager.AppSettings["BookServiceActiveEndPoint"];
IBookService proxy = new ChannelFactory<IBookService>(endPoint).CreateChannel();
// this is where breaks
Console.WriteLine(proxy.GetBookDetails("TestBookTitle")); /
Console.ReadLine();
}
}
any ideas where to look for further info?
wcf host is website and solution has multiple startup projects
- webservices.host (website)
- webservices.consoletests
This can be caused by a lot of problems, some I've encountered are:
Problems with serialization/deserialization
Endpoint configuration issues
app pool has terminated (ex, when a stackoverflowexception is thrown)
tons more
so the best solution is enable tracing and look at the trace logs. I won't try to explain something that's already well explained in the web so I'll just give you a few links:
How to turn on WCF tracing?
http://msdn.microsoft.com/en-us/library/ms733025(v=vs.110).aspx
Usually, it is enough to enable tracing on your web server. However, there might be (very rare) cases where you won't find anything wrong in the server trace logs. In this case, you might also want to enable tracing in the client (same procedure, if you have a .NET client).

Error consuming web service: An existing connection was forcibly closed

I have a Winforms appplication written in C# that consumes web services from a Windows 2008 IIS Coldfusion server. All the web service calls succeed but one, which fails about 50% of the time with the following error:
System.InvalidOperationException was unhandled by user code
Message=There is an error in XML document (1254, 7).
with an inner exception of:
InnerException: System.IO.IOException
Message=Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.
I checked my IIS logs and I get a 503 error (Service Unavailable) and an IIS code of 64 (The specified network is no longer available). Any suggestions would be great.
I run my web service in SOAP UI and I get the following error:
javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Connection reset
This code works fine at one company but this error pops up almost every time for this company I'm currently working with.
I'm not sure this is applicable to the OP's specific situation, but this may help others who arrive here nowadays. One potential cause for this exception involves mismatched security protocols. If the server you're calling requires TLS 1.2 and you're using an older version of ASP.net (<= Version 4.0) you will be using an older security protocol to make your calls unless you change it. You can force ASP.net to use TLS 1.2 (shown below). This can be done anywhere in the application, but I put it just before the line that calls the web service requiring TLS 1.2:
using System.Net;
...
//Enable TLS 1.2
ServicePointManager.SecurityProtocol = (SecurityProtocolType)3072;
// Call the Web Service that requires TLS 1.2
I recently got a similar message when consuming a WCF-Webservice. In my case it turned
out to be a configuration error on the server side. Maybe something is configured differently
on the one server where this happens to you?
My problem was that the default maximum message size was configured to be too small
on the server and this resulted in the same forced connection closing. There is a default
maximum message size to avoid DOS attacks...
If you are using a WCF client to connect to the service, enable service trace logging in your client application with the following config:
<system.diagnostics>
<trace autoflush="true" />
<sources>
<source name="System.ServiceModel"
switchValue="Error"
propagateActivity="true">
<listeners>
<add name="sdt"
type="System.Diagnostics.XmlWriterTraceListener"
initializeData= "ErrorTrace.svclog"
/>
</listeners>
</source>
</sources>
</system.diagnostics>
Download the windows sdk and you have a nice trace viewer for these log files. It helps you get to the bottom of errors in WCF communication.
Using cross platform communication sometimes it happens (once happened to my code) that the exception thrown is not the real description of what is happening inside.
One cause of this exception is that your response time is a little lesser than time required by the webservice method to complete. So try to increase the timeout in your app.config.
If it doesn't work there could be two possible problems in your case.
If SSL is used then there is problem with SSL certificate validity.
There are some invalid characters used in XML for example your platform doesn't support Unicode characters and there is some un-supported character used in XML.
But I hope just increasing the timeout will fix this.
I did get similar error and cause was exception in XML serialization. Mostly if xmlserializer tries to read some property and that get method raises an exception due to some database connection already closed or any resources not available.
Have you tried to log exceptions in error event inside global.asax?
Sometimes if global.asax does not raise error event, then only way to log error through response filter. You can add custom response filter in the web.config, in which you will be able to analyze how much XML was correctly serialized and where it might be failing.
http://msdn.microsoft.com/en-us/library/aa479332.aspx
http://www.raboof.com/projects/elmah/
Intermediate "An existing connection was forcibly closed by the remote host" from only one destination sounds like a networking issue to me.
Try getting logs from the server you try to access and from the involved firewalls of both locations.
You may run Fiddler or NetMon / WireShark / Ethereal to diagnose further.
Connection close happen for any circumstance. Make sure the timeout is abundant on the server and the client, make sure there is no recursion in the data you are returning. Circular reference. Serialization is important in this case because the thing is being serialized when returned.
Do a WCF tracer and check the answer there. Any fault in the server will close the connection. If the server requires username, make sure those are correct. Take care of the SSL error. use WCF client to test the service.
This may be a shot in the dark but here is my theory:
The first error is happening on the web service side with a poor exception being thrown, maybe some invalid data is being passing into the service? This could return the error regarding the XML being malformed. I would do several test cases to see what data is being passed into the service and what causes the issue.
The second error I have seen before in a certain circumstance regarding a web service exception being thrown and a try catch wrapped around a using statement for the service. This combination of logic caused an early exit that wasnt cleaned up.
try to check the existing protocols in your last company and compare them with your current company,I mean TCP/Ip,...
Check the app pool recycling configuration in IIS. I have seen this error, for example, when the "Private Memory Limit" is set to a value (say 100mb) and then the w3wp process exceeds this limit which will cause the app pool to be recycled.
This normally isn't a problem since any existing connections are given time to complete and new connections will be processed by the newly spun-up app pool.
If all the connections are not closed within the shutdown time limit though (normally 90 seconds) then they are killed by IIS and the client may raise the "An existing connection was forcibly closed" error.

What should the client do while the TIBCO EMS server attempts failover?

The TIBCO EMS user's guide (pg 292) says:
The backup server will work indefinitely to either A) become the
primary server or B) reconnect to the primary server. It also says
clients may receive fail-over notification when the switch is successful (see also TIBCO EMS .NET reference pg 220).
I have some questions spinning off of these facts...
What kind of errors occur on the client side while the servers are attempting fail-over/reconnect?
What is the appropriate response from the client?
Get new Connection objects from the ConnectionFactory until one works?
Wait for fail-over notification? (are current Connection instances fixed at this time? or do I need to get a new instance?)
I hope the scenario is clear, any related information or advice would be appreciated too.
I can at least answer #1 above.
If you have enabled Tibems.SetExceptionOnFTSwitch(true); and have set up an exception handler to capture the messages the server sends to the client, you will see the following:
For single-server, non-fault tolerant connection failures:
"Connection has been terminated".
For fault-tolerant connection failures:
"Connection has performed fault-tolerant switch to "
If you attempt to publish while the connection is down, a TIBCO.EMS.IllegalStateException is thrown with the "Producer is closed" message.
for #2 above, I think the answer is to allow the EMS library to handle as much as possible. Once we got the EMS reconnect functionality to work, it gracefully tried to reconnect until the server became available again and once it reconnected, it was like there was never a problem. The only gotcha is probably if you try to publish a message before the ems connection is back. This is where the exception handler comes in, Once notified that you are in failover mode, you can adjust exception handling on the publisher side to suppress the error until the connection is back. The thing I don't know is how do you tell when you've exhausted all reconnect attempts.
Anyway, Seems like our two worlds are closely related when it comes to EMS - hope our findings (based on your comments on my questions) help you.
We use TEMS (Tibco EMS - a Tibco Product for WCF) So it becomes a custom binding. We tried to break it by doing things like bounce the server to force switch overs and it works really well. make sure you are using version 1.2 not 1.1 because you cannot do anything other then client acknowledgement.

How do you deal with transport-level errors in SqlConnection?

Every now and then in a high volume .NET application, you might see this exception when you try to execute a query:
System.Data.SqlClient.SqlException: A transport-level error has
occurred when sending the request to the server.
According to my research, this is something that "just happens" and not much can be done to prevent it. It does not happen as a result of a bad query, and generally cannot be duplicated. It just crops up maybe once every few days in a busy OLTP system when the TCP connection to the database goes bad for some reason.
I am forced to detect this error by parsing the exception message, and then retrying the entire operation from scratch, to include using a new connection. None of that is pretty.
Anybody have any alternate solutions?
I posted an answer on another question on another topic that might have some use here. That answer involved SMB connections, not SQL. However it was identical in that it involved a low-level transport error.
What we found was that in a heavy load situation, it was fairly easy for the remote server to time out connections at the TCP layer simply because the server was busy. Part of the reason was the defaults for how many times TCP will retransmit data on Windows weren't appropriate for our situation.
Take a look at the registry settings for tuning TCP/IP on Windows. In particular you want to look at TcpMaxDataRetransmissions and maybe TcpMaxConnectRetransmissions. These default to 5 and 2 respectively, try upping them a little bit on the client system and duplicate the load situation.
Don't go crazy! TCP doubles the timeout with each successive retransmission, so the timeout behavior for bad connections can go exponential on you if you increase these too much. As I recall upping TcpMaxDataRetransmissions to 6 or 7 solved our problem in the vast majority of cases.
This blog post by Michael Aspengren explains the error message "A transport-level error has occurred when sending the request to the server."
To answer your original question:
A more elegant way to detect this particular error, without parsing the error message, is to inspect the Number property of the SqlException.
(This actually returns the error number from the first SqlError in the Errors collection, but in your case the transport error should be the only one in the collection.)
I had the same problem albeit it was with service requests to a SQL DB.
This is what I had in my service error log:
System.Data.SqlClient.SqlException: A transport-level error has occurred when sending the request to the server. (provider: TCP Provider, error: 0 - An existing connection was forcibly closed by the remote host.)
I have a C# test suite that tests a service. The service and DB were both on external servers so I thought that might be the issue. So I deployed the service and DB locally to no avail. The issue continued. The test suite isn't even a hard pressing performance test at all, so I had no idea what was happening. The same test was failing each time, but when I disabled that test, another one would fail continuously.
I tried other methods suggested on the Internet that didn't work either:
Increase the registry values of TcpMaxDataRetransmissions and TcpMaxConnectRetransmissions.
Disable the "Shared Memory" option within SQL Server Configuration Manager under "Client Protocols" and sort TCP/IP to 1st in the list.
This might occur when you are testing scalability with a large number of client connection attempts. To resolve this issue, use the regedit.exe utility to add a new DWORD value named SynAttackProtect to the registry key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\ with value data of 00000000.
My last resort was to use the old age saying "Try and try again". So I have nested try-catch statements to ensure that if the TCP/IP connection is lost in the lower communications protocol that it does't just give up there but tries again. This is now working for me, however it's not a very elegant solution.
use Enterprise Services with transactional components
I have seen this happen in my own environment a number of times. The client application in this case is installed on many machines. Some of those machines happen to be laptops people were leaving the application open disconnecting it and then plugging it back in and attempting to use it. This will then cause the error you have mentioned.
My first point would be to look at the network and ensure that servers aren't on DHCP and renewing IP Addresses causing this error. If that isn't the case then you have to start trawlling through your event logs looking for other network related.
Unfortunately it is as stated above a network error. The main thing you can do is just monitor the connections using a tool like netmon and work back from there.
Good Luck.
You should also check hardware connectivity to the database.
Perhaps this thread will be helpful:
http://channel9.msdn.com/forums/TechOff/234271-Conenction-forcibly-closed-SQL-2005/
I'm using reliability layer around my DB commands (abstracted away in the repository interfaece). Basically that's just code that intercepts any expected exception (DbException and also InvalidOperationException, that happens to get thrown on connectivity issues), logs it, captures statistics and retries everything again.
With that reliability layer present, the service has been able to survive stress-testing gracefully (constant dead-locks, network failures etc). Production is far less hostile than that.
PS: There is more on that here (along with a simple way to define reliability with the interception DSL)
I had the same problem. I asked my network geek friends, and all said what people have replied here: Its the connection between the computer and the database server. In my case it was my Internet Service Provider, or there router that was the problem. After a Router update, the problem went away. But do you have any other drop-outs of internet connection from you're computer or server? I had...
I experienced the transport error this morning in SSMS while connected to SQL 2008 R2 Express.
I was trying to import a CSV with \r\n. I coded my row terminator for 0x0d0x0a. When I changed it to 0x0a, the error stopped. I can change it back and forth and watch it happen/not happen.
BULK INSERT #t1 FROM 'C:\123\Import123.csv' WITH
( FIRSTROW = 1, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0d0x0a' )
I suspect I am not writing my row terminator correctly because SQL parses one character at a time right while I'm trying to pass two characters.
Anyhow, this error is 4 years old now, but it may provide a bit of information for the next user.
I just wanted to post a fix here that worked for our company on new software we've installed. We were getting the following error since day 1 on the client log file: Server was unable to process request. ---> A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.) ---> The semaphore timeout period has expired.
What completely fixed the problem was to set up a link aggregate (LAG) on our switch. Our Dell FX1 server has redundant fiber lines coming out of the back of it. We did not realize that the switch they're plugged into needed to have a LAG configured on those two ports. See details here: https://docs.meraki.com/display/MS/Switch+Ports#SwitchPorts-LinkAggregation

Categories