Web socket stops responding after awhile - c#

I have a windows service that uses a websocket (from http://sta.github.io/websocket-sharp/) to conenct to Slack and monitor messages.
My connection code looks something like this:
ws = new WebSocket(connection.Url);
ws.OnMessage += async (sender, e) =>
{
var msg = JsonConvert.DeserializeObject<MessageFromSlack>(e.Data);
if (msg.Type == "message" && msg.Text != null && msg.User != UserId)
{
if (userMatcher.IsMatch(msg.Text))
{
await ProcessDirectMessage(msg);
}
await ProcessMessage(msg);
}
if (msg.Type == "channel_joined")
{
await ChannelJoined(msg.ChannelModel.Id);
}
};
ws.OnClose += (sender, e) =>
{
var reason = e.Reason;
var code = e.Code;
System.Diagnostics.Debug.WriteLine($"{code}:{reason}");
};
ws.Connect();
Basically it waits for a message and then if it's directed # my bot, it'll call ProcessDirectMessage and if not it'll call ProcessMessage. The details of those functions are, I think, unimportant (they do some matching looking for key phrases and respond by sending a message back).
This all works fine. For a while. But after some period of time (usually more than a day), it just stops responding altogether. My OnMessage handler never gets hit. I thought that maybe what is happening is the websocket is getting closed on the server side, so I added the OnClose handler, but that never seems to get hit either.
Does anybody have an idea what might be happening here? Is there a way to keep the connection alive, or else reconnect it when it dies?

By the nature of TCP connection - the only reliable way to detect its gone is to write something to it. If you are just reading (waiting for data to arrive) - you can do that for a very long time while the other side is long time dead. That happens if that other side did not close connection gracefully (which involves an exchange of some TCP packets).
Web socket protocol defines special Ping frame, and corresponding Pong frame, which you should use to avoid situation described in the question. From time to time you should send Ping frame and wait (for a certain timeout) for server to respond with Pong frame. If it did not respond in given timeout - assume connection is dead and reconnect.
As far as I know - library you use does not automatically send ping requests on your behalf. However, it allows you to do that via Ping method.
You need to configure timeout with
ws.WaitTime = TimeSpan.FromSeconds(5);
And then, from time to time (for example - when you did not receive any new messages in last X seconds), do:
bool isAlive = ws.Ping();
There is also boolean property which does the same:
bool isAlive = ws.IsAlive;
This is a blocking call (both of the above). It will return true if server replied with Pong during ws.WaitTime interval, and false otherwise. Then you can act accordingly.

Related

WebSocket simplex stream - how can the client close the connection?

I'm using WebSockets to stream values from server to client. The connection should be closed when the stream of values is completed (server-side termination), or when the client stops subscribing (client-side termination).
How can the client gracefully close the connection?
A rough sample to demonstrate the issue in AspNetCore; server pushes a (potentially) infinite stream of values; client subscribes to the first 5 values, and should then close the connection.
app.Use(async (context, next) =>
{
if (context.Request.Path == "/read")
{
var client = new ClientWebSocket();
await client.ConnectAsync(new Uri(#"wss://localhost:7125/write"), CancellationToken.None);
for (int i = 0; i < 5; i++)
await client.ReceiveAsync(new ArraySegment<byte>(new byte[sizeof(int)]), CancellationToken.None);
// This does not seem to have any particular effect on writer
// await client.CloseOutputAsync(WebSocketCloseStatus.NormalClosure, string.Empty, CancellationToken.None);
// This freezes in AspNetcore on .NET6 core (because the implementation waits for the connection to close, which never happens)
// In AspNet (non-core, .Net Framework 4.8), this seems to throw an exception that data cannot be read after the connection has been closed (i.e. the socket seems to only be closeable if no data is pending to be read)
await client.CloseAsync(WebSocketCloseStatus.NormalClosure, string.Empty, CancellationToken.None);
}
if (context.Request.Path == "/write")
{
var ws = await context.WebSockets.AcceptWebSocketAsync();
await foreach (var number in GetNumbers())
{
var bytes = BitConverter.GetBytes(number);
if (ws.State != WebSocketState.Open)
throw new Exception("I wish we'd hit this branch, but we never do!");
await ws.SendAsync(new ArraySegment<byte>(bytes), WebSocketMessageType.Binary, true, CancellationToken.None);
}
}
});
static async IAsyncEnumerable<int> GetNumbers()
{
for (int i = 0; i <= int.MaxValue; i++)
{
yield return i;
await Task.Delay(25);
}
}
The general issue seems to be that the close message isn't picked up by the /write method, i.e. ws.State remains WebSocketState.Open. I'm assuming that onle receive operations update the connection status?
Is there any good way to handle this situation / for the server to pick up the client's request to close the connection?
I would quite like to avoid the client having to send any explicit messages to the server / for the server to have to read the stream explicitly. I'm increasingly wondering if that is possible, though.
Way the WebSocket protocol works is similar to TCP - via connection establishment, the only difference - initiation is done via http[s].
One send action from one side matches one receive action from another, and vice versa.
You can notice this detail (if i am not mistaken) in remarks of documentation:
Exactly one send and one receive is supported on each WebSocket object in parallel.
So, you should receive at least one data segment and recognize CloseIntention message from client. And the same on client side.
How to receive message, recognize close intention, and properly react to it - see here.
How to send close intention message - see here.
Suspect you should call webSocket.ReceiveAsync at least once in background on your server.
Then, in ContinueWith task call CancellationTokenSource.Cancel() for current server socket session.
That repo is working, except of docker-compose. - I am kinda newcomer in complex DevOps things. )
UPDATE
Remarks part of docs is not about matching of send-receive actions on different sides of conversation. Just wanted you to notice how this TCP-concept works, i.e you should receive data at least once.

Handling network disconnect

I am trying to do "long polling" with an HttpWebRequest object.
In my C# app, I am making an HTTP GET request, using HttpWebRequest. And then afterwards, I wait for the response with beginGetResponse(). I am using ThreadPool.RegisterWaitForSingleObject to wait for the response, or to timeout (after 1 minute).
I have set the target web server to take a long time to respond. So that, I have time to disconnect the network cable.
After sending the request, I pull the network cable.
Is there a way to get an exception when this happens? So I don't have to wait for the timeout?
Instead of an exception, the timeout (from RegisterWaitForSingleObject) happens after the 1 minute timeout has expired.
Is there a way to determine that the network connection went down? Currently, this situation is indistinguishable from the case where the web server takes more than 1 minute to respond.
I found a solution:
Before calling beginGetResponse, I can call the following on the HttpWebRequest:
req.ServicePoint.SetTcpKeepAlive( true, 10000, 1000)
I think this means that after 10 seconds of inactivity, the client will send a TCP "keep alive" over to the server. That keep alive will fail if the network connection is down because the network cable was pulled.
So, when the cable is pulled, I a keep alive gets sent within 10 seconds (at most), and then the callback for BeginGetResponse happens. In the callback, I get and exception when I call req.EndGetResponse().
I guess this defeats one of the benefits of long polling, though. Since we're still sending packets around.
I'll leave it to you to try pulling the plug on this.
ManualResetEvent done = new ManualResetEvent(false);
void Main()
{
// set physical address of network adapter to monitor operational status
string physicalAddress = "00215A6B4D0F";
// create web request
var request = (HttpWebRequest)HttpWebRequest.Create(new Uri("http://stackoverflow.com"));
// create timer to cancel operation on loss of network
var timer = new System.Threading.Timer((s) =>
{
NetworkInterface networkInterface =
NetworkInterface.GetAllNetworkInterfaces()
.FirstOrDefault(nic => nic.GetPhysicalAddress().ToString() == physicalAddress);
if(networkInterface == null)
{
throw new Exception("Could not find network interface with phisical address " + physicalAddress + ".");
}
else if(networkInterface.OperationalStatus != OperationalStatus.Up)
{
Console.WriteLine ("Network is down, aborting.");
request.Abort();
done.Set();
}
else
{
Console.WriteLine ("Network is still up.");
}
}, null, 100, 100);
// start asynchronous request
IAsyncResult asynchResult = request.BeginGetResponse(new AsyncCallback((o) =>
{
try
{
var response = (HttpWebResponse)request.EndGetResponse((IAsyncResult)o);
var reader = new StreamReader(response.GetResponseStream(), System.Text.Encoding.UTF8);
var writer = new StringWriter();
writer.Write(reader.ReadToEnd());
Console.Write(writer.ToString());
}
finally
{
done.Set();
}
}), null);
// wait for the end
done.WaitOne();
}
I dont think you are gonna like this. You can test for internet connectivity after you create the request to the slow server.
There are many ways to do that - from another request to google.com (or some ip address in your network) to P/Invoke. You can get more info here: Fastest way to test internet connection
After you create the original request you go in a loop that checks for internet connectivity and until either the internet is down or the original request comes back (it can set a variable to stop the loop).
Helps at all?

RasConnectionNotification after computer resumes from sleep

I've got a project called DotRas on CodePlex that exposes a component called RasConnectionWatcher which uses the RasConnectionNotification Win32 API to receive notifications when connections on a machine change. One of my users recently brought to my attention that if the machine comes out of sleep mode, and attempts to redial the connection, the connection goes into a loop indicating the connection is already being dialed even though it isn't. This loop will not end until the application is restarted, even if done through a synchronous call which all values on the structs are unique for that specific call, and none of it is retained once the call completes.
I've done as much as I can to fix the problem, but I fear the problem is something I've done with the RasConnectionNotification API and using ThreadPool.RegisterWaitForSingleObject which might be blocking something else in Windows.
The below method is used to register 1 of the 4 change types the API supports, and the handle to associate with it to monitor. During runtime, the below method would be called 4 times during initialization to register all 4 change types.
private void Register(NativeMethods.RASCN changeType, RasHandle handle)
{
AutoResetEvent waitObject = new AutoResetEvent(false);
int ret = SafeNativeMethods.Instance.RegisterConnectionNotification(handle, waitObject.SafeWaitHandle, changeType);
if (ret == NativeMethods.SUCCESS)
{
RasConnectionWatcherStateObject stateObject = new RasConnectionWatcherStateObject(changeType);
stateObject.WaitObject = waitObject;
stateObject.WaitHandle = ThreadPool.RegisterWaitForSingleObject(waitObject, new WaitOrTimerCallback(this.ConnectionStateChanged), stateObject, Timeout.Infinite, false);
this._stateObjects.Add(stateObject);
}
}
The event passed into the API gets signaled when Windows detects a change in the connections on the machine. The callback used just takes the change type registered from the state object and then processes it to determine exactly what changed.
private void ConnectionStateChanged(object obj, bool timedOut)
{
lock (this.lockObject)
{
if (this.EnableRaisingEvents)
{
try
{
// Retrieve the active connections to compare against the last state that was checked.
ReadOnlyCollection<RasConnection> connections = RasConnection.GetActiveConnections();
RasConnection connection = null;
switch (((RasConnectionWatcherStateObject)obj).ChangeType)
{
case NativeMethods.RASCN.Disconnection:
connection = FindEntry(this._lastState, connections);
if (connection != null)
{
this.OnDisconnected(new RasConnectionEventArgs(connection));
}
if (this.Handle != null)
{
// The handle that was being monitored has been disconnected.
this.Handle = null;
}
this._lastState = connections;
break;
}
}
catch (Exception ex)
{
this.OnError(new System.IO.ErrorEventArgs(ex));
}
}
}
}
}
Everything works perfectly, other than when the machine comes out of sleep. Now the strange thing is when this happens, if a MessageBox is displayed (even for 1 ms and closed by using SendMessage) it will work. I can only imagine something I've done is blocking something else in Windows so that it can't continue processing while the event is being processed by the component.
I've stripped down a lot of the code here, the full source can be found at:
http://dotras.codeplex.com/SourceControl/changeset/view/68525#1344960
I've come for help from people much smarter than myself, I'm outside of my comfort zone trying to fix this problem, any assistance would be greatly appreciated!
Thanks! - Jeff
After a lot of effort, I tracked down the problem. Thankfully it wasn't a blocking issue in Windows.
For those curious, basically once the machine came out of sleep the developer was attempting to immediately dial a connection (via the Disconnected event). Since the network interfaces hadn't finished initializing, an error was returned and the connection handle was not being closed. Any attempts to close the connection would throw an error indicating the connection was already closed, even though it wasn't. Since the handle was left open, any subsequent attempts to dial the connection would cause an actual error.
I just had to make an adjustment in the HangUp code to hide the error thrown when a connection is closed that has already been closed.

MSMQ Receive() method timeout

My original question from a while ago is MSMQ Slow Queue Reading, however I have advanced from that and now think I know the problem a bit more clearer.
My code (well actually part of an open source library I am using) looks like this:
queue.Receive(TimeSpan.FromSeconds(10), MessageQueueTransactionType.Automatic);
Which is using the Messaging.MessageQueue.Receive function and queue is a MessageQueue. The problem is as follows.
The above line of code will be called with the specified timeout (10 seconds). The Receive(...) function is a blocking function, and is supposed to block until a message arrives in the queue at which time it will return. If no message is received before the timeout is hit, it will return at the timeout. If a message is in the queue when the function is called, it will return that message immediately.
However, what is happening is the Receive(...) function is being called, seeing that there is no message in the queue, and hence waiting for a new message to come in. When a new message comes in (before the timeout), it isn't detecting this new message and continues waiting. The timeout is eventually hit, at which point the code continues and calls Receive(...) again, where it picks up the message and processes it.
Now, this problem only occurs after a number of days/weeks. I can make it work normally again by deleting & recreating the queue. It happens on different computers, and different queues. So it seems like something is building up, until some point when it breaks the triggering/notification ability that the Receive(...) function uses.
I've checked a lot of different things, and everything seems normal & isn't different from a queue that is working normally. There is plenty of disk space (13gig free) and RAM (about 350MB free out of 1GB from what I can tell). I have checked registry entries which all appear the same as other queues, and the performance monitor doesn't show anything out of the normal. I have also run the TMQ tool and can't see anything noticably wrong from that.
I am using Windows XP on all the machines and they all have service pack 3 installed. I am not sending a large amount of messages to the queues, at most it would be 1 every 2 seconds but generally a lot less frequent than that. The messages are only small too and nowhere near the 4MB limit.
The only thing I have just noticed is the p0000001.mq and r0000067.mq files in C:\WINDOWS\system32\msmq\storage are both 4,096KB however they are that size on other computers also which are not currently experiencing the problem. The problem does not happen to every queue on the computer at once, as I can recreate 1 problem queue on the computer and the other queues still experience the problem.
I am not very experienced with MSMQ so if you post possible things to check can you please explain how to check them or where I can find more details on what you are talking about.
Currently the situation is:
ComputerA - 4 queues normal
ComputerB - 2 queues experiencing problem, 1 queue normal
ComputerC - 2 queues experiencing problem
ComputerD - 1 queue normal
ComputerE - 2 queues normal
So I have a large number of computers/queues to compare and test against.
Any particular reason you aren't using an event handler to listen to the queues? The System.Messaging library allows you to attach a handler to a queue instead of, if I understand what you are doing correctly, looping Receive every 10 seconds. Try something like this:
class MSMQListener
{
public void StartListening(string queuePath)
{
MessageQueue msQueue = new MessageQueue(queuePath);
msQueue.ReceiveCompleted += QueueMessageReceived;
msQueue.BeginReceive();
}
private void QueueMessageReceived(object source, ReceiveCompletedEventArgs args)
{
MessageQueue msQueue = (MessageQueue)source;
//once a message is received, stop receiving
Message msMessage = null;
msMessage = msQueue.EndReceive(args.AsyncResult);
//do something with the message
//begin receiving again
msQueue.BeginReceive();
}
}
We are also using NServiceBus and had a similar problem inside our network.
Basically, MSMQ is using UDP with two-phase commits. After a message is received, it has to be acknowledged. Until it is acknowledged, it cannot be received on the client side as the receive transaction hasn't been finalized.
This was caused by different things in different times for us:
once, this was due to the Distributed Transaction Coordinator unable to communicate between machines as firewall misconfiguration
another time, we were using cloned virtual machines without sysprep which made internal MSMQ ids non-unique and made it receive a message to one machine and ack to another. Eventually, MSMQ figures things out but it takes quite a while.
Try this
public Message Receive( TimeSpan timeout, Cursor cursor )
overloaded function.
To get a cursor for a MessageQueue, call the CreateCursor method for that queue.
A Cursor is used with such methods as Peek(TimeSpan, Cursor, PeekAction) and Receive(TimeSpan, Cursor) when you need to read messages that are not at the front of the queue. This includes reading messages synchronously or asynchronously. Cursors do not need to be used to read only the first message in a queue.
When reading messages within a transaction, Message Queuing does not roll back cursor movement if the transaction is aborted. For example, suppose there is a queue with two messages, A1 and A2. If you remove message A1 while in a transaction, Message Queuing moves the cursor to message A2. However, if the transaction is aborted for any reason, message A1 is inserted back into the queue but the cursor remains pointing at message A2.
To close the cursor, call Close.
If you want to use something completely synchronous and without event you can test this method
public object Receive(string path, int millisecondsTimeout)
{
var mq = new System.Messaging.MessageQueue(path);
var asyncResult = mq.BeginReceive();
var handles = new System.Threading.WaitHandle[] { asyncResult.AsyncWaitHandle };
var index = System.Threading.WaitHandle.WaitAny(handles, millisecondsTimeout);
if (index == 258) // Timeout
{
mq.Close();
return null;
}
var result = mq.EndReceive(asyncResult);
return result;
}

How do I obtain the latency between server and client in C#?

I'm working on a C# Server application for a game engine I'm writing in ActionScript 3. I'm using an authoritative server model as to prevent cheating and ensure fair game. So far, everything works well:
When the client begins moving, it tells the server and starts rendering locally; the server, then, tells everyone else that client X has began moving, among with details so they can also begin rendering. When the client stops moving, it tells the server, which performs calculations based on the time the client began moving and the client render tick delay and replies to everyone, so they can update with the correct values.
The thing is, when I use the default 20ms tick delay on server calculations, when the client moves for a rather long distance, there's a noticeable tilt forward when it stops. If I increase slightly the delay to 22ms, on my local network everything runs very smoothly, but in other locations, the tilt is still there. After experimenting a little, I noticed that the extra delay needed is pretty much tied to the latency between client and server. I even boiled it down to a formula that would work quite nicely: delay = 20 + (latency / 10).
So, how would I proceed to obtain the latency between a certain client and the server (I'm using asynchronous sockets). The CPU effort can't be too much, as to not have the server run slowly. Also, is this really the best way, or is there a more efficient/easier way to do this?
Sorry that this isn't directly answering your question, but generally speaking you shouldn't rely too heavily on measuring latency because it can be quite variable. Not only that, you don't know if the ping time you measure is even symmetrical, which is important. There's no point applying 10ms of latency correction if it turns out that the ping time of 20ms is actually 19ms from server to client and 1ms from client to server. And latency in application terms is not the same as in networking terms - you may be able to ping a certain machine and get a response in 20ms but if you're contacting a server on that machine that only processes network input 50 times a second then your responses will be delayed by an extra 0 to 20ms, and this will vary rather unpredictably.
That's not to say latency measurement it doesn't have a place in smoothing predictions out, but it's not going to solve your problem, just clean it up a bit.
On the face of it, the problem here seems to be that that you're sent information in the first message which you use to extrapolate data from until the last message is received. If all else stays constant then the movement vector given in the first message multiplied by the time between the messages will give the server the correct end position that the client was in at roughly now-(latency/2). But if the latency changes at all, the time between the messages will grow or shrink. The client may know he's moved 10 units, but the server simulated him moving 9 or 11 units before being told to snap him back to 10 units.
The general solution to this is to not assume that latency will stay constant but to send periodic position updates, which allow the server to verify and correct the client's position. With just 2 messages as you have now, all the error is found and corrected after the 2nd message. With more messages, the error is spread over many more sample points allowing for smoother and less visible correction.
It can never be perfect though: all it takes is a lag spike in the last millisecond of movement and the server's representation will overshoot. You can't get around that if you're predicting future movement based on past events, as there's no real alternative to choosing either correct-but-late or incorrect-but-timely since information takes time to travel. (Blame Einstein.)
One thing to keep in mind when using ICMP based pings is that networking equipment will often give ICMP traffic lower priority than normal packets, especially when the packets cross network boundaries such as WAN links. This can lead to pings being dropped or showing higher latency than traffic is actually experiencing and lends itself to being an indicator of problems rather than a measurement tool.
The increasing use of Quality of Service (QoS) in networks only exacerbates this and as a consequence though ping still remains a useful tool, it needs to be understood that it may not be a true reflection of the network latency for non-ICMP based real traffic.
There is a good post at the Itrinegy blog How do you measure Latency (RTT) in a network these days? about this.
You could use the already available Ping Class. Should be preferred over writing your own IMHO.
Have a "ping" command, where you send a message from the server to the client, then time how long it takes to get a response. Barring CPU overload scenarios, it should be pretty reliable. To get the one-way trip time, just divide the time by 2.
We can measure the round-trip time using the Ping class of the .NET Framework.
Instantiate a Ping and subscribe to the PingCompleted event:
Ping pingSender = new Ping();
pingSender.PingCompleted += PingCompletedCallback;
Add code to configure and action the ping.
Our PingCompleted event handler (PingCompletedEventHandler) has a PingCompletedEventArgs argument. The PingCompletedEventArgs.Reply gets us a PingReply object. PingReply.RoundtripTime returns the round trip time (the "number of milliseconds taken to send an Internet Control Message Protocol (ICMP) echo request and receive the corresponding ICMP echo reply message"):
public static void PingCompletedCallback(object sender, PingCompletedEventArgs e)
{
...
Console.WriteLine($"Roundtrip Time: {e.Reply.RoundtripTime}");
...
}
Code-dump of a full working example, based on MSDN's example. I have modified it to write the RTT to the console:
public static void Main(string[] args)
{
string who = "www.google.com";
AutoResetEvent waiter = new AutoResetEvent(false);
Ping pingSender = new Ping();
// When the PingCompleted event is raised,
// the PingCompletedCallback method is called.
pingSender.PingCompleted += PingCompletedCallback;
// Create a buffer of 32 bytes of data to be transmitted.
string data = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
byte[] buffer = Encoding.ASCII.GetBytes(data);
// Wait 12 seconds for a reply.
int timeout = 12000;
// Set options for transmission:
// The data can go through 64 gateways or routers
// before it is destroyed, and the data packet
// cannot be fragmented.
PingOptions options = new PingOptions(64, true);
Console.WriteLine("Time to live: {0}", options.Ttl);
Console.WriteLine("Don't fragment: {0}", options.DontFragment);
// Send the ping asynchronously.
// Use the waiter as the user token.
// When the callback completes, it can wake up this thread.
pingSender.SendAsync(who, timeout, buffer, options, waiter);
// Prevent this example application from ending.
// A real application should do something useful
// when possible.
waiter.WaitOne();
Console.WriteLine("Ping example completed.");
}
public static void PingCompletedCallback(object sender, PingCompletedEventArgs e)
{
// If the operation was canceled, display a message to the user.
if (e.Cancelled)
{
Console.WriteLine("Ping canceled.");
// Let the main thread resume.
// UserToken is the AutoResetEvent object that the main thread
// is waiting for.
((AutoResetEvent)e.UserState).Set();
}
// If an error occurred, display the exception to the user.
if (e.Error != null)
{
Console.WriteLine("Ping failed:");
Console.WriteLine(e.Error.ToString());
// Let the main thread resume.
((AutoResetEvent)e.UserState).Set();
}
Console.WriteLine($"Roundtrip Time: {e.Reply.RoundtripTime}");
// Let the main thread resume.
((AutoResetEvent)e.UserState).Set();
}
You might want to perform several pings and then calculate an average, depending on your requirements of course.

Categories