Using NetMQMonitor to detect server disconnects?

Using NetMQMonitor to detect server disconnects? - c#

I am looking for a better way to detect disconnects when a Router/server goes down or is unavailable due to a poor connection. (I'm Listening from a Dealer/client running on wifi) I found zmq_socket_monitor() and discovered that NetMQ has the same feature. My understanding from the documentation is that when you monitor a socket you give it an inproc address, and it notifies you of any socket changes using that address. I couldn't really find any examples of the NetMQMonitor except the unit tests, my question is if I am using it correctly in the code below? Is it valid to use it alongside a NetMQPoller?
// run poller on a separate thread
_poller = new NetMQPoller { _dealer, _subscriber, _outgoingMessageQueue, _subscriptionChanges};
_poller.RunAsync();
// run a monitor listening for Connected and Disconnected events
_monitor = new NetMQMonitor(_dealer, "inproc://rep.inproc", SocketEvents.Disconnected | SocketEvents.Connected);
_monitor.EventReceived += _monitor_EventReceived;
_monitor.StartAsync();
**** UPDATE ****
So... after posting this I discovered the answer in the NetMQPoller tests on github, so that answers whether you can use the NetMQMonitor with a NetMQPoller, but I'm still curious if anyone has thoughts on the overall approach of using a monitor to track connection state. Here's the relevant code for anyone interested:
[Fact]
public void Monitoring()
{
var listeningEvent = new ManualResetEvent(false);
var acceptedEvent = new ManualResetEvent(false);
var connectedEvent = new ManualResetEvent(false);
using (var rep = new ResponseSocket())
using (var req = new RequestSocket())
using (var poller = new NetMQPoller())
using (var repMonitor = new NetMQMonitor(rep, "inproc://rep.inproc", SocketEvents.Accepted | SocketEvents.Listening))
using (var reqMonitor = new NetMQMonitor(req, "inproc://req.inproc", SocketEvents.Connected))
{
repMonitor.Accepted += (s, e) => acceptedEvent.Set();
repMonitor.Listening += (s, e) => listeningEvent.Set();
repMonitor.AttachToPoller(poller);
int port = rep.BindRandomPort("tcp://127.0.0.1");
reqMonitor.Connected += (s, e) => connectedEvent.Set();
reqMonitor.AttachToPoller(poller);
poller.RunAsync();
req.Connect("tcp://127.0.0.1:" + port);
req.SendFrame("a");
rep.SkipFrame();
rep.SendFrame("b");
req.SkipFrame();
Assert.True(listeningEvent.WaitOne(300));
Assert.True(connectedEvent.WaitOne(300));
Assert.True(acceptedEvent.WaitOne(300));
}
}

Using the monitor is exactly the right way to look for changes in connection state.
Under the hood the management threads are ping pinging across the connection. If the ping pongs dry up, then there is a problem. This detects network issues, but also detects things like crashes; if the process at one of a socket dies, the process at the other end is informed of the dead connection.
The only inadequacy is if it matters to you what happens to sent messages. Different sockets cache messages in different places, some being biased to keeping them at the sending end until the receiver is ready, others storing them at the receiving end. If the connection dies and you want your undelivered messages back (to send elsewhere, perhaps), you can't get them. ZMQ is like a post office. As soon as you hand the letter over the counter, they cannot and will not give it back to you, even if you can still see it!
This is the nature of Actor model, which is what ZMQ implements. Communicating Sequential Processes, a development of Actor model, does not store messages in a channel at all, meaning that if the connection dies the application still owns the unsent message. Sometimes it's useful to know for sure if a message definitely was not delivered.

Related

Web socket stops responding after awhile

I have a windows service that uses a websocket (from http://sta.github.io/websocket-sharp/) to conenct to Slack and monitor messages.
My connection code looks something like this:
ws = new WebSocket(connection.Url);
ws.OnMessage += async (sender, e) =>
{
var msg = JsonConvert.DeserializeObject<MessageFromSlack>(e.Data);
if (msg.Type == "message" && msg.Text != null && msg.User != UserId)
{
if (userMatcher.IsMatch(msg.Text))
{
await ProcessDirectMessage(msg);
}
await ProcessMessage(msg);
}
if (msg.Type == "channel_joined")
{
await ChannelJoined(msg.ChannelModel.Id);
}
};
ws.OnClose += (sender, e) =>
{
var reason = e.Reason;
var code = e.Code;
System.Diagnostics.Debug.WriteLine($"{code}:{reason}");
};
ws.Connect();
Basically it waits for a message and then if it's directed # my bot, it'll call ProcessDirectMessage and if not it'll call ProcessMessage. The details of those functions are, I think, unimportant (they do some matching looking for key phrases and respond by sending a message back).
This all works fine. For a while. But after some period of time (usually more than a day), it just stops responding altogether. My OnMessage handler never gets hit. I thought that maybe what is happening is the websocket is getting closed on the server side, so I added the OnClose handler, but that never seems to get hit either.
Does anybody have an idea what might be happening here? Is there a way to keep the connection alive, or else reconnect it when it dies?

By the nature of TCP connection - the only reliable way to detect its gone is to write something to it. If you are just reading (waiting for data to arrive) - you can do that for a very long time while the other side is long time dead. That happens if that other side did not close connection gracefully (which involves an exchange of some TCP packets).
Web socket protocol defines special Ping frame, and corresponding Pong frame, which you should use to avoid situation described in the question. From time to time you should send Ping frame and wait (for a certain timeout) for server to respond with Pong frame. If it did not respond in given timeout - assume connection is dead and reconnect.
As far as I know - library you use does not automatically send ping requests on your behalf. However, it allows you to do that via Ping method.
You need to configure timeout with
ws.WaitTime = TimeSpan.FromSeconds(5);
And then, from time to time (for example - when you did not receive any new messages in last X seconds), do:
bool isAlive = ws.Ping();
There is also boolean property which does the same:
bool isAlive = ws.IsAlive;
This is a blocking call (both of the above). It will return true if server replied with Pong during ws.WaitTime interval, and false otherwise. Then you can act accordingly.

Rabbit MQ - Recovery of connection/channel/consumer

I am creating a consumer that runs in an infinite loop to read messages from the queue. I am looking for advice/sample code on how to recover abd continue within my infinite loop even if there are network disruptions. The consumer has to stay running as it will be installed as a WindowsService.
1) Can someone please explain how to properly use these settings? What is the difference between them?
NetworkRecoveryInterval
AutomaticRecoveryEnabled
RequestedHeartbeat
2) Please see my current sample code for the consumer. I am using the .Net RabbitMQ Client v3.5.6.
How will the above settings do the "recovery" for me?
e.g. will consumer.Queue.Dequeue block until it is recovered?
That doesn't seem right
so...
Do I have to code for this manually? e.g. will consumer.Queue.Dequeue throw an exception for which I have to detect and manually re-create my connection, channel, and consumer? Or just the consumer, as "AutomaticRecovery" will recover the channel for me?
Does this mean I should move the consumer creation inside the while loop? what about the channel creation? and the connection creation?
3) Assuming I have to do some of this recovery code manually, are there event callbacks (and how do I register for them) to tell me that there are network problems?
Thanks!
public void StartConsumer(string queue)
{
using (IModel channel = this.Connection.CreateModel())
{
var consumer = new QueueingBasicConsumer(channel);
const bool noAck = false;
channel.BasicConsume(queue, noAck, consumer);
// do I need these conditions? or should I just do while(true)???
while (channel.IsOpen &&
Connection.IsOpen &&
consumer.IsRunning)
{
try
{
BasicDeliverEventArgs item;
if (consumer.Queue.Dequeue(Timeout, out item))
{
string message = System.Text.Encoding.UTF8.GetString(item.Body);
DoSomethingMethod(message);
channel.BasicAck(item.DeliveryTag, false);
}
}
catch (EndOfStreamException ex)
{
// this is likely due to some connection issue -- what am I to do?
}
catch (Exception ex)
{
// should never happen, but lets say my DoSomethingMethod(message); throws an exception
// presumably, I'll just log the error and keep on going
}
}
}
}
public IConnection Connection
{
get
{
if (_connection == null) // _connection defined in class -- private static IConnection _connection;
{
_connection = CreateConnection();
}
return _connection;
}
}
private IConnection CreateConnection()
{
ConnectionFactory factory = new ConnectionFactory()
{
HostName = "RabbitMqHostName",
UserName = "RabbitMqUserName",
Password = "RabbitMqPassword",
};
// why do we need to set this explicitly? shouldn't this be the default?
factory.AutomaticRecoveryEnabled = true;
// what is a good value to use?
factory.NetworkRecoveryInterval = TimeSpan.FromSeconds(5);
// what is a good value to use? How is this different from NetworkRecoveryInterval?
factory.RequestedHeartbeat = 5;
IConnection connection = factory.CreateConnection();
return connection;
}

RabbitMQ features
The documentation on RabbitMQ's site is actually really good. If you want to recover queues, exchanges and consumers, you're looking for topology recovery, which is enabled by default. Automatic Recovery (which is enabled by default) includes:
Reconnect
Restore connection listeners
Re-open channels
Restore channel listeners
Restore channel basic.qos setting, publisher confirms and transaction settings
The NetworkRecoveryInterval is the amount of time before a retry on an automatic recovery is performed (defaults to 5s).
Heartbeat has another purpose, namely to identify dead TCP connections. There are more to read about that at RabbitMQ's site.
Code sample
Writing reliable code for recovery is tricky. The EndOfStreamException is (as you suspect) most likely due to network problems. If you use the management plugin, you can reproduce this by closing the connection from there and see that the exception is triggered. For production-like applications, you might want to have a set of brokers that you alternate between in case of connection failure. If you have several RabbitMQ brokers, you might also want to guard yourself against long-term server failure on one or more of the servers. You might want to implement error strategies, like requeuing the message, or using a dead letter exchange.
I've been thinking a bit of these things and written a thin client, RawRabbit, that handles some of these things. Maybe it could be something for you? If not, I would suggest that you change the QueueingBasicConsumer to an EventingBasicConsumer. It is event driven, rather than thread blocking.
var eventConsumer = new EventingBasicConsumer(channel);
eventConsumer.Received += (sender, args) =>
{
var body = args.Body;
eventConsumer.Model.BasicAck(args.DeliveryTag, false);
};
channel.BasicConsume(queue, false, eventConsumer);
If you have topology recovery activated, the consumer will be restored by the RabbitMQ Client and start receiving messages again.
For more granular control, hook up event handlers for ConsumerCancelled and Shutdown to detect connectivity problems and Registered to know when the consumer can be used again.

C# program hangs on Socket.Accept()

I created a server "middleman" application that uses sockets and multi-threading techniques (ServerListener is run in a new thread). I found early on that when I would use the Socket.Accept() method, the program would hang indefinitely, waiting for that connection to happen. The problem is, as far as I can tell there is no reason for it not to.
I spent a good portion of the day trying lots of different things to make it work, and somewhere something changed because it suddenly started working for a while. However, as soon as I accidentally chose a different data source than "localhost" for the client application, the problem popped back up again. I have tried running the program without the firewall OR antivirus running, but no luck. The client program IS set to connect on port 10000. Here is my code:
public void ServerListener() {
UpdateStatus("Establishing link to server");
server = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
server.Bind(new IPEndPoint(IPAddress.Any, defaultPort));
server.Listen(queue);
UpdateStatus("Accepting Connections");
while (true) {
Socket client = default(Socket);
try {
client = server.Accept();
if (client != null) {
++count;
UpdateCount(count.ToString());
new Thread(
() => {
Client myclient = new Client(client, defaultPort, this);
}
).Start();
}
}
catch( Exception ex ){
MessageBox.Show(ex.ToString());
client.Close();
}
}
}
It will run just fine right up until server.Accept(), then hangs. As stated, it did work for a while earlier, but is now hanging again. I've tried to see if any other programs are using port 10000, and they aren't. I went over and over this with a friend, and we couldn't find the problem. Please help!
EDIT To be clear, I do know that Accept is a blocking call. The client program makes the connection on port 10000, but this program keeps on waiting on the Accept as if nothing happened. It did work for a time, so I know the connection is working like it is supposed to from the client program's end. However, I can't fathom why this program is now acting like that connection never happens, and continues to wait on the Accept.

Accept blocks on purpose. If you want to do other things while waiting for another client to connect you can:
Run the ServerListener in another Thread or better - a long running task:
using System.Threading.Tasks;
...
Task.Factory.StartNew(ServerListener, TaskCreationOptions.LongRunning);
Use the AcceptAsync method which uses the SocketAsyncEventArgs class. For that to work, you create a new SocketAsyncEventArgs instance, set its values and pass it to socket.AcceptAsync.

Multithreading using AsyncCallback and GUI controls

Multithread programming is a new concept for me. I’ve done a bunch of reading and even with many examples, I just can’t seem to figure it out. I'm new to C# and programming.
I have a winform project with lots of custom controls I’ve imported and will utilize many tcpclients. I’m trying to get each control to be hosted on it’s own separate thread. Right now, I’m trying to get 1 control to behave appropriately with it’s own thread.
I'll show you what I have and then follow up with some questions regarding guidance.
string asyncServerHolder; // gets the server name from a text_changed event
int asyncPortHolder; // gets the port # from a text_changed event
TcpClient wifiClient = new TcpClient();
private void btnStart_Click(object sender, EventArgs e)
{
... // variable initialization, etc.
... // XML setup, http POST setup.
send(postString + XMLString); // Content to send.
}
private void send(string msg)
{
AsyncCallback callBack = new AsyncCallback(ContentDownload);
wifiClient.BeginConnect(asyncServerHolder, asyncPortHolder, callBack, wifiClient);
wifiClient.Client.Send(System.Text.Encoding.ASCII.GetBytes(msg));
}
private void ContentDownload(IAsyncResult result)
{
if (wifiClient.Connected)
{
string response4 = "Connected!!"; //debug msg
byte[] buff = new byte[1024];
int i = wifiClient.Client.Receive(buff);
do
{
response1 = System.Text.Encoding.UTF8.GetString(buff, 0, i);
} while (response1.Length == 0);
response2 = response1.Substring(9, 3); // pick out status code to be displayed after
wifiClient.Client.Dispose();
wifiClient.Close();
}
}
If you're knowledgeable about this, I bet you see lots of problems above. As it stands right now, I always get an exception one my first iteration of running this sequence:
"A request to send or receive data was disallowed because the socket is not connected and (when sending on a datagram socket using a sendto call) no address was supplied"
Why is this? I have confirmed that my asyncServerHolder and my asyncPortHolder are correct. My second iteration of attempting allowed me to see response4 = "Connected!!" but I get a null response on response1.
Eventually I'd like to substitute in my user controls which I have in a List. I'd just like to gracefully connect, send my msg, receive my response and then allow my form to notify me from that particular control which plays host to that tcp client. My next step would be link up many controls.
Some questions:
1) Do I need more TCP clients? Should they be in a list and be the # of controls I have enabled at that time of btnStart_Click?
2) My controls are on my GUI, does that mean I need to invoke if I'm interacting with them?
3) I see many examples using static methods with this context. Why is this?
Thanks in advance. All criticism is welcome, feel free to be harsh!

BeginConnect returns immediately. Probably, no connection has been established yet when Send runs. Make sure that you use the connection only after having connected.
if (wifiClient.Connected) and what if !Connected? You just do nothing. That's not a valid error recovery strategy. Remove this if entirely.
In your read loop you destroy the previously read contents on each iteration. In fact, you can't split up an UTF8 encoded string at all and decode the parts separately. Read all bytes into some buffer and only when you have received everything, decode the bytes to a string.
wifiClient.Client.Dispose();
wifiClient.Close();
Superstitious dispose pattern. wifiClient.Dispose(); is the canonical way to release everything.
I didn't quite understand what "controls" you are talking about. A socket is not a control. UI controls are single-threaded. Only access them on the UI thread.
Do I need more TCP clients?
You need one for each connection.
Probably, you should use await for all blocking operations. There are wrapper libraries that make the socket APIs usable with await.

Waiting for networking C# console application to fully start

I have run into an issue with the slow C# start-up time causing UDP packets to drop initially. Below, I is what I have done to mitigate this start-up delay. I essentially wait an additional 10ms between the first two packet transmissions. This fixes the initial drops at least on my machine. My concern is that a slower machine may need a longer delay than this.
private void FlushPacketsToNetwork()
{
MemoryStream packetStream = new MemoryStream();
while (packetQ.Count != 0)
{
byte[] packetBytes = packetQ.Dequeue().ToArray();
packetStream.Write(packetBytes, 0, packetBytes.Length);
}
byte[] txArray = packetStream.ToArray();
udpSocket.Send(txArray);
txCount++;
ExecuteStartupDelay();
}
// socket takes too long to transmit unless I give it some time to "warm up"
private void ExecuteStartupDelay()
{
if (txCount < 3)
{
timer.SpinWait(10e-3);
}
}
So, I am wondering is there a better approach to let C# fully load all of its dependencies? I really don't mind if it takes several seconds to completely load; I just do not want to do any high bandwidth transmissions until C# is ready for full speed.
Additional relevant details
This is a console application, the network transmission is run from a separate thread, and the main thread just waits for a key press to terminate the network transmitter.
In the Program.Main method I have tried to get the most performance from my application by using the highest priorities reasonable:
public static void Main(string[] args)
{
Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.High;
...
Thread workerThread = new Thread(new ThreadStart(worker.Run));
workerThread.Priority = ThreadPriority.Highest;
workerThread.Start();
...
Console.WriteLine("Press any key to stop the stream...");
WaitForKeyPress();
worker.RequestStop = true;
workerThread.Join();
Also, the socket settings I am currently using are shown below:
udpSocket = new Socket(targetEndPoint.Address.AddressFamily,
SocketType.Dgram,
ProtocolType.Udp);
udpSocket.Ttl = ttl;
udpSocket.SendBufferSize = 1024 * 1024;
udpSocket.Blocking = true;
udpSocket.Connect(targetEndPoint);
The default SendBufferSize is 8192, so I went ahead and moved it up to a megabyte, but this setting did not seem to have any affect on the dropped packets at the beginning.

From the comments I learned that TCP is not an option for you (because of inherent delays in transmission), also you do not want to loose packets due to other side being not fully loaded.
So you actually need to implement some features present in TCP (retransmission) but in more robust and lightweight fashion. I also assume that you are in control of the receiving side.
I propose that you send some predetermined number of packets. And then wait for confirmation. For instance, every packet can have an id that constantly grows. Every N packets, receiving application sends the number of last received packet to the sender. After receiving this number sender will know if it is necessary to repeat last N packets.
This approach should not hurt your bandwidth very much and you will get some sort of information about received data (although not guaranteed).
Otherwise it is best to switch to TCP. By the way did you try using TCP? How much your bandwidth hurts because of it?

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.