Use increased priority for sockets communication in .NET? - c#

I have a C# application that uses System.Net.Sockets.Socket to communicate with some devices on a local network, each time that a message is sent from the application to a device an acknowledge from the device is expected - usually within 200 milliseconds, if the acknowledge is not received within a given timeout period an exception is thrown.
There is one socket per device
Reception is by the socket ReceiveAsync method.
Some users report seeing the acknowledge timeout exception even though I have increased the timeout period to one second, my worry is that the users may be running another application that is cpu intensive and thus interfering with the reception of packages from the devices.
Should I consider raising the priority of my application or does .NET already assign an increased priority to socket events or is the systems time slice for each thread short enough that I do not need to worry about this.
Thank you.

Networks performance can take a "burp" and have a performance hit all the time regardless of the process priority.
With that in mind, 1 second is a really short timeout interval. You didn't say TCP or UDP. For TCP, I'd wait longer - much longer. For UDP, add retry logic.
Users don't want to see exceptions. They want to see the application working.

Related

Socket buffers the data it receives

I have a client .NET application and a server .NET application, connected through sockets.
The client sends a string of 20 or so characters every 500 milliseconds.
On my local development machine, this works perfectly, but once the client and the server are on two different servers, the server is not receiving the string immediately when it's sent. The client still sends perfectly, I've confirmed this with Wireshark. I have also confirmed that the the server does receive the strings every 500 milliseconds.
The problem is that my server application that is waiting for the message only actually receives the message every 20 seconds or so - and then it receives all the content from those 20 seconds.
I use asynchronous sockets and for some reason the callback is just not invoked more than once every 20 seconds.
In AcceptCallback it establishes the connection and call BeginReceive
handler.BeginReceive(state.buffer, 0, StateObject.BufferSize, 0, new AsyncCallback(ReadCallback), state);
This works on my local machine, but on my production server the ReadCallback doesn't happen immediately.
The BufferSize is set to 1024. I also tried setting it to 10. It makes a difference in how much data it will read from the socket at one time once the ReadCallback is invoked, but that's not really the problem here. Once it invokes ReadCallback, the rest works fine.
I'm using Microsofts Asynchronous Server Socket Example so you can see there what my ReadCallback method looks like.
How can I get the BeginReceive callback immediately when data arrives at the server?
--
UPDATE
This has been solved. It was because the server had a a single processor and single core. After adding another core, the problem was instantly solved. ReadCallback is now called immediately when the call goes through to the server.
Thankyou all for your suggestions!!
One approach might be to adjust the SO_SNDBUF option for the send side. SInce you are not running into this problem when both server/client are on the same box, it is possible that having a small buffer is throttling the send side since due to (a possible) slower sending rate between the servers. If the sender cannot send fast enough, then the send-side buffer might be filling up sooner.
Update: we did some debugging and turns out that the issue is with the application being slower.
It might be that the Nagle algorithm is waiting on the sender side for more packets. If you are sending small chunks of data, they will be merged in one so you don't pay a huge TCP header overhead for small data.
You can disable it using: StreamSocketControl.NoDelay
See: http://msdn.microsoft.com/en-us/library/windows/apps/windows.networking.sockets.streamsocketcontrol.nodelay
The Nagle algorithm might be disabled for loopback and this is a possible explanation of why it works when you have both the sender and the receiver on the same machine.
By the request of OP, duplicating my "comment/answer" here.
My guess was, the problem appeared because of thread scheduling on a single-core machine. This is an old problem, almost extinct in the modern age of hyper-threading/multi-core processors. When a thread is spawned in the course of execution of the program, it needs scheduled time to run.
On a single-core machine, if one thread continues to execute without explicitly passing control to OS scheduler (by waiting for mutex/signal or by calling Sleep), the execution of any other thread (in the same process and with lower priority) may be postponed indefinitely by the scheduler. Hence, in the case described, the asynchronous network thread was (most likely) just starved for execution time - getting only pieces from time to time.
Adding second CPU/core, obviously, fixed that by providing a parallel scheduling environment.

Communication WCF constant connection

When using WCF for 2 computer to communicate over the network, i am executing a method on the remote server, the time the operation can take is not known it can take from 1 second to a day or more, so i want to set the ((IClientChannel)pipeProxy).OperationTimeout property to a high value, but is this the way to go or is this a dirty way of programming, because a connection is active for the whole time (it is all on a relatively stable lan network).
I wouldn't do it like that. Such a long timeout is likely to cause issues.
I would split the operation into two: One call from client to server which starts the operation, and then a callback from the server to the client to say that it's finished. The callback would of course include any result information (success, failure etc).
For something which takes such a long time, you might also want to introduce a "keep alive" mechanism where the client periodically calls the server to check that it is still responding.
If you have a very long timeout, it makes it hard to know if something has actually gone wrong. But if you split the operation into two, it makes it impossible to know if something has gone wrong unless you poll occasionally with a keep-alive (or more accurately, "are you alive?") style message.
Alternatively, you could have the server call back occasionally with a progress message, but that's a bit harder to manage than having the client polling the server occasionally (because the client would have to track the last time the server called it back to determine if the server had stopped responding).

Windows kernel queuing outbound network connections

We have an application (a meta-search engine) that must make 50 - 250 outbound HTTP connections in response to a user action, frequently.
The way we do this is by creating a bunch of HttpWebRequests and running them asynchronously using Action.BeginInvoke. This obviously uses the ThreadPool to launch the web requests, which run synchronously on their own thread. Note that it is currently this way as this was originally a .NET 2.0 app and there was no TPL to speak of.
Using ETW (our event sources combined with the .NET framework and kernal ones) and NetMon is that while the thread pool can start 200 threads running our code in about 300ms (so, no threadpool exhaustion issues here), it takes up a variable amount of time, sometimes up to 10 - 15 seconds for the Windows kernel to make all the TCP connections that have been queued up.
This is very obvious in NetMon - you see around 60 - 100 TCP connections open (SYN) immediately (the number varies, but it's never more then around 120), then the rest trickle in over a period of time. It's as if the connections are being queued somewhere, but I don't know where and I don't know how to tune this to we can perform more concurrent outgoing connections. Perfmon Outbound Connection Queue stays at 0 but in the Connections Established counter you can see an initial spike of connections then a gradual increase as the rest filter through.
It does appear that latency to the endpoints to which we are connecting play a part, as running the code close to the endpoints that it connects to doesn't show the problem as significantly.
I've taken comprehensive ETW traces but there is no decent documentation on many of the Microsoft providers, which would be a help I'm sure.
Any advice to work around this or advice on tuning windows for a large amount of outgoing connections would be great. The platform is Win7 (dev) and Win2k8R2 (prod).
It looks like slow DNS queries are the culprit here. Looking at the ETW provider "Microsoft-Windows-Networking-Correlation", I can trace the network call from inception to connection and note that many connections are taking > 1 second at the DNS resolver (Microsoft-Windows-RPC).
It appears our local DNS server is slow/can't handle the load we are throwing at it and isn't caching aggressively. Production wasn't showing as severe symptoms as the prod DNS servers do everything right.

500 Socket clients trying to push data to my socket server. Need help in handling

I am required to create a high performance application where I will be getting 500 socket messages from my socket clients simultaneously. Based on my logs i could see that my dual core system is processing 80 messages at a time.
I am using Async sockets (.BeginRecieve) and i have set NoDelay to true
From the logs from my clients and my server i could see that the message i wrote from my client is read by my server after 3-4 sec.
My service time of my application should be lot lesser.
First, you should post your current code so any potential bugs can be identified.
Second, if you're on .NET 3.5, you might want to look at the SocketAsyncEventArgs enhancements.
Start looking at your resource usages:
CPU usage - both on the overall system, as well as your specific process.
Memory usage - same as above.
Networking statistics.
Once you identify where the bottleneck is, both the community and yourself will have an easier time looking at what to focus on next in improving the performance.
A review of your code may also be necessary - but this may be more appropriate for https://codereview.stackexchange.com/.
When you do a socket.listen, what is your backlog set to? I can't speak to .net 4.0, but with 2.0 I have seen a problem where once your backlog is filled up (too many connection attempts too fast) then some of the sockets will get a TCP accept and then a TCP Reset. The Client then may or may not attempt to reconnect later again. This causes a connection bottleneck rather than a data throughput or a processing bottleneck.

UDP Delay Potential

I have an application that consists of numerous systems using UDP clients in remote locations. All clients send UDP packets to a central location for processing. In my application, it is critical that the central location knows what time the packet was sent by the remote location.
From a design perspective, would it be "safe" to assume that the central location could timestamp the packets as they arrive and use that as the "sent time"? Since the app uses UDP, the packets should either arrive immediately or not arrive at all? The other option would be to set up some kind of time syncing on each remote location. The disadvantage to this is that then I would need to continually ensure that the time syncing is working on each of potentially hundreds of remote locations.
My question is whether timestamping the UDP packets at the central location to determine "sent time" is a potential flaw. Is it possible to experience any delay with UDP?
For seconds resolution you can use time stamping of when you receive the packet, but you still need to use a sequence number to block re-ordered or duplicate packets.
This can make your remote stations less complex as they won't need a battery backed clock or synchronisation techniques.
For millisecond resolution you would want to calculate the round trip time (RTT) and use that offset to the clock on the receiver.
Unless you are using the precision time protocol (PTP) in a controlled environment you can never trust the clock of remote hosts.
There is always a delay in transmission, and UDP packets do not have guaranteed delivery nor are they guaranteed to arrive in sequence.
Would need more information about the context to recommend a better soluion.
One option would be to require that the client clocks are synchronized with an external atomic clock. To ensure this, and to make your UDP more robust, the server can reject any packets that arrive "late" (as determained by the difference in the server's clock--also externally sync'd--and the packet timestamp).
If your server is acking packets, it can report to the client that it is (possibly) out of sync so that it can re-sync itself.
If your server is not acking packets, your whole scheme is probalby going to fail anyhow due to dropped or out of ordered packets.

Categories