It seems that with any given input both of these functions return the same value.
Does that mean that my computer is using big-endian (Win7)? Because I know NetworkOrder is in big-endian so converting between the two should do nothing, then?
I am a bit confused on when I have to use these functions. I am trying to write a simple client-server program and am currently just familiarizing myself with what MSDN has to say about the NetworkStream, IPAdress, and TcpClient classes.
When would I need to use these functions, if at all? When sending byte arrays to the server and back would I need to call these functions on the individual bytes before sending them off? I'd imagine not.. what about if I prepend the data with a length integer. Would I need to call HostToNetworkOrder on that?
Both functions do the exact same conversion; there's two functions so that your code will be more readable and the intention will stand out better.
Your windows system is running on an Inten (or AMD) processor and it's the processor that sets the word format... these are little endian machines.
Related
Okay so I'm trying to make a multiplayer game using a self built netcode in unity3d using c#,
The thing is since I'm using raw tcp I need to convert everything to a byte[] but I got sick of using Array.Copy. Since I'm reserving a few bytes of every message sent over a network as sort of a message identifier that I can use to interpret the data I receive.
So my question is, for the purpose of making this code more friendly to myself, is it a terrible idea to use a list of bytes instead of a byte array and once I've prepared the message to be sent I can just call .ToArray on that list?
Would this be terrible for performance?
As a general rule when dealing with sockets: you should usually be working with oversized arrays (ArrayPool<byte>.Shared can be useful here), and then use the Send overloads that accepts either the byte[], int, int (offset+count), or ArraySegment<byte> - so you aren't constantly re-copying things, and can re-use buffers. However, frankly: you may also way to look into "pipelines"; this is the new IO API that Kestrel uses that deals with all the buffer management for you, so you don't have to. In the core framework, pipelines is currently mostly server-focused, but this is being improved hopefully in .NET 5 as part of "Bedrock" - however, client-side pipelines wrappers are available in Pipelines.Sockets.Unofficial (on NuGet).
To be explicit: no, constantly calling ToArray() on a List<byte> is not a good way of making buffers more convenient; that will probably work, but could contribute to GC stalls, plus it is unnecessary overhead in your socket code.
I'm writing an application where two applications (say server and client) are communicating via a TCP-based connection on localhost.
The code is fairly performance critical, so I'm trying to optimize as best as possible.
The code below is from the server application. To send messages, my naive approach was to create a BinaryWriter from the TcpClient's stream, and write each value of the message via the BinaryWriter.
So let's say the message consists of 4 values; a long, followed by a bolean value, and then 2 more longs; the naive approach was:
TcpClient client = ...;
var writer = new BinaryWriter(client.GetStream());
// The following takes ca. 0.55ms:
writer.Write((long)123);
writer.Write(true);
writer.Write((long)456);
writer.Write((long)2);
With 0.55ms execution time, this strikes me as fairly slow.
Then, I've tried the following instead:
TcpClient client = ...;
// The following takes ca. 0.15ms:
var b1 = BitConverter.GetBytes((long)123);
var b2 = BitConverter.GetBytes(true);
var b3 = BitConverter.GetBytes((long)456);
var b4 = BitConverter.GetBytes((long)2);
var result = new byte[b1.Length + b2.Length + b3.Length + b4.Length];
Array.Copy(b1, 0, result, 0, b1.Length);
Array.Copy(b2, 0, result, b1.Length, b2.Length);
Array.Copy(b3, 0, result, b1.Length + b2.Length, b3.Length);
Array.Copy(b4, 0, result, b1.Length + b2.Length + b3.Length, b4.Length);
client.GetStream().Write(result, 0, result.Length);
The latter runs in ca 0.15ms, while the first approach took roughly 0.55ms, so 3-4 times slower.
I'm wondering ... why?
And more importantly, what would be the best way to write messages as fast as possible (while maintaining at least a minimum of code readability)?
The only way I could think of right now is to create a custom class similar to BinaryWriter;
but instead of writing each value directly to the stream, it would buffer a certain amount of data (say 10,000 bytes or such) and only send it to the stream when its internal buffer is full, or when some .Flush() method is explicitly called (e.g. when message is done being written).
This should work, but I wonder if I'm overcomplicating things and there's an even simpler way to achieve good performance?
And if this was indeed the best way - any suggestions how big the internal buffer should ideally be? Does it make sense to align this with Winsock's send and receive buffers, or best to make it as big as possible (or rather as big as sensible given memory constraints)?
Thanks!
The first code does four blocking network-IO operations, while the second one does only one. Usually, most types of IO operations incur in quite heavy overhead, so you would presumably want to avoid small writes/reads and batch things up.
You should always serialize your data, and if posible, batch it into a single message. This way you would avoid as much IO overhead as possible.
Probably the question is more about Interprocess Communication (IPC) rather than TCP protocol. There are multiple options to use for IPC (see Interprocess Communications page on Microsoft Dev Center). First you need to define your system requirements (how the system should perform/scale), than you need to choose a simplest option that works best in your particular scenario using performance metrics.
Relevant excerpt from Performance Culture article by Joe Duffy:
Decent engineers intuit. Good engineers measure. Great engineers do both.
Measure what, though?
I put metrics into two distinct categories:
Consumption metrics. These directly measure the resources consumed by running a test.
Observational metrics. These measure the outcome of running a test, observationally, using metrics “outside” of the system.
Examples of consumption metrics are hardware performance counters, such as instructions retired, data cache misses, instruction cache misses, TLB misses, and/or context switches. Software performance counters are also good candidates, like number of I/Os, memory allocated (and collected), interrupts, and/or number of syscalls. Examples of observational metrics include elapsed time and cost of running the test as billed by your cloud provider. Both are clearly important for different reasons.
As for TCP, I don't see the point of writing data in small pieces when you can write it at once. You can use BufferedStream to decorate TCP client stream instance and use same BinaryWriter with it. Just make sure you don't mix reads and writes in a way that forces BufferedStream to try to write internal buffer back to the stream, because that operation is not supported in NetworkStream. See Is it better to send 1 large chunk or lots of small ones when using TCP? and Why would BufferedStream.Write throw “This stream does not support seek operations”? discussions on StackOverflow.
For more information check Example of Named Pipes, C# Sockets vs Pipes, IPC Mechanisms in C# - Usage and Best Practices, When to use .NET BufferedStream class? and When is optimisation premature? discussions on StackOverflow.
I was wondering about the order of sent and received bytes by/from TCP socket.
I got implemented socket, it's up and working, so that's good.
I have also something called "a message" - it's a byte array that contains string (serialized to bytes) and two integers (converted to bytes). It has to be like that - project specifications :/
Anyway, I was wondering about how it is working on bytes:
In byte array, we have order of bytes - 0,1,2,... Length-1. They sit in memory.
How are they sent? Is last one the first to send? Or is it the first? Receiving, I think, is quite easy - first byte to appear gets on first free place in buffer.
I think a little image I made nicely shows what I mean.
They are sent in the same order they are present in memory. Doing otherwise would be more complex... How would you do if you had a continuous stream of bytes? Wait that the last one has been sent and then reverse all? Or this inversion should work "packet by packet"? So each block of 2k bytes (or whatever is the size of the TCP packets) is internally reversed but the order of the packets is "correct"?
Receiving, I think, is quite easy - first byte to appear gets on first free place in buffer.
Why on the earth the sender should reverse the bytes but the receiver shouldn't? If you build a symmetric system, both do an action or none does it!
Note that the real problem is normally the one of endianness. The memory layout of an int on your computer could be different than the layout of an int of another computer. So one of the two computers could have to reverse the 4 bytes of the int. But endianness is something that is resolved "primitive type" by "primitive type". Many internet protocols, for historical reason, are Big Endian, while Intel CPUs are Little Endians. Even internal fields of TCP are Big Endian (see Big endian or Little endian on net?), but here we are speaking of fields of TCP, not of the data moved by the TCP protocol.
When writing functions that operate on a potentially infinite "stream" of data, i.e. bytes, chars, whatever, what are some design considerations in deciding to use Strings/Arrays vs. Streams for the input/output?
Is there a huge performance impact of always writing functions to use streams, and then making overload methods that use stream wrappers (i.e. StringReader/Writer) to return "simple data" like an array or a string that doesnt require disposing and other considerations?
I would think functions operating on arrays are far more convenient because you can "return" a resulting array and you don't usually have to worry about disposing. I assume stream operators are nice because they can operate on an infinite source of data, probably memory-efficient as well.
If you are working with binary data of unknown size always use streams. Reading an entire file into a byte array for example is usually bad idea if it can be avoided. Most functions in .Net that work with binary data such as encryption and compression are built to use streams as input/output.
If you are writing a function to process a stream of data, then why not pass it as an IEnumerable<T>. You can then return a stream as an IEnumerable<T> in a generator function. In other words using return yield to return each result one a ta time.
You can end up with asymptotic improvements in performance in some cases because the evaluation is done as needed.
I'm using C#.Net and the Socket class from the System.Net.Sockets namespace. I'm using the asynchronous receive methods. I understand this can be more easily done with something like a web service; this question is borne out of my curiosity rather than a practical need.
My question is: assume the client is sending some binary-serialized object of an unknown length. On my server with the socket, how do I know the entire object has been received and that it is ready for deserialization? I've considered prepending the object with the length of the object in bytes, but this seems unnecessary in the .Net world. What happens if the object is larger than the buffer? How would I know, 'hey, gotta resize the buffer because the object is too big'?
You either need the protocol to be self-terminating (like XML is, effectively - you know when you've finished receiving an XML document when it closes the root element) or you need to length-prefix the data, or you need the other end to close the stream when it's done.
In the case of a self-terminated protocol, you need to have enough hooks in so that the reading code can tell when it's finished. With binary serialization you may well not have enough hooks. Length-prefix is by far the easiest solution here.
If you use pure sockets, you need to know the length. Otherwise, the size of the buffer is not relevant, because even if you have a buffer of the size of the whole data, it still may not read all into it - check Stream.Read method, it returns the nr of bites actually read, so you need to loop until all data is received.
Yeah, you won't deserialize until you've rxed all the bytes.