I'm using C#.Net and the Socket class from the System.Net.Sockets namespace. I'm using the asynchronous receive methods. I understand this can be more easily done with something like a web service; this question is borne out of my curiosity rather than a practical need.
My question is: assume the client is sending some binary-serialized object of an unknown length. On my server with the socket, how do I know the entire object has been received and that it is ready for deserialization? I've considered prepending the object with the length of the object in bytes, but this seems unnecessary in the .Net world. What happens if the object is larger than the buffer? How would I know, 'hey, gotta resize the buffer because the object is too big'?
You either need the protocol to be self-terminating (like XML is, effectively - you know when you've finished receiving an XML document when it closes the root element) or you need to length-prefix the data, or you need the other end to close the stream when it's done.
In the case of a self-terminated protocol, you need to have enough hooks in so that the reading code can tell when it's finished. With binary serialization you may well not have enough hooks. Length-prefix is by far the easiest solution here.
If you use pure sockets, you need to know the length. Otherwise, the size of the buffer is not relevant, because even if you have a buffer of the size of the whole data, it still may not read all into it - check Stream.Read method, it returns the nr of bites actually read, so you need to loop until all data is received.
Yeah, you won't deserialize until you've rxed all the bytes.
Related
I need a collection type for received bytes in my socket application (which deals with ~5k of concurrent connections).
I tried using a List<byte> but since it has one internal array and I receive lots of data, it can cause OutOfMemoryExceptions.
So I need a collection that,
Keeps the data in smaller blocks; like an Unrolled Linked List.
Provides fast lookup (Preferably an IList<T>) because I look for a delimiter that marks the end of the message after each receive operation.
What I use right now is Stream. I supply a MemoryStream for the operations that don't involve too much data and supply a FileStream of a temporary file for the operations that involve serious amounts of data.
MemoryStream is no different than a List<T>, though and I prefer not to use files as buffers.
So...
What collection or approach do you recommend?
It appears that you are using inappropriate architecture for a network application. You should buffer only those data which is required. Here you are using a list to buffer the data until the required amount of data is received.
I would recommend that you should check for delimiter on each receipt of data in the data itself and if it is there, you should push in only the data till you encounter the delimiter. Once the data is ready, you should fetch it out from list and use it and dispose off the list. Adding up everything to the list is not a good approach and will surely consume a lot of memory.
Ideally, you should have a protocol which always inform you before you actually receive the data about the length of data you are going to receive. This way, you can be sure that required data has been received and you should not rely on the delimiter.
A possible quick and dirty solution:
At the start of the program, allocate a buffer large enough for the largest amount of data you will receive. Use a separate 'count' field to keep track of how much data is currently in use.
(I don't really like this solution; I'd use files or find some way of working with the data in blocks, but it might work for you).
i want to transfer data over sockets and currently i am creating a memory stream.
i can also use a network stream.
Can anyone please help me understand the difference between c# network stream and memory stream?
A NetworkStream is directly related to a socket; it does not know it's own length, you cannot seek, and the read/write functions are directly bound to the receive/send APIs (and therefore, read and write are entirely unrelated to eachother). It can timeout, and a read can take a considerable time if waiting for more data.
A MemoryStream is basically a wrapper over a local byte[]. It has a known length (which can change), you can seek, and read/write are directly related: both increment the same position cursor, and you can write something, rewind, and then read it. All operations are very timely.
It might be easier to ask "what are the similarities", which would be simply: both have a read/write API, by virtue of being subclasses of Stream.
both streams are derive of Stream, this classes are warper for different purpose
According to my understanding, Network Stream reads from the network interface, where if you use a Memory Stream (I mean, in the same scenario), all the data will be loaded to memory first (I assume it reads to the end of the actual stream), then the read operations will read from memory.
The first read operation to occur on the Memory Stream, all the data needs to be loaded in to memory.
Where network stream, you can read the data as they arrive.
I have a class that is wrapping a stream, with the intention of that stream likely being a NetworkStream. However in unit testing it is much easier to use a MemoryStream to verify functionality. However I have noticed that MemoryStream and NetworkStream don't actually act the same on write.
When I write to a MemoryStream the seek pointer of the stream is set to the end of what I wrote. In order to read out what I wrote I have to adjust the Seek pointer back to the beginning of what I wrote.
When I write to a NetworkStream the other end of the stream can read that data without adjusting the seek pointer.
I assume that a NetworkStream is handling the concept of the seek pointer internally and making sure that even though it is filling data into the stream on the other side it is not adjusting the pointer.
My question then is, what is the best way to mock that same behavior in the MemoryStream? I thought I might just seek the pointer back the same number of bytes I write but that seems cludgy. Also If I stack several of these streams together or other wrapped streams, how will they know whether the pointer needs reset or not?
MemoryStream isn't designed for how you are using it. A NetworkStream is designed to allow you to fill up a block of memory with your output. It is not designed to have a second object reading from what the original object is writing.
I would suggest that you create a FakeNetworkStream class. You would have two objects, writing to one would add data to the other. But the advantage of a FakeNetworkStream is that you can implement pathological socket behavior to detect as many bugs as possible.
Don't send any data until the user flushes the stream. Subtle bugs can arise when a socket is not flushed because where or not the data is actually sent can vary. By always waiting until the flush you can have your tests fail when a flush actually need to happen.
Read isn't guaranteed to produce all of the data you request. Simulate this by only ever producing a single byte at a time.
This might help.
https://github.com/billpg/POP3Listener/commit/a73a325f21a955edd9f1023a94c586aba29cfadc#diff-80d1609ed97fdaa9538dea7547301577cc19b6f4466a1a0a3a0ac08ad3eea23e
This function returns two NetworkStream objects, a client and a server, by briefly opening a server, connecting to that server, and returning both ends.
I receive the follow exception:
System.NotSupportedException : This stream does not support seek operations.
at System.Net.Sockets.NetworkStream.Seek(Int64 offset, SeekOrigin origin)
at System.IO.BufferedStream.FlushRead()
at System.IO.BufferedStream.WriteByte(Byte value)
The follow link show that this is a known problem for microsoft.
http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=273186
This stacktrace show 2 things:
The System.IO.BufferedStream do some absurd pointer move operation. A BufferedStream should buffer the underlying stream and not more. The quality of buffer will be bad if there are such seek operation.
It will never work stable with a stream that not support Seek.
Are there any alternatives?
Does I need a buffer together with a NetworkStream in C# or is this already buffered.
Edit: I want simply reduce the number of read/write calls to the underlying socket stream.
The NetworkStream is already buffered. All data that is received is kept in a buffer waiting for you to read it. Calls to read will either be very fast, or will block waiting for data to be received from the other peer on the network, a BufferedStream will not help in either case.
If you are concerned about the blocking then you can look at switching the underlying socket to non-blocking mode.
The solution is to use two independent BufferedStreams, one for receiving and one for sending. And don't forget to flush the sending BufferedStream appropriately.
Since even in 2018 it seems hard to get a satisfying answer to this question, for the sake of humanity, here are my two cents:
The NetworkStream is buffered on the OS side. However, that does not mean there are no reasons to buffer on the .net side. TCP behaves well on Write-Read (repeat), but stalls on Write-Write-Read due to delayed ack, etc, etc.
If you, like me, have a bunch of sub-par protocol code to take into the twentyfirst century, you want to buffer.
Alternatively, if you stick to the above, you could also buffer only reads/rcvs or only writes/sends, and use the NetworkStream directly for the other side, depending on how broken what code is. You just have to be consistent!
What BufferedStream docs fail to make abundantly clear is that you should only switch reading and writing if your stream is seekable. This is because it buffers reads and writes in the same buffer. BufferedStream simply does not work well for NetworkStream.
As Marc pointed out, the cause of this lameness is the conflation of two streams into one NetworkStream which is not one of .net's greatest design decisions.
A BufferedStream simply acts to reduce the number of read/write calls to the underlying stream (which may be IO/hardware bound). It cannot provide seek capability (and indeed, buffering and seeking are in many ways contrary to eachother).
Why do you need to seek? Perhaps copy the stream to something seekable first - a MemoryStream or a FileStream - then do your actual work from that second, seekable stream.
Do you have a specific purpose in mind? I may be able to suggest more appropriate options with more details...
In particular: note that NetworkStream is a curiosity - with most streams, read/write relate to the same physical stream; however, a NetworkStream actually represents two completely independent pipes; read and write are completely unrelated. Likewise, you can't seek in bytes that have already zipped past you... you can skip data, but that is better done by doing a few Read opdrations and discarding the data.
I need a C# implementation of Java's PushbackInputStream. I have made my own very basic one, but I wondered if there was a well tested and decently performing version already available somewhere. As it happens I always push back the same bytes I read so really it just needs to be able to reposition backwards, buffering up to a number of bytes I specify. (like Java's BufferedInputStream with the mark and reset methods).
Update: I should add that I can't simply reposition the stream as CanSeek may be false. (e.g. when the input steam is a NetworkStream)
The problem with pushing data back into a stream is that any readers that sit on top of the stream may already have a local buffer of data. This makes this approach very brittle. Personally, I would try to avoid this scenario, and use data constructs where I either don't need to push back, or can use single-byte Peek etc.
You need to build a wrapper class that either functions as a stream, but supports a buffer of the last X bytes so you can seek back at least for a limited distance, or something that isn't a stream at all where you can indeed "push data back into the input stream".
Either way you're going to have to write something yourself.
Can't you just use a System.IO.Stream and seek backwards after reading from current position?
stream.Seek(-1, System.IO.SeekOrigin.Current)
Where -1 could be a variable of how far you want to go back?
So long as the stream indicates it supports seeking (CanSeek) then
stream.Seek(-offset, System.IO.SeekOrigin.Current)
Will be fine.