c# Serial Port Binary stream processing - c#

I have a serial device that has a binary output and I capture the data using the following.
private void port_DataReceived(object sender, SerialDataReceivedEventArgs e)
{
int count = sp.BytesToRead;
byte[] data = new byte[count];
sp.Read(data, 0, data.Length);
file.WriteLine(BitConverter.ToString(data));
}
The data comes through and looks like this...
06-14-F2-A1-64-2D-62-00-1A-31-00-06-14-F3-84-62-59-01-00-1A-31-00-06-14-F3-85-56-52-55-31
1A-31-00-06-14-F4-18-04-2E-62-00-1A-31-00-06-14-F4-E3-27-5B-01-00-1A-31-00-06-14-F4-E4-1C-51-55-31
1A-31-00-06-14-F5-71-4C-59-71-20-1A-31-00-06-14-F5-8E-A5-2E-62-00-1A-31-00-06-14-F5-F4-47-56-55-31-1A-31-00-06-14-F6-10-1A-1A-31-52-24-1A-31-00-06-14-F6-3D-40-19-70-00-1A-31-00-06-14-F6-3E-9C-4C-55-31-1A-33-00-06-14-F6-F6-11-3D-A0-00-17-B0-C8-4E-42-70-AA-00-00-59-51-1E-1A-31-00-06-14-F7-05-4A-2E-62-00-1A-31-00-06-14-F7-83-5C-56-55-31-1A-31-00-06-14-F7-99-04-5A-01-00-1A-31-00-06-14-F7-99-F8-51-55-31-1A-31-00-06-14-F8-7B-EA-2E-62-00-1A-31-00-06-14-F9-00-CE-56-01-00-1A-31-00-06-14-F9-0E-DF-51-55-31-1A-31-00-06-14-F9-F2-8B-2B-62-00-1A-31-00-06-14-FA-15-1F-1D-05-30-1A-31-00-06-14-FA-62-4D-59-01-00-1A-31-00-06-14-FA-63-41-55-55-31-1A-31-00-06-14-FA-6F-6E-1D-67-67-1A-31-00-06-14-FA-EC-50-2E-72-00-1A-31-00-06-14-FB-22-96-38-62-00-1A-31-00-06-14-FB-3B-7A-40-20-43-1A-31-00-06-14-FB-69-2E-2B-62-00-1A-31-00-06-14-FC-62-F1-2D-72-00-1A-31-00-06-14-FC-DF-D1-2E-62-00-1A-31-00-06
The hex isn't the issue here as I can decode that but a statement I am looking for begins with 1A-31 and then is a set number of bytes long. As you can see the serial stream in this case starts mid flow and so is not a full statement.
How can I look for this marker, discard the beginning and then start processing. Also bear in mind that this will happen multiple times as the readBuffer at some point will truncate the stream and I will need to piece it back together again?

You're almost there. Your problem is that the data you're streaming comes in chunks, which misalign with where statements begin and end. I'm going to assume that the end of a statement is found by the 1A-31 that identifies the start of the next statement. If this isn't true, reinterpret this answer accordingly.
Now, you will not be able to do anything with the very first pieces of data in your example, which contains half a statement. So, let's start with assuming that the first chunk of data you get indeed starts with 1A-31.
There are now two options:
You can find the entire statement inside the chunk (i.e. you encounter another 1A-31 inside it). In this case, eat it up and do with it whatever you'd want to do with it (I'd add a StatementReceived event and send it there, or something like that). Repeat this exercise until the chunk has been entirely processed.
The statement is not entirely contained inside the chunk. Copy the data you already got to a temporary buffer and wait for the next port_DataReceived call.
If the second option was the case, you know that the data for the next port_DataReceived will not begin with 1A-31 (because the temporary buffer is non-empty). However, you can scan to the end of it (until the next 1A-31), prepend the temporary buffer (stored in the previous port_DataReceived call) to it, and raise StatementReceived and erase the temporary buffer.
With a similar approach, you can also deal with statements that require more than 2 chunks of data to be sent; each time you do not encounter a 1A-31, append the received data to the temporary buffer, until the statement is complete.
Finally, if the very first bytes that you read upon startup do not start with 1A-31, you'll just have to discard those. Can't do something with half a statement.

Related

FileStream.Read() - bytes read

FileStream.Read() returns the amount of bytes read, but... is there any situation other than having reached the end of file, that it will read less bytes than the number of bytes requested and not throw an exception?
the documentation says:
The Read method returns zero only after reaching the end of the stream. Otherwise, Read always reads at least one byte from the stream before returning. If no data is available from the stream upon a call to Read, the method will block until at least one byte of data can be returned. An implementation is free to return fewer bytes than requested even if the end of the stream has not been reached.
But this doesn't quite explain in what situations data would be unavailable and cause the method to block until it can read again. I mean, shouldn't most situations where data is unavailable force an exception?
What are real situations where comparing the number of bytes read against the number of expected bytes could differ (assuming that we're already checking for end of file when we mention number of bytes expected)?
EDIT: A bit more information, reason why I'm asking this is because I've come across a bit of code where the developer pretty much did something like this:
bytesExpected = (remainingBytesInFile > 94208 ? 94208 : remainingBytesInFile
while (bytesRead < bytesExpected)
{
bytesRead += fileStream.Read(buffer, bytesRead, bytesExpected - bytesRead)
}
Now, I can't see any advantage to having this while at all, I'd expect it to throw an exception if it can't read the number of bytes expected (bearing in mind it's already taking into account that there are those many bytes left to read)
What would the reason one could possibly have for something like this? I'm sure I'm missing something
The documentation is for Stream.Read, from which FileStream is derived. Since FileStream is a stream, it should obey the stream contract. Not all streams do, but unless you have a very good reason, you should stick to that.
In a typical file stream, you'll only get a return value smaller than count when you reach the end of file (and it's a pretty simple way of checking for the end of file).
However, in a NetworkStream, for example, you keep reading in a loop until the method returns zero - signalling the end of stream. The same works for file streams - you know you're at the end of the file when Read returns zero.
Most importantly, FileStream isn't just for what you'd consider files - it's also for pseudo-files like standard input/output pipes and COM ports, for example (try opening a file stream on PRN, for example). In that case, you're not reading a file with a fixed length, and the behaviour is the same as with NetworkStream.
Finally, don't forget that FileStream isn't sealed. It's perfectly fine for you to implement a virtualized file system, for example - and it's perfectly fine if your virtualized file system doesn't support seeking, or checking the length of file.
EDIT:
To address your edit, this is exactly how you're supposed to read any stream. Nothing wrong with it. If there's nothing else to read in a stream, the Read method will simply return 0, and you know the stream is over. The only thing is, it seems that he tries to fill his buffer to full, one buffer at a time - this only makes sense if you explicitly need to partition the file by 94208 bytes, and pass that byte[] for further processing somewhere.
If that's not the case, you don't really need to fill the full buffer - you just keep reading (and probably writing on some other side) until Read returns 0. And indeed, by default, FileStream will always fill the whole buffer unless it's built around a pipe handle - but since that's a possibility, you shouldn't rely on the "real file" behaviour, so as long as you need those byte[] for something non-stream (e.g. parsing messages), this is entirely fine. If you're only using the stream as an actual stream, and you're streaming the data somewhere else, it doesn't have a point, really - you only need one while to read the file.
Your expectations would only apply to the case when the stream is reading data off of a no-latency source. Other I/O sources can be slow, which is why the Read method might will not always be able to return immediately. That doesn't mean that there is an error (so no exception), just that it has to wait for data to arrive.
Examples: network stream, file stream on slow disk, etc.
(UPDATE, HDD example) To give an example specific to files (since your case is FileStream, although Read is defined on Stream and so all implementations should fulfill the requirements): mechanical hard-drives go to "sleep" when not active (specially on battery-powered devices, read laptops). Spinning up can take a second or so. That is not an IOException, but your read would have to wait for a second before any data is read.
Simple answer is that on a FileStream it probably never happens.
However keep in mind that the Read method is inherited from Stream which serves as base for many other streams like NetworkStream and in this case you may not be able to read has many bytes as you requested simple because they havent been received from the network yet.
So like the documentation says it all depends on the implementation of the specific type of stream - FileStream, NetworkStream, etc.

How to place a delimiter in a NetworkStream byte array?

I'm setting up a way to communicate between a server and a client. How I am working it at the moment, is that a stream's first byte will contain an indicator of what is coming and then looking up that request's class I can determine the length of the request:
stream.Read(message, 0, 1)
if(message == <byte representation of a known class>)
{
stream.Read(message, 0, Class.RequestSize);
}
I'm curious how to handle the case of when the class size is not known, of if after reading a known request the data is corrupt.
I'm thinking that I can insert in some sort of delimiter into the stream, but since a byte can only be between 0-255, I'm not sure how to go about creating a unique delimiter. Do I want to place a pattern into the stream to represent the end of a message? How can I be sure that this pattern is unique enough to not be mistaken for actual data?
There are different approaches on this. One option would be sending the length of the class name and possible of the whole packet first (e.g. always the first byte). This way you can read just read that byte and then n bytes more to get the class name.
By this approach you don't end up reading a lot of stuff a malicious client sends you with the intent to DoS your application and you can quickly determine if you read enough to handle the packet or if it's not yet complete.
There are some low level bytes which are used especially as delimiters. Start of Text and End of Text have a (hex) value of 0x02 and 0x03 respectively. And you have Start of Heading coupled with End of Transmission, 0x01 and 0x04; you could use these.

how do you account for when TCP does not get all the bytes in one read

I just read an article that says TCPClient.Read() may not get all the sent bytes in one read. How do you account for this?
For example, the server can write a string to the tcp stream. The client reads half of the string's bytes, and then reads the other half in another read call.
how do you know when you need to combine the byte arrays received in both calls?
how do you know when you need to combine the byte arrays received in both calls?
You need to decide this at the protocol level. There are four common models:
Close-on-finish: each side can only send a single "message" per connection. After sending the message, they close the sending side of the socket. The receiving side keeps reading until it reaches the end of the stream.
Length-prefixing: Before each message, include the number of bytes in the message. This could be in a fixed-length format (e.g. always 4 bytes) or some compressed format (e.g. 7 bits of size data per byte, top bit set for the final byte of size data). Then there's the message itself. The receiving code will read the size, then read that many bytes.
Chunking: Like length-prefixing, but in smaller chunks. Each chunk is length-prefixed, with a final chunk indicating "end of message"
End-of-message signal: Keep reading until you see the terminator for the message. This can be a pain if the message has to be able to include arbitrary data, as you'd need to include an escaping mechanism in order to represent the terminator data within the message.
Additionally, less commonly, there are protocols where each message is always a particular size - in which case you just need to keep going until you've read that much data.
In all of these cases, you basically need to loop, reading data into some sort of buffer until you've got enough of it, however you determine that. You should always use the return value of Read to note how many bytes you actually read, and always check whether it's 0, in which case you've reached the end of the stream.
Also note that this doesn't just affect network streams - for anything other than a local MemoryStream (which will always read as much data as you ask for in one go, if it's in the stream at all), you should assume that data may only become available over the course of multiple calls.
You should call read() in a loop. The condition of that loop would check if there is still any data available to be read.
That is kinda hard to answer, because you can never know when data will arrive, and thats why I usually use a thread for receiving data in my chat program. But you should be able to use something similar to this:
do{
numberOfBytesRead = myNetworkStream.Read(myReadBuffer,
0,
myReadBuffer.Length);
myCompleteMessage.AppendFormat("{0}",
Encoding.ASCII.GetString(myReadBuffer, 0, numberOfBytesRead));
}
while(myNetworkStream.DataAvailable);
Look at this source!

How to gather received buffers in socket programming (TCP/IP) in .net?

I am using the server-client model for communicating with a hardware board using socket programing.
I receive data from board using "read()" method of "NetworkStream" class which reads a buffer with specified maximum size and returns the length of valid data in buffer. I have considered the maximum size of buffer with a enough big number.
The board sends a set of messages every 100ms. Each message consists a 2-byte constant header and a variable number of bytes as its data after the header bytes.
The problem is that I do not receive the messages one by one! Instead, I receive a buffer may contains 2 or 3 messages or one message is scattered between two buffer.
Currently, I am using a DFA which gather the content of messages using the constant header bytes (We do not know the length of messages, we just know the header bytes) but the problem is that the data bytes may contains the header bytes randomly !!
Is there any efficient way to gather the bytes of each message from buffers using any specific stream or class? How can I overcome to this problem?!
You need to add an additional buffer component between your consumer DFA and the socket client.
Whenever data is avaliable from the NetworkStream the buffer component will read it and append it to its own private buffer, incrementing an "available bytes" counter. The buffer component needs to expose at least the following functionality to its users:
a BytesAvailable property -- this returns the value of the counter
a PeekBytes(int count) method -- this returns the first count bytes of the buffer, if that much is available at least, and does not modify the counter or the buffer
a ReadBytes(int count) method -- as above, but it decrements the counter by count and removes the bytes read from the buffer so that subsequent PeekBytes calls will never read them again
Keep in mind that you don't need to be able to service an arbitrarily high count parameter; it is enough if you can service a count as long as the longest message it would be possible to receive at all times.
Obviously the buffer component needs to keep some kind of data structure that allows "wrapping around" of some kind; you might want to look into a circular (ring) buffer implementation, or you can just use two fixed buffers of size N where N is the length of the longest message and switch from one to the other as they become full. You should be careful so that you stop pulling in data from the NetworkStream if your buffers become full and only continue pulling after the DFA has called ReadBytes to free up some buffer space.
Whenever your DFA needs to read data, it will first ask your buffer stage how much data it has accumulated and then proceed accordingly. It would look something like this:
if BytesAvailable < 2
return; // no header to read, cannot do anything
// peek at the header -- do not remove it from the buffer!
header = PeekBytes(2);
// calculate the full message length based on the header
// if this is not possible from just the header, you might want to do this
// iteratively, or you might want to change the header so that it is possible
length = X;
if BytesAvailable < X
return; // no full message to read, cannot continue
header = ReadBytes(2); // to remove it from the buffer
message = ReadBytes(X); // remove the whole message as well
This way your DFA will only ever deal with whole messages.

SslStream equivalent of TcpClient.Available?

Based on the advice of #Len-Holgate in this question, I'm asynchronously requesting 0-byte reads, and in the callback, accept bytes the available bytes with synchronous reads, since I know the data is available and won't block. This seems so efficient and wonderful.
But then I add the option for SslStream, and the approach falls apart. The zero-byte read is fine, but the SslStream decrypts the bytes, leaving a zero byte-count in the TcpClient's buffer (appropriately so), and I cannot determine how many bytes are now in the SslStream available for reading.
Is there a simple trick around this?
Some code, just for context:
sslStream.BeginRead(this.zeroByteBuffer, 0, 0, DataAvailable, this);
And after the EndRead() ( which correctly returns 0 ), DataAvailable contains:
// by now this is 0, because sslStream has already consumed the bytes
available = myTcpClient.Available;
if (0 < available) // Never occurs
{
// this part can be distractingly complicated, but
// it's based on the available byte count
sslStream.Read(...);
}
And due to the protocol, I need to evaluate byte-by-byte and decode variable byte-width unicode and stuff. I don't want to have to read byte-by-byte asynchronously!
If I understood correctly, your messages are delimited by a certain character, and you are already using a StringBuilder to cover the case when a message is fragmented into multiple pieces.
You could consider ignoring the delimiter when reading data, adding any data to it when it becomes available, and then inspecting the local StringBuilder for the delimiter character. When found, you can extract a single message using sb.ToString(0, delimiterIndex) and sb.Remove(0, delimiterIndex) until no delimiters remain.
This would also cover the case when two messages are received simultaneously.

Categories