Programming practices when receiving and manipulating received TCP/HTTP data? - c#

Should data manipulation once data is returned either with TCP or HTTP be received as byte arrays or is it an O.K. practice to receive it as a string? I've been trying to find some professional projects on github to get my answer, but have had no luck. Some examples of HTTPClient from Microsoft on MSDN usually make use of the GetByteArrayAsync(website) method, instead of GetStringAsync(website). Is there any reason why they would use GetByteArrayAsync instead of GetStringAsync, which would make data manipulation much easier right off the bat? Are there any advantages to using GetByteArrayAsync first instead?

What moves "through the wire" are bytes, not strings.
They might be text, but can be pictures, or a zip file.
At TCP/HTTP level this is unknown, and it does not matter.
That decision belongs with a higher level.
HTTP has a bit more info than TCP, so you might have a mimetype to help you decide what those bytes are.
Even if you know it is some kind of text, you will need to know the character set. You might get that info in the HTTP header, or in the document itself, or there might be a standard saying what the encoding is.
Only then you will be able to convert to a string.

Related

How can i evaluate a stream (string)

First of all, what I do at the moment:
I sniff a asyncron serial bus with 9 bit protocol and send the data to the PC. At the PC side I receive the data as an endless string, that looks like that: .12_80E886.02_80E894.13. The Software of the PC-side is written with winforms with C#. Now I have the problem that I haven´t a clearly start you can see it in the stream example. The reason for that is, that I start the sniff somewhere in the protocol.
What I want to do:
I think I can use startindex = IndexOf("_"), and set them now as new start. I have to evaluate sign´s in the stream the stream is build: _(timestamp in milliseconds).(addressbyte databyte). The only what I want to display in my RichTextBox is the databyte, also I need a data management method for the timestamp. Because I have in the GUI the function that I can see the time beetween two or more databyte´s, for that I think I make a sql database. The addressbyte need I to collor the byte with an one as address in a special collor.
Question:
How can I evaluate the stream so that i have alternately timestamp,
addressbyte and than databyte as single substring?
The reason why I want them so, is that, I think I can make an easy if elseif else block to realize all what I want to do.
When someone has an better suggestion for my project pls write it as comment.
With friendly wishes sniffi
I think you're trying to solve two problems at the same time. It would be better to separate them and solve them individually.
There is the issue of transporting the data, for this you are using streams. That is a valid solution. There is sending and receiving the data (bits) over the stream.
You have the problem of transforming these bits (after receiving them) into actual objects (dates, strings, etc..). For that you an use a simple parser, tokenizer, a local script that can get the correct parts from the data and convert it, or you can use a serialization framework (like DataContracts).
If you have simple data, I would opt for using a single method that can parse the data. For more complex scenarios I would look into serialization.
Also be ware that you will need to validate your inputs, since you cannot assume that there is always a trusted (non compromised) piece of software that is sending the bits to you.
I think string is bad choice. Propably data is send as bytes. Sniff rather bytes than string. And you need protocol description to understand data.
You need to read bytes form bus and interpret it.

HTTP POST - Can it contain complex objects directly?

From all I've read it seems that it's always of the form string=string&string=string... (all the strings being encoded to exclude & and =) however, searching for it (e.g. Wikipedia, SO, ...) I haven't found that mentioned as an explicit restriction.
(Of course a base64 string of a binary of complex objects can be sent. That's not the question.) But:
Can POST contain complex objects directly or is it all sent as a string?
There is nothing in HTTP that prevents the posting of binary data. You do not have to convert binary data to base64 or other text encodings. Though the common "key1=val1&key=val2" usage is very widely conventional and convenient it is not required. It only depends upon what the sender and receiver agree upon. See these threads or google "http post binary data" or the like.
Sending binary data over http
How to correctly send binary data over HTTPS POST?
It is just a string, just like any binary stream. There's various ways to encode complex objects to fit into a string though. base64 is an option, and so is json (the latter probably being more desirable).
PHP has a specific way to deal with this.. This:
a[]=1&a[]=2
Will result in an array with 1, 2.
This:
a[foo]=bar&a[gir]=zim
Creates also an array with 2 keys.
I've also seen this format in some frameworks:
a.foo=bar&b.gir=zim
So while urlencoding does not have a specific, standard syntax to do this.. that does not mean you can add meaning and do your own post-processing.
If you're buidling an API, you are probably best off not using urlencoding at all... There's much more capable and better formats. You can use whatever Content-Type you'd like.
HTTP itself is just based on strings. There's no notion of "objects", only text. The definition of "object" is dependent on whatever data format you transport over HTTP (XML, JSON, binary files, ...).
So, POST can contain "complex objects" if they are appropriately encoded into text.

Silverlight Binary Serialization over the wire

While nearly completing a new release, we've ignored the large size of the XML data that our WCF service returns to our silverlight client. Now we're investigating how to shrink the data, so that the results aren't in the 10-100mb range.
Its seems clear that binary serialization is the solution, and it seems easy enough to serialize the data into binary with, for instance, SharpSerializer, but through all of the SO posts about binary serialization and other tutorials I've come across, no one addresses how to send the serialized data across the wire to the Client. I expect I'm missing some obvious but critical piece to the WCF service puzzle.
Hopefully someone can lend me some help. Let me know if I should include more information.
First, try the built-in binary encoding (<binaryMessageEncoding> in config, see http://www.mostlydevelopers.com/blog/post/2009/10/14/Silverlight-3-WCF-Binary-Message-Encoding.aspx and http://www.silverlight.net/learn/data-networking/network-services-(soap,-rest-and-more)/how-do-i-use-binary-encoding-for-wcf-with-silverlight-3 ).
Your data will probably shrink, but please note that the built-in binary encoding was designed to be as fast as possible, not as small as possible.
If that's not enough and you want to use a 3rd=party component to do the serialization to binary data, you can indeed return this data as a byte[] (but you will also need to use <binaryMessageEncoding> above to prevent WCF from base64-encoding the data to make it valid XML). You can also use Stream instead of byte[], this won't give you true streaming behavior on the Silverlight client side but can give you true streaming on the server side.

Is it smart to output data from embedded device in xml format?

Our company makes many embedded devices that communicate with PC's via applications that I write in C#.net. I have been considering different ways of improving the data transfer so that the PC application can be more easily synchronized with the devices current state (which in some cases is continually changing).
I have been thinking about an approach where the device formats it's description and state messages into an xml formatted message before sending them across either the serial port, USB, Ethernet Socket, etc. I was thinking that it may make the process of getting all of this data into my C# classes more simple.
The alternative is an approach where the host application sends a command like GETSTATUS and the device responds with an array of bytes, each representing a different property, sensor reading, etc.
I don't have a great deal of experience with xml but from what I have seen can be done with LINQ to XML it seems like it might be a good idea. What do you guys think? Is this something that is done commonly? Is it a horrible idea?!?
First, which ever way you go, make sure the returned data has a version number embedded so that you can revise the data structure.
Is both an option? Seriously, there are always situations where sending data in a more readable form are preferable, and others where a more dense representation is best (these are fewer than most people think, but I don't want to start a religious war about it). People will passionately argue for both, because they are optimizing for different things. Providing both options would satisfy both camps.
A nice, clear XML status could definitely lower the bar for people who are starting to work with your devices. You could also build a C# object that can be deserialized from the binary data that is returned.
It isn't a terrible idea, but it is probably an overdedesign. I would prefer to use a format that the embedded device will generate easier and faster. Then at the PC side I would insert a layer to conver it to a convenient format. You can also use LINQ with objects. Why don't send the data in binary form or in a simple ASCII protocol and then convert it to C# objects? You can use LINQ to access the data. In my opinion, in this case XML introduces an unnecessary complexity.
There are tradeoffs either way, so the right choice depends on your application, how powerful your device is and who is going to be using this protocol.
You mention that the alternative is a binary-serialized, request-response approach. I think that there are two separate dimensions here: the serialization format (binary or XML) and the communication style. You can use whatever serialization format you want in either a push protocol or in a request-response protocol.
XML might be a good choice if
Readability is important
If there is variation between devices, i.e. if you have different devices that have different properties, since XML tends to be self-describing.
Or if you want to publish your device's data to the Internet.
Of course, XML is verbose and there are certainly ways to accomplish all of the above with a binary protocol (e.g. with tagged values can be used to make your binary protocol more descriptive).
One of the founders of this very site has some sane and amusing opinions on XML in XML: The Angle Bracket Tax
I did something very similar in a previous design with PC to microprocessor communications using an XML format. It worked very well on the PC side since what Adobe Flex (what we were using) could interpret XML very easily, and I suspect .Net can do the same thing very easily.
The more complicated part of it was on the microprocessor side. The XML parsing had to be done manually, which was not really that complicated, but just time intensive. Creating the XML string can also be quite a lot of code depending on what you're doing.
Overall - If I had to do it again, I still think XML was a good choice because it is a very flexible protocol. RAM was not that much of an issue with regards to storing a few packets in our FIFO buffer on the microprocessor side but that may be something to consider in your application.
It's a waste of precious embedded CPU time to generate and transmit XML files. Instead, I would just use an array of binary bytes represent the data, but I would use structs to help interpret the data. The struct feature of C# lets you easily interpret an array of bytes as meaningful data. Here's an example:
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct DeviceStatus
{
public UInt16 position; // Byte 0 and 1
public Byte counter; // Byte 2
public Fruit currentFruit; // Byte 3
};
enum Fruit : Byte
{
Off = 0,
Apple = 1,
Orange = 2,
Banana = 3,
}
Then you would have a function that converts your array of bytes to this struct:
public unsafe DeviceStatus getStatus()
{
byte[] dataFromDevice = fetchStatusFromDevice();
fixed (byte* pointer = dataFromDevice)
{
return *(DeviceStatus*)pointer;
}
}
Compared to XML, this method will save CPU time on the device and on the PC, and it is easier to maintain than an XML schema, with complementary functions for building and parsing the XML file. All you have to do is make sure that the struct and enum definitions in your embdedded device are the same as the definitions in your C# code, so that the C# program and device agree on the protocol to use.
You'll probably want to use the "packed" attribute on both the C# and embedded side so that all the struct elements are positioned in a predictable way.

Protocol Buffers c# (protobuf-net) Message::ByteSize

I am looking for the protobuf-net equivalent to the C++ API Message::ByteSize to find out the serialized message length in bytes.
I haven't played with the C++ API, so you'll have to give me a bit more context / information. What does this method do? Perhaps a sample usage?
If you are consuming data from a stream, there are "WithLengthPrefix" versions to automate limiting to discreet messages, or I believe the method to just read the next length from the stream is on the public API.
If you want to get a length in place of serializing, then currently I suspect the easiest option might be to serialize to a dummy stream and track the length. Oddly enough, an early version of protobuf-net did have "get the length without doing the work" methods, but after discussion on the protobuf-net I removed these. The data serialized is still tracked, obviously. However, because the API is different than the binary data length for objects is not available "for free".
If you clarify what the use-case is, I'm sure we can make it easily available (if it isn't already).
Re the comment; that is what I suspected. Because protobuf-net defers the binary translation to the last moment (because it is dealing with regular .NET types, not some self-generated code) there is no automatic way of getting this value without doing the work. I could add a mechanism to let you get this value by writing to Stream.Null? but if you need the data anyway you might benefit from just writing to MemoryStream and checking the .Length in advance of copying the data.

Categories