I am working on existing project which is built using DotNetty library, mostly networking framework. I am not that much aware with this framework but as a quick fix, I want to convert IByteBuffer value to the String.
IByteBuffer represents a stream of binary data of various data types, not a string.
If you want a dump of all the bytes in the buffer, you can use ByteBufferUtil.HexDump. This gives you a string of the individual bytes in hexadecimal. This is useful for troubleshooting, if the buffer doesn't contain quite the data you are expecting - you can go tracing the data byte by byte, and find where it goes wrong.
If you want to interpret the bytes differently, you really need to know the types in the buffer. There's no generic method, because the buffer isn't self-descriptive (unlike e.g. XML). If you're trying to get a quick look at the string data in the buffer, and the data happens to be encoded in ASCII, you can try something like this:
Encoding.ASCII.GetString(byteBuffer.Array)
Needless to say, unless the whole buffer contains an ASCII string, this will produce lots of garbage. Whether this is useful or not depends entirely on the data you're working with; if the buffer has something like an HTTP request, you'll probably see the data just fine. Needless to say, this should only be used for debugging purposes - for any production use, you should really know the layout of the buffer explicitly, rather than guessing at it.
Related
So I've delved into serializing data using Binary Formatter, which I am impressed with. But the problem is compatibility. I want my serialized data to be portable, therefore accessible by different platforms. So XML serialization may seem like the answer, but the files produced are too large and there is no need for human-readability.
So I thought about creating my own encoding/serialization system so that I can write a long[] array and a string[]/List<string> containing Hexadecimal vales to a file.
I thought about converting all of the arrays to into byte[], but I'm not sure whether I should be concerned about character text encoding. I only intend on serializing/encoding arrays containing Hexadecimal and long values.
byte[] Bytes = HexArray.Select(s => Convert.ToByte(s, 16)).ToArray();
After converting all of the arrays to byte[], I could write them to a file, whilst noting of the byte offsets of the individual arrays so that they could be recovered.
Any ideas on a better way to do this? I really don't wanna resort to XML. Wish the BinaryFormatter was portable. This has to cross-platform so it can't be affected by endianness
You might want to take a look at Protocol Buffers (protobuf):
a language-neutral, platform-neutral, extensible way of serializing structured data for use in communications protocols, data storage, and more.
A couple of popular C# libraries are:
protobuf (Google) and
protobuf-net
First of all, what I do at the moment:
I sniff a asyncron serial bus with 9 bit protocol and send the data to the PC. At the PC side I receive the data as an endless string, that looks like that: .12_80E886.02_80E894.13. The Software of the PC-side is written with winforms with C#. Now I have the problem that I haven´t a clearly start you can see it in the stream example. The reason for that is, that I start the sniff somewhere in the protocol.
What I want to do:
I think I can use startindex = IndexOf("_"), and set them now as new start. I have to evaluate sign´s in the stream the stream is build: _(timestamp in milliseconds).(addressbyte databyte). The only what I want to display in my RichTextBox is the databyte, also I need a data management method for the timestamp. Because I have in the GUI the function that I can see the time beetween two or more databyte´s, for that I think I make a sql database. The addressbyte need I to collor the byte with an one as address in a special collor.
Question:
How can I evaluate the stream so that i have alternately timestamp,
addressbyte and than databyte as single substring?
The reason why I want them so, is that, I think I can make an easy if elseif else block to realize all what I want to do.
When someone has an better suggestion for my project pls write it as comment.
With friendly wishes sniffi
I think you're trying to solve two problems at the same time. It would be better to separate them and solve them individually.
There is the issue of transporting the data, for this you are using streams. That is a valid solution. There is sending and receiving the data (bits) over the stream.
You have the problem of transforming these bits (after receiving them) into actual objects (dates, strings, etc..). For that you an use a simple parser, tokenizer, a local script that can get the correct parts from the data and convert it, or you can use a serialization framework (like DataContracts).
If you have simple data, I would opt for using a single method that can parse the data. For more complex scenarios I would look into serialization.
Also be ware that you will need to validate your inputs, since you cannot assume that there is always a trusted (non compromised) piece of software that is sending the bits to you.
I think string is bad choice. Propably data is send as bytes. Sniff rather bytes than string. And you need protocol description to understand data.
You need to read bytes form bus and interpret it.
Is there a good reason that .NET provides string functions (like search, substring extraction, splitting, etc) only for UTF-16 and not for byte arrays? I see many cases when it would be easier and much more efficient to work with 8-bit chars instead of 16-bit.
Let's take MIME (.EML) format for example. It's basically 8-bit text file. You cannot read it properly using ANY encoding (because encoding info is contained within the file, moreover, different parts can have different encodings).
So you basically better read a MIME file as bytes, determine it's structure (ideally, using 8bit-string parsing tools), and after finding encodings for all encoding-dependent data blocks apply encoding.GetString(data) to get normal UTF-16 representation of them.
Another thing is with base64 data blocks (base64 is just an example, there are also UUE and others). Currently .NET expects you to have a base64 16-bit string but it's not effective to read data of double size and do all conversions from bytes to string just to decode this data. When dealing with megabytes of data, it becomes important.
Missing byte string manipulation functions leads to the need to write them manually but the implementation is obviously less efficient than native code implementation of string functions.
I don't say it needs to be called 8-bit chars, let's keep it bytes. Just have a set of native methods which reflect most string manipulation routines, but with byte arrays. Is this needed only by me or am I missing something important about common .NET architecture?
Let's take MIME (.EML) format for example. It's basically 8-bit text file. You cannot read it properly using ANY encoding. (because encoding info is contained within the file, moreover, different parts can have different encodings).
So, you're talking about a case where general-purpose byte-string methods aren't very useful, and you'd need to specialise.
And then for other cases, you'd need to specialise again.
And again.
I actually think byte-string methods would be more useful than your example suggests, but it remains that a lot of cases for them have specialised needs that differ from other uses in incompatible ways.
Which suggests it may not be well-suited for the base library. It's not like you can't make your own that do fit those specialised needs.
Code to deal with mixed-encoding string manipulation is unnecessarily hard and much harder to explain/get right. The way you suggest to handle mixed encoding every "string" would need to keep encoding information in it and framework would have to provide implementations of all possible combinations of encodings.
Standard solution for such problem is to provide well defined way convert all types to/from single "canonical" representation and perform most operations on that canonical type. You see that more easily in image/video processing where random incoming formats converted into one format tool knows about, processed and converted back to original/any other format.
.Net strings are almost there with "canonical" way to represent Unicode string. There are still many ways to represent same string from user point of view that is actually composed from different char elements. Even regular string comparison is huge problem (as frequently in addition to encoding there are locale differences).
Notes
there are already plenty of API dealing with byte arrays to compare/slice - both in Array/List classes and as LINQ helpers. The only real missing part is regex-like matches.
even dealing with single type of encoding for strings (UTF-16 in .Net, UTF-8 in many other systems) is hard enough - even getting "sting length" is a problem (do you need to count surrogate pairs only or include all combining characters, or just .Length is enough).
it is good idea to try to write code yourself to see where complexity come from and whether particular framework decision makes sense. Try to implement 10-15 common string functions to support several encodings - i.e. (UTF8, UTF16, and one of 8-bit encoding).
i'm developing application that is listening to the data coming to the pc and store it in a db
when i'm trying to use any sniffing software it decode the data and i can read it...
but in my code ....i cant read it at all
it come in a format like that
1822262151622341817118815518211616121520941131921572041519912321413018224510453482062312258624219217426213385792952422362282081777270129716688629114817282188771708157542505055171418651781981425595109572128317191993018793431541418175198551682143218916536118562071014546919618158204181231187237183188160147127165111798312311810419822146114761993113815821216617541542372062129733198212250147199288115346102031191275215728146245198190171121209115149107193226253199151253205183146112072202559697791491441131572351381412278441552554817712614110121823714822712523618924690185291182071331471286244143181469018522814822821118012620321315924832238219115405615512392145202385512115735771691111055935782371281492476567165158924021493139815144225143762294713291762001113814720516216041120169912317914878167571392103510118386589521910621319622274158971538465206168139190127867123282255271781242497522124211517622131122113236255230254211206911242051832545515823012124925217318223920523316923122925514321122343602492471242........
can any one tell me what kind of data is that and any code to solve it out??
To see what a real packet sniffer looks like, check out WireShark. There are many different protocols over TCP, and many of them are binary. Those that aren't may be using unicode characters, which are two-byte characters so an ascii display of them would be meaningless.
Anyway, the data you're displaying is pretty meaningless. It looks like decimal data, are you concatenating a bunch of decimal representations of the binary stream interpreted as byte or integer values? That would explain it. You should start by running the stream through System.TextEncoding.ASCII.Decode You'll probably see some recognizable strings. Then try System.TextEncoding.Unicode.Decode, etc.
No, we cannot. And the reason is simple, we don't know what application you are sniffing.
That stream of data could mean anything.
But, I suggest you print the data in hexadecimal. Maybe the data would make more sense.
I use the excellent FileHelpers library when I work with text data. It allows me to very easily dump text fields from a file or in-memory string into a class that represents the data.
In working with a big endian microcontroller-based system I need to read a serial data stream. In order to save space on the very limited microcontroller platform I need to write raw binary data which contains field of various multi-byte types (essentially just dumping a struct variable out the serial port).
I like the architecture of FileHelpers. I create a class that represents the data and tag it with attributes that tell the engine how to put data into the class. I can feed the engine a string representing a single record and get an deserialized representation of the data. However, this is different from object serialization in that the raw data is not delimited in any way, it's a simple binary fixed record format.
FileHelpers is probably not suitable for reading such binary data as it cannot handle the nulls that show up and* I suspect that there might be unicode issues (the engine takes input as a string, so I have to read bytes from the serial port and translate them into a unicode string before they go to my data converter classes). As an experiment I have set it up to read the binary stream and as long as I'm careful to not send nulls it works quite well so far. It is easy to set up new converters that read the raw data and account for endian foratting issues and such. It currently fails on nulls and cannot process multiple records (it expect a CRLF between records).
What I want to know is if anyone knows of an open-source library that works similarly to FileHelpers but that is designed to handle binary data.
I'm considering deriving something from FileHelpers to handle this task, but it seems like there ought to be something already available to do this.
*It turns out that it does not complain about nulls in the input stream. I had an unrelated bug in my test program that came up where I expected a problem with the nulls. Should have investigated a little deeper first!
I haven't used filehelpers, so I can't do a direct comparison; however, if you have an object-model that represents your objects, you could try protobuf-net; it is a binary serialization engine for .NET using Google's compact "protocol buffers" wire format. Much more efficient than things like xml, but without the need to write all your own serialization code.
Note that "protocol buffers" does include some very terse markers between fields (typically one byte); this adds a little padding, but greatly improves version tolerance. For "packed" data (i.e. blocks of ints, say, from an array) this can be omitted if desired.
So: if you just want a compact output, it might be good. If you need a specific output, probably less so.
Disclosure: I'm the author, so I'm biased; but it is free.
When I am fiddling with GPS data in the SIRFstarIII binary mode, I use the Python interactive prompt with the serial module to fetch the stream from the USB/serial port and the struct module to convert the bytes as needed (per some format defined by SIRF). Using the interactive prompt is very flexible because I can read the string to a variable, process it, view the results and try again if needed. After the prototyping stage is finished, I have the data format strings that I need to put into the final program.
Your question doesn't mention anything about why you have a C# tag. I understand FileHelpers is a C# library, but I that doesn't tell me what environment you are working in. There is an implementation of Python for .NET called IronPython.
I realize this answer might mean you have to learn a new language, but having an interactive prompt is a very powerful tool for any programmer.