ByteArray class for C# - c#

I'm trying to translate a function from ActionScript 3 into C# .NET.
What I have trouble is how to properly use ByteArrays in C#. In As3 there is a specific Class for it that already has most of the functionality i need, but in C# nothing of that sort seems to exist and I can't wrap my head around it.
This is the As3 function:
private function createBlock(type:uint, tag:uint,data:ByteArray):ByteArray
{
var ba:ByteArray = new ByteArray();
ba.endian = Endian.LITTLE_ENDIAN;
ba.writeUnsignedInt(data.length+16);
ba.writeUnsignedInt(0x00);
ba.writeUnsignedInt(type);
ba.writeUnsignedInt(tag);
data.position = 0;
ba.writeBytes(data);
ba.position = 0;
return ba;
}
But from what I gather, in C# I have to use a normal Array with the byte type, like this
byte[] ba = new byte[length];
Now, I looked into the Encoding Class, the BinaryWriter and BinaryFormatter class and researched if somebody made a Class for ByteArrays, but with no luck.
Can somebody nudge me in the right direction please?

You should be able to do this using a combination of MemoryStream and BinaryWriter:
public static byte[] CreateBlock(uint type, uint tag, byte[] data)
{
using (var memory = new MemoryStream())
{
// We want 'BinaryWriter' to leave 'memory' open, so we need to specify false for the third
// constructor parameter. That means we need to also specify the second parameter, the encoding.
// The default encoding is UTF8, so we specify that here.
var defaultEncoding = new UTF8Encoding(encoderShouldEmitUTF8Identifier:false, throwOnInvalidBytes:true);
using (var writer = new BinaryWriter(memory, defaultEncoding, leaveOpen:true))
{
// There is no Endian - things are always little-endian.
writer.Write((uint)data.Length+16);
writer.Write((uint)0x00);
writer.Write(type);
writer.Write(data);
}
// Note that we must close or flush 'writer' before accessing 'memory', otherwise the bytes written
// to it may not have been transferred to 'memory'.
return memory.ToArray();
}
}
However, note that BinaryWriter always uses little-endian format. If you need to control this, you can use Jon Skeet's EndianBinaryWriter instead.
As an alternative to this approach, you could pass streams around instead of byte arrays (probably using a MemoryStream for implementation), but then you will need to be careful about lifetime management, i.e. who will close/dispose the stream when it's done with? (You might be able to get away with not bothering to close/dispose a memory stream since it uses no unmanaged resources, but that's not entirely satisfactory IMO.)

You want to have a byte stream and then extract the array from it:
using(MemoryStream memory = new MemoryStream())
using(BinaryWriter writer = new BinaryWriter(memory))
{
// write into stream
writer.Write((byte)0); // a byte
writer.Write(0f); // a float
writer.Write("hello"); // a string
return memory.ToArray(); // returns the underlying array
}

Related

Convert MemoryStream to Array C#

I have implemented a code block in order to convert Stream into Byte Array. And code snippet is shown below. But unfortunately, it gives OutOfMemory Exception while converting MemoryStream to Array (return newDocument.ToArray();). please could someone help me with this?
public byte[] MergeToBytes()
{
using (var processor = new PdfDocumentProcessor())
{
AppendStreamsToDocumentProcessor(processor);
using (var newDocument = new MemoryStream())
{
processor.SaveDocument(newDocument);
return newDocument.ToArray();
}
}
}
public Stream MergeToStream()
{
return new MemoryStream(MergeToBytes());
}
Firstly: how big is the document? if it is too big for the byte[] limit: you're going to have to use a different approach.
However, a MemoryStream is already backed by an (oversized) array; you can get this simply using newDocument.TryGetBuffer(out var buffer), and noting that you must restrict yourself to the portion of the .Array indicated by .Offset (usually, but not always, zero) and .Count (the number of bytes that should be considered "live"). Note that TryGetBuffer can return false, but not in the new MemoryStream() scenario.
If is also interesting that you're converting a MemoryStream to a byte[] and then back to a MemoryStream. An alternative here would just have been to set the Position back to 0, i.e. rewind it. So:
public Stream MergeToStream()
{
using var processor = new PdfDocumentProcessor();
AppendStreamsToDocumentProcessor(processor);
var newDocument = new MemoryStream();
processor.SaveDocument(newDocument);
newDocument.Position = 0;
return newDocument;
}

Send object over a TCP connection

I've started windows mobile programming today and I have successfully connected to my server.
The application I am making on Visual Studio is not a universal application, but a Windows Mobile Application.
The API DataWriter is used to write data to an output stream, in the applications scenario the output stream is the socket. I.E:
DataWriter dw = new DataWriter(clientSocket.OutputStream);
One of the methods I have been looking at is WriteBytes and WriteBuffer
(Documentation can be found her for API documentation for DataWriter
.
Which method do I use, and why?
How can I convert this class and sent it to my server using the methods mentioned above.
public class Message
{
public string pas { get; internal set; }
public int type { get; internal set; }
public string us { get; internal set; }#
}
//the below code goes in a seperate function
DataWriter dw = new DataWriter(clientSocket.OutputStream);
Message ms = new Message();
ms.type = 1;
ms.us = usernameTextBox.Text;
ms.pas = usernameTextBox.Text;
//TODO: send ms to the server
Between the two methods, WriteBytes() seems like the simpler approach. WriteBuffer() offers you more control over the output buffer/stream, which you can certainly use if and when you need to. But, for all intents and purposes, if you just need to simply open a connection and send it a byte stream, WriteBytes() does the job.
How can I convert this class and sent it to my server
That's entirely up to you, really. What you have to do is define how you're going to "serialize" your class to transmit over the connection (and thereby have to "deserialize" it when the other code receives the data).
There are a few ways to do that, among many others. A straightforward approach (taken from the top answer on that linked question), would be to use the BinaryFormatter class. Something like this:
var ms = new Message();
ms.type = 1;
ms.us = usernameTextBox.Text;
ms.pas = usernameTextBox.Text;
byte[] serializedMessage;
var formatter = new BinaryFormatter();
using (var stream = new MemoryStream())
{
formatter.Serialize(stream, ms);
serializedMessage = ms.ToArray();
}
// now serializedMessage is a byte array to be sent
Then on the other end you'd need to deserialize it back to an object instance. Which might look something like this:
// assuming you have a variable called serializedMessage as the byte array received
Message ms;
using (var stream = new MemoryStream())
{
var formatter = new BinaryFormatter();
stream.Write(serializedMessage, 0, serializedMessage.Length);
stream.Seek(0, SeekOrigin.Begin);
ms = (Message)formatter.Deserialize(stream);
}
You can of course abstract these behind a simpler interface. Or if you're looking for any kind of human readability in the serialization you might try something like a JSON serializer and directly convert the string to a byte array, etc.
Edit: Note that this is really just an example of one of many ways to "serialize" an object. And, as pointed out by a user in comments below, there could be drawbacks to using this binary serializer.
You can use any serializer, really. You can even make your own. Technically overriding .ToString() to print all the properties and then calling that is a form of serialization. The concept is always the same... Convert the in-memory object to a transmittable piece of data, from which an identical in-memory object can later be built. (Technically, saving to a database is a form of serialization.)

How can I calculate (manually) the SizeOf a string, also does a class add to a size?

First part:
I have a string...
string sTest = "This is my test string";
How can I (manually, without code) determine the SizeOf the string? What should it end up being? How do you get that size?
Second Part:
I have a class...
[StructLayout(LayoutKind.Sequential)]
public class TestStruct
{
public string Test;
}
I use it...
TestStruct oTest = new TestStruct();
oTest.Test = "This is my test string";
Are there any differences in size from the first part to the second part?
Update:
The point of this is to use the size as a way to create a memory map file.
long lMapSize = System.Runtime.InteropServices.Marshal.SizeOf(oTest);
mmf = MemoryMappedFile.CreateOrOpen("testmap", lMapSize);
Just thought it was worth noting. Currently lMapSize = 4. Which confuses the ... out of me! Thanks everyone!
string is a reference type, so field inside other class or struct or local method variable will be a pointer - IntPtr. Size of pointer is 32 bit on 32 bit pc and 64 for 64.
If you want to know size of string itself, you need to know which encoding you are going to use. Encoding.UTF8.GetByteCount("aaa") will return size of "aaa" in UTF8 encoding. But this will return only size of string, not .net object size.
Looks like there is no accurate way to calculate size of object from code. If you want to know what's going on in your application there is some memory profilers for it, just search 'C# memory profiler'. I've used only mono profiler with heap-shot, so I can't recomend you which of profilers is good.
Another way is to use WinDBG with sos.dll and sosex.dll. Here you can find example of using sos.dll.
Update:
You can also serialize your object into byte array. Serializer will add some extra data like assembly name and version.
byte[] bytes;
using (MemoryStream stream = new MemoryStream())
{
IFormatter formatter = new BinaryFormatter();
formatter.Serialize(stream, myObject);
bytes = stream.ToArray();
}
using (MemoryStream stream = new MemoryStream(bytes))
{
IFormatter formatter = new BinaryFormatter();
myObject = formatter.Deserialize(stream);
}

No map for object error when deserializing object

I have the following C# code which is supposed to serialize arbitrary objects to a string, and then of course deserialize it.
public static string Pack(Message _message)
{
BinaryFormatter formatter = new BinaryFormatter();
MemoryStream original = new MemoryStream();
MemoryStream outputStream = new MemoryStream();
formatter.Serialize(original, _message);
original.Seek(0, SeekOrigin.Begin);
DeflateStream deflateStream = new DeflateStream(outputStream, CompressionMode.Compress);
original.CopyTo(deflateStream);
byte[] bytearray = outputStream.ToArray();
UTF8Encoding encoder = new UTF8Encoding();
string packed = encoder.GetString(bytearray);
return packed;
}
public static Message Unpack(string _packed_message)
{
UTF8Encoding encoder = new UTF8Encoding();
byte[] bytearray = encoder.GetBytes(_packed_message);
BinaryFormatter formatter = new BinaryFormatter();
MemoryStream input = new MemoryStream(bytearray);
MemoryStream decompressed = new MemoryStream();
DeflateStream deflateStream = new DeflateStream(input, CompressionMode.Decompress);
deflateStream.CopyTo(decompressed); // EXCEPTION
decompressed.Seek(0, SeekOrigin.Begin);
var message = (Message)formatter.Deserialize(decompressed); // EXCEPTION 2
return message;
}
But the problem is that any time the code is ran, I am experiencing an exception. Using the above code and invoking it as shown below, I am receiving InvalidDataException: Unknown block type. Stream might be corrupted. at the marked // EXCEPTION line.
After searching for this issue I have attempted to ditch the deflation. This was only a small change: in Pack, bytearray gets created from original.ToArray() and in Unpack, I Seek() input instead of decompressed and use Deserialize(input) instead of decompressed too. The only result which changed: the exception position and body is different, yet it still happens. I receive a SerializationException: No map for object '201326592'. at // EXCEPTION 2.
I don't seem to see what is the problem. Maybe it is the whole serialization idea... the problem is that somehow managing to pack the Message instances is necessary because these objects hold the information that travel between the server and the client application. (Serialization logic is in a .Shared DLL project which is referenced on both ends, however, right now, I'm only developing the server-side first.) It also has to be told, that I am only using string outputs because right now, the TCP connection between the servers and clients are based on string read-write on the ends. So somehow it has to be brought down to the level of strings.
This is how the Message object looks like:
[Serializable]
public class Message
{
public MessageType type;
public Client from;
public Client to;
public string content;
}
(Client right now is an empty class only having the Serializable attribute, no properties or methods.)
This is how the pack-unpack gets invoked (from Main()...):
Shared.Message msg = Shared.MessageFactory.Build(Shared.MessageType.DEFAULT, new Shared.Client(), new Shared.Client(), "foobar");
string message1 = Shared.MessageFactory.Pack(msg);
Console.WriteLine(message1);
Shared.Message mess2 = Shared.MessageFactory.Unpack(message1); // Step into... here be exceptions
Console.Write(mess2.content);
Here is an image showing what happens in the IDE. The output in the console window is the value of message1.
Some investigation unfortunately also revealed that the problem could lie around the bytearray variable. When running Pack(), after the encoder creates the string, the array contains 152 values, however, after it gets decoded in Unpack(), the array has 160 values instead.
I am appreciating any help as I am really out of ideas and having this problem the progress is crippled. Thank you.
(Update) The final solution:
I would like to thank everyone answering and commenting, as I have reached the solution. Thank you.
Marc Gravell was right, I missed the closing of deflateStream and because of this, the result was either empty or corrupted. I have taken my time and rethought and rewrote the methods and now it works flawlessly. And even the purpose of sending these bytes over the networked stream is working too.
Also, as Eric J. suggested, I have switched to using ASCIIEnconding for the change between string and byte[] when the data is flowing in the Stream.
The fixed code lies below:
public static string Pack(Message _message)
{
using (MemoryStream input = new MemoryStream())
{
BinaryFormatter bformatter = new BinaryFormatter();
bformatter.Serialize(input, _message);
input.Seek(0, SeekOrigin.Begin);
using (MemoryStream output = new MemoryStream())
using (DeflateStream deflateStream = new DeflateStream(output, CompressionMode.Compress))
{
input.CopyTo(deflateStream);
deflateStream.Close();
return Convert.ToBase64String(output.ToArray());
}
}
}
public static Message Unpack(string _packed)
{
using (MemoryStream input = new MemoryStream(Convert.FromBase64String(_packed)))
using (DeflateStream deflateStream = new DeflateStream(input, CompressionMode.Decompress))
using (MemoryStream output = new MemoryStream())
{
deflateStream.CopyTo(output);
deflateStream.Close();
output.Seek(0, SeekOrigin.Begin);
BinaryFormatter bformatter = new BinaryFormatter();
Message message = (Message)bformatter.Deserialize(output);
return message;
}
}
Now everything happens just right, as the screenshot proves below. This was the expected output from the first place. The Server and Client executables communicate with each other and the message travels... and it gets serialized and unserialized properly.
In addition to the existing observations about Encoding vs base-64, note you haven't closed the deflate stream. This is important because compression-streams buffer: if you don't close, it may not write the end. For a short stream, that may mean it writes nothing at all.
using(DeflateStream deflateStream = new DeflateStream(
outputStream, CompressionMode.Compress))
{
original.CopyTo(deflateStream);
}
return Convert.ToBase64String(outputStream.GetBuffer(), 0,
(int)outputStream.Length);
Your problem is most probably in the UTF8 encoding. Your bytes are not really a character string and UTF-8 is a encoding with different byte lengths for characters.
This means the byte array may not correspond to a correctly encoded UTF-8 string (there may be some bytes missing at the end for instance.)
Try using UTF16 or ASCII which are constant length encodings (the resulting string will likely contain control characters so it won't be printable or transmitable through something like HTTP or email.)
But if you want to encode as a string it is customary to use UUEncoding to convert the byte array into a real printable string, then you can use any encoding you want.
When I run the following Main() code against your Pack() and Unpack():
static void Main(string[] args)
{
Message msg = new Message() { content = "The quick brown fox" };
string message1 = Pack(msg);
Console.WriteLine(message1);
Message mess2 = Unpack(message1); // Step into... here be exceptions
Console.Write(mess2.content);
}
I see that the bytearray
byte[] bytearray = outputStream.ToArray();
is empty.
I did modify your serialized class slightly since you did not post code for the included classes
public enum MessageType
{
DEFAULT = 0
}
[Serializable]
public class Message
{
public MessageType type;
public string from;
public string to;
public string content;
}
I suggest the following steps to resolve this:
Check the intermediate results along the way. Do you also see 0 bytes in the array? What is the string value returned by Pack()?
Dispose of your streams once you are done with them. The easiest way to do that is with the using keyword.
Edit
As Eli and Marc correctly pointed out, you cannot store arbitrary bytes in a UTF8 string. The mapping is not bijective (you can't go back and forth without loss/distortion of information). You will need a mapping that is bijective, such as the Convert.ToBase64String() approach Marc suggests.

Bytes consumed by StreamReader

Is there a way to know how many bytes of a stream have been used by StreamReader?
I have a project where we need to read a file that has a text header followed by the start of the binary data. My initial attempt to read this file was something like this:
private int _dataOffset;
void ReadHeader(string path)
{
using (FileStream stream = File.OpenRead(path))
{
StreamReader textReader = new StreamReader(stream);
do
{
string line = textReader.ReadLine();
handleHeaderLine(line);
} while(line != "DATA") // Yes, they used "DATA" to mark the end of the header
_dataOffset = stream.Position;
}
}
private byte[] ReadDataFrame(string path, int frameNum)
{
using (FileStream stream = File.OpenRead(path))
{
stream.Seek(_dataOffset + frameNum * cbFrame, SeekOrigin.Begin);
byte[] data = new byte[cbFrame];
stream.Read(data, 0, cbFrame);
return data;
}
return null;
}
The problem is that when I set _dataOffset to stream.Position, I get the position that the StreamReader has read to, not the end of the header. As soon as I thought about it this made sense, but I still need to be able to know where the end of the header is and I'm not sure if there's a way to do it and still take advantage of StreamReader.
You can find out how many bytes the StreamReader has actually returned (as opposed to read from the stream) in a number of ways, none of them too straightforward I'm afraid.
Get the result of textReader.CurrentEncoding.GetByteCount(totalLengthOfAllTextRead) and then seek to this position in the stream.
Use some reflection hackery to retrieve the value of the private variable of the StreamReader object that corresponds to the current byte position within the internal buffer (different from that with the stream - usually behind, but no more than equal to of course). Judging by .NET Reflector, the this variable seems to be named bytePos.
Don't bother using a StreamReader at all but instead implement your custom ReadLine function built on top of the Stream or BinaryReader even (BinaryReader is guaranteed never to read further ahead than what you request). This custom function must read from the stream char by char, so you'd actually have to use the low-level Decoder object (unless the encoding is ASCII/ANSI, in which case things are a bit simpler due to single-byte encoding).
Option 1 is going to be the least efficient I would imagine (since you're effectively re-encoding text you just decoded), and option 3 the hardest to implement, though perhaps the most elegant. I'd probably recommend against using the ugly reflection hack (option 2), even though it's looks tempting, being the most direct solution and only taking a couple of lines. (To be quite honest, the StreamReader class really ought to expose this variable via a public property, but alas it does not.) So in the end, it's up to you, but either method 1 or 3 should do the job nicely enough...
Hope that helps.
So the data is utf8 (the default encoding for StreamReader). This is a multibyte encoding, so IndexOf would be inadvisable. You could:
Encoding.UTF8.GetByteCount(string)
on your data so far, adding 1 or 2 bytes for the missing line ending.
If you're needing to count bytes, I'd go with the BinaryReader. You can take the results and cast them about as needed, but I find its idea of its current position to be more reliable (in that since it reads in binary, its immune to character-set problems).
So your last line contains 'DATA' + an unknown amount of data bytes. You could extract the position by using IndexOf() with your last read line. Then readjust the stream.Position.
But I am not sure if you should use ReadLine() at all in this case. Maybe it would be better to read byte by byte until you reach the 'DATA' mark.
The line breaks are easily identifiable without needing to decode the stream first (except for some encodings rarely used for text files like EBCDIC, UTF-16, UTF-32), so you can just read each line as bytes and then decode the entire line:
using (FileStream stream = File.OpenRead(path)) {
List<byte> buffer = new List<byte>();
bool hasCr = false;
bool done = false;
while (!done) {
int b = stream.ReadByte();
if (b == -1) throw new IOException("End of file reached in header.");
if (b == 13) {
hasCr = true;
} else if (b == 10 && hasCr) {
string line = Encoding.UTF8.GetString(buffer.ToArray(), 0, buffer.Count);
if (line == "DATA") {
done = true;
} else {
HandleHeaderLine(line);
}
buffer.Clear();
hasCr = false;
} else {
if (hasCr) buffer.Add(13);
hasCr = false;
buffer.Add((byte)b);
}
}
_dataOffset = stream.Position;
}
Instead of closing the stream and open it again, you could of course just keep on reading the data.

Categories