I have the following C# code which is supposed to serialize arbitrary objects to a string, and then of course deserialize it.
public static string Pack(Message _message)
{
BinaryFormatter formatter = new BinaryFormatter();
MemoryStream original = new MemoryStream();
MemoryStream outputStream = new MemoryStream();
formatter.Serialize(original, _message);
original.Seek(0, SeekOrigin.Begin);
DeflateStream deflateStream = new DeflateStream(outputStream, CompressionMode.Compress);
original.CopyTo(deflateStream);
byte[] bytearray = outputStream.ToArray();
UTF8Encoding encoder = new UTF8Encoding();
string packed = encoder.GetString(bytearray);
return packed;
}
public static Message Unpack(string _packed_message)
{
UTF8Encoding encoder = new UTF8Encoding();
byte[] bytearray = encoder.GetBytes(_packed_message);
BinaryFormatter formatter = new BinaryFormatter();
MemoryStream input = new MemoryStream(bytearray);
MemoryStream decompressed = new MemoryStream();
DeflateStream deflateStream = new DeflateStream(input, CompressionMode.Decompress);
deflateStream.CopyTo(decompressed); // EXCEPTION
decompressed.Seek(0, SeekOrigin.Begin);
var message = (Message)formatter.Deserialize(decompressed); // EXCEPTION 2
return message;
}
But the problem is that any time the code is ran, I am experiencing an exception. Using the above code and invoking it as shown below, I am receiving InvalidDataException: Unknown block type. Stream might be corrupted. at the marked // EXCEPTION line.
After searching for this issue I have attempted to ditch the deflation. This was only a small change: in Pack, bytearray gets created from original.ToArray() and in Unpack, I Seek() input instead of decompressed and use Deserialize(input) instead of decompressed too. The only result which changed: the exception position and body is different, yet it still happens. I receive a SerializationException: No map for object '201326592'. at // EXCEPTION 2.
I don't seem to see what is the problem. Maybe it is the whole serialization idea... the problem is that somehow managing to pack the Message instances is necessary because these objects hold the information that travel between the server and the client application. (Serialization logic is in a .Shared DLL project which is referenced on both ends, however, right now, I'm only developing the server-side first.) It also has to be told, that I am only using string outputs because right now, the TCP connection between the servers and clients are based on string read-write on the ends. So somehow it has to be brought down to the level of strings.
This is how the Message object looks like:
[Serializable]
public class Message
{
public MessageType type;
public Client from;
public Client to;
public string content;
}
(Client right now is an empty class only having the Serializable attribute, no properties or methods.)
This is how the pack-unpack gets invoked (from Main()...):
Shared.Message msg = Shared.MessageFactory.Build(Shared.MessageType.DEFAULT, new Shared.Client(), new Shared.Client(), "foobar");
string message1 = Shared.MessageFactory.Pack(msg);
Console.WriteLine(message1);
Shared.Message mess2 = Shared.MessageFactory.Unpack(message1); // Step into... here be exceptions
Console.Write(mess2.content);
Here is an image showing what happens in the IDE. The output in the console window is the value of message1.
Some investigation unfortunately also revealed that the problem could lie around the bytearray variable. When running Pack(), after the encoder creates the string, the array contains 152 values, however, after it gets decoded in Unpack(), the array has 160 values instead.
I am appreciating any help as I am really out of ideas and having this problem the progress is crippled. Thank you.
(Update) The final solution:
I would like to thank everyone answering and commenting, as I have reached the solution. Thank you.
Marc Gravell was right, I missed the closing of deflateStream and because of this, the result was either empty or corrupted. I have taken my time and rethought and rewrote the methods and now it works flawlessly. And even the purpose of sending these bytes over the networked stream is working too.
Also, as Eric J. suggested, I have switched to using ASCIIEnconding for the change between string and byte[] when the data is flowing in the Stream.
The fixed code lies below:
public static string Pack(Message _message)
{
using (MemoryStream input = new MemoryStream())
{
BinaryFormatter bformatter = new BinaryFormatter();
bformatter.Serialize(input, _message);
input.Seek(0, SeekOrigin.Begin);
using (MemoryStream output = new MemoryStream())
using (DeflateStream deflateStream = new DeflateStream(output, CompressionMode.Compress))
{
input.CopyTo(deflateStream);
deflateStream.Close();
return Convert.ToBase64String(output.ToArray());
}
}
}
public static Message Unpack(string _packed)
{
using (MemoryStream input = new MemoryStream(Convert.FromBase64String(_packed)))
using (DeflateStream deflateStream = new DeflateStream(input, CompressionMode.Decompress))
using (MemoryStream output = new MemoryStream())
{
deflateStream.CopyTo(output);
deflateStream.Close();
output.Seek(0, SeekOrigin.Begin);
BinaryFormatter bformatter = new BinaryFormatter();
Message message = (Message)bformatter.Deserialize(output);
return message;
}
}
Now everything happens just right, as the screenshot proves below. This was the expected output from the first place. The Server and Client executables communicate with each other and the message travels... and it gets serialized and unserialized properly.
In addition to the existing observations about Encoding vs base-64, note you haven't closed the deflate stream. This is important because compression-streams buffer: if you don't close, it may not write the end. For a short stream, that may mean it writes nothing at all.
using(DeflateStream deflateStream = new DeflateStream(
outputStream, CompressionMode.Compress))
{
original.CopyTo(deflateStream);
}
return Convert.ToBase64String(outputStream.GetBuffer(), 0,
(int)outputStream.Length);
Your problem is most probably in the UTF8 encoding. Your bytes are not really a character string and UTF-8 is a encoding with different byte lengths for characters.
This means the byte array may not correspond to a correctly encoded UTF-8 string (there may be some bytes missing at the end for instance.)
Try using UTF16 or ASCII which are constant length encodings (the resulting string will likely contain control characters so it won't be printable or transmitable through something like HTTP or email.)
But if you want to encode as a string it is customary to use UUEncoding to convert the byte array into a real printable string, then you can use any encoding you want.
When I run the following Main() code against your Pack() and Unpack():
static void Main(string[] args)
{
Message msg = new Message() { content = "The quick brown fox" };
string message1 = Pack(msg);
Console.WriteLine(message1);
Message mess2 = Unpack(message1); // Step into... here be exceptions
Console.Write(mess2.content);
}
I see that the bytearray
byte[] bytearray = outputStream.ToArray();
is empty.
I did modify your serialized class slightly since you did not post code for the included classes
public enum MessageType
{
DEFAULT = 0
}
[Serializable]
public class Message
{
public MessageType type;
public string from;
public string to;
public string content;
}
I suggest the following steps to resolve this:
Check the intermediate results along the way. Do you also see 0 bytes in the array? What is the string value returned by Pack()?
Dispose of your streams once you are done with them. The easiest way to do that is with the using keyword.
Edit
As Eli and Marc correctly pointed out, you cannot store arbitrary bytes in a UTF8 string. The mapping is not bijective (you can't go back and forth without loss/distortion of information). You will need a mapping that is bijective, such as the Convert.ToBase64String() approach Marc suggests.
Related
I have implemented a code block in order to convert Stream into Byte Array. And code snippet is shown below. But unfortunately, it gives OutOfMemory Exception while converting MemoryStream to Array (return newDocument.ToArray();). please could someone help me with this?
public byte[] MergeToBytes()
{
using (var processor = new PdfDocumentProcessor())
{
AppendStreamsToDocumentProcessor(processor);
using (var newDocument = new MemoryStream())
{
processor.SaveDocument(newDocument);
return newDocument.ToArray();
}
}
}
public Stream MergeToStream()
{
return new MemoryStream(MergeToBytes());
}
Firstly: how big is the document? if it is too big for the byte[] limit: you're going to have to use a different approach.
However, a MemoryStream is already backed by an (oversized) array; you can get this simply using newDocument.TryGetBuffer(out var buffer), and noting that you must restrict yourself to the portion of the .Array indicated by .Offset (usually, but not always, zero) and .Count (the number of bytes that should be considered "live"). Note that TryGetBuffer can return false, but not in the new MemoryStream() scenario.
If is also interesting that you're converting a MemoryStream to a byte[] and then back to a MemoryStream. An alternative here would just have been to set the Position back to 0, i.e. rewind it. So:
public Stream MergeToStream()
{
using var processor = new PdfDocumentProcessor();
AppendStreamsToDocumentProcessor(processor);
var newDocument = new MemoryStream();
processor.SaveDocument(newDocument);
newDocument.Position = 0;
return newDocument;
}
I am trying to send a serialized object through a network tunnel, and while sending it, it adds whitespace (blank) at the end of the byte[], so it fits the selected size. When I go to deserialize it, it throws an error:
SerializationException: End of Stream encountered before parsing was
completed
Code:
[Serializable]
class Foo
{
public int number;
public string str;
}
public class ExampleServer
{
public static void Main(string[] args)
{
TcpListener listener = new TcpListener(9000);
listener.Start();
TcpClient serverClient = listener.AcceptTcpClient();
IFormatter formatter = new BinaryFormatter();
Stream stream = new MemoryStream();
byte[] buffer = new byte[2048];
serverClient.GetStream().Read(buffer, 0, buffer.Length);
stream.Write(buffer, 0, buffer.Length);
Foo fo = (Foo)formatter.Deserialize(stream);
Console.WriteLine(fo.number);
}
}
public class ExampleClient
{
public static void Main(string[] args)
{
TcpClient client = new TcpClient();
client.Connect("127.0.0.1", 9000);
Foo fo = new Foo();
fo.number = 10;
fo.str = "str";
IFormatter formatter = new BinaryFormatter();
Stream stream = new MemoryStream();
formatter.Serialize(stream, fo);
byte[] buffer = new byte[2048];
stream.Read(buffer, 0, buffer.Length);
client.GetStream().Write(buffer, 0, buffer.Length);
}
}
How would I fix this?
In both code samples, the fundamental problem is that you write a buffer to a MemoryStream and then immediately try to read from it, without rewinding. MemoryStream is like a video tape: it only has one position, and if you don't rewind: you're in the wrong place. So fixing that will probably help.
However, that is not the only problem:
when reading from the network stream, you don't check the return of Read. That means that all you have is at least one byte. Calling Read with a buffer does not fill the buffer - it blocks until it can read at least one byte, or the end of the stream is detected; you must check the return value of Read
technically this also applies to MemoryStream, although in reality you will get away with it there
sending / expecting 2k is ... bizarre; send what the payload size is, not something arbitrarily oversized; you probably need to know about "framing", which in a binary format usually means: length-prefix
perhaps ironically, BinaryFormatter does framing itself anyway; if you'd just pointed BinaryFormatter at the network stream, it probably would have worked, but:
BinaryFormatter is a terrible choice, frankly; I could write many paragraphs explaining why - but I already did
As general advice: network code is hard. If you aren't an expert in that area, I'd strongly recommend using a pre-rolled RPC or similar layer.
Try to add [Serializable] tag to the class you are Serializing and check if this solve you problem.
add like this:
[Serializable]
public class TestSimpleObject {
public int member1;
public string member2;
public string member3;
public double member4;
}
I'm trying to translate a function from ActionScript 3 into C# .NET.
What I have trouble is how to properly use ByteArrays in C#. In As3 there is a specific Class for it that already has most of the functionality i need, but in C# nothing of that sort seems to exist and I can't wrap my head around it.
This is the As3 function:
private function createBlock(type:uint, tag:uint,data:ByteArray):ByteArray
{
var ba:ByteArray = new ByteArray();
ba.endian = Endian.LITTLE_ENDIAN;
ba.writeUnsignedInt(data.length+16);
ba.writeUnsignedInt(0x00);
ba.writeUnsignedInt(type);
ba.writeUnsignedInt(tag);
data.position = 0;
ba.writeBytes(data);
ba.position = 0;
return ba;
}
But from what I gather, in C# I have to use a normal Array with the byte type, like this
byte[] ba = new byte[length];
Now, I looked into the Encoding Class, the BinaryWriter and BinaryFormatter class and researched if somebody made a Class for ByteArrays, but with no luck.
Can somebody nudge me in the right direction please?
You should be able to do this using a combination of MemoryStream and BinaryWriter:
public static byte[] CreateBlock(uint type, uint tag, byte[] data)
{
using (var memory = new MemoryStream())
{
// We want 'BinaryWriter' to leave 'memory' open, so we need to specify false for the third
// constructor parameter. That means we need to also specify the second parameter, the encoding.
// The default encoding is UTF8, so we specify that here.
var defaultEncoding = new UTF8Encoding(encoderShouldEmitUTF8Identifier:false, throwOnInvalidBytes:true);
using (var writer = new BinaryWriter(memory, defaultEncoding, leaveOpen:true))
{
// There is no Endian - things are always little-endian.
writer.Write((uint)data.Length+16);
writer.Write((uint)0x00);
writer.Write(type);
writer.Write(data);
}
// Note that we must close or flush 'writer' before accessing 'memory', otherwise the bytes written
// to it may not have been transferred to 'memory'.
return memory.ToArray();
}
}
However, note that BinaryWriter always uses little-endian format. If you need to control this, you can use Jon Skeet's EndianBinaryWriter instead.
As an alternative to this approach, you could pass streams around instead of byte arrays (probably using a MemoryStream for implementation), but then you will need to be careful about lifetime management, i.e. who will close/dispose the stream when it's done with? (You might be able to get away with not bothering to close/dispose a memory stream since it uses no unmanaged resources, but that's not entirely satisfactory IMO.)
You want to have a byte stream and then extract the array from it:
using(MemoryStream memory = new MemoryStream())
using(BinaryWriter writer = new BinaryWriter(memory))
{
// write into stream
writer.Write((byte)0); // a byte
writer.Write(0f); // a float
writer.Write("hello"); // a string
return memory.ToArray(); // returns the underlying array
}
I have now an issue with deserializing an object sent over TCP.
When deserializing, I get the following SerializationException (code below):
Additional information: Binary stream '0' does not contain a valid
BinaryHeader. Possible causes are invalid stream or object version
change between serialization and deserialization.
Serialization code:
public static void SerializeRO(Stream stream, ReplicableObject ro) {
MemoryStream serializedObjectStream = new MemoryStream();
Formatter.Serialize(serializedObjectStream, ro);
BinaryWriter bw = new BinaryWriter(stream);
bw.Write(serializedObjectStream.Length);
serializedObjectStream.WriteTo(stream);
serializedObjectStream.Close();
bw.Close();
}
Deserialization code:
public static List<ReplicableObject> ParseStreamForObjects(Stream stream) {
List<ReplicableObject> result = new List<ReplicableObject>();
while (true) {
if (!(stream as NetworkStream).DataAvailable) break;
BinaryReader br = new BinaryReader(stream);
int length = br.ReadInt32();
byte[] bytes = br.ReadBytes(length);
MemoryStream ms = new MemoryStream(bytes);
ms.Position = 0;
// ERROR OCCURS ON THE LINE BELOW
result.Add((ReplicableObject) Formatter.Deserialize(ms));
ms.Close();
br.Close();
}
return result;
}
The objects are being serialized at runtime, so I don't think it's a versioning issue. I am new to streaming etc, so I may have missed something obvious.
I'd like to suggest what I think it could be, but I'm really stuck. :)
Thanks.
serializedObjectStream.Length is a long.
You're writing a 64-bit value to the network, but you're trying to read it as a 32-bit int.
I'm trying to serialize/deserialize string. Using the code:
private byte[] StrToBytes(string str)
{
BinaryFormatter bf = new BinaryFormatter();
MemoryStream ms = new MemoryStream();
bf.Serialize(ms, str);
ms.Seek(0, 0);
return ms.ToArray();
}
private string BytesToStr(byte[] bytes)
{
BinaryFormatter bfx = new BinaryFormatter();
MemoryStream msx = new MemoryStream();
msx.Write(bytes, 0, bytes.Length);
msx.Seek(0, 0);
return Convert.ToString(bfx.Deserialize(msx));
}
This two code works fine if I play with string variables.
But If I deserialize a string and save it to a file, after reading the back and serializing it again, I end up with only first portion of the string.
So I believe I have a problem with my file save/read operation. Here is the code for my save/read
private byte[] ReadWhole(string fileName)
{
try
{
using (BinaryReader br = new BinaryReader(new FileStream(fileName, FileMode.Open)))
{
return br.ReadBytes((int)br.BaseStream.Length);
}
}
catch (Exception)
{
return null;
}
}
private void WriteWhole(byte[] wrt,string fileName,bool append)
{
FileMode fm = FileMode.OpenOrCreate;
if (append)
fm = FileMode.Append;
using (BinaryWriter bw = new BinaryWriter(new FileStream(fileName, fm)))
{
bw.Write(wrt);
}
return;
}
Any help will be appreciated.
Many thanks
Sample Problematic Run:
WriteWhole(StrToBytes("First portion of text"),"filename",true);
WriteWhole(StrToBytes("Second portion of text"),"filename",true);
byte[] readBytes = ReadWhole("filename");
string deserializedStr = BytesToStr(readBytes); // here deserializeddStr becomes "First portion of text"
Just use
Encoding.UTF8.GetBytes(string s)
Encoding.UTF8.GetString(byte[] b)
and don't forget to add System.Text in your using statements
BTW, why do you need to serialize a string and save it that way?
You can just use File.WriteAllText() or File.WriteAllBytes. The same way you can read it back, File.ReadAllBytes() and File.ReadAllText()
The problem is that you are writing two strings to the file, but only reading one back.
If you want to read back multiple strings, then you must deserialize multiple strings. If there are always two strings, then you can just deserialize two strings. If you want to store any number of strings, then you must first store how many strings there are, so that you can control the deserialization process.
If you are trying to hide data (as indicated by your comment to another answer), then this is not a reliable way to accomplish that goal. On the other hand, if you are storing data an a user's hard-drive, and the user is running your program on their local machine, then there is no way to hide the data from them, so this is as good as anything else.