I have implemented a code block in order to convert Stream into Byte Array. And code snippet is shown below. But unfortunately, it gives OutOfMemory Exception while converting MemoryStream to Array (return newDocument.ToArray();). please could someone help me with this?
public byte[] MergeToBytes()
{
using (var processor = new PdfDocumentProcessor())
{
AppendStreamsToDocumentProcessor(processor);
using (var newDocument = new MemoryStream())
{
processor.SaveDocument(newDocument);
return newDocument.ToArray();
}
}
}
public Stream MergeToStream()
{
return new MemoryStream(MergeToBytes());
}
Firstly: how big is the document? if it is too big for the byte[] limit: you're going to have to use a different approach.
However, a MemoryStream is already backed by an (oversized) array; you can get this simply using newDocument.TryGetBuffer(out var buffer), and noting that you must restrict yourself to the portion of the .Array indicated by .Offset (usually, but not always, zero) and .Count (the number of bytes that should be considered "live"). Note that TryGetBuffer can return false, but not in the new MemoryStream() scenario.
If is also interesting that you're converting a MemoryStream to a byte[] and then back to a MemoryStream. An alternative here would just have been to set the Position back to 0, i.e. rewind it. So:
public Stream MergeToStream()
{
using var processor = new PdfDocumentProcessor();
AppendStreamsToDocumentProcessor(processor);
var newDocument = new MemoryStream();
processor.SaveDocument(newDocument);
newDocument.Position = 0;
return newDocument;
}
Related
I needed to put a byte to a memory stream so initially, I used:
byte[] Input;
using (MemoryStream mem = new MemoryStream())
{
mem.Write(Input, 0, (int)Input.Length);
StreamReader stream = new StreamReader(mem);
...
}
I wanted to use the Streamreader to read lines from a text file.
It didn't work.
Then I used
using (MemoryStream mem = new MemoryStream(Input))
instead and removed
mem.Write(Input, 0, (int)Input.Length);
It worked. I don't know why. Why did it work?
In your first approach, you use mem.Write(Input, 0, (int)Input.Length);. Note that MemoryStream.Write sets the stream read/write position behind the written data. In your example case this is equivalent with a position signifying the end of the stream. Trying to read from the MemoryStream again will not return any data, as the MemoryStream read/write position is at the end of the stream.
In your second approach, you passed the Input byte array as argument to the MemoryStream constructor. Providing the byte array through the constructor not only will make MemoryStream use this byte array, but more importantly it keeps the initial stream position of zero. Thus, when trying to read from the MemoryStream initialized in this way, the data contained in the input byte array will be returned as expected.
How to fix the problem with the first approach?
You can make the first approach with MemoryStream.Write working by simply setting the MemoryStream position back to the intended/original value (in your example case it would be zero) after writing the data to the MemoryStream:
byte[] Input;
using (MemoryStream mem = new MemoryStream())
{
mem.Write(Input, 0, (int)Input.Length);
mem.Position = 0;
using (StreamReader stream = new StreamReader(mem))
{
...
}
}
I'm trying to translate a function from ActionScript 3 into C# .NET.
What I have trouble is how to properly use ByteArrays in C#. In As3 there is a specific Class for it that already has most of the functionality i need, but in C# nothing of that sort seems to exist and I can't wrap my head around it.
This is the As3 function:
private function createBlock(type:uint, tag:uint,data:ByteArray):ByteArray
{
var ba:ByteArray = new ByteArray();
ba.endian = Endian.LITTLE_ENDIAN;
ba.writeUnsignedInt(data.length+16);
ba.writeUnsignedInt(0x00);
ba.writeUnsignedInt(type);
ba.writeUnsignedInt(tag);
data.position = 0;
ba.writeBytes(data);
ba.position = 0;
return ba;
}
But from what I gather, in C# I have to use a normal Array with the byte type, like this
byte[] ba = new byte[length];
Now, I looked into the Encoding Class, the BinaryWriter and BinaryFormatter class and researched if somebody made a Class for ByteArrays, but with no luck.
Can somebody nudge me in the right direction please?
You should be able to do this using a combination of MemoryStream and BinaryWriter:
public static byte[] CreateBlock(uint type, uint tag, byte[] data)
{
using (var memory = new MemoryStream())
{
// We want 'BinaryWriter' to leave 'memory' open, so we need to specify false for the third
// constructor parameter. That means we need to also specify the second parameter, the encoding.
// The default encoding is UTF8, so we specify that here.
var defaultEncoding = new UTF8Encoding(encoderShouldEmitUTF8Identifier:false, throwOnInvalidBytes:true);
using (var writer = new BinaryWriter(memory, defaultEncoding, leaveOpen:true))
{
// There is no Endian - things are always little-endian.
writer.Write((uint)data.Length+16);
writer.Write((uint)0x00);
writer.Write(type);
writer.Write(data);
}
// Note that we must close or flush 'writer' before accessing 'memory', otherwise the bytes written
// to it may not have been transferred to 'memory'.
return memory.ToArray();
}
}
However, note that BinaryWriter always uses little-endian format. If you need to control this, you can use Jon Skeet's EndianBinaryWriter instead.
As an alternative to this approach, you could pass streams around instead of byte arrays (probably using a MemoryStream for implementation), but then you will need to be careful about lifetime management, i.e. who will close/dispose the stream when it's done with? (You might be able to get away with not bothering to close/dispose a memory stream since it uses no unmanaged resources, but that's not entirely satisfactory IMO.)
You want to have a byte stream and then extract the array from it:
using(MemoryStream memory = new MemoryStream())
using(BinaryWriter writer = new BinaryWriter(memory))
{
// write into stream
writer.Write((byte)0); // a byte
writer.Write(0f); // a float
writer.Write("hello"); // a string
return memory.ToArray(); // returns the underlying array
}
I'm having an issue with copying data from a MemoryStream into a Stream inside a ZipArchive. The following is NOT working - it returns only 114 bytes:
GetDataAsByteArray(IDataSource dataSource)
{
using (var zipStream = new MemoryStream())
{
using (var archive = new ZipArchive(zipStream, ZipArchiveMode.Create, true))
{
var file = archive.CreateEntry("compressed.file");
using (var targetStream = file.Open())
{
using (var sourceStream = new MemoryStream())
{
await dataSource.LoadIntoStream(sourceStream);
sourceStream.CopyTo(targetStream);
}
}
}
var result = zipStream.ToArray();
zipStream.Close();
return result;
}
}
However, using the implementation below for the "copy"-process, all 1103 bytes are written to the array/memory stream:
await targetStream.WriteAsync(sourceStream.ToArray(), 0, (int) sourceStream.Length);
I'm wondering why the CopyTo yields less bytes. Also I'm feeling unsecure with the cast to Int32 in the second implementation.
FYI: Comparing the byte array: It looks like only the header and footer of the zip file were written by the first implementation.
Stream.CopyTo() starts copying from the stream's current Position. Which probably isn't 0 after that LoadIntoStream() call. Since it is a MemoryStream, you can simply fix it like this:
await dataSource.LoadIntoStream(sourceStream);
sourceStream.Position = 0;
sourceStream.CopyTo(targetStream);
Set sourceStream.Position = 0 before copying it. The copy will copy from the current position to the end of the stream.
As other have said the Position is probably no longer 0. You can't always set the Position back to 0 though, such as for Network and Compressed streams. You should check the stream.CanSeek property before doing any operations and if it is false then copy the stream to a new MemoryStream first (which can be seeked) and then after each operation which changes the position set the Position back to 0.
I have the following C# code which is supposed to serialize arbitrary objects to a string, and then of course deserialize it.
public static string Pack(Message _message)
{
BinaryFormatter formatter = new BinaryFormatter();
MemoryStream original = new MemoryStream();
MemoryStream outputStream = new MemoryStream();
formatter.Serialize(original, _message);
original.Seek(0, SeekOrigin.Begin);
DeflateStream deflateStream = new DeflateStream(outputStream, CompressionMode.Compress);
original.CopyTo(deflateStream);
byte[] bytearray = outputStream.ToArray();
UTF8Encoding encoder = new UTF8Encoding();
string packed = encoder.GetString(bytearray);
return packed;
}
public static Message Unpack(string _packed_message)
{
UTF8Encoding encoder = new UTF8Encoding();
byte[] bytearray = encoder.GetBytes(_packed_message);
BinaryFormatter formatter = new BinaryFormatter();
MemoryStream input = new MemoryStream(bytearray);
MemoryStream decompressed = new MemoryStream();
DeflateStream deflateStream = new DeflateStream(input, CompressionMode.Decompress);
deflateStream.CopyTo(decompressed); // EXCEPTION
decompressed.Seek(0, SeekOrigin.Begin);
var message = (Message)formatter.Deserialize(decompressed); // EXCEPTION 2
return message;
}
But the problem is that any time the code is ran, I am experiencing an exception. Using the above code and invoking it as shown below, I am receiving InvalidDataException: Unknown block type. Stream might be corrupted. at the marked // EXCEPTION line.
After searching for this issue I have attempted to ditch the deflation. This was only a small change: in Pack, bytearray gets created from original.ToArray() and in Unpack, I Seek() input instead of decompressed and use Deserialize(input) instead of decompressed too. The only result which changed: the exception position and body is different, yet it still happens. I receive a SerializationException: No map for object '201326592'. at // EXCEPTION 2.
I don't seem to see what is the problem. Maybe it is the whole serialization idea... the problem is that somehow managing to pack the Message instances is necessary because these objects hold the information that travel between the server and the client application. (Serialization logic is in a .Shared DLL project which is referenced on both ends, however, right now, I'm only developing the server-side first.) It also has to be told, that I am only using string outputs because right now, the TCP connection between the servers and clients are based on string read-write on the ends. So somehow it has to be brought down to the level of strings.
This is how the Message object looks like:
[Serializable]
public class Message
{
public MessageType type;
public Client from;
public Client to;
public string content;
}
(Client right now is an empty class only having the Serializable attribute, no properties or methods.)
This is how the pack-unpack gets invoked (from Main()...):
Shared.Message msg = Shared.MessageFactory.Build(Shared.MessageType.DEFAULT, new Shared.Client(), new Shared.Client(), "foobar");
string message1 = Shared.MessageFactory.Pack(msg);
Console.WriteLine(message1);
Shared.Message mess2 = Shared.MessageFactory.Unpack(message1); // Step into... here be exceptions
Console.Write(mess2.content);
Here is an image showing what happens in the IDE. The output in the console window is the value of message1.
Some investigation unfortunately also revealed that the problem could lie around the bytearray variable. When running Pack(), after the encoder creates the string, the array contains 152 values, however, after it gets decoded in Unpack(), the array has 160 values instead.
I am appreciating any help as I am really out of ideas and having this problem the progress is crippled. Thank you.
(Update) The final solution:
I would like to thank everyone answering and commenting, as I have reached the solution. Thank you.
Marc Gravell was right, I missed the closing of deflateStream and because of this, the result was either empty or corrupted. I have taken my time and rethought and rewrote the methods and now it works flawlessly. And even the purpose of sending these bytes over the networked stream is working too.
Also, as Eric J. suggested, I have switched to using ASCIIEnconding for the change between string and byte[] when the data is flowing in the Stream.
The fixed code lies below:
public static string Pack(Message _message)
{
using (MemoryStream input = new MemoryStream())
{
BinaryFormatter bformatter = new BinaryFormatter();
bformatter.Serialize(input, _message);
input.Seek(0, SeekOrigin.Begin);
using (MemoryStream output = new MemoryStream())
using (DeflateStream deflateStream = new DeflateStream(output, CompressionMode.Compress))
{
input.CopyTo(deflateStream);
deflateStream.Close();
return Convert.ToBase64String(output.ToArray());
}
}
}
public static Message Unpack(string _packed)
{
using (MemoryStream input = new MemoryStream(Convert.FromBase64String(_packed)))
using (DeflateStream deflateStream = new DeflateStream(input, CompressionMode.Decompress))
using (MemoryStream output = new MemoryStream())
{
deflateStream.CopyTo(output);
deflateStream.Close();
output.Seek(0, SeekOrigin.Begin);
BinaryFormatter bformatter = new BinaryFormatter();
Message message = (Message)bformatter.Deserialize(output);
return message;
}
}
Now everything happens just right, as the screenshot proves below. This was the expected output from the first place. The Server and Client executables communicate with each other and the message travels... and it gets serialized and unserialized properly.
In addition to the existing observations about Encoding vs base-64, note you haven't closed the deflate stream. This is important because compression-streams buffer: if you don't close, it may not write the end. For a short stream, that may mean it writes nothing at all.
using(DeflateStream deflateStream = new DeflateStream(
outputStream, CompressionMode.Compress))
{
original.CopyTo(deflateStream);
}
return Convert.ToBase64String(outputStream.GetBuffer(), 0,
(int)outputStream.Length);
Your problem is most probably in the UTF8 encoding. Your bytes are not really a character string and UTF-8 is a encoding with different byte lengths for characters.
This means the byte array may not correspond to a correctly encoded UTF-8 string (there may be some bytes missing at the end for instance.)
Try using UTF16 or ASCII which are constant length encodings (the resulting string will likely contain control characters so it won't be printable or transmitable through something like HTTP or email.)
But if you want to encode as a string it is customary to use UUEncoding to convert the byte array into a real printable string, then you can use any encoding you want.
When I run the following Main() code against your Pack() and Unpack():
static void Main(string[] args)
{
Message msg = new Message() { content = "The quick brown fox" };
string message1 = Pack(msg);
Console.WriteLine(message1);
Message mess2 = Unpack(message1); // Step into... here be exceptions
Console.Write(mess2.content);
}
I see that the bytearray
byte[] bytearray = outputStream.ToArray();
is empty.
I did modify your serialized class slightly since you did not post code for the included classes
public enum MessageType
{
DEFAULT = 0
}
[Serializable]
public class Message
{
public MessageType type;
public string from;
public string to;
public string content;
}
I suggest the following steps to resolve this:
Check the intermediate results along the way. Do you also see 0 bytes in the array? What is the string value returned by Pack()?
Dispose of your streams once you are done with them. The easiest way to do that is with the using keyword.
Edit
As Eli and Marc correctly pointed out, you cannot store arbitrary bytes in a UTF8 string. The mapping is not bijective (you can't go back and forth without loss/distortion of information). You will need a mapping that is bijective, such as the Convert.ToBase64String() approach Marc suggests.
I'm trying to serialize/deserialize string. Using the code:
private byte[] StrToBytes(string str)
{
BinaryFormatter bf = new BinaryFormatter();
MemoryStream ms = new MemoryStream();
bf.Serialize(ms, str);
ms.Seek(0, 0);
return ms.ToArray();
}
private string BytesToStr(byte[] bytes)
{
BinaryFormatter bfx = new BinaryFormatter();
MemoryStream msx = new MemoryStream();
msx.Write(bytes, 0, bytes.Length);
msx.Seek(0, 0);
return Convert.ToString(bfx.Deserialize(msx));
}
This two code works fine if I play with string variables.
But If I deserialize a string and save it to a file, after reading the back and serializing it again, I end up with only first portion of the string.
So I believe I have a problem with my file save/read operation. Here is the code for my save/read
private byte[] ReadWhole(string fileName)
{
try
{
using (BinaryReader br = new BinaryReader(new FileStream(fileName, FileMode.Open)))
{
return br.ReadBytes((int)br.BaseStream.Length);
}
}
catch (Exception)
{
return null;
}
}
private void WriteWhole(byte[] wrt,string fileName,bool append)
{
FileMode fm = FileMode.OpenOrCreate;
if (append)
fm = FileMode.Append;
using (BinaryWriter bw = new BinaryWriter(new FileStream(fileName, fm)))
{
bw.Write(wrt);
}
return;
}
Any help will be appreciated.
Many thanks
Sample Problematic Run:
WriteWhole(StrToBytes("First portion of text"),"filename",true);
WriteWhole(StrToBytes("Second portion of text"),"filename",true);
byte[] readBytes = ReadWhole("filename");
string deserializedStr = BytesToStr(readBytes); // here deserializeddStr becomes "First portion of text"
Just use
Encoding.UTF8.GetBytes(string s)
Encoding.UTF8.GetString(byte[] b)
and don't forget to add System.Text in your using statements
BTW, why do you need to serialize a string and save it that way?
You can just use File.WriteAllText() or File.WriteAllBytes. The same way you can read it back, File.ReadAllBytes() and File.ReadAllText()
The problem is that you are writing two strings to the file, but only reading one back.
If you want to read back multiple strings, then you must deserialize multiple strings. If there are always two strings, then you can just deserialize two strings. If you want to store any number of strings, then you must first store how many strings there are, so that you can control the deserialization process.
If you are trying to hide data (as indicated by your comment to another answer), then this is not a reliable way to accomplish that goal. On the other hand, if you are storing data an a user's hard-drive, and the user is running your program on their local machine, then there is no way to hide the data from them, so this is as good as anything else.