I have a bunch of data in a MemoryStream and I want to write all of it to a BinaryWriter.
Lucky for me all streams now have a Stream.CopyTo(Stream) method, and I can get the target stream from the BinaryWriter.BaseStream property.
public void WriteTo(BinaryWriter writer)
{
this.memoryStream.CopyTo(writer.BaseStream);
}
Can I safely bypass the writer like this and copy it directly to the base stream, or will this mess up the writer's buffer, position, or something else? Or, how would you propose copying a Stream to a BinaryWriter?
If you don't want to mess with the data you have in the initial stream would not something like this work:
var myStreamInitially = new MemoryStream();
var myStreamClone = new MemoryStream();
myStreamInitially.CopyTo(myStreamClone);
var binaryWriteb = new BinaryWriter(myStreamClone);
If you're using .NET 4+ the CopyTo method is very handy.
UPDATE
Isn't this then safer than changing the binaryStream underlying baseStream:
void WriteToStreamInUnknownStatus(BinaryWriter binaryWriter)
{
var myStream = new MemoryStream();
try
{
binaryWriter.Write(myStream.ToArray());
}
catch
{ }
}
UPDATE 2
If you try this you get an exception: "memory stream is not explandable"
static void Main(string[] args)
{
var binaryWrite = new BinaryWriter(new MemoryStream(new byte[] {1, 2, 3, 4}));
binaryWrite.Seek(3, SeekOrigin.Begin);
var position = binaryWrite.BaseStream.Position;
new MemoryStream(new byte[] {1, 2, 3, 4}).CopyTo(binaryWrite.BaseStream);
position = binaryWrite.BaseStream.Position;
}
So in top of having to be sure that the property is thread safe you also need to know the type of the inner stream. To risk IMO.
What you propose is safe, provided:
That you're using this in a single-threaded context. Either there are no other threads, or you have an exclusive lock on the writer at the time you call this. ... AND
You call Flush on the writer before writing directly to the BaseStream. The writer could have some data buffered that it hasn't yet written to the stream.
So your modified code is:
public void WriteTo(BinaryWriter writer)
{
writer.Flush();
this.memoryStream.CopyTo(writer.BaseStream);
}
Related
I have implemented a code block in order to convert Stream into Byte Array. And code snippet is shown below. But unfortunately, it gives OutOfMemory Exception while converting MemoryStream to Array (return newDocument.ToArray();). please could someone help me with this?
public byte[] MergeToBytes()
{
using (var processor = new PdfDocumentProcessor())
{
AppendStreamsToDocumentProcessor(processor);
using (var newDocument = new MemoryStream())
{
processor.SaveDocument(newDocument);
return newDocument.ToArray();
}
}
}
public Stream MergeToStream()
{
return new MemoryStream(MergeToBytes());
}
Firstly: how big is the document? if it is too big for the byte[] limit: you're going to have to use a different approach.
However, a MemoryStream is already backed by an (oversized) array; you can get this simply using newDocument.TryGetBuffer(out var buffer), and noting that you must restrict yourself to the portion of the .Array indicated by .Offset (usually, but not always, zero) and .Count (the number of bytes that should be considered "live"). Note that TryGetBuffer can return false, but not in the new MemoryStream() scenario.
If is also interesting that you're converting a MemoryStream to a byte[] and then back to a MemoryStream. An alternative here would just have been to set the Position back to 0, i.e. rewind it. So:
public Stream MergeToStream()
{
using var processor = new PdfDocumentProcessor();
AppendStreamsToDocumentProcessor(processor);
var newDocument = new MemoryStream();
processor.SaveDocument(newDocument);
newDocument.Position = 0;
return newDocument;
}
var incomingStream = ...
var outgoingStream = ...
await incomingStream.CopyToAsync(outgoingStream);
The above code is simple enough, and copies a incoming stream to the outgoign stream. Both streams being chunked transfers coming/going over the interet.
Now, lets say i wanted to Transform the stream with something like Func<Stream,Stream,Task> how would I do that without reading all data in.
Ofcause I could just do
var ms = new MemoryStream();
incomingStream.CopyTo(ms);
--- do transform of streams and seek
ms.CopyTo(outgoingStream)
but that would read the hole thing into the ms, is there any build in stuff that allows me to read from incoming stream and write to a new stream that dont buffer everything up but instead just keep a small internal stream for buffered data and it wont read from incoming stream before data is pulled off it again.
What I am trying to do is:
protected async Task XmlToJsonStream(Stream instream, Stream outStream)
{
XmlReaderSettings readerSettings = new XmlReaderSettings();
readerSettings.IgnoreWhitespace = false;
var reader = XmlReader.Create(instream, readerSettings);
var jsonWriter = new JsonTextWriter(new StreamWriter(outStream));
jsonWriter.WriteStartObject();
while (await reader.ReadAsync())
{
jsonWriter.writeReader(reader);
}
jsonWriter.WriteEndObject();
jsonWriter.Flush();
}
protected async Task XmlFilterStream(Stream instream, Stream outStream)
{
XmlReaderSettings readerSettings = new XmlReaderSettings();
readerSettings.IgnoreWhitespace = false;
var reader = XmlReader.Create(instream, readerSettings);
var writer = XmlWriter.Create(outStream, new XmlWriterSettings { Async = true, CloseOutput = false })
while (reader.Read())
{
writer.writeReader(reader);
}
}
but i dont know how to hook it up.
var incomingStream = ...
var outgoingStream = ...
var temp=...
XmlFilterStream(incomingStream,temp);
XmlToJsonStream(temp,outgoingstream);
because if I use a MemoryStream as temp, would it not just at the end have it all stored in the stream. Looking for at stream that throws away the data again when it has been read.
All of the above is just example code, missing some disposes and seeks ofcause, but I hope I managed to illustrate what i am going for. To be able to based on settings to plug and play between just copying stream, doing xml filtering and optional transform it to json.
Streams are sequences of bytes, so a stream transformation would be something like Func<ArraySegment<byte>, ArraySegment<byte>>. You can then apply it in a streaming way:
async Task TransformAsync(this Stream source, Func<ArraySegment<byte>, ArraySegment<byte>> transform, Stream destination, int bufferSize = 1024)
{
var buffer = new byte[bufferSize];
while (true)
{
var bytesRead = await source.ReadAsync(buffer, 0, bufferSize);
if (bytesRead == 0)
return;
var bytesToWrite = transform(new ArraySegment(buffer, 0, bytesRead));
if (bytesToWrite.Count != 0)
await destination.WriteAsync(bytesToWrite.Buffer, bytesToWrite.Offset, bytesToWrite.Count);
}
}
It's a bit more complicated than that, but that's the general idea. It needs some logic to ensure WriteAsync writes all the bytes; and there's also usually a "flush" method that is required in addition to the transform method, which is called when the source stream finishes, so the transform algorithm has a last chance to return its final data to write to the output stream.
If you want streams of other things, like XML or JSON types, then you're probably better off going with Reactive Extensions.
I'm not sure I understand your question fully, but I think you're asking how you would operate on an input stream without loading it entirely into memory first.
In this case, you wouldn't want do do something like this:
var ms = new MemoryStream();
incomingStream.CopyTo(ms);
This does load the entire input stream incomingStream into memory -- into ms.
From what I can see, your XmlFilterStream method seems to be redundant, i.e. XmlToJsonStream does everything that XmlFilterStream does anyway.
Why not just have:
protected async Task XmlToJsonStream(Stream instream, Stream outStream)
{
XmlReaderSettings readerSettings = new XmlReaderSettings();
readerSettings.IgnoreWhitespace = false;
var reader = XmlReader.Create(instream, readerSettings);
var jsonWriter = new JsonTextWriter(new StreamWriter(outStream));
jsonWriter.WriteStartObject();
while (await reader.ReadAsync())
{
jsonWriter.writeReader(reader);
}
jsonWriter.WriteEndObject();
jsonWriter.Flush();
}
And call it like this:
var incomingStream = ...
var outgoingStream = ...
XmlToJsonStream(incomingStream ,outgoingstream);
If the answer is that you have omitted some important details in XmlFilterStream then, without seeing those details, I would recommend that you just integrate those into the one XmlToJsonStream function.
I use the following snippet of code, and I'm unsure whether I need to call the Flush methods (once on StreamWriter, once on MemoryStream):
//converts an xsd object to the corresponding xml string, using the UTF8 encoding
public string Serialize(T t)
{
using (var memoryStream = new MemoryStream())
{
var encoding = new UTF8Encoding(false);
using (var writer = new StreamWriter(memoryStream, encoding))
{
var serializer = new XmlSerializer(typeof (T));
serializer.Serialize(writer, t);
writer.Flush();
}
memoryStream.Flush();
return encoding.GetString(memoryStream.ToArray());
}
}
First of all, because the code is inside the using block, I think the automatically called dispose method might do this for me. Is this true, or is flushing an entirely different concept?
According to stackoverflow itself:
Flush meaning clears all buffers for a stream and causes any buffered data to be written to the underlying device.
What does that mean in the context of the code above?
Secondly, the flush method of the MemoryStream does nothing according to the api, so what's up with that? why do we call a method that does nothing?
You don't need to use Flush on the StreamWriter, as you are disposing it (by having it in a using block). When it's disposed, it's automatically flushed and closed.
You don't need to use Flush on the MemoryStream, as it's not buffering anything that is written to any other source. There is simply nothing to flush anywhere.
The Flush method is only present in the MemoryStream object because it inherits from the Stream class. You can see in the source code for the MemoryStream class that the flush method actually does nothing.
In general Streams will buffer data as it's written (periodically flushing the buffer to the associated device if there is one) because writing to a device, usually a file, is expensive. A MemoryStream writes to RAM so the whole concept of buffering and flushing is redundant. The data is always in RAM already.
And yes, disposing the stream will cause it to be flushed.
Commenting flush method returning empty byte[], Though I am Using Using block
byte[] filecontent = null;
using var ms = new MemoryStream();
using var sw = new StreamWriter(fs);
sw.WriteCSVLine(new[] { "A", "B" });//This is extension to write as CSV
//tx.Flush();
//fs.Flush();
fs.Position = 0;
filecontent = fs.ToArray();
I'm calling a library method that writes to a stream. But I want to write to a string. Is this possible? (I do not control the source code of the method I'm calling and so changing that is not an option.)
Experimenting, I tried something like this:
iCalendarSerializer serializer = new iCalendarSerializer();
MemoryStream stream = new MemoryStream();
serializer.Serialize(new iCalendar(), stream, System.Text.Encoding.UTF8);
byte[] buff = new byte[stream.Length];
stream.Read(buff, 0, (int)stream.Length);
But I get an error on the last line that's something about not being able to access a closed stream. Apparently, the Serialize() method closes the stream when it's done.
Are there other options?
How about byte[] buff = stream.ToArray()?
ToArray is one of 2 correct way of getting the data out of memory stream (the other one is GetBuffer and Length). It looks like you just want byte array sized to data of the stream and ToArray does exactly that.
Note that it is by design safe to call these 3 methods on disposed stream, so you can safely wrap using(stream) around the code that write some data to the stream.
In you case stream look to be disposed by serialization code (.Serialize).
iCalendarSerializer serializer = new iCalendarSerializer();
MemoryStream stream = new MemoryStream();
using(stream)
{
serializer.Serialize(new iCalendar(), stream, System.Text.Encoding.UTF8);
}
byte[] buff = stream.ToArray();
In your example you need to change the position of the stream before read takes place:
stream.Position = 0;
stream.Read(buff, 0, (int)stream.Length);
In order to write stream to string you can use StreamReader.ReadToEnd() method:
var reader = new StreamReader(stream);
var text = reader.ReadToEnd();
I've tried this code:
byte[] someData = new byte[] { 1, 2, 3, 4 };
MemoryStream stream = new MemoryStream(someData, 1, someData.Length - 1, true);
using (BinaryWriter writer = new BinaryWriter(stream))
{
writer.Write(1);
}
stream.Dispose();
Everytime it's run, a NotSupportedException is thrown, telling me that the stream cannot be written to. Why is this the case? The last parameter of the initialization shown in line 2 clearly is true, so I should be able to write to the stream.
It works if I don't specify the start index and count.
Why does this happen?
Always (almost always) create a memory stream without parameters in the constructor:
using (MemoryStream stream = new MemoryStream())
{
using (BinaryWriter writer = new BinaryWriter(stream))
{
writer.Write(1);
}
stream.Flush();
byte[] bytes = stream.GetBuffer();
//use it
}
This code works fine
From MSDN:
Initializes a new non-resizable instance of the MemoryStream class
based on the specified region of a byte array, with the CanWrite
property set as specified.
The BinaryWriter starts writing at the end of the stream, so it needs to resize it to be able to write, but this is not allowed. You can only write to the already allocated bytes of the stream.