Unexpected MemoryStream behavior - c#

I am reading and writing to the same MemoryStream.
Something like this(possible compilation mistakes):
MemoryStream stream = new MemoryStream();
stream.Write("1234",0,4);
stream.Position -= 4;
stream.Read(buffer,0,4);
Why do I HAVE to move Position? Why it is not separate to read and write?
Is there any other Stream that can be used?

Because that's how streams are supposed to work. You have one position, see it as a cursor, set at a point in the stream at which you can read or write. Reading and writing both advance this position.
If you're merely using a MemoryStream to exchange data between callers, as a pseudo IPC mechanism, then perhaps some better way exists to do so.

Related

Guidelines for designing a robust file format writer?

Suppose you want to write a .WAV file format writer like so:
using var stream = File.OpenRead("test.wav");
using var writer = new WavWriter(stream, ... /* .WAV format parameters */);
// write the file
// writer.Dispose() does a few things:
// - writes user-added chunks
// - updates the file header (chunk size) so the file is valid
There is a concpetual problem in doing so:
the user can change the stream position and therefore screw the writing process
You may suggest the following:
the writer should own the stream, this would work if writing to a file, but not to a stream
own its own memory stream so it can write to streams too, okay but memory concerns
I guess you get the point...
To me, the only viable thing would be to document that aspect but I may have missed something, hence the question.
Question:
How to make a file format writer be able to write to a stream yet defend yourself about possible changes to its position?
My suggestion would be to keep an internal position field in the WavWriter. Each time you do some operation you can check that this matches the position in the backing stream and throw an exception if it does not. Update this value at the end of each write operation.
Ideally you should also handle streams that does not support seeking, but it does not sound like your design would permit that anyway. It might be a good idea to check CanSeek in the constructor and throw if seek is not supported. It is in general a good idea to validate any arguments before usage.

C# serializing/deserializing with memory stream System.OutOfMemoryException

I am trying to serialize/de-serialize a stream of around 50MB xml data with the following code and I get System.OutOfMemoryException exception.
var formatter = new BinaryFormatter();
var stream = new MemoryStream();
using (stream)
{
formatter.Serialize(stream, source);
stream.Seek(0, SeekOrigin.Begin);
return (T) formatter.Deserialize(stream);
}
I debugged the code and it throws OutOfMemoryException exception on the formatter.Serialize(stream,source) line.
I did some search in it says that limit is 2GB. How can I debug to find out the reason or is there any efficent way of writing this code? Or any tool to watch the memory usage.
Thanks,
Dealing with 2GB of xml is never going to be efficient. However, to make it work, you could try writing to a FileStream instead of a MemoryStream, since a MemoryStream has a 2GB limit. Alternatively, you could write your own Stream implementation using multiple buffers rather than a single large buffer.
However, I strongly suggest that what you actually want to do here is some combination of:
use a different serialization format (xml is a poor choice for large data)
don't require it all in one blob
have less data

Is there a way to make this faster? MemoryStream vs FileStream

I am working with iTextSharp, and need to generate hundreds of thousands of RTF documents - the resulting files are between 5KB and 500KB.
I am listing 2 approaches below - the original approach wasn't necessarily slow, but I figured why write and retrieve to/from file to get the output string I need. I saw this other approach using MemoryStream, but it actually slowed things down. I essentially just need the outputted RTF content, so that I can run some filters on that RTF to clean up unnecessary formatting. The queries bringing back the data are very quick instant seeming . To generate a 1000 files (actually 2000 files are created in process) with original approach files takes about 15 minutes, the same with second approach takes about 25-30 minutes. The resulting files that I've run are averaging around 80KB.
Is there something wrong with the second approach? Seems like it should be faster than the first one, not slower.
Original approach:
RtfWriter2.GetInstance(doc, new FileStream(RTFFilePathName, FileMode.Create));
doc.Open();
//Add Tables and stuff here
doc.Close(); //It saves a file here to (RTFPathFileName)
StreamReader srRTF = new StreamReader(RTFFilePathName);
string rtfText = srRTF.ReadToEnd();
srRTF.Close();
//Do additional things with rtfText before writing to my final file
New approach, trying to speed it up but this is actually half as fast:
MemoryStream stream = new MemoryStream();
RtfWriter2.GetInstance(doc, stream);
doc.Open();
//Add Tables and stuff here
doc.Close();
string rtfText =
ASCIIEncoding.ASCII.GetString(stream.GetBuffer());
stream.Close();
//Do additional things with rtfText before writing to my final file
The second approach I am trying I found here:
iTextSharp - How to generate a RTF document in the ClipBoard instead of a file
How big your resulting stream is? MemoryStream performs a lot of memory copy operations while growing, so for large results it may take significantly longer to write data by small chunks compared with FileStream.
To verify if it is the problem set inital size of MemoryStream to some large value around resulting size and re-run the code.
To fix it you can pre-grow memory stream initially (if you know approximate output) or write your own stream that uses different scheme when growing. Also using temporary file might be good enough for your purposes as is.
Like Alexei said, its probably caused by fact, yo are creating MemoryStream every time, and every time it continously re-alocates memory as it grows. Try creating only 1 stream and reset it to begining before every write.
Also I think stream.GetBuffer() again returns new memory, so try using same StreamReader with your MemoryStream.
And it seems your code can be easily paralelised, so you can try run it using Paralel Extesions or using TreadPool.
And it seems little weird, you are writing your text as bytes in stream, then reading this stream as bytes and converting to text. Wouldnt it be possible to save your document directly as text?
A MemoryStream is not associated with a file, and has no concept of a filename. Basically, you can't do that.
You certainly can't cast between them; you can only cast upwards an downwards - not sideways; to visualise:
Stream
|
| |
FileStream MemoryStream
You can cast a MemoryStream to a Stream trivially, and a Stream to a MemoryStream via a type-check; but never a FileStream to a MemoryStream. That is like saying a dog is an animal, and an elephant is an animal, so we can cast a dog to an elephant.
You could subclass MemoryStream and add a Name property (that you supply a value for), but there would still be no commonality between a FileStream and a YourCustomMemoryStream, and FileStream doesn't implement a pre-existing interface to get a Name; so the caller would have to explicitly handle both separately, or use duck-typing (maybe via dynamic or reflection).
Another option (perhaps easier) might be: write your data to a temporary file; use a FileStream from there; then (later) delete the file.
I know this is old but there is a lot of misinformation in this thread.
It's all about buffer size. The internal buffers are significantly smaller with a memory stream vs a file stream. Smaller buffers cause more read\writes.
Just intilaize your memory stream with either a file stream or a byte array with a size of around 80k. Close the doc, set stream position to 0 and read to end the contents.
On a side note, get buffer will return the whole allocated buffer. So if you only wrote 1 byte and the buffer is 4k, you will have a lot of garbage in your string.

C# How to write one byte at an offset?

Im trying to write a single byte at a certain location in a file. This is what im using at the moment:
BinaryWriter bw = new BinaryWriter(File.Open(filename, FileMode.Open));
bw.BaseStream.Seek(0x6354C, SeekOrigin.Begin);
bw.Write(0xB0);
bw.Close();
The problem is that BinaryWriter.Write(args) writes a four-byte signed integer at the position. I wish to only write one byte at the particular location. And then later possibly two bytes else where, how I specify how many bytes to write?
change
bw.Write(0xB0);
to
bw.Write((byte)0xB0);
There is absolutely no need to use a high-level BinaryWriter just to write a simple byte to a stream - It's more efficient and tidy just to do this:
Stream outStream = File.Open(filename, FileMode.Open);
outStream.Seek(0x6354C, SeekOrigin.Begin);
outStream.WriteByte(0xb0);
(In general you also shouldn't really Seek after attaching a BinaryWriter to your stream - the BinaryWriter should be in control of the stream, and changing things "behind its back" is a bit dirty)
You could cast to byte:
bw.Write((byte)0xB0);
This should cause the correct overloaded version of Write to be invoked.

weird behaviour of seek C#

I have some difficulties with stream. I am using FileStream and BinaryReader and I got some weird behaviours. First of all (and this was on another question, when used StreamReader I got weird behaviour that when I did Peek the psoition was changed, so I used BinaryReader which was fine) NOW I have a problem that sometimes when I do Seek (using of course the underlying base stream - FileStream) SOMETIMES it works fine (get to the right position) but sometimes it just jumps to a position that is way beyond the file's length, It doesn't happen all the time, for instance I had a problem to get to a position at 1233*267, but a day later it was fine and the problem was at another place.
FileStream m_fsReader = new FileStream(m_strDataFileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
BinaryReader m_brReader = new BinaryReader(m_fsReader);
and the seek part:
m_fsReader.Seek(offset, SeekOrigin.Begin);
Thanks,
I've noticed that every Stream keep its own position. When a Stream is constructed from another stream, the position is initially the same; but if the second stream seek, it doesn't synchronize its base stream position.
Try to watch Position property of both streams after read and seek operation. You will see discrepancies between the operation and the base stream Position value.
I solved this problem by calling myself Seek on the base stream after the work done by a substream.
It is difficult to say but I'm quite sure that is if one day work and another it does not probability the file has been changed.
Regarding the Seek Method it allow you to seek to any location beyond the length of the stream.
From MSDN:
You can seek to any location beyond the length of the stream. When you seek beyond the length of the file, the file size grows.
http://msdn.microsoft.com/en-us/library/system.io.filestream.seek.aspx

Categories