Copy between two streams in .net 2.0 - c#

I have been using the following code to Compress data in .Net 4.0:
public static byte[] CompressData(byte[] data_toCompress)
{
using (MemoryStream outFile = new MemoryStream())
{
using (MemoryStream inFile = new MemoryStream(data_toCompress))
using (GZipStream Compress = new GZipStream(outFile, CompressionMode.Compress))
{
inFile.CopyTo(Compress);
}
return outFile.ToArray();
}
}
However, in .Net 2.0 Stream.CopyTo method is not available. So, I tried making a replacement:
public static byte[] CompressData(byte[] data_toCompress)
{
using (MemoryStream outFile = new MemoryStream())
{
using (MemoryStream inFile = new MemoryStream(data_toCompress))
using (GZipStream Compress = new GZipStream(outFile, CompressionMode.Compress))
{
//inFile.CopyTo(Compress);
Compress.Write(inFile.GetBuffer(), (int)inFile.Position, (int)(inFile.Length - inFile.Position));
}
return outFile.ToArray();
}
}
The compression fails, though, when using the above attempt - I get an error saying:
MemoryStream's internal buffer cannot be accessed.
Could anyone offer any help on this issue? I'm really not sure what else to do here.
Thank you,
Evan

This is the code straight out of .Net 4.0 Stream.CopyTo method (bufferSize is 4096):
byte[] buffer = new byte[bufferSize];
int count;
while ((count = this.Read(buffer, 0, buffer.Length)) != 0)
destination.Write(buffer, 0, count);

Since you have access to the array already, why don't you do this:
using (MemoryStream outFile = new MemoryStream())
{
using (GZipStream Compress = new GZipStream(outFile, CompressionMode.Compress))
{
Compress.Write(data_toCompress, 0, data_toCompress.Length);
}
return outFile.ToArray();
}
Most likely in the sample code you are using inFile.GetBuffer() will throw an exception since you do not use the right constructor - not all MemoryStream instances allow you access to the internal buffer - you have to look for this in the documentation:
Initializes a new instance of the MemoryStream class based on the
specified region of a byte array, with the CanWrite property set as
specified, and the ability to call GetBuffer set as specified.
This should work - but is not needed anyway in the suggested solution:
using (MemoryStream inFile = new MemoryStream(data_toCompress,
0,
data_toCompress.Length,
false,
true))

Why are you constructing a memory stream with an array and then trying to pull the array back out of the memory stream?
You could just do Compress.Write(data_toCompress, 0, data_toCompress.Length);
If you need to replace the functionality of CopyTo, you can create a buffer array of some length, read data from the source stream and write that data to the destination stream.

You can try
infile.WriteTo(Compress);

try to replace the line:
Compress.Write(inFile.GetBuffer(), (int)inFile.Position, (int)(inFile.Length - inFile.Position));
with:
Compress.Write(data_toCompress, 0, data_toCompress.Length);
you can get rid of this line completely:
using (MemoryStream inFile = new MemoryStream(data_toCompress))
Edit: find an example here: Why does gzip/deflate compressing a small file result in many trailing zeroes?

You should manually read and write between these 2 streams:
private static void CopyStream(Stream from, Stream to)
{
int bufSize = 1024, count;
byte[] buffer = new byte[bufSize];
count = from.Read(buffer, 0, bufSize);
while (count > 0)
{
to.Write(buffer, 0, count);
count = from.Read(buffer, 0, bufSize);
}
}

The open-source NuGet package Stream.CopyTo implements Stream.CopyTo for all versions of the .NET Framework.
Available on GitHub and via NuGet (Install-Package Stream.CopyTo)

Related

Download a file from System.Io.Stream class C# asp.net [duplicate]

I have a StreamReader object that I initialized with a stream, now I want to save this stream to disk (the stream may be a .gif or .jpg or .pdf).
Existing Code:
StreamReader sr = new StreamReader(myOtherObject.InputStream);
I need to save this to disk (I have the filename).
In the future I may want to store this to SQL Server.
I have the encoding type also, which I will need if I store it to SQL Server, correct?
As highlighted by Tilendor in Jon Skeet's answer, streams have a CopyTo method since .NET 4.
var fileStream = File.Create("C:\\Path\\To\\File");
myOtherObject.InputStream.Seek(0, SeekOrigin.Begin);
myOtherObject.InputStream.CopyTo(fileStream);
fileStream.Close();
Or with the using syntax:
using (var fileStream = File.Create("C:\\Path\\To\\File"))
{
myOtherObject.InputStream.Seek(0, SeekOrigin.Begin);
myOtherObject.InputStream.CopyTo(fileStream);
}
You have to call Seek if you're not already at the beginning or you won't copy the entire stream.
You must not use StreamReader for binary files (like gifs or jpgs). StreamReader is for text data. You will almost certainly lose data if you use it for arbitrary binary data. (If you use Encoding.GetEncoding(28591) you will probably be okay, but what's the point?)
Why do you need to use a StreamReader at all? Why not just keep the binary data as binary data and write it back to disk (or SQL) as binary data?
EDIT: As this seems to be something people want to see... if you do just want to copy one stream to another (e.g. to a file) use something like this:
/// <summary>
/// Copies the contents of input to output. Doesn't close either stream.
/// </summary>
public static void CopyStream(Stream input, Stream output)
{
byte[] buffer = new byte[8 * 1024];
int len;
while ( (len = input.Read(buffer, 0, buffer.Length)) > 0)
{
output.Write(buffer, 0, len);
}
}
To use it to dump a stream to a file, for example:
using (Stream file = File.Create(filename))
{
CopyStream(input, file);
}
Note that Stream.CopyTo was introduced in .NET 4, serving basically the same purpose.
public void CopyStream(Stream stream, string destPath)
{
using (var fileStream = new FileStream(destPath, FileMode.Create, FileAccess.Write))
{
stream.CopyTo(fileStream);
}
}
private void SaveFileStream(String path, Stream stream)
{
var fileStream = new FileStream(path, FileMode.Create, FileAccess.Write);
stream.CopyTo(fileStream);
fileStream.Dispose();
}
I don't get all of the answers using CopyTo, where maybe the systems using the app might not have been upgraded to .NET 4.0+. I know some would like to force people to upgrade, but compatibility is also nice, too.
Another thing, I don't get using a stream to copy from another stream in the first place. Why not just do:
byte[] bytes = myOtherObject.InputStream.ToArray();
Once you have the bytes, you can easily write them to a file:
public static void WriteFile(string fileName, byte[] bytes)
{
string path = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location);
if (!path.EndsWith(#"\")) path += #"\";
if (File.Exists(Path.Combine(path, fileName)))
File.Delete(Path.Combine(path, fileName));
using (FileStream fs = new FileStream(Path.Combine(path, fileName), FileMode.CreateNew, FileAccess.Write))
{
fs.Write(bytes, 0, (int)bytes.Length);
//fs.Close();
}
}
This code works as I've tested it with a .jpg file, though I admit I have only used it with small files (less than 1 MB). One stream, no copying between streams, no encoding needed, just write the bytes! No need to over-complicate things with StreamReader if you already have a stream you can convert to bytes directly with .ToArray()!
Only potential downsides I can see in doing it this way is if there's a large file you have, having it as a stream and using .CopyTo() or equivalent allows FileStream to stream it instead of using a byte array and reading the bytes one by one. It might be slower doing it this way, as a result. But it shouldn't choke since the .Write() method of the FileStream handles writing the bytes, and it's only doing it one byte at a time, so it won't clog memory, except that you will have to have enough memory to hold the stream as a byte[] object. In my situation where I used this, getting an OracleBlob, I had to go to a byte[], it was small enough, and besides, there was no streaming available to me, anyway, so I just sent my bytes to my function, above.
Another option, using a stream, would be to use it with Jon Skeet's CopyStream function that was in another post - this just uses FileStream to take the input stream and create the file from it directly. It does not use File.Create, like he did (which initially seemed to be problematic for me, but later found it was likely just a VS bug...).
/// <summary>
/// Copies the contents of input to output. Doesn't close either stream.
/// </summary>
public static void CopyStream(Stream input, Stream output)
{
byte[] buffer = new byte[8 * 1024];
int len;
while ( (len = input.Read(buffer, 0, buffer.Length)) > 0)
{
output.Write(buffer, 0, len);
}
}
public static void WriteFile(string fileName, Stream inputStream)
{
string path = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location);
if (!path.EndsWith(#"\")) path += #"\";
if (File.Exists(Path.Combine(path, fileName)))
File.Delete(Path.Combine(path, fileName));
using (FileStream fs = new FileStream(Path.Combine(path, fileName), FileMode.CreateNew, FileAccess.Write)
{
CopyStream(inputStream, fs);
}
inputStream.Close();
inputStream.Flush();
}
Here's an example that uses proper usings and implementation of idisposable:
static void WriteToFile(string sourceFile, string destinationfile, bool append = true, int bufferSize = 4096)
{
using (var sourceFileStream = new FileStream(sourceFile, FileMode.OpenOrCreate))
{
using (var destinationFileStream = new FileStream(destinationfile, FileMode.OpenOrCreate))
{
while (sourceFileStream.Position < sourceFileStream.Length)
{
destinationFileStream.WriteByte((byte)sourceFileStream.ReadByte());
}
}
}
}
...and there's also this
public static void WriteToFile(Stream stream, string destinationFile, int bufferSize = 4096, FileMode mode = FileMode.OpenOrCreate, FileAccess access = FileAccess.ReadWrite, FileShare share = FileShare.ReadWrite)
{
using (var destinationFileStream = new FileStream(destinationFile, mode, access, share))
{
while (stream.Position < stream.Length)
{
destinationFileStream.WriteByte((byte)stream.ReadByte());
}
}
}
The key is understanding the proper use of using (which should be implemented on the instantiation of the object that implements idisposable as shown above), and having a good idea as to how the properties work for streams. Position is literally the index within the stream (which starts at 0) that is followed as each byte is read using the readbyte method. In this case I am essentially using it in place of a for loop variable and simply letting it follow through all the way up to the length which is LITERALLY the end of the entire stream (in bytes). Ignore in bytes because it is practically the same and you will have something simple and elegant like this that resolves everything cleanly.
Keep in mind, too, that the ReadByte method simply casts the byte to an int in the process and can simply be converted back.
I'm gonna add another implementation I recently wrote to create a dynamic buffer of sorts to ensure sequential data writes to prevent massive overload
private void StreamBuffer(Stream stream, int buffer)
{
using (var memoryStream = new MemoryStream())
{
stream.CopyTo(memoryStream);
var memoryBuffer = memoryStream.GetBuffer();
for (int i = 0; i < memoryBuffer.Length;)
{
var networkBuffer = new byte[buffer];
for (int j = 0; j < networkBuffer.Length && i < memoryBuffer.Length; j++)
{
networkBuffer[j] = memoryBuffer[i];
i++;
}
//Assuming destination file
destinationFileStream.Write(networkBuffer, 0, networkBuffer.Length);
}
}
}
The explanation is fairly simple: we know that we need to keep in mind the entire set of data we wish to write and also that we only want to write certain amounts, so we want the first loop with the last parameter empty (same as while). Next, we initialize a byte array buffer that is set to the size of what's passed, and with the second loop we compare j to the size of the buffer and the size of the original one, and if it's greater than the size of the original byte array, end the run.
Why not use a FileStream object?
public void SaveStreamToFile(string fileFullPath, Stream stream)
{
if (stream.Length == 0) return;
// Create a FileStream object to write a stream to a file
using (FileStream fileStream = System.IO.File.Create(fileFullPath, (int)stream.Length))
{
// Fill the bytes[] array with the stream data
byte[] bytesInStream = new byte[stream.Length];
stream.Read(bytesInStream, 0, (int)bytesInStream.Length);
// Use FileStream object to write to the specified file
fileStream.Write(bytesInStream, 0, bytesInStream.Length);
}
}
//If you don't have .Net 4.0 :)
public void SaveStreamToFile(Stream stream, string filename)
{
using(Stream destination = File.Create(filename))
Write(stream, destination);
}
//Typically I implement this Write method as a Stream extension method.
//The framework handles buffering.
public void Write(Stream from, Stream to)
{
for(int a = from.ReadByte(); a != -1; a = from.ReadByte())
to.WriteByte( (byte) a );
}
/*
Note, StreamReader is an IEnumerable<Char> while Stream is an IEnumbable<byte>.
The distinction is significant such as in multiple byte character encodings
like Unicode used in .Net where Char is one or more bytes (byte[n]). Also, the
resulting translation from IEnumerable<byte> to IEnumerable<Char> can loose bytes
or insert them (for example, "\n" vs. "\r\n") depending on the StreamReader instance
CurrentEncoding.
*/
Another option is to get the stream to a byte[] and use File.WriteAllBytes. This should do:
using (var stream = new MemoryStream())
{
input.CopyTo(stream);
File.WriteAllBytes(file, stream.ToArray());
}
Wrapping it in an extension method gives it better naming:
public void WriteTo(this Stream input, string file)
{
//your fav write method:
using (var stream = File.Create(file))
{
input.CopyTo(stream);
}
//or
using (var stream = new MemoryStream())
{
input.CopyTo(stream);
File.WriteAllBytes(file, stream.ToArray());
}
//whatever that fits.
}
public void testdownload(stream input)
{
byte[] buffer = new byte[16345];
using (FileStream fs = new FileStream(this.FullLocalFilePath,
FileMode.Create, FileAccess.Write, FileShare.None))
{
int read;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
fs.Write(buffer, 0, read);
}
}
}

Large PDFsharp (MigraDoc) PdfDocument to byte[]

I have been attempting to save a large PdfDocument into a byte array using various means but always come back to an out of memory exception (file is 200 MB and 2.5K pages).
My initial attempt was to simply use MemoryStream
public static byte[] ProcessLargePdfDocument(PdfDocument pdfDocument)
{
using (MemoryStream stream = new MemoryStream())
{
pdfDocument.Save(stream, true);
return stream.ToArray();
}
}
Then I tried adding in some buffering
public static byte[] ProcessLargePdfDocument(PdfDocument pdfDocument, long whereToStartReading = 0)
{
List<byte> byteList = new List<byte>();
using (MemoryStream stream = new MemoryStream())
{
pdfDocument.Save(stream, false);
byte[] buffer = new byte[megabyte];
stream.Seek(whereToStartReading, SeekOrigin.Begin);
int bytesRead = stream.Read(buffer, 0, megabyte);
while (bytesRead > 0)
{
byteList.AddRange(buffer);
bytesRead = stream.Read(buffer, 0, megabyte);
}
}
return byteList.ToArray();
}
No matter what I try I get an out of memory exception on the pdfDocument.Save call. I am able to write it to a file location and to read it back using a buffered FileStream in dev but I'm not able to do this on the production environment due to permissions (yet).
Two tips:
Make sure your process runs as a 64-bit process to allow it to use more than 2 GiB of RAM.
stream.ToArray() creates a copy, stream.GetBuffer() lets you access the internal buffer of the MemoryStream. If the exception occurs after the Save() this may make a difference.

protobuf-net returns null when calling Deserialize

My end goal is to use protobuf-net and GZipStream in an attempt to compress a List<MyCustomType> object to store in a varbinary(max) field in SQL Server. I'm working on unit tests to understand how everything works and fits together.
Target .NET framework is 3.5.
My current process is:
Serialize the data with protobuf-net (good).
Compress the serialized data from #1 with GZipStream (good).
Convert the compressed data to a base64 string (good).
At this point, the value from step #3 will be stored in a varbinary(max) field. I have no control over this. The steps resume with needing to take a base64 string and deserialize it to a concrete type.
Convert a base 64 string to a byte[] (good).
Decompress the data with GZipStream (good).
Deserialize the data with protobuf-net (bad).
Can someone assist with why the call to Serializer.Deserialize<string> returns null? I'm stuck on this one and hopefully a fresh set of eyes will help.
FWIW, I tried another version of this using List<T> where T is a custom class I created and I Deserialize<> still returns null.
FWIW 2, data.txt is a 4MB plaintext file residing on my C:.
[Test]
public void ForStackOverflow()
{
string data = "hi, my name is...";
//string data = File.ReadAllText(#"C:\Temp\data.txt");
string serializedBase64;
using (MemoryStream protobuf = new MemoryStream())
{
Serializer.Serialize(protobuf, data);
using (MemoryStream compressed = new MemoryStream())
{
using (GZipStream gzip = new GZipStream(compressed, CompressionMode.Compress))
{
byte[] s = protobuf.ToArray();
gzip.Write(s, 0, s.Length);
gzip.Close();
}
serializedBase64 = Convert.ToBase64String(compressed.ToArray());
}
}
byte[] base64byteArray = Convert.FromBase64String(serializedBase64);
using (MemoryStream base64Stream = new MemoryStream(base64byteArray))
{
using (GZipStream gzip = new GZipStream(base64Stream, CompressionMode.Decompress))
{
using (MemoryStream plainText = new MemoryStream())
{
byte[] buffer = new byte[4096];
int read;
while ((read = gzip.Read(buffer, 0, buffer.Length)) > 0)
{
plainText.Write(buffer, 0, read);
}
// why does this call to Deserialize return null?
string deserialized = Serializer.Deserialize<string>(plainText);
Assert.IsNotNull(deserialized);
Assert.AreEqual(data, deserialized);
}
}
}
}
Because you didn't rewind plainText after writing to it. Actually, that entire Stream is unnecessary - this works:
using (MemoryStream base64Stream = new MemoryStream(base64byteArray))
{
using (GZipStream gzip = new GZipStream(
base64Stream, CompressionMode.Decompress))
{
string deserialized = Serializer.Deserialize<string>(gzip);
Assert.IsNotNull(deserialized);
Assert.AreEqual(data, deserialized);
}
}
Likewise, this should work for the serialize:
using (MemoryStream compressed = new MemoryStream())
{
using (GZipStream gzip = new GZipStream(
compressed, CompressionMode.Compress, true))
{
Serializer.Serialize(gzip, data);
}
serializedBase64 = Convert.ToBase64String(
compressed.GetBuffer(), 0, (int)compressed.Length);
}

How to decompress a string in javascript, compressed in C#? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
ZLIB Decompression - Client Side
I'll try to be clear and I'm sorry for my bad english. This is the question:
In my web application i received a string that represent an image compressed with this algorithm, written in C#:
public static class Compression
{
public static string Compress(string text)
{
byte[] buffer = Encoding.UTF8.GetBytes(text);
MemoryStream ms = new MemoryStream();
using (GZipStream zip = new GZipStream(ms, CompressionMode.Compress, true))
{
zip.Write(buffer, 0, buffer.Length);
}
ms.Position = 0;
MemoryStream outStream = new MemoryStream();
byte[] compressed = new byte[ms.Length];
ms.Read(compressed, 0, compressed.Length);
byte[] gzBuffer = new byte[compressed.Length + 4];
System.Buffer.BlockCopy(compressed, 0, gzBuffer, 4, compressed.Length);
System.Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gzBuffer, 0, 4);
return Convert.ToBase64String(gzBuffer);
}
public static string Decompress(string compressedText)
{
byte[] gzBuffer = Convert.FromBase64String(compressedText);
using (MemoryStream ms = new MemoryStream())
{
int msgLength = BitConverter.ToInt32(gzBuffer, 0);
ms.Write(gzBuffer, 4, gzBuffer.Length - 4);
byte[] buffer = new byte[msgLength];
ms.Position = 0;
using (GZipStream zip = new GZipStream(ms, CompressionMode.Decompress))
{
zip.Read(buffer, 0, buffer.Length);
}
return Encoding.UTF8.GetString(buffer);
}
}
}
The Decompress method is used in the Server side application. I receive an xml file with the string that represent the image compressed with the Compress method and I want to be able to decompress the string I received in javascript within my web app. Is there a way to do that? Are there other solutions? Thank's to everyone!!
The best solution might be to translate the decompression function from C# to Javascript. You could use one that's already available in Javascript such as this one, but you would need to change the source of the image or uncompress-recompress at the server, unless it happens to be compatible with the compression you're using.
Another option would be to convert the image in to .jpg or .png before you use it, again at the server. This would give you more flexibility in the long run, but might put a load on the server depending on traffic and image size.
You can use JSXCompressor library to do decompression (deflate, unzip).
But if your web server support compression at http level I think you can skip compression and decompression.

Writing to the compression stream is not supported. Using System.IO.GZipStream

I get an exception when trying to decompress a (.gz) file using the GZipStream class that is included in the .NET framework. I am using the MSDN documentation. This is the exception:
Writing to the compression stream is not supported.
Here is the application source:
try
{
var infile = new FileStream(#"C:\TarDecomp\TarDecomp\TarDecomp\bin\Debug\nick_blah-2008.tar.gz", FileMode.Open, FileAccess.Read, FileShare.Read);
byte[] buffer = new byte[infile.Length];
// Read the file to ensure it is readable.
int count = infile.Read(buffer, 0, buffer.Length);
if (count != buffer.Length)
{
infile.Close();
Console.WriteLine("Test Failed: Unable to read data from file");
return;
}
infile.Close();
MemoryStream ms = new MemoryStream();
// Use the newly created memory stream for the compressed data.
GZipStream compressedzipStream = new GZipStream(ms, CompressionMode.Decompress, true);
Console.WriteLine("Decompression");
compressedzipStream.Write(buffer, 0, buffer.Length); //<<Throws error here
// Close the stream.
compressedzipStream.Close();
Console.WriteLine("Original size: {0}, Compressed size: {1}", buffer.Length, ms.Length);
} catch {...}
The exception is thrown at the compressedZipStream.write().
Any ideas? What is this exception telling me?
It is telling you that you should call Read instead of Write since it's decompression! Also the memory stream should be constructed with the data, or rather you should pass the file stream directly to the GZipStream constructor.
Example of how it should have been done (haven't tried to compile it):
Stream inFile = new FileStream(#"C:\TarDecomp\TarDecomp\TarDecomp\bin\Debug\nick_blah-2008.tar.gz", FileMode.Open, FileAccess.Read, FileShare.Read);
Stream decodedStream = new MemoryStream();
byte[] buffer = new byte[4096];
using (Stream inGzipStream = new GZipStream(inFile, CompressionMode.Decompress))
{
int bytesRead;
while ((bytesRead = inGzipStream.Read(buffer, 0, buffer.Length)) > 0)
decodedStream.Write(buffer, 0, bytesRead);
}
// Now decodedStream contains the decoded data
The compression code doesn't work like encryption - you can't decompress from one stream to another by writing the compressed data. You have to provide a stream which contains the compressed data already and let GZipStream read from it. Something like this:
using (Stream file = File.OpenRead(filename))
using (Stream gzip = new GZipStream(file, CompressionMode.Decompress))
using (Stream memoryStream = new MemoryStream())
{
CopyStream(gzip, memoryStream);
return memoryStream.ToArray();
}
CopyStream is a simple utility method to read from one stream and copy all the data to another. Something like this:
static void CopyStream(Stream input, Stream output)
{
byte[] buffer = new byte[8192];
int bytesRead;
while ((bytesRead = input.Read(buffer, 0, buffer.Length)) > 0)
{
output.Write(buffer, 0, bytesRead);
}
}
How compression streams work can be puzzling at first.
Reading takes compressed data and writing takes uncompressed data. All in all, the stream ensures you only "see" uncompressed data at all times.
The proper way to achieve what you are trying to do, is to read using the GZipStream and then write using the GZipStream also.

Categories