Generate zip file with xml content on the fly [duplicate] - c#

I want to write a String to a Stream (a MemoryStream in this case) and read the bytes one by one.
stringAsStream = new MemoryStream();
UnicodeEncoding uniEncoding = new UnicodeEncoding();
String message = "Message";
stringAsStream.Write(uniEncoding.GetBytes(message), 0, message.Length);
Console.WriteLine("This:\t\t" + (char)uniEncoding.GetBytes(message)[0]);
Console.WriteLine("Differs from:\t" + (char)stringAsStream.ReadByte());
The (undesired) result I get is:
This: M
Differs from: ?
It looks like it's not being read correctly, as the first char of "Message" is 'M', which works when getting the bytes from the UnicodeEncoding instance but not when reading them back from the stream.
What am I doing wrong?
The bigger picture: I have an algorithm which will work on the bytes of a Stream, I'd like to be as general as possible and work with any Stream. I'd like to convert an ASCII-String into a MemoryStream, or maybe use another method to be able to work on the String as a Stream. The algorithm in question will work on the bytes of the Stream.

After you write to the MemoryStream and before you read it back, you need to Seek back to the beginning of the MemoryStream so you're not reading from the end.
UPDATE
After seeing your update, I think there's a more reliable way to build the stream:
UnicodeEncoding uniEncoding = new UnicodeEncoding();
String message = "Message";
// You might not want to use the outer using statement that I have
// I wasn't sure how long you would need the MemoryStream object
using(MemoryStream ms = new MemoryStream())
{
var sw = new StreamWriter(ms, uniEncoding);
try
{
sw.Write(message);
sw.Flush();//otherwise you are risking empty stream
ms.Seek(0, SeekOrigin.Begin);
// Test and work with the stream here.
// If you need to start back at the beginning, be sure to Seek again.
}
finally
{
sw.Dispose();
}
}
As you can see, this code uses a StreamWriter to write the entire string (with proper encoding) out to the MemoryStream. This takes the hassle out of ensuring the entire byte array for the string is written.
Update: I stepped into issue with empty stream several time. It's enough to call Flush right after you've finished writing.

Try this "one-liner" from Delta's Blog, String To MemoryStream (C#).
MemoryStream stringInMemoryStream =
new MemoryStream(ASCIIEncoding.Default.GetBytes("Your string here"));
The string will be loaded into the MemoryStream, and you can read from it. See Encoding.GetBytes(...), which has also been implemented for a few other encodings.

You're using message.Length which returns the number of characters in the string, but you should be using the nubmer of bytes to read. You should use something like:
byte[] messageBytes = uniEncoding.GetBytes(message);
stringAsStream.Write(messageBytes, 0, messageBytes.Length);
You're then reading a single byte and expecting to get a character from it just by casting to char. UnicodeEncoding will use two bytes per character.
As Justin says you're also not seeking back to the beginning of the stream.
Basically I'm afraid pretty much everything is wrong here. Please give us the bigger picture and we can help you work out what you should really be doing. Using a StreamWriter to write and then a StreamReader to read is quite possibly what you want, but we can't really tell from just the brief bit of code you've shown.

I think it would be a lot more productive to use a TextWriter, in this case a StreamWriter to write to the MemoryStream. After that, as other have said, you need to "rewind" the MemoryStream using something like stringAsStream.Position = 0L;.
stringAsStream = new MemoryStream();
// create stream writer with UTF-16 (Unicode) encoding to write to the memory stream
using(StreamWriter sWriter = new StreamWriter(stringAsStream, UnicodeEncoding.Unicode))
{
sWriter.Write("Lorem ipsum.");
}
stringAsStream.Position = 0L; // rewind
Note that:
StreamWriter defaults to using an instance of UTF8Encoding unless specified otherwise. This instance of UTF8Encoding is constructed without a byte order mark (BOM)
Also, you don't have to create a new UnicodeEncoding() usually, since there's already one as a static member of the class for you to use in convenient utf-8, utf-16, and utf-32 flavors.
And then, finally (as others have said) you're trying to convert the bytes directly to chars, which they are not. If I had a memory stream and knew it was a string, I'd use a TextReader to get the string back from the bytes. It seems "dangerous" to me to mess around with the raw bytes.

You need to reset the stream to the beginning:
stringAsStream.Seek(0, SeekOrigin.Begin);
Console.WriteLine("Differs from:\t" + (char)stringAsStream.ReadByte());
This can also be done by setting the Position property to 0:
stringAsStream.Position = 0

Related

Read from a compressing GZipStream

I'm exploring how to implement an HTTP server in C#. (And before you ask, I know there is Kestrel (and nothing else that isn't obsolete), and I want a much, much smaller application.) So, the response could be a Stream that cannot be seeked and has an unknown length. For this situation, chunked encoding can be used instead of sending a Content-Length header.
The response can also be compressed with gzip or br as indicated by the client. This can be accomplished with e.g. the GZipStream class. I had almost said "easily", because that's not really the case. I always find the GZipStream API confusing each time I use it. I usually bump into every exception there is until I finally get it right.
It seems like I can only write (push) to a GZipStream and the compressed data will trickle out the other end into the specified "base" stream. But that's not desirable because I can't just let the compressed data flow to the client. It needs to be chunked. That is, each bit of compressed data needs to be prefixed with its chunk size. Of course the GZipStream cannot produce that format.
Instead, I'd like to read (pull) from the compressing GZipStream, but that doesn't seem to be possible. The documentation says it will throw an exception if I try that. But there has to be some instance that brings the compressed bytes into the chunked format.
So how would I get the expected result? Can it even be achieved with this API? Why can't I pull from the compressing stream, only push?
I'm not trying to make up (non-functional) sample code because that would only be confusing.
PS: Okay, maybe this:
Stream responseBody = ...;
if (canCompress)
{
responseBody = new GZipStream(responseBody, CompressionMode.Compress); // <-- probably wrong
}
// not shown: add appropriate headers
while (true)
{
int chunkLength = responseBody.Read(buffer); // <-- not possible
if (chunkLength == 0)
break;
response.Write($"{chunkLength:X}\r\n");
response.Write(buffer.AsMemory()[..chunkLength]);
response.Write("\r\n");
}
response.Write("0\r\n\r\n");
Your usage of GZipStream is incomplete. While your input responseBuffer is the correct target buffer, you have to actually write the bytes TO the GZipStream itself.
In addition, once you are done writing, you must close the GZipStream instance to write all compressed bytes to your target buffer. This is the critical step because there is no such thing as "partial compression" of an input stream in GZip. You would have to analyze the entire input in order to properly compress it. As such, this is the critical missing link that MUST happen before you can continue to write the response.
Finally, you need to reset the position of your output stream so that you can read it into an intermediary response buffer.
using MemoryStream responseBody = new MemoryStream();
GZipStream gzipStream = null; // make sure to dispose after use
if (canCompress)
{
using MemoryStream gzipStreamBuffer = new MemoryStream(bytes);
gzipStream = new GZipStream(responseBody, CompressionMode.Compress, true);
gzipStreamBuffer.CopyTo(gzipStream);
gzipStream.Close(); // close the stream so that all compressed bytes are written
responseBody.Seek(0, SeekOrigin.Begin); // reset the response so that we can read it to the buffer
}
var buffer = new byte[20];
while (true)
{
int chunkLength = responseBody.Read(buffer);
if (chunkLength == 0)
break;
// write response
}
In my test example, my bytes input was 241 bytes, whereas the compressed bytes written to the buffer totaled 82 bytes.

How do I read the correct Stream/byte[] from HttpPostedFile InputStream property?

I get a HttpPostedFile that is being uploaded (supposedly a pdf), and I have to use it's stream to initialize it in PdfSharp.
The problem is that, altough HttpPostedFile SaveAs() method saves a valid pdf, saving it's InputStream doesn't create a valid pdf, so when I use the InputStream on PdfSharp to read the pdf it throws an exception with "Invalid Pdf", and saving the InputStream byte[]
which I tried to get like this:
public byte[] GetBytesFromStream(System.IO.Stream uploadedFile)
{
int length = Convert.ToInt32(uploadedFile.Length); //Length: 103050706
string str = "";
byte[] input = new byte[length];
// Initialize the stream.
System.IO.Stream MyStream = uploadedFile;
// Read the file into the byte array.
MyStream.Read(input, 0, length);
return input;
}
Calling the method like this:
byte[] fileBytes = GetBytesFromStream(uploadedFile.InputStream);
But creating a file from those bytes creates an invalid pdf too...
I created the file from bytes like this...
System.IO.File.WriteAllBytes("Foo.pdf", fileBytes);
I have 2 questions about this then:
1st - Why is the stream I receive from the InputStream invalid, and the SaveAs Works.
2nd - How could I get the correct stream from the inputStream or the HttpPostedFile, without saving the file to disk and then reading it.
Noticed that this question wasn't answered (since Evk's comment was the solution) and I couldn't accept any answer.
So I'm making this one just to not leave this question unanswered.
tl;dr;
The solution as per Evk's comment was the position of the stream, it was being read beforehand and setting the position to 0 before trying to create the pdf was enough to fix the problem.

C# Convert FileStream.WriteLine to go to a MemoryStream

I wrote some code in a console program and tested with files.
Now I want to port it to a BizTalk Pipeline Component that implements a specific interface. I wasn't aware that that .Write and .WriteLine methods from a File to a Memory Stream were so different. I thought I would just be able to swap my objects. There is no .WriteLine method, and the .Write method requires offset and bytes (additional parameters).
So now, what is the best way to change my tested code to write to the memory stream, given that I have a lot of .WriteLine statements. I could write to a StringBuffer first, but then I think that would blow the concept of streaming (i.e. would have the whole document in memory at one time).
// This is how I used the streams in the Console program
//FileStream originalStream = File.Open(inFilename, FileMode.Open);
//StreamWriter streamToReturn = new StreamWriter(outFilename);
// This is how to get the input stream in the BizTalk Pipeline Componenet
System.IO.Stream originalStream = pInMsg.BodyPart.GetOriginalDataStream();
MemoryStream streamToReturn = new MemoryStream();
streamToReturn.WriteLine("<" + schemaStructure.rootElement + ">");
There's a lot more code not shown here. Above is just to set the stage for what I did.
Use a StreamWriter which you can use to call WriteLine.
MemoryStream streamToReturn = new MemoryStream();
var writer = new StreamWriter(streamToReturn);
writer.WriteLine("<" + schemaStructure.rootElement + ">");

Avoiding MemoryStream.ToArray() when using System.IO.Compression.ZipArchive

A helper method to turn a string into a zipped up text file:
public static System.Net.Mail.Attachment CreateZipAttachmentFromString(string content, string filename)
{
using (MemoryStream memoryStream = new MemoryStream())
{
using (ZipArchive zipArchive = new ZipArchive(memoryStream, ZipArchiveMode.Update))
{
ZipArchiveEntry zipArchiveEntry = zipArchive.CreateEntry(filename);
using (StreamWriter streamWriter = new StreamWriter(zipArchiveEntry.Open()))
{
streamWriter.Write(content);
}
}
MemoryStream memoryStream2 = new MemoryStream(memoryStream.ToArray(), false);
return new Attachment(memoryStream2, filename + ".zip", MediaTypeNames.Application.Zip);
}
}
I was really hoping to avoid turning the first memory stream into an array, making another memory stream on it to read it, and passing that to attachment. My logic was, why copy X megabytes to another place in memory to establish another stream pointing to the copy, when it's essentially just what we started out with.. It's the multi-megabyte equivalent of redundancy like if(myBool == true)
So I figured instead I would Seek back to the start of the first memory stream, and then attachment could just read that.. Or I would establish another memorystream pointing to the buffer of the first, and with the offset and length parameters set so it would know what to read
Neither of these approaches work out because it seems that ZipArchive only pushes data into the memory stream (in my case maybe) when control falls out of the using block and the ziparchive is disposed. Disposing it also disposes the MemoryStream and nearly everything (other than ToArray() and GetBuffer()) throw ObjectDisposedException.
Ultimately I can't seek it or get its length after the ZipArchive pumps data into it and before it pumps it in, the offset is usually zero and the length is definitely zero so the values are useless
Is there a nice optimal way, short of configuring my own over-large buffer (which then makes it non expandable by MemoryStream), to avoid having to burn up around 2x the memory bytes of the archive size with this method?
Most well designed streams and stream-users in .NET have an additional boolean parameter that can be used to instruct them to leave the "base stream" (terrible name) open when disposing.
This is ZipArchive's constructor:
public ZipArchive(
Stream stream,
ZipArchiveMode mode,
bool leaveOpen
)
There is no need for a second MemoryStream. You need to do two things:
Ensure, that the MemoryStream is not disposed before the last usage point. This is harmless. Disposing a MemoryStream does nothing helpful and for compatibility reasons can never do anything in the future. The .NET Framework has a very high compatibility bar. They often don't even dare to rename fields.
Seek to offset zero.
So remove the using around the MemoryStream and use the ctor for ZipArchive that allows you to leave the stream open.
Since the Attachment you are returning makes use of the MemoryStream you can't dispose it before exiting the method. Again, this is harmless. The only negative point is that the code becomes less obvious.
There's an entirely different approach: You can write your own Stream class that creates the bytes on demand. That way there is no need to buffer the string and ZIP bytes at all. This is much more work, of course. And it does not detract from the fact that the whole string must sit in memory at once, so it's still not a O(1) space solution.
public static System.Net.Mail.Attachment CreateZipAttachmentFromString(string content, string filename)
{
MemoryStream memoryStream = new MemoryStream();
using (ZipArchive zipArchive = new ZipArchive(memoryStream, ZipArchiveMode.Update, true))
{
ZipArchiveEntry zipArchiveEntry = zipArchive.CreateEntry(filename);
using (StreamWriter streamWriter = new StreamWriter(zipArchiveEntry.Open()))
{
streamWriter.Write(content);
}
}
memoryStream.Position = 0;
return new Attachment(memoryStream, filename + ".zip", MediaTypeNames.Application.Zip);
}

Reliable way to convert a file to a byte[]

I found the following code on the web:
private byte [] StreamFile(string filename)
{
FileStream fs = new FileStream(filename, FileMode.Open,FileAccess.Read);
// Create a byte array of file stream length
byte[] ImageData = new byte[fs.Length];
//Read block of bytes from stream into the byte array
fs.Read(ImageData,0,System.Convert.ToInt32(fs.Length));
//Close the File Stream
fs.Close();
return ImageData; //return the byte data
}
Is it reliable enough to use to convert a file to byte[] in c#, or is there a better way to do this?
byte[] bytes = System.IO.File.ReadAllBytes(filename);
That should do the trick. ReadAllBytes opens the file, reads its contents into a new byte array, then closes it. Here's the MSDN page for that method.
byte[] bytes = File.ReadAllBytes(filename)
or ...
var bytes = File.ReadAllBytes(filename)
Not to repeat what everyone already have said but keep the following cheat sheet handly for File manipulations:
System.IO.File.ReadAllBytes(filename);
File.Exists(filename)
Path.Combine(folderName, resOfThePath);
Path.GetFullPath(path); // converts a relative path to absolute one
Path.GetExtension(path);
All these answers with .ReadAllBytes(). Another, similar (I won't say duplicate, since they were trying to refactor their code) question was asked on SO here: Best way to read a large file into a byte array in C#?
A comment was made on one of the posts regarding .ReadAllBytes():
File.ReadAllBytes throws OutOfMemoryException with big files (tested with 630 MB file
and it failed) – juanjo.arana Mar 13 '13 at 1:31
A better approach, to me, would be something like this, with BinaryReader:
public static byte[] FileToByteArray(string fileName)
{
byte[] fileData = null;
using (FileStream fs = File.OpenRead(fileName))
{
var binaryReader = new BinaryReader(fs);
fileData = binaryReader.ReadBytes((int)fs.Length);
}
return fileData;
}
But that's just me...
Of course, this all assumes you have the memory to handle the byte[] once it is read in, and I didn't put in the File.Exists check to ensure the file is there before proceeding, as you'd do that before calling this code.
looks good enough as a generic version. You can modify it to meet your needs, if they're specific enough.
also test for exceptions and error conditions, such as file doesn't exist or can't be read, etc.
you can also do the following to save some space:
byte[] bytes = System.IO.File.ReadAllBytes(filename);
Others have noted that you can use the built-in File.ReadAllBytes. The built-in method is fine, but it's worth noting that the code you post above is fragile for two reasons:
Stream is IDisposable - you should place the FileStream fs = new FileStream(filename, FileMode.Open,FileAccess.Read) initialization in a using clause to ensure the file is closed. Failure to do this may mean that the stream remains open if a failure occurs, which will mean the file remains locked - and that can cause other problems later on.
fs.Read may read fewer bytes than you request. In general, the .Read method of a Stream instance will read at least one byte, but not necessarily all bytes you ask for. You'll need to write a loop that retries reading until all bytes are read. This page explains this in more detail.
string filePath= #"D:\MiUnidad\testFile.pdf";
byte[] bytes = await System.IO.File.ReadAllBytesAsync(filePath);

Categories