What is the best method of writing a StringBuilder to a System.IO.Stream?
I am currently doing:
StringBuilder message = new StringBuilder("All your base");
message.Append(" are belong to us");
System.IO.MemoryStream stream = new System.IO.MemoryStream();
System.Text.ASCIIEncoding encoding = new ASCIIEncoding();
stream.Write(encoder.GetBytes(message.ToString()), 0, message.Length);
Don't use a StringBuilder, if you're writing to a stream, do just that with a StreamWriter:
using (var memoryStream = new MemoryStream())
using (var writer = new StreamWriter(memoryStream ))
{
// Various for loops etc as necessary that will ultimately do this:
writer.Write(...);
}
That is the best method. Other wise loss the StringBuilder and use something like following:
using (MemoryStream ms = new MemoryStream())
{
using (StreamWriter sw = new StreamWriter(ms, Encoding.Unicode))
{
sw.WriteLine("dirty world.");
}
//do somthing with ms
}
Perhaps it will be usefull.
var sb= new StringBuilder("All your money");
sb.Append(" are belong to us, dude.");
var myString = sb.ToString();
var myByteArray = System.Text.Encoding.UTF8.GetBytes(myString);
var ms = new MemoryStream(myByteArray);
// Do what you need with MemoryStream
Depending on your use case it may also make sense to just start with a StringWriter:
StringBuilder sb = null;
// StringWriter - a TextWriter backed by a StringBuilder
using (var writer = new StringWriter())
{
writer.WriteLine("Blah");
. . .
sb = writer.GetStringBuilder(); // Get the backing StringBuilder out
}
// Do whatever you want with the StringBuilder
EDIT: As #ramon-smits points out, if you have access to StringBuilder.GetChunks(), you will also have access to StreamWriter.WriteAsync(StringBuilder). So you can just do this instead:
StringBuilder stringBuilder = new StringBuilder();
// Write data to StringBuilder...
Stream stream = GetStream(); // Get output stream from somewhere.
using (var streamWriter = new StreamWriter(stream, Encoding.UTF8, leaveOpen: true))
{
await streamWriter.WriteAsync(stringBuilder);
}
Much simpler.
I recently had to do exactly this thing and found this question with unsatisfactory answers.
You can write a StringBuilder to a Stream without materializing the entire string:
StringBuilder stringBuilder = new StringBuilder();
// Write data to StringBuilder...
Stream stream = GetStream(); // Get output stream from somewhere.
using (var streamWriter = new StreamWriter(stream, Encoding.UTF8, leaveOpen: true))
{
foreach (ReadOnlyMemory<char> chunk in stringBuilder.GetChunks())
{
await streamWriter.WriteAsync(chunk);
}
}
N.B. This API (StringBuilder.GetChunks()) is only available in .NET Core 3.0 and above
If this operation happens frequently, you could further reduce GC pressure by using a StringBuilder object pool.
If you want to use something like a StringBuilder because it is cleaner to pass around and work with, then you can use something like the following StringBuilder alternate I created.
The most important thing it does different is that it allows access to the internal data without having to assemble it into a String or ByteArray first. This means you don't have to double up the memory requirements and risk trying to allocate a contiguous chunk of memory that fits your entire object.
NOTE: I am sure there are better options then using a List<string>() internally but this was simple and proved to be good enough for my purposes.
public class StringBuilderEx
{
List<string> data = new List<string>();
public void Append(string input)
{
data.Add(input);
}
public void AppendLine(string input)
{
data.Add(input + "\n");
}
public void AppendLine()
{
data.Add("\n");
}
/// <summary>
/// Copies all data to a String.
/// Warning: Will fail with an OutOfMemoryException if the data is too
/// large to fit into a single contiguous string.
/// </summary>
public override string ToString()
{
return String.Join("", data);
}
/// <summary>
/// Process Each section of the data in place. This avoids the
/// memory pressure of exporting everything to another contiguous
/// block of memory before processing.
/// </summary>
public void ForEach(Action<string> processData)
{
foreach (string item in data)
processData(item);
}
}
Now you can dump the entire contents to file using the following code.
var stringData = new StringBuilderEx();
stringData.Append("Add lots of data");
using (StreamWriter file = new System.IO.StreamWriter(localFilename))
{
stringData.ForEach((data) =>
{
file.Write(data);
});
}
Related
I'm probably doing something obviously stupid here. Please point it out!
I have some C# code that is pulling down a bunch of .gz files from SFTP (using the SSH.NET Nuget package - works great!). Each gz contains only a single .CSV file inside of them. I want to keep these files in memory without hitting disk (yes, I know, server memory management concerns exist - that's fine as these files are fairly small), decompress them in memory to extract the CSV file inside, and then return a collection of CSV files in a custom DTO (FtpFile).
My problem is that while my MemoryStream from the SFTP connection has data in it, either it doesn't ever seem to be populated in my GZipStream or the copy from the GZipStream to my output MemoryStream is failing. I have tried with the more traditional looping over Read with my own buffer but it had the same results as this code.
Aside from connection details (it connects successfully, so no worries there), here's all of my code:
Logic:
public static List<FtpFile> Foo()
{
var connectionInfo = new ConnectionInfo("example.com",
"username",
new PasswordAuthenticationMethod("username", "password"));
using (var client = new SftpClient(connectionInfo))
{
client.Connect();
var searchResults = client.ListDirectory("/testdir")
.Where(obj => obj.IsRegularFile
&& obj.Name.ToLowerInvariant().StartsWith("test_")
&& obj.Name.ToLowerInvariant().EndsWith(".gz"))
.Take(2)
.ToList();
var fileResults = new List<FtpFile>();
foreach (var file in searchResults)
{
var ftpFile = new FtpFile { FileName = file.Name, FileSize = file.Length };
using (var fileStream = new MemoryStream())
{
client.DownloadFile(file.FullName, fileStream); // Success! All is good here, so far. :)
using (var gzStream = new GZipStream(fileStream, CompressionMode.Decompress))
{
using (var outputStream = new MemoryStream())
{
gzStream.CopyTo(outputStream);
byte[] outputBytes = outputStream.ToArray(); // No data. Sad panda. :'(
ftpFile.FileContents = Encoding.ASCII.GetString(outputBytes);
fileResults.Add(ftpFile);
}
}
}
}
return fileResults;
}
}
FtpFile (just a simple DTO I'm populating):
public class FtpFile
{
public string FileName { get; set; }
public long FileSize { get; set; }
public string FileContents { get; set; }
}
PSA If anybody comes and copies this code, be aware that this is NOT good code in that you could have some serious memory management problems with this code! It's best practice to instead stream it to disk, which is not being done in this code! My needs are very specific in that I have to have these files simultaneously in memory for what I'm building with them.
If you are inserting data into the stream, make sure to seek back to its origin before un-gzipping it.
The following should fix your troubles:
using (var fileStream = new MemoryStream())
{
client.DownloadFile(file.FullName, fileStream); // Success! All is good here, so far. :)
fileStream.Seek(0, SeekOrigin.Begin);
using (var gzStream = new GZipStream(fileStream, CompressionMode.Decompress))
{
using (var outputStream = new MemoryStream())
{
gzStream.CopyTo(outputStream);
byte[] outputBytes = outputStream.ToArray(); // No data. Sad panda. :'(
ftpFile.FileContents = Encoding.ASCII.GetString(outputBytes);
fileResults.Add(ftpFile);
}
}
}
I am unsure what I am doing wrong. The files that I create after grabbing a byte[] (which is emailAttachment.Body) and passing it to the method ExtractZipFile, converting it to MemoryStream and then unzipping it, returning it as a KeyValuePair and then Writing to a file using FileStream.
However when I go to open the new created files there is an error in opening them. They are not able to be opened.
The below are in the same class
using Ionic.Zip;
var extractedFiles = ExtractZipFile(emailAttachment.Body);
foreach (KeyValuePair<string, MemoryStream> extractedFile in extractedFiles)
{
string FileName = extractedFile.Key;
using (FileStream file = new FileStream(CurrentFileSystem +
FileName.FileFullPath(),FileMode.Create, System.IO.FileAccess.Write))
{
byte[] bytes = new byte[extractedFile.Value.Length];
extractedFile.Value.Read(bytes, 0, (int) xtractedFile.Value.Length);
file.Write(bytes,0,bytes.Length);
extractedFile.Value.Close();
}
}
private Dictionary<string, MemoryStream> ExtractZipFile(byte[] messagePart)
{
Dictionary<string, MemoryStream> result = new Dictionary<string,MemoryStream>();
MemoryStream data = new MemoryStream(messagePart);
using (ZipFile zip = ZipFile.Read(data))
{
foreach (ZipEntry ent in zip)
{
MemoryStream memoryStream = new MemoryStream();
ent.Extract(memoryStream);
result.Add(ent.FileName,memoryStream);
}
}
return result;
}
Is there something I am missing? I do not want to save the original zip file just the extracted Files from MemoryStream.
What am I doing wrong?
After writing to your MemoryStream, you're not setting the position back to 0:
MemoryStream memoryStream = new MemoryStream();
ent.Extract(memoryStream);
result.Add(ent.FileName,memoryStream);
Because of this, the stream position will be at the end when you try to read from it, and you'll read nothing. Make sure to rewind it:
memoryStream.Position = 0;
Also, you don't have to handle the copy manually. Just let the CopyTo method take care of it:
extractedFile.Value.CopyTo(file);
I'd suggest that you clean up your use of MemoryStream in your code.
I agree that calling memoryStream.Position = 0; will allow this code to work correctly, but it's an easy thing to miss when reading and writing memory streams.
It's better to write code that avoids the bug.
Try this:
private IEnumerable<(string Path, byte[] Content)> ExtractZipFile(byte[] messagePart)
{
using (var data = new MemoryStream(messagePart))
{
using (var zipFile = ZipFile.Read(data))
{
foreach (var zipEntry in zipFile)
{
using (var memoryStream = new MemoryStream())
{
zipEntry.Extract(memoryStream);
yield return (Path: zipEntry.FileName, Content: memoryStream.ToArray());
}
}
}
}
}
Then your calling code would look something like this:
foreach (var extractedFile in ExtractZipFile(emailAttachment.Body))
{
File.WriteAllBytes(Path.Combine(CurrentFileSystem, extractedFile.Path.FileFullPath()), extractedFile.Content);
}
It's just a lot less code and a much better chance of avoiding bugs. The number one predictor of bugs in code is the number of lines of code you write.
Since I find it all a lot of code for a simple operation, here's my two cents.
using Ionic.Zip;
using (var s = new MemoryStream(emailAttachment.Body))
using (ZipFile zip = ZipFile.Read(s))
{
foreach (ZipEntry ent in zip)
{
string path = Path.Combine(CurrentFileSystem, ent.FileName.FileFullPath())
using (FileStream file = new FileStream(path, FileAccess.Write))
{
ent.Extract(file);
}
}
}
Having some problems with CsvHelper and writing to a memory stream. I've tried flushing the stream writer and setting positions and everything else tried. I figure I've narrowed it down to a really simple test case that obviously fails. What am I doing wrong here?
public OutputFile GetTestFile()
{
using (var ms = new MemoryStream())
using (var sr = new StreamWriter(ms))
using (var csv = new CsvWriter(sr))
{
csv.WriteField("test");
sr.Flush();
return new OutputFile
{
Data = ms.ToArray(),
Length = ms.Length,
DataType = "text/csv",
FileName = "test.csv"
};
}
}
[TestMethod]
public void TestWritingToMemoryStream()
{
var file = GetTestFile();
Assert.IsFalse(file.Data.Length == 0);
}
Editing the correct answer in for people googling as this corrected code actually passes my test. I have no idea why writing to a StringWriter then converting it to bytes solves all the crazy flushing issues, but it works now.
using (var sw = new StringWriter())
using (var csvWriter = new CsvWriter(sw, config))
{
csvWriter.WriteRecords(records);
return Encoding.UTF8.GetBytes(sw.ToString());
}
Since CSVHelper is meant to collect several fields per row/line, it does some buffering itself until you tell it the current record is done:
csv.WriteField("test");
csv.NextRecord();
sr.Flush();
Now, the memstream should have the data in it. However, unless there is more processing elsewhere, the result in your OutputFile is wrong: Data will be byte[] not "text/csv". It seems like StringWriter would produce something more appropriate:
string sBuff;
using (StringWriter sw = new StringWriter())
using (CsvWriter csv = new CsvWriter(sw))
{
csv.WriteRecord<SomeItem>(r);
sBuff = sw.ToString();
}
Console.WriteLine(sBuff);
"New Item ",Falcon,7
Using the following code I always get the same hash regardless of the input. Any ideas why that might be?
private static SHA256 sha256;
internal static byte[] HashForCDCR(this string value)
{
byte[] hash;
using (var myStream = new System.IO.MemoryStream())
{
using (var sw = new System.IO.StreamWriter(myStream))
{
sw.Write(value);
hash = sha256.ComputeHash(myStream);
}
}
return hash;
}
You are computing hash of empty portion of the stream (the one immediately after content you wrote with sw.Write) so it always the same.
Cheap fix: sw.Flush();myStream.Position = 0;. Better fix is to finish writing and create new read only stream for encryption based on original stream:
using (var myStream = new System.IO.MemoryStream())
{
using (var sw = new System.IO.StreamWriter(myStream))
{
sw.Write(value);
}
using (var readonlyStream = new MemoryStream(myStream.ToArray(), writable:false)
{
hash = sha256.ComputeHash(readonlyStream);
}
}
You may need to flush your stream. For optimal performance StreamWriter doesn't write to stream immediately . It waits for its internal buffer to fill. Flushing the writer immediately flush the content of the internal buffer to underline stream.
sw.Write(value);
sw.Flush();
myStream.Position = 0;
hash = sha256.ComputeHash(myStream);
I will probably use the solution that Alexei Levenkov called a "cheap fix". However, I did come across one other way to make it work, which I will post for future readers:
var encoding = new System.Text.UTF8Encoding();
var bytes = encoding.GetBytes(value);
var hash = sha256.ComputeHash(bytes);
return hash;
Jacob
I am working on a class that maintains a dictionary of images.
This dictionary should be saved to and loaded from a file.
I implemented the below solution, but the problem is that according to MSDN
documentation for Image.FromStream();
http://msdn.microsoft.com/en-us/library/93z9ee4x(v=VS.80).aspx
"The stream is reset to zero if this method is called successively with the same stream."
Any ideas how to fix this? The speed of loading the dictionary is critical.
class ImageDictionary
{
private Dictionary<string, Image> dict = new Dictionary<string, Image>();
public void AddImage(string resourceName, string filename)
{
//...
}
public Image GetImage(string resourceName)
{
//...
}
public void Save(string filename)
{
var stream = new FileStream(filename, FileMode.Create);
var writer = new BinaryWriter(stream);
writer.Write((Int32) dict.Count);
foreach (string key in dict.Keys)
{
writer.Write(key);
Image img;
dict.TryGetValue(key, out img);
img.Save(stream,System.Drawing.Imaging.ImageFormat.Png);
}
writer.Close();
stream.Close();
}
public void Load(string filename)
{
var stream = new FileStream(filename, FileMode.Open);
var reader = new BinaryReader(stream);
Int32 count = reader.ReadInt32();
dict.Clear();
for (int i = 0; i < count; i++)
{
string key = reader.ReadString();
Image img = Image.FromStream(stream);
dict.Add(key, img);
}
reader.Close();
stream.Close();
}
}
The Image.FromStream method expects a valid image stream. You are concatenating multiple images into a single file and if you want to reconstruct them you will also need to save their size in addition to their number. An easier solution would be to simply binary serialize the image dictionary:
public void Save(string filename)
{
var serializer = new BinaryFormatter();
using (var stream = File.Create(filename))
{
serializer.Serialize(stream, dict);
}
}
public void Load(string filename)
{
var serializer = new BinaryFormatter();
using (var stream = File.Open(filename, FileMode.Open))
{
dict = (Dictionary<string, Image>)serializer.Deserialize(stream);
}
}
You can try to use BinaryFormatter to serialize/deserialize your dictionary dict to/from file.
An off-the-wall idea that might work (I have definitely not tested this):
Create a Substream class that derives from Stream and also wraps an underlying Stream. Its constructor would take a Stream and an offset into that stream that the Substream treats as zero. Substream is basically a constrained view or window into another stream (in this case, your file stream).
Initially, create a Substream over your FileStream with an offset of zero.
When you call Image.FromStream, the position of your Substream will advance to some new position (call this p).
Create a new Substream over your FileStream with an offset of p.
Loop until finished.
The idea is that even if Image.FromStream resets the underlying stream, it will reset the Substream to some offset into the FileStream, which is what you really want.
why not create custom header for that file which includes (number of images,starting address of the actual images and make between each image a line separator)