MemoryStream Disposed after using Workbook.Save() method OpenXML - c#

I have been messing with this for hours to no avail. I am trying to copy an excel file, add a new sheet to it, put the file in a MemoryStream and then return the stream.
Here is the code:
public Stream ProcessDocument()
{
var resultStream = new MemoryStream();
string sourcePath = "path\\to\\file";
string destinationPath = "path\\to\\file";
CopyFile(destinationPath, sourcePath);
var copiedFile = SpreadsheetDocument.Open(destinationPath, true);
var fileWithSheets = SpreadsheetDocument.Open("path\\to\\file", false);
AddCopyOfSheet(fileWithSheets.WorkbookPart, copiedFile.WorkbookPart, "foo");
using(var stream = new MemoryStream()){
copiedFile.WorkbookPart.Workbook.Save(stream);
stream.Position = 0;
stream.CopyTo(resultsStream);
}
return resultsStream;
}
public void CopyFile(string outputFullFilePath, string inputFileFullPath)
{
int bufferSize = 1024 * 1024;
using (var fileStream = new FileStream(outputFullFilePath, FileMode.OpenOrCreate))
{
var fs = new FileStream(inputFileFullPath, FileMode.Open, FileAccess.ReadWrite);
fileStream.SetLength(fs.Length);
int bytesRead = -1;
byte[] bytes = new byte[bufferSize];
while ((bytesRead = fs.Read(bytes, 0, bufferSize)) > 0)
{
fileStream.Write(bytes, 0, bytesRead);
}
fs.Close();
fileStream.Close();
}
}
public static void AddCopyOfSheet(WorkbookPart sourceDocument, WorkbookPart destinationDocument, string sheetName)
{
WorksheetPart sourceSheetPart = GetWorkSheetPart(sourceDocument, sheetName);
destinationDocument.AddPart(sourceSheetPart);
}
public static WorksheetPart GetWorksheetPart(WorkbookPart workbookPart, string sheetName)
{
string id = workbookPart.Workbook.Descendants<Sheet>().First(x => x.Name.Value.Contains(sheetName)).Id
return (WorksheetPart)workbookPart.GetPartById(id);
}
The issue seems to arise from copiedFile.WorkbookPart.Workbook.Save(stream).
After this is ran, I get an error saying that there was an exception of type 'System.ObjectDisposedException'. The file copies fine and adding the sheet seems to also be working.
Here's what I've tried:
Using .Save() without stream as a parameter. It does nothing.
Using two different streams (hence the resultStream jank left in this code)
Going pure OpenXML and copying the WorkbookParts to a stream directly. Tested with a plain text excel and was fine, but it breaks the desired file because it has some advanced formatting that does not seem to work well with OpenXML. I am open to refactoring if someone knows how I could work around this, though.
What I haven't tried:
Creating ANOTHER copy of the copy and using the SpreadsheetDocument.Create(stream, type) method. I have a feeling this would work but it seems like an awful and slow solution.
Updating OpenXML. I am currently on 2.5.
Any feedback or ideas are hugely appreciated. Thank you!
PS: My dev box is airgapped so I had to hand write this code over. Sorry if there are any errors.

Turns out that copiedFile.WorkbookPart.Workbook.Save(stream); disposes of the stream by default. The workaround to this was to make a MemoryStream class that overloads its ability to be disposed, like so:
public class DisposeLockableMemoryStream : MemoryStream
{
public DisposeLockableMemoryStream(bool allowDispose)
{
AllowDispose = allowDispose;
}
public bool AllowDispose { get; set; }
protected override void Dispose(bool disposing)
{
if (!AllowDispose)
return;
base.Dispose(disposing);
}
}
All you need to do is make sure you stream.AllowDispose = true and then dispose of it once you're done.
Now, this didn't really fix my code because it turns out that .Save() only tracks changes made to the document, not the entire thing!!!. Basically, this library is hot garbage and I regret signing up for this story to begin with.
For more information, see a post I made on r/csharp.

Related

How can you benefit from the new System.Buffers (Span, Memory) when doing File I/O and parsing

I'm currently looking at the new System.Buffers with Span, ReadOnlySpan, Memory, ReadOnlyMemory, ...
I understand when passing e.g. a ReadOnlySpan (ROS) this could reduce heap allocations in many cases and make code perform better. Most examples regarding Span show you the .AsSpan(), .Slice(...) examples but that's it.
Once I have my data (e.g. byte[]) then I can create a Span or ReadOnlySpan from it and pass that to several methods/classes inside my library.
But how can File I/O be writting using System.Buffers (Span/Memory/..)?
I've tried to created two small (partial) examples to demonstrate the situation.
// Example 1:
using (var br = new BinaryReader(File.OpenRead(pathToFile)) {
ReadFile(br);
}
private void ReadFile(BinaryReader br) {
ParseHeader(...);
}
private void ParseHeader(BinaryReader br) {
br.ReadBytes(...);
br.ReadInt32();
// ...
}
and
// Example 2:
public Foo GetFileAsFoo(string path) {
using (var s = new FileStream(path, FileMode.Open, FileAccess.Read) {
return ReadAndGetFoo(s);
}
}
public Foo ReadAndGetFoo(Stream file) {
// copy to memorystream as filestream file I/O is slow
var ms = new MemoryStream();
file.CopyTo(ms);
ms.Position = 0;
Parser p = new Parser(ms);
p.Read();
return p.GetFoo();
}
public class Parser {
private readonly Stream _s;
public Parser(Stream stream) {
_s = stream;
}
int Peek() {
if (_s.Position >= _s.Length) return -1;
int r = _s.ReadByte();
_s.Seek(-1, SeekOrigin.Current);
return r;
}
public void Read() {
// logic here
}
public Foo GetFoo() {
// ...
return _Foo;
}
// other methods to parse
}
The question for the first example is mainly on how can I get a ReadOnlySpan/Memory/...(?) from:
using (var br = new BinaryReader(File.OpenRead(pathToFile))
I'm aware of System.Buffers.Binary.BinaryPrimites to replace the BinaryReader, but this requires a ReadOnlySpan. How would I get my data from File.OpenRead as span in the first place, similar as I do with the BinaryReader?
What options are available?
I guess there is no class in the BinaryPrimities that keeps track of the 'position' similar as a BinaryReader does?
https://learn.microsoft.com/en-us/dotnet/api/system.buffers.binary.binaryprimitives?view=net-5.0
The second example instead of working with a BinaryReader is working on a Stream.
To keep it efficient, I first do a copy to a memorystream to reduce the I/O. (I know this one-time copy is slow and should be avoided).
How could this File as Stream be read the using System.Buffers (Span/Memory/...?) ?
(byte per byte is being read and parsed using Stream.ReadByte())
Hope to learn something!

Why can't I append a new byte array to a file using FileStream? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I am trying to write a byte array to a file, multiple times. The FileMode is set to Append, to create the file if it doesn't exist, otherwise open the file and seek to the end of the file as described. The problem is that when writing to the existing file, it gets overwritten rather than have the new byte array appended to it. That's all there is to it.
void WriteToFile()
{
byte[] buffer = new byte[16 * 1024];
int num;
using (FileStream dest_stream = new FileStream(filename, FileMode.Append, FileAccess.Write))
{
while (num = ReadFromAnotherStream(my_other_stream, ref buffer) > 0)
dest_stream.Write(buffer, 0, num);
}
}
This function will be called occasionally. If the file already exists, seek to the end of the file and continue writing from there, otherwise create a new file and write data.
When it should append, it overwrites... It does not append.
It should append to the file instead of overwrite it.
There is no error thrown.
Using Seek for the FileStream does nothing.
When it overwrites, the data is correct, however, it needs to be appended at the end of the previous data, and not overwrite.
UPDATE: Well, I had no choice but to divide each call into multiple "temp" files then at the end merge them into the main file. Such worked flawlessly, no Seeking was required leading to non-corrupted files.
A downside would be extra processing for multiple temp files (especially large ones) being merged into one.
Pseudo-code:
string filename;
List<string> tmp_files = new List<string>();
int __i = 0;
do
{
filename = $"my_file.tmp{__i++}";
tmp_files.Add(filename);
}
while (File.Exists(filename));
// Every time WriteToFile() gets called... This should be on top.
// Instead of writing directly to the file, add more of them until the input stream has been read fully.
using (FileStream out_fs = new FileStream("my_file.bin", FileMode.Create))
{
foreach (string tmp_file in tmp_files)
{
using (FileStream in_fs = new FileStream(tmp_file, FileMode.Open))
{
in_fs.CopyTo(out_fs);
}
File.Delete(tmp_file);
}
}
First of all, thank you everyone who took part in this thread. Part of the code could not be shared and that is beyond me. I understand that there is no magic ball out there to read minds from pseudo-codes, but I was in desperate to solve this unethical mistake from the API, so I wanted to get as many possibilities.
I still don't know what the issue is with Append, but the input stream has absolutely nothing to do with it and that's out of the way for now.
We can't tell what is wrong with your code because you didn't include the code for ReadFromAnotherStream. But here is some code that does what you want, and it works:
/// <summary>
/// Appends the contents of the file at inputFilePath to the file at pathToAppendTo
/// </summary>
void Append(string inputFilePath,string pathToAppendTo)
{
var buffer = new byte[16];
using FileStream source = new FileStream(inputFilePath, FileMode.Open, FileAccess.Read);
using FileStream destinationStream = new FileStream(pathToAppendTo, FileMode.Append, FileAccess.Write);
while (TryRead(source, buffer, out int bytes))
{
destinationStream.Write(buffer, 0, bytes);
}
}
private bool TryRead(FileStream source, byte[] buffer, out int bytesRead)
{
bytesRead = source.Read(buffer, 0, buffer.Length);
return bytesRead > 0;
}
Here is a unit test to verify that it works:
[TestMethod]
public void TestWriteFile()
{
var inputFileName = "C:/fileYouWantToCopy";
var outputFileName = "C:/fileYouWantToAppendTo";
var originalFileBytes = GetFileLength(outputFileName);
var additionalBytes = GetFileLength(inputFileName);
Append(inputFileName,outputFileName);
Assert.AreEqual(GetFileLength(outputFileName), originalFileBytes + additionalBytes);
}
private long GetFileLength(string path) => new FileInfo(path).Length;

Creating and updating a zipfile

Following the documentation, I'm having an extremely difficult time getting this to work. Using ZipFile I want to create a zip in memory and then be able to update it. On each successive call to update the, the zip reports that it has 0 entries.
What am I doing wrong?
public void AddFile(MemoryStream zipStream, Stream file, string fileName)
{
//First call to this zipStream is just an empty stream
var zip = ZipFile.Create(zipStream);
zip.BeginUpdate();
zip.Add(new ZipDataSource(file), fileName);
zip.CommitUpdate();
zip.IsStreamOwner = false;
zip.Close();
zipStream.Position = 0;
}
public Stream GetFile(Stream zipStream, string pathAndName)
{
var zip = ZipFile.Create(zipStream);
zip.IsStreamOwner = false;
foreach (ZipEntry hi in zip) //count is 0
{
}
var entry = zip.GetEntry(pathAndName);
return entry == null ? null : zip.GetInputStream(entry);
}
The custom data source
public class ZipDataSource : IStaticDataSource
{
private Stream _stream;
public ZipDataSource(Stream stream)
{
_stream = stream;
}
public Stream GetSource()
{
_stream.Position = 0;
return _stream;
}
ZipFile.Create(zipStream) is not just a convenient static accessor like anyone would think. If you're going to use that only use it for the very first time you're creating a zip. When opening up an existing zip you need to use var zip = new ZipFile(zipStream).
I've personally had many issues in the past with this library and would suggest that anyone looking for a good zip library to choose something other than icsharpziplib... The API just plain sucks.

C# .NET Why Stream.Seek is required when unzipping stream

I'm working on a project where I need the ability to unzip streams and byte arrays as well as zip them. I was running some unit tests that create the Zip from a stream and then unzip them and when I unzip them, the only way that DonNetZip sees them as a zip is if I run streamToZip.Seek(o,SeekOrigin.Begin) and streamToZip.Flush(). If I don't do this, I get the error "Cannot read Block, No data" on the ZipFile.Read(stream).
I was wondering if anyone could explain why that is. I've seen a few articles on using it to actually set the relative read position, but none that really explain why in this situation it is required.
Here is my Code:
Zipping the Object:
public Stream ZipObject(Stream data)
{
var output = new MemoryStream();
using (var zip = new ZipFile())
{
zip.AddEntry(Name, data);
zip.Save(output);
FlushStream(output);
ZippedItem = output;
}
return output;
}
Unzipping the Object:
public List<Stream> UnZipObject(Stream data)
{
***FlushStream(data); // This is what I had to add in to make it work***
using (var zip = ZipFile.Read(data))
{
foreach (var item in zip)
{
var newStream = new MemoryStream();
item.Extract(newStream);
UnZippedItems.Add(newStream);
}
}
return UnZippedItems;
}
Flush method I had to add:
private static void FlushStream(Stream stream)
{
stream.Seek(0, SeekOrigin.Begin);
stream.Flush();
}
When you return output from ZipObject, that stream is at the end - you've just written the data. You need to "rewind" it so that the data can then be read. Imagine you had a video cassette, and had just recorded a program - you'd need to rewind it before you watched it, right? It's exactly the same here.
I would suggest doing this in ZipObject itself though - and I don't believe the Flush call is necessary. I'd personally use the Position property, too:
public Stream ZipObject(Stream data)
{
var output = new MemoryStream();
using (var zip = new ZipFile())
{
zip.AddEntry(Name, data);
zip.Save(output);
}
output.Position = 0;
return output;
}
When you write to a stream, the position is changed. If you want to decompress it (the same stream object), you'll need to reset the position. Else you'll get a EndOfStreamException because the ZipFile.Read will start at the stream.Position.
So
stream.Seek(0, SeekOrigin.Begin);
Or
stream.Position = 0;
would do the trick.
Offtopic but sure useful:
public IEnumerable<Stream> UnZipObject(Stream data)
{
using (var zip = ZipFile.Read(data))
{
foreach (var item in zip)
{
var newStream = new MemoryStream();
item.Extract(newStream);
newStream.Position = 0;
yield return newStream;
}
}
}
Won't unzip all items in memory (because of the MemoryStream used in the UnZipObject(), only when iterated. Thats because extracted items are yielded. (returning an IEnumerable<Stream>) More info on yield: http://msdn.microsoft.com/en-us/library/vstudio/9k7k7cf0.aspx
Normally i wouldn't recomment returning data as stream, because the stream is something like an iterator (using the .Position as current position). This way it isn't by default threadsafe. I'd rather return these memory streams as ToArray().

string serialization and deserialization problem

I'm trying to serialize/deserialize string. Using the code:
private byte[] StrToBytes(string str)
{
BinaryFormatter bf = new BinaryFormatter();
MemoryStream ms = new MemoryStream();
bf.Serialize(ms, str);
ms.Seek(0, 0);
return ms.ToArray();
}
private string BytesToStr(byte[] bytes)
{
BinaryFormatter bfx = new BinaryFormatter();
MemoryStream msx = new MemoryStream();
msx.Write(bytes, 0, bytes.Length);
msx.Seek(0, 0);
return Convert.ToString(bfx.Deserialize(msx));
}
This two code works fine if I play with string variables.
But If I deserialize a string and save it to a file, after reading the back and serializing it again, I end up with only first portion of the string.
So I believe I have a problem with my file save/read operation. Here is the code for my save/read
private byte[] ReadWhole(string fileName)
{
try
{
using (BinaryReader br = new BinaryReader(new FileStream(fileName, FileMode.Open)))
{
return br.ReadBytes((int)br.BaseStream.Length);
}
}
catch (Exception)
{
return null;
}
}
private void WriteWhole(byte[] wrt,string fileName,bool append)
{
FileMode fm = FileMode.OpenOrCreate;
if (append)
fm = FileMode.Append;
using (BinaryWriter bw = new BinaryWriter(new FileStream(fileName, fm)))
{
bw.Write(wrt);
}
return;
}
Any help will be appreciated.
Many thanks
Sample Problematic Run:
WriteWhole(StrToBytes("First portion of text"),"filename",true);
WriteWhole(StrToBytes("Second portion of text"),"filename",true);
byte[] readBytes = ReadWhole("filename");
string deserializedStr = BytesToStr(readBytes); // here deserializeddStr becomes "First portion of text"
Just use
Encoding.UTF8.GetBytes(string s)
Encoding.UTF8.GetString(byte[] b)
and don't forget to add System.Text in your using statements
BTW, why do you need to serialize a string and save it that way?
You can just use File.WriteAllText() or File.WriteAllBytes. The same way you can read it back, File.ReadAllBytes() and File.ReadAllText()
The problem is that you are writing two strings to the file, but only reading one back.
If you want to read back multiple strings, then you must deserialize multiple strings. If there are always two strings, then you can just deserialize two strings. If you want to store any number of strings, then you must first store how many strings there are, so that you can control the deserialization process.
If you are trying to hide data (as indicated by your comment to another answer), then this is not a reliable way to accomplish that goal. On the other hand, if you are storing data an a user's hard-drive, and the user is running your program on their local machine, then there is no way to hide the data from them, so this is as good as anything else.

Categories