Contents of zip file are truncated - c#

I want to write an XML string to a zip file using C#:
var myXmlString = "..."; // Contains some XML
// Set up zip archive
var archive = ZipFile.Open(FileName, ZipArchiveMode.Create);
var stream = archive.CreateEntry("myfile.xml").Open();
var sw = new StreamWriter(stream, Encoding.UTF8);
// Write data
sw.Write(myXmlString);
// Cleanup
stream.Flush();
stream.Close();
archive.Dispose();
This works fine except for one detail: the text in the zipped file is truncated. So the XML is just cut off at one point. If I extract the .xml file from the zip, it is 7.171 Bytes large. The XML string I wrote into the file is 7.404 Bytes long.
Can anyone help me find out where the missing bytes went?

Related

Is possible to create zip password protected file without first creating file, then zip it

I am writing data into text file and using below code,
await using var file = new StreamWriter(filePath);
foreach (var packet in resultPackets)
{
file.WriteLine(JsonConvert.SerializeObject(packet));
}
And I am using below code to zip the file with password protected using `DotNetZip,
using (ZipFile zip = new ZipFile())
{
zip.Password = "password";
zip.AddFile(filePath);
zip.Save(#"C:\tmp\data4.zip");
}
Is there a way to combined both, I want to create a file on the fly as password protected.
I don't
want to create first file with data, t
then create zip file from it
and delete the base file
Is this possible? Thanks!
Okay, so since this is still unanswered, here's a small program that does the job for me:
using (var stream = new MemoryStream())
using (var streamWriter = new StreamWriter(stream))
{
// Insert your code in here, i.e.
//foreach (var packet in resultPackets)
//{
// streamWriter.WriteLine(JsonConvert.SerializeObject(packet));
//}
// ... instead I write a simple string.
streamWriter.Write("Hello World!");
// Make sure the contents from the StreamWriter are actually flushed into the stream, then seek the beginning of the stream.
streamWriter.Flush();
stream.Seek(0, SeekOrigin.Begin);
using (ZipFile zip = new ZipFile())
{
zip.Password = "password";
// Write the contents of the stream into a file that is called "test.txt"
zip.AddEntry("test.txt", stream);
// Save the archive.
zip.Save("test.zip");
}
}
Note how AddEntry does not create any form of temporary file. Instead, when the archive is saved, the contents of the stream are read and put into a compressed file within the archive. However, be aware that the whole content of the file are completely kept in memory before it the archive is written to the disk.

OutOfMemory exception when trying to download multiple files as a Zip file using Ionic.Zip dll

This is my working code that I used to download multiple files as a zip file using Ionic.Zip dll. File contents is stored in a SQL database. This program works if I try to download 1-2 files at a time, but throws an OutOfMemory exception if I try to download multiple files as some of the files may very large.
Exception occurs when it's trying to write in to outputStream.
How can I improve this code to download multiple files or is there a better way to download multiple files one by one rather than zipping them to a one large file?
Code:
public ActionResult DownloadMultipleFiles()
{
string connectionString = "MY DB CONNECTIOBN STRING";
List<Document> documents = new List<Document>();
var query = "MY LIST OF FILES - FILE METADA DATA LIKE FILEID, FILENAME";
documents = query.Query<Document>(connectionString1).ToList();
List<Document> DOCS = documents.GetRange(0, 50); // 50 FILES
Response.Clear();
var outputStream = new MemoryStream();
using (var zip = new ZipFile())
{
foreach (var doc in DOCS)
{
Stream stream = new MemoryStream();
byte[] content = GetFileContent(doc.FileContentId); // This method returns file content
stream.Write(content, 0, content.Length);
zip.UseZip64WhenSaving = Zip64Option.AsNecessary // edited
zip.AddEntry(doc.FileName, content);
}
zip.Save(outputStream);
}
return File(outputStream, "application/zip", "allFiles.zip");
}
Download the files to disc instead of to memory, then use Ionic to zip them from disc. This means you don't need to have all the files in memory at once.

Read inside .DAT file using C#

I have .DAT from SharePoint, to recover some of the data I need to read the .DAT file using C#.
Some of the options are
StreamReader objInput = new StreamReader(filename, System.Text.Encoding.Default);
string contents = objInput.ReadToEnd().Trim();
string[] split = System.Text.RegularExpressions.Regex.Split(contents, "\\s+", RegexOptions.None);
foreach (string s in split)
{
Console.WriteLine(s);
}
or
//ObjectToSerialize objectToSerialize;
//Stream stream = File.Open(filename, FileMode.Open);
//BinaryFormatter bFormatter = new BinaryFormatter();
//objectToSerialize = (ObjectToSerialize)bFormatter.Deserialize(stream);
//stream.Close();
.../
The problem is the DAT file may contain XMl files, Doc files, or PPT or others. I just want list all the data and files inside the .DAT file.
Is there is any way I can do this is C#?
You can read .dat file using c#, but it depends on the structure of data how you have inside the .dat file
take a look at this link
How-read-data-from-DAT-file-using-C
how-can-i-read-data-from-dat-files

Using updateEntry() method with dotnetzip won't overwrite files correctly

I've been having a bit of a problem lately. I've been trying to extract one zip file into a memory stream and then from that stream, use the updateEntry() method to add it to the destination zip file.
The problem is, when the file in the stream is being put into the destination zip, it works if the file is not already in the zip. If there is a file with the same name, it does not overwrite correctly. It says on the dotnetzip docs that this method will overwrite files that are present in the zip with the same name but it does not seem to work. It will write correctly but when I go to check the zip, the files that are supposed to be overwritten have a compressed byte size of 0 meaning something went wrong.
I'm attaching my code below to show you what I'm doing:
ZipFile zipnew = new ZipFile(forgeFile);
ZipFile zipold = new ZipFile(zFile);
using(zipnew) {
foreach(ZipEntry zenew in zipnew) {
percent = (current / zipnew.Count) * 100;
string flna = zenew.FileName;
var fstream = new MemoryStream();
zenew.Extract(fstream);
fstream.Seek(0, SeekOrigin.Begin);
using(zipold) {
var zn = zipold.UpdateEntry(flna, fstream);
zipold.Save();
fstream.Dispose();
}
current++;
}
zipnew.Dispose();
}
Although it might be a bit slow, I found a solution by manually deleting and adding in the file. I'll leave the code here in case anyone else comes across this problem.
ZipFile zipnew = new ZipFile(forgeFile);
ZipFile zipold = new ZipFile(zFile);
using(zipnew) {
// Loop through each entry in the zip file
foreach(ZipEntry zenew in zipnew) {
string flna = zenew.FileName;
// Create a new memory stream for extracted files
var ms = new MemoryStream();
// Extract entry into the memory stream
zenew.Extract(ms);
ms.Seek(0, SeekOrigin.Begin); // Rewind the memory stream
using(zipold) {
// Remove existing entry first
try {
zipold.RemoveEntry(flna);
zipold.Save();
}
catch (System.Exception ex) {} // Ignore if there is nothing found
// Add in the new entry
var zn = zipold.AddEntry(flna, ms);
zipold.Save(); // Save the zip file with the newly added file
ms.Dispose(); // Dispose of the stream so resources are released
}
}
zipnew.Dispose(); // Close the zip file
}

How can I form a Word document using stream of bytes

I have a stream of bytes which actually (if put right) will form a valid Word file, I need to convert this stream into a Word file without writing it to disk, I take the original stream from SQL Server database table:
ID Name FileData
----------------------------------------
1 Word1 292jf2jf2ofm29fj29fj29fj29f2jf29efj29fj2f9 (actual file data)
the FileData field carries the data.
Microsoft.Office.Interop.Word.Application word = new Microsoft.Office.Interop.Word.Application();
Microsoft.Office.Interop.Word.Document doc = new Microsoft.Office.Interop.Word.Document();
doc = word.Documents.Open(#"C:\SampleText.doc");
doc.Activate();
The above code opens and fill a Word file from File System, I don't want that, I want to define a new Microsoft.Office.Interop.Word.Document, but I want to fill its content manually from byte stream.
After getting the in-memory Word document, I want to do some parsing of keywords.
Any ideas?
Create an in memmory file system, there are drivers for that.
Give word a path to an ftp server path (or something else) which you then use to push the data.
One important thing to note: storing files in a database is generally not good design.
You could look at how Sharepoint solves this. They have created a web interface for documents stored in their database.
Its not that hard to create or embed a webserver in your application that can serve pages to Word. You don't even have to use the standard ports.
There probably isn't any straight-forward way of doing this. I found a couple of solutions searching for it:
Use the OpenOffice SDK to manipulate the document instead of Word
Interop
Write the data to the clipboard, and then from the Clipboard to Word
I don't know if this does it for you, but apparently the API doesn't provide what you're after (unfortunately).
There are really only 2 ways to open a Word document programmatically - as a physical file or as a stream. There's a "package", but that's not really applicable.
The stream method is covered here: https://learn.microsoft.com/en-us/office/open-xml/how-to-open-a-word-processing-document-from-a-stream
But even it relies on there being a physical file in order to form the stream:
string strDoc = #"C:\Users\Public\Public Documents\Word13.docx";
Stream stream = File.Open(strDoc, FileMode.Open);
The best solution I can offer would be to write the file out to a temp location where the service account for the application has permission to write:
string newDocument = #"C:\temp\test.docx";
WriteFile(byteArray, newDocument);
If it didn't have permissions on the "temp" folder in my example, you would simply just add the service account of your application (application pool, if it's a website) to have Full Control of the folder.
You'd use this WriteFile() function:
/// <summary>
/// Write a byte[] to a new file at the location where you choose
/// </summary>
/// <param name="byteArray">byte[] that consists of file data</param>
/// <param name="newDocument">Path to where the new document will be written</param>
public static void WriteFile(byte[] byteArray, string newDocument)
{
using (MemoryStream stream = new MemoryStream())
{
stream.Write(byteArray, 0, (int)byteArray.Length);
// Save the file with the new name
File.WriteAllBytes(newDocument, stream.ToArray());
}
}
From there, you can open it with OpenXML and edit the file. There's no way to open a Word document in byte[] form directly into an instance of Word - Interop, OpenXML, or otherwise - because you need a documentPath, or the stream method mentioned earlier that relies on there being a physical file. You can edit the bytes you would get by reading the bytes into a string, and XML afterwards, or just edit the string, directly:
string docText = null;
byte[] byteArray = null;
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(documentPath, true))
{
using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
{
docText = sr.ReadToEnd(); // <-- converts byte[] stream to string
}
// Play with the XML
XmlDocument xml = new XmlDocument();
xml.LoadXml(docText); // the string contains the XML of the Word document
XmlNodeList nodes = xml.GetElementsByTagName("w:body");
XmlNode chiefBodyNode = nodes[0];
// add paragraphs with AppendChild...
// remove a node by getting a ChildNode and removing it, like this...
XmlNode firstParagraph = chiefBodyNode.ChildNodes[2];
chiefBodyNode.RemoveChild(firstParagraph);
// Or play with the string form
docText = docText.Replace("John","Joe");
// If you manipulated the XML, write it back to the string
//docText = xml.OuterXml; // comment out the line above if XML edits are all you want to do, and uncomment out this line
// Save the file - yes, back to the file system - required
using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
{
sw.Write(docText);
}
}
// Read it back in as bytes
byteArray = File.ReadAllBytes(documentPath); // new bytes, ready for DB saving
Reference:
https://learn.microsoft.com/en-us/office/open-xml/how-to-search-and-replace-text-in-a-document-part
I know it's not ideal, but I have searched and not found a way to edit the byte[] directly without a conversion that involves writing out the file, opening it in Word for the edits, then essentially re-uploading it to recover the new bytes. Doing byte[] byteArray = Encoding.UTF8.GetBytes(docText); prior to re-reading the file will corrupt them, as would any other Encoding I tried (UTF7,Default,Unicode, ASCII), as I found when I tried to write them back out using my WriteFile() function, above, in that last line. When not encoded and simply collected using File.ReadAllBytes(), and then writing the bytes back out using WriteFile(), it worked fine.
Update:
It might be possible to manipulate the bytes like this:
//byte[] byteArray = File.ReadAllBytes("Test.docx"); // you might be able to assign your bytes here, instead of from a file?
byte[] byteArray = GetByteArrayFromDatabase(fileId); // function you have for getting the document from the database
using (MemoryStream mem = new MemoryStream())
{
mem.Write(byteArray, 0, (int)byteArray.Length);
using (WordprocessingDocument wordDoc =
WordprocessingDocument.Open(mem, true))
{
// do your updates -- see string or XML edits, above
// Once done, you may need to save the changes....
//wordDoc.MainDocumentPart.Document.Save();
}
// But you will still need to save it to the file system here....
// You would update "documentPath" to a new name first...
string documentPath = #"C:\temp\newDoc.docx";
using (FileStream fileStream = new FileStream(documentPath,
System.IO.FileMode.CreateNew))
{
mem.WriteTo(fileStream);
}
}
// And then read the bytes back in, to save it to the database
byteArray = File.ReadAllBytes(documentPath); // new bytes, ready for DB saving
Reference:
https://learn.microsoft.com/en-us/previous-versions/office/office-12//ee945362(v=office.12)
But note that even this method will require saving the document, then reading it back in, in order to save it to bytes for the database. It will also fail if the document is in .doc format instead of .docx on that line where the document is being opened.
Instead of that last section for saving the file to the file system, you could just take the memory stream and save that back into bytes once you are outside of the WordprocessingDocument.Open() block, but still inside the using (MemoryStream mem = new MemoryStream() { ... } statement:
// Convert
byteArray = mem.ToArray();
This will have your Word document byte[].

Categories