Save modified WordprocessingDocument to new file - c#

I'm attempting to open a Word document, change some text and then save the changes to a new document. I can get the first bit done using the code below but I can't figure out how to save the changes to a NEW document (specifying the path and file name).
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
using DocumentFormat.OpenXml.Packaging;
using System.IO;
namespace WordTest
{
class Program
{
static void Main(string[] args)
{
string template = #"c:\data\hello.docx";
string documentText;
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(template, true))
{
using (StreamReader reader = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
{
documentText = reader.ReadToEnd();
}
documentText = documentText.Replace("##Name##", "Paul");
documentText = documentText.Replace("##Make##", "Samsung");
using (StreamWriter writer = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
{
writer.Write(documentText);
}
}
}
}
}
I'm a complete beginner at this, so forgive the basic question!

If you use a MemoryStream you can save the changes to a new file like this:
byte[] byteArray = File.ReadAllBytes("c:\\data\\hello.docx");
using (MemoryStream stream = new MemoryStream())
{
stream.Write(byteArray, 0, (int)byteArray.Length);
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(stream, true))
{
// Do work here
}
// Save the file with the new name
File.WriteAllBytes("C:\\data\\newFileName.docx", stream.ToArray());
}

In Open XML SDK 2.5:
File.Copy(originalFilePath, modifiedFilePath);
using (var wordprocessingDocument = WordprocessingDocument.Open(modifiedFilePath, isEditable: true))
{
// Do changes here...
}
wordprocessingDocument.AutoSave is true by default so Close and Dispose will save changes.
wordprocessingDocument.Close is not needed explicitly because the using block will call it.
This approach doesn't require entire file content to be loaded into memory like in accepted answer. It isn't a problem for small files, but in my case I have to process more docx files with embedded xlsx and pdf content at the same time so the memory usage would be quite high.

Simply copy the source file to the destination and make changes from there.
File.copy(source,destination);
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(destination, true))
{
\\Make changes to the document and save it.
wordDoc.MainDocumentPart.Document.Save();
wordDoc.Close();
}
Hope this works.

This approach allows you to buffer the "template" file without batching the whole thing into a byte[], perhaps allowing it to be less resource intensive.
var templatePath = #"c:\data\hello.docx";
var documentPath = #"c:\data\newFilename.docx";
using (var template = File.OpenRead(templatePath))
using (var documentStream = File.Open(documentPath, FileMode.OpenOrCreate))
{
template.CopyTo(documentStream);
using (var document = WordprocessingDocument.Open(documentStream, true))
{
//do your work here
document.MainDocumentPart.Document.Save();
}
}

For me this worked fine:
// To search and replace content in a document part.
public static void SearchAndReplace(string document)
{
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(document, true))
{
string docText = null;
using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
{
docText = sr.ReadToEnd();
}
Regex regexText = new Regex("Hello world!");
docText = regexText.Replace(docText, "Hi Everyone!");
using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
{
sw.Write(docText);
}
}
}

Related

How to create a .tar.gz file from text string

I have a dlm file and I want to create a .tar.gz file from the content in dlm file. When I am trying to create the file, it is created but when I manually unzip that it is failed. My code is below for creating .tar.gz file, targetFileName is like C:\Folder\xxx.tar.gz:
using (StreamWriter write = new StreamWriter(targetFileName, false, Encoding.Default))
{
write.Write(text.ToString());
write.Close();
}
In the above code text is content from dlm file. Is there anything that I am missing? please help.
try use SharpZipLib from Nuget
using System;
using System.IO;
using System.Text;
using ICSharpCode.SharpZipLib.GZip;
using ICSharpCode.SharpZipLib.Tar;
add method:
private static void CreateTarGZ(string tgzFilename, string innerFilename, string text)
{
var uncompressed = Encoding.UTF8.GetBytes(text);
using (Stream outStream = File.Create(tgzFilename))
{
using (GZipOutputStream gzoStream = new GZipOutputStream(outStream))
{
gzoStream.SetLevel(9);
using (TarOutputStream taroStream = new TarOutputStream(gzoStream, Encoding.UTF8))
{
taroStream.IsStreamOwner = false;
TarEntry entry = TarEntry.CreateTarEntry(innerFilename);
entry.Size = uncompressed.Length;
taroStream.PutNextEntry(entry);
taroStream.Write(uncompressed, 0, uncompressed.Length);
taroStream.CloseEntry();
taroStream.Close();
}
}
}
}
then call:
CreateTarGZ("test.tar.gz", "FileName.txt", "my text");
CreateTarGZ("c:\\temp\\test.tar.gz", "foo-folder\\FileName.txt", "my text");
This is a quick example to create a .tar.gz and .gz file that will include the file that you might be creating using the stream.
Note that I'm using SharpZipLib which you can find in Nuget Package Manager for you project. Then make sure to add reference in your code:
Making tar.gz
using ICSharpCode.SharpZipLib.GZip;
using ICSharpCode.SharpZipLib.Tar;
using System.IO;
using System.Text;
static void Main(string[] args)
{
string text = ".Net is Awesome";
string filename = "D:\\text.txt";
string tarfilename = "D:\\text.tar.gz";
using (StreamWriter write = new StreamWriter(filename, false, Encoding.Default))
{
//Writing a text file
write.Write(text.ToString());
write.Close();
//Creating a tar.gz Stream
Stream TarFileStream = File.Create(tarfilename);
Stream GZStream = new GZipOutputStream(TarFileStream);
TarArchive tarArchive = TarArchive.CreateOutputTarArchive(GZStream);
tarArchive.RootPath = "D:/"; //Setting the Root Path for the archive
//Creating a file entry for the tar archive
TarEntry tarEntry = TarEntry.CreateEntryFromFile(filename);
//Writing the entry in the archive.
tarArchive.WriteEntry(tarEntry, false); //set false to only add the concerned file in the archive.
tarArchive.Close();
}
}
Making only .gz
You can create a method to make it more reusable like:
private static void MakeGz(string targetFile)
{
string TargetGz = targetFile + ".gz";
using (Stream GzStream = new GZipOutputStream(File.Create(TargetGz)))
{
using (FileStream fs = File.OpenRead(targetFile))
{
byte[] FileBuffer = new byte[fs.Length];
fs.Read(FileBuffer, 0, (int)fs.Length);
GzStream.Write(FileBuffer, 0, FileBuffer.Length);
fs.Close();
GzStream.Close();
}
}
}
Then you can call this method whenever you are creating a file to make an archive for the same at the same time like:
MakeGz(filename);

How can I create named destinations with iTextSharp?

I am trying to convert PDF bookmarks into named destinations with C# and iTextSharp 5 library. Unfortunately iTextSharp seems not to write named destinations into the target PDF file.
using System;
using System.Collections.Generic;
using iTextSharp.text.pdf;
using iTextSharp.text;
using System.IO;
namespace PDFConvert
{
class Program
{
static void Main(string[] args)
{
String InputPdf = #"test.pdf";
String OutputPdf = "out.pdf";
PdfReader reader = new PdfReader(InputPdf);
var fileStream = new FileStream(OutputPdf, FileMode.Create, FileAccess.Write, FileShare.None);
var list = SimpleBookmark.GetBookmark(reader);
PdfStamper stamper = new PdfStamper(reader, fileStream);
foreach (Dictionary<string, object> entry in list)
{
object o;
entry.TryGetValue("Title", out o);
String title = o.ToString();
entry.TryGetValue("Page", out o);
String location = o.ToString();
String[] aLoc = location.Split(' ');
int page = int.Parse(aLoc[0]);
PdfDestination dest = new PdfDestination(PdfDestination.XYZ, float.Parse(aLoc[2]), float.Parse(aLoc[3]), float.Parse(aLoc[4]));
stamper.Writer.AddNamedDestination(title, page, dest);
// stamper.Writer.AddNamedDestinations(SimpleNamedDestination.GetNamedDestination(reader, false), reader.NumberOfPages);
}
stamper.Close();
reader.Close();
}
}
}
I already tried to use PdfWriter instead of PdfStamper, with the same result. I have definitely calls of stamper.Writer.AddNamedDestination(title, page, dest); but no sign of NamedDestinations in my target file.
I have found a solution using iText 7 instead of 5. Unfortunately the syntax is completely different. In my code below I only consider the second level Bookmarks ("Outline") of my PDF.
using iText.Kernel.Pdf;
using iText.Kernel.Pdf.Navigation;
using System;
namespace PDFConvert
{
class Program
{
static void Main(string[] args)
{
String InputPdf = #"test.pdf";
String OutputPdf = "out.pdf";
PdfDocument pdfDoc = new PdfDocument(new PdfReader(InputPdf), new PdfWriter(OutputPdf));
PdfOutline outlines = pdfDoc.GetOutlines(false);
// first level
foreach (var outline in outlines.GetAllChildren())
{
// second level
foreach (var second in outline.GetAllChildren())
{
String title = second.GetTitle();
PdfDestination dest = second.GetDestination();
pdfDoc.AddNamedDestination(title, dest.GetPdfObject());
}
}
pdfDoc.Close();
}
}
}

Reading very large .xml.bz2 files

I'd like to parse Wikimedia's .xml.bzip2 dumps without extracting the entire file or performing any XML validation:
var filename = "enwiki-20160820-pages-articles.xml.bz2";
var settings = new XmlReaderSettings()
{
ValidationType = ValidationType.None,
ConformanceLevel = ConformanceLevel.Auto // Fragment ?
};
using (var stream = File.Open(filename, FileMode.Open))
using (var bz2 = new BZip2InputStream(stream))
using (var xml = XmlTextReader.Create(bz2, settings))
{
xml.ReadToFollowing("page");
// ...
}
The BZip2InputStream works - if I use a StreamReader, I can read XML line by line. But when I use XmlTextReader, it fails when I try to perform the read:
System.Xml.XmlException: 'Unexpected end of file has occurred. The following elements are not closed: mediawiki. Line 58, position 1.'
The bzip stream is not at EOF. Is it possible to open an XmlTextReader on top of a BZip2 stream? Or is there some other means to do this?
This should work. I used combination of XmlReader and Xml Linq. You can parse the XElement doc as needed.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication29
{
class Program
{
const string URL = #"https://dumps.wikimedia.org/enwiki/20160820/enwiki-20160820-abstract26.xml";
static void Main(string[] args)
{
XmlReader reader = XmlReader.Create(URL);
while (!reader.EOF)
{
if (reader.Name != "doc")
{
reader.ReadToFollowing("doc");
}
if (!reader.EOF)
{
XElement doc = (XElement)XElement.ReadFrom(reader);
}
}
}
}
}

CsvHelper not writing data to memory stream

Having some problems with CsvHelper and writing to a memory stream. I've tried flushing the stream writer and setting positions and everything else tried. I figure I've narrowed it down to a really simple test case that obviously fails. What am I doing wrong here?
public OutputFile GetTestFile()
{
using (var ms = new MemoryStream())
using (var sr = new StreamWriter(ms))
using (var csv = new CsvWriter(sr))
{
csv.WriteField("test");
sr.Flush();
return new OutputFile
{
Data = ms.ToArray(),
Length = ms.Length,
DataType = "text/csv",
FileName = "test.csv"
};
}
}
[TestMethod]
public void TestWritingToMemoryStream()
{
var file = GetTestFile();
Assert.IsFalse(file.Data.Length == 0);
}
Editing the correct answer in for people googling as this corrected code actually passes my test. I have no idea why writing to a StringWriter then converting it to bytes solves all the crazy flushing issues, but it works now.
using (var sw = new StringWriter())
using (var csvWriter = new CsvWriter(sw, config))
{
csvWriter.WriteRecords(records);
return Encoding.UTF8.GetBytes(sw.ToString());
}
Since CSVHelper is meant to collect several fields per row/line, it does some buffering itself until you tell it the current record is done:
csv.WriteField("test");
csv.NextRecord();
sr.Flush();
Now, the memstream should have the data in it. However, unless there is more processing elsewhere, the result in your OutputFile is wrong: Data will be byte[] not "text/csv". It seems like StringWriter would produce something more appropriate:
string sBuff;
using (StringWriter sw = new StringWriter())
using (CsvWriter csv = new CsvWriter(sw))
{
csv.WriteRecord<SomeItem>(r);
sBuff = sw.ToString();
}
Console.WriteLine(sBuff);
"New Item ",Falcon,7

Read the content of an xml file within a zip package

I am required to read the contents of an .xml file using the Stream (Here the xml file is existing with in the zip package). Here in the below code, I need to get the file path at runtime (here I have hardcoded the path for reference). Please let me know how to read the file path at run time.
I have tried to use string s =entry.FullName.ToString(); but get the error "Could not find the Path". I have also tried to hard code the path as shown below. however get the same FileNotFound error.
string metaDataContents;
using (var zipStream = new FileStream(#"C:\OB10LinuxShare\TEST1\Temp" + "\\"+zipFileName+".zip", FileMode.Open))
using (var archive = new ZipArchive(zipStream, ZipArchiveMode.Read))
{
foreach (var entry in archive.Entries)
{
if (entry.Name.EndsWith(".xml"))
{
FileInfo metadataFileInfo = new FileInfo(entry.Name);
string metadataFileName = metadataFileInfo.Name.Replace(metadataFileInfo.Extension, String.Empty);
if (String.Compare(zipFileName, metadataFileName, true) == 0)
{
using (var stream = entry.Open())
using (var reader = new StreamReader(stream))
{
metaDataContents = reader.ReadToEnd();
clientProcessLogWriter.WriteToLog(LogWriter.LogLevel.DEBUG, "metaDataContents : " + metaDataContents);
}
}
}
}
}
I have also tried to get the contents of the .xml file using the Stream object as shown below. But here I get the error "Stream was not readable".
Stream metaDataStream = null;
string metaDataContent = string.Empty;
using (Stream stream = entry.Open())
{
metaDataStream = stream;
}
using (var reader = new StreamReader(metaDataStream))
{
metaDataContent = reader.ReadToEnd();
}
Kindly suggest, how to read the contents of the xml with in a zip file using Stream and StreamReader by specifying the file path at run time
Your section code snippet is failing because when you reach the end of the first using statement:
using (Stream stream = entry.Open())
{
metaDataStream = stream;
}
... the stream will be disposed. That's the point of a using statment. You should be fine with this sort of code, but load the XML file while the stream is open:
XDocument doc;
using (Stream stream = entry.Open())
{
doc = XDocument.Load(stream);
}
That's to load it as XML... if you really just want the text, you could use:
string text;
using (Stream stream = entry.Open())
{
using (StreamReader reader = new StreamReader(stream))
{
text = reader.ReadToEnd();
}
}
Again, note how this is reading before it hits the end of either using statement.
Here is a sample of how to read a zip file using .net 4.5
private void readZipFile(String filePath)
{
String fileContents = "";
try
{
if (System.IO.File.Exists(filePath))
{
System.IO.Compression.ZipArchive apcZipFile = System.IO.Compression.ZipFile.Open(filePath, System.IO.Compression.ZipArchiveMode.Read);
foreach (System.IO.Compression.ZipArchiveEntry entry in apcZipFile.Entries)
{
if (entry.Name.ToUpper().EndsWith(".XML"))
{
System.IO.Compression.ZipArchiveEntry zipEntry = apcZipFile.GetEntry(entry.Name);
using (System.IO.StreamReader sr = new System.IO.StreamReader(zipEntry.Open()))
{
//read the contents into a string
fileContents = sr.ReadToEnd();
}
}
}
}
}
catch (Exception)
{
throw;
}
}

Categories