Convert docx to pdf

Convert docx to pdf - c#

My documents are stored in a database that I want to send mail with attachments.
I want to convert stored docx to pdf.
var result = from c in valinor.documents
select new
{
c.document_name,
c.document_size,
c.document_content
};
var kk = result.ToList();
for (int i = 0; i<kk.Count; i++)
{
MemoryStream stream = new MemoryStream(kk[i].document_content);
Attachment attachment = new Attachment(stream, kk[i].document_name + ".pdf", "application/pdf");
mail.Attachments.Add(attachment);
}
How can I convert document_content to pdf?

You need to use Microsoft.Office.Interop.Word in MIcrosoft office dll.
add reference to your project Microsoft.Office.Interop.Word
Check my sample of Code.
It's nice and Easy. 100% work for me.
Microsoft.Office.Interop.Word.Application word = new Microsoft.Office.Interop.Word.Application();
wordDocument = word.Documents.Open(savedFileName, ReadOnly: true);
wordDocument.ExportAsFixedFormat(attahcmentPath + "/pdf" + attachment.BetAttachmentCode + ".pdf", Microsoft.Office.Interop.Word.WdExportFormat.wdExportFormatPDF);
word.Quit(false);

You will need a third party component such as ABCpdf and (probably) Word installed on the machine and use that component to convert from docx to pdf.

Related

Convert bytes to PDF File

I am able to create a word doc using the code below.
Question: how do i create a pdf instead of word doc?
Code
using (StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, tdindb.TDCode + "-test.doc")))
{
string html = string.Format("<html>{0}</html>", sbHtml);
outputFile.WriteLine(html);
}
string FileLocation = docPath + "\\" + tdindb.TDCode + "-test.doc";
byte[] fileBytes = System.IO.File.ReadAllBytes(FileLocation);
string fileName = Path.GetFileName(FileLocation);
return File(fileBytes, System.Net.Mime.MediaTypeNames.Application.Octet, fileName);
Thank you

You need to use one of PDF creating libraries. I tried to use iText , IronPDF, and PDFFlow. All of them create PDF documents from scratch.
But PDFFlow was better for my case because i needed automatic page creation and multi-page spread table)
This is how to create a simple PDF file in C#:
{
var DocumentBuilder.New()
.AddSection()
.AddParagraphToSection("your text goes here!")
.ToSection()
.ToDocument()
.Build("Result.PDF");
}
feel free to ask me if you need more help.

Converting a Word document to HTML has been answered here before. The linked example is the first result of many on this site.
Once you have your HTML, to create a PDF you need to use a PDF creation library. For this example we will use IronPDF which requires just 3 lines of code:
string html = string.Format("<html>{0}</html>", sbHtml);
var renderer = new IronPdf.ChromePdfRenderer();
// Save PDF file to BinaryData
renderer.RenderHtmlAsPdf(html).BinaryData;
// Save file to location
renderer.RenderHtmlAsPdf(html).SaveAs("output.pdf");

Convert a Word (DOCX) file to a PDF in C# on cloud environment

I have generated a word file using Open Xml and I need to send it as attachment in a email with pdf format but I cannot save any physical pdf or word file on disk because I develop my application in cloud environment(CRM online).
I found only way is "Aspose Word to .Net".
http://www.aspose.com/docs/display/wordsnet/How+to++Convert+a+Document+to+a+Byte+Array But it is too expensive.
Then I found a solution is to convert word to html, then convert html to pdf. But there is a picture in my word. And I cannot resolve the issue.

The most accurate conversion from DOCX to PDF is going to be through Word. Your best option for that is setting up a server with OWAS (Office Web Apps Server) and doing your conversion through that.
You'll need to set up a WOPI endpoint on your application server and call:
/wv/WordViewer/request.pdf?WOPISrc={WopiUrl}&type=downloadpdf
OR
/wv/WordViewer/request.pdf?WOPISrc={WopiUrl}&type=printpdf
Alternatively you could try and do it using OneDrive and Word Online, but you'll need to work out the parameters Word Online uses as well as whether that's permitted within the Ts & Cs.

You can try Gnostice XtremeDocumentStudio .NET.
Converting From DOCX To PDF Using XtremeDocumentStudio .NET
http://www.gnostice.com/goto.asp?id=24900&t=convert_docx_to_pdf_using_xdoc.net
In the published article, conversion has been demonstrated to save to a physical file. You can use documentConverter.ConvertToStream method to convert a document to a Stream as shown below in the code snippet.
DocumentConverter documentConverter = new DocumentConverter();
// input can be a FilePath, Stream, list of FilePaths or list of Streams
Object input = "InputDocument.docx";
string outputFileFormat = "pdf";
ConversionMode conversionMode = ConversionMode.ConvertToSeperateFiles;
List<Stream> outputStreams = documentConverter.ConvertToStream(input, outputFileFormat, conversionMode);
Disclaimer: I work for Gnostice.

If you wanna convert bytes array, then to use Metamorphosis:
string docxPath = #"example.docx";
string pdfPath = Path.ChangeExtension(docxPath, ".pdf");
byte[] docx = File.ReadAllBytes(docxPath);
// Convert DOCX to PDF in memory
byte[] pdf = p.DocxToPdfConvertByte(docx);
if (pdf != null)
{
// Save the PDF document to a file for a viewing purpose.
File.WriteAllBytes(pdfPath, pdf);
System.Diagnostics.Process.Start(pdfPath);
}
else
{
System.Console.WriteLine("Conversion failed!");
Console.ReadLine();
}

I have recently used SautinSoft 'Document .Net' library to convert docx to pdf in my React(frontend), .NET core(micro services- backend) application. It only take 15 seconds to generate a pdf having 23 pages. This 15 seconds includes getting data from database, then merging data with docx template and then converting it to pdf. The code has deployed to azure Linux box and works fine.
https://sautinsoft.com/products/document/
Sample code
public string GeneratePDF(PDFDocumentModel document)
{
byte[] output = null;
using (var outputStream = new MemoryStream())
{
// Create single pdf.
DocumentCore singlePDF = new DocumentCore();
var documentCores = new List<DocumentCore>();
foreach (var section in document.Sections)
{
documentCores.Add(GenerateDocument(section));
}
foreach (var dc in documentCores)
{
// Create import session.
ImportSession session = new ImportSession(dc, singlePDF, StyleImportingMode.KeepSourceFormatting);
// Loop through all sections in the source document.
foreach (Section sourceSection in dc.Sections)
{
// Because we are copying a section from one document to another,
// it is required to import the Section into the destination document.
// This adjusts any document-specific references to styles, bookmarks, etc.
// Importing a element creates a copy of the original element, but the copy
// is ready to be inserted into the destination document.
Section importedSection = singlePDF.Import<Section>(sourceSection, true, session);
// First section start from new page.
if (dc.Sections.IndexOf(sourceSection) == 0)
importedSection.PageSetup.SectionStart = SectionStart.NewPage;
// Now the new section can be appended to the destination document.
singlePDF.Sections.Add(importedSection);
//Paging
HeaderFooter footer = new HeaderFooter(singlePDF, HeaderFooterType.FooterDefault);
// Create a new paragraph to insert a page numbering.
// So that, our page numbering looks as: Page N of M.
Paragraph par = new Paragraph(singlePDF);
par.ParagraphFormat.Alignment = HorizontalAlignment.Center;
CharacterFormat cf = new CharacterFormat() { FontName = "Consolas", Size = 11.0 };
par.Content.Start.Insert("Page ", cf.Clone());
// Page numbering is a Field.
Field fPage = new Field(singlePDF, FieldType.Page);
fPage.CharacterFormat = cf.Clone();
par.Content.End.Insert(fPage.Content);
par.Content.End.Insert(" of ", cf.Clone());
Field fPages = new Field(singlePDF, FieldType.NumPages);
fPages.CharacterFormat = cf.Clone();
par.Content.End.Insert(fPages.Content);
footer.Blocks.Add(par);
importedSection.HeadersFooters.Add(footer);
}
}
var pdfOptions = new PdfSaveOptions();
pdfOptions.Compression = false;
pdfOptions.EmbedAllFonts = false;
pdfOptions.EmbeddedImagesFormat = PdfSaveOptions.EmbImagesFormat.Png;
pdfOptions.EmbeddedJpegQuality = 100;
//dont allow editing after population, also ensures content can be printed.
pdfOptions.PreserveFormFields = false;
pdfOptions.PreserveContentControls = false;
if (!string.IsNullOrEmpty(document.PdfProperties.Title))
{
singlePDF.Document.Properties.BuiltIn[BuiltInDocumentProperty.Title] = document.PdfProperties.Title;
}
if (!string.IsNullOrEmpty(document.PdfProperties.Author))
{
singlePDF.Document.Properties.BuiltIn[BuiltInDocumentProperty.Author] = document.PdfProperties.Author;
}
if (!string.IsNullOrEmpty(document.PdfProperties.Subject))
{
singlePDF.Document.Properties.BuiltIn[BuiltInDocumentProperty.Subject] = document.PdfProperties.Subject;
}
singlePDF.Save(outputStream, pdfOptions);
output = outputStream.ToArray();
}
return Convert.ToBase64String(output);
}

Add HTML String to OpenXML (*.docx) Document

I am trying to use Microsoft's OpenXML 2.5 library to create a OpenXML document. Everything works great, until I try to insert an HTML string into my document. I have scoured the web and here is what I have come up with so far (snipped to just the portion I am having trouble with):
Paragraph paragraph = new Paragraph();
Run run = new Run();
string altChunkId = "id1";
AlternativeFormatImportPart chunk =
document.MainDocumentPart.AddAlternativeFormatImportPart(
AlternativeFormatImportPartType.Html, altChunkId);
chunk.FeedData(new MemoryStream(Encoding.UTF8.GetBytes(ioi.Text)));
AltChunk altChunk = new AltChunk { Id = altChunkId };
run.AppendChild(new Break());
paragraph.AppendChild(run);
body.AppendChild(paragraph);
Obviously, I haven't actually added the altChunk in this example, but I have tried appending it everywhere - to the run, paragraph, body, etc. In ever case, I am unable to open up the docx file in Word 2010.
This is making me a little nutty because it seems like it should be straightforward (I will admit that I'm not fully understanding the AltChunk "thing"). Would appreciate any help.
Side Note: One thing I did find that was interesting, and I don't know if it's actually a problem or not, is this response which says AltChunk corrupts the file when working from a MemoryStream. Can anybody confirm that this is/isn't true?

I can reproduce the error "... there is a problem with the content" by using
an incomplete HTML document as the content of the alternative format import part.
For example if you use the following HTML snippet <h1>HELLO</h1>
MS Word is unable to open the document.
The code below shows how to add an AlternativeFormatImportPart to a word document.
(I've tested the code with MS Word 2013).
using (WordprocessingDocument doc = WordprocessingDocument.Open(#"test.docx", true))
{
string altChunkId = "myId";
MainDocumentPart mainDocPart = doc.MainDocumentPart;
var run = new Run(new Text("test"));
var p = new Paragraph(new ParagraphProperties(
new Justification() { Val = JustificationValues.Center }),
run);
var body = mainDocPart.Document.Body;
body.Append(p);
MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes("<html><head></head><body><h1>HELLO</h1></body></html>"));
// Uncomment the following line to create an invalid word document.
// MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes("<h1>HELLO</h1>"));
// Create alternative format import part.
AlternativeFormatImportPart formatImportPart =
mainDocPart.AddAlternativeFormatImportPart(
AlternativeFormatImportPartType.Html, altChunkId);
//ms.Seek(0, SeekOrigin.Begin);
// Feed HTML data into format import part (chunk).
formatImportPart.FeedData(ms);
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;
mainDocPart.Document.Body.Append(altChunk);
}
According to the Office OpenXML specification valid parent elements for the
w:altChunk element are body, comment, docPartBody, endnote, footnote, ftr, hdr and tc.
So, I've added the w:altChunk to the body element.
For more information on the w:altChunk element see this MSDN link.
EDIT
As pointed out by #user2945722, to make sure that the OpenXml library correctlty interprets the byte array as UTF-8, you should add the UTF-8 preamble. This can be done this way:
MemoryStream ms = new MemoryStream(new UTF8Encoding(true).GetPreamble().Concat(Encoding.UTF8.GetBytes(htmlEncodedString)).ToArray()
This will prevent your é's from being rendered as Ã©'s, your ä's as Ã¤'s, etc.

Had the same problem here, but a totally different cause. Worth a try if the accepted solution doesn't help. Try closing the file after saving. In my case, it happened to be the difference between a corrupt and a clean docx file. Oddly, most other operations work with only a Save() and program exit.
String cid = "chunkid";
WordprocessingDocument document = WordprocessingDocument.Open("somefile.docx", true);
Body body = document.MainDocumentPart.Document.Body;
MemoryStream ms = new MemoryStream(System.Text.Encoding.UTF8.GetBytes("<html><head></head><body>hi</body></html>"));
AlternativeFormatImportPart formatImportPart = document.MainDocumentPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.Html, cid);
formatImportPart.FeedData(ms);
AltChunk altChunk = new AltChunk();
altChunk.Id = cid;
document.MainDocumentPart.Document.Body.Append(altChunk);
document.MainDocumentPart.Document.Save();
// here's the magic!
document.Close();

Embed an Excel graphic in a Word document with OpenXML working with Word 2010 and 2003

I have to implement a Microsoft Word document generator with embed excel graphics in it.
One of my constraint is to make my generated docx work both with Microsoft word 2010 and 2003 + compatibility pack.
I didn't managed to make it works for both of them. I can make it works for Word 2010 but the document are not working for 2003 and vice versa.
After several search to make it work for Word 2003 I have added this in my code :
private static void Word2003(ChartPart importedChartPart, MainDocumentPart mainDocumentPart, Stream fileStream)
{
var ext = new ExternalData { Id = "rel" + 5 };
importedChartPart.ChartSpace.InsertAt(ext, 3);
var fi = new FileInfo(#"generated.xlsx");
importedChartPart.AddExternalRelationship("http://schemas.openxmlformats.org/officeDocument/2006/relationships/package", new Uri(fi.Name, UriKind.Relative), "rel5");
EmbeddedPackagePart embeddedObjectPart = mainDocumentPart.AddEmbeddedPackagePart(#"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
Stream copyStream = new MemoryStream();
fileStream.CopyTo(copyStream);
embeddedObjectPart.FeedData(copyStream);
}
But at this point generated documents don't work with Word 2010. If I delete these two lignes :
var ext = new ExternalData { Id = "rel" + 5 };
importedChartPart.ChartSpace.InsertAt(ext, 3);
from previous code it's works for Word 2010 but not for Word 2003.
I have tried several things but I didn't manage to make it work for each case.
You can find this small piece of code here
The prerequisite is a template of Excel file with a Chart and a graphic in it.
Edit : Generated document always works with Microsoft Office 2007 (with the two problematic code lines or not). I'm still seeking for solutions !

I finally found the solution !
The problem was due to 2 things :
I didn't put the External Data correctly and the External relationship was wrong.
This code make it works :
private static void Word2003(ChartPart importedChartPart, MainDocumentPart mainDocumentPart, Stream fileStream)
{
// Add of the external data id
ExternalData ext = new ExternalData { Id = "rel" + 5 };
AutoUpdate autoUpdate = new AutoUpdate{ Val = false};
ext.Append(autoUpdate);
importedChartPart.ChartSpace.Append(ext);
// Set of the relationship
var fi = new FileInfo(#"generated.xlsx");
importedChartPart.AddExternalRelationship("http://schemas.openxmlformats.org/officeDocument/2006/relationships/oleObject", new Uri(fi.Name, UriKind.Relative), "rel5");
// Link to the embedded file
EmbeddedPackagePart embeddedObjectPart = mainDocumentPart.AddEmbeddedPackagePart(#"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
Stream copyStream = new MemoryStream();
fileStream.CopyTo(copyStream);
embeddedObjectPart.FeedData(copyStream);
}
Now generated Word document works with Word 2003, 2007 and 2010.
Maybe this will help somebody!

Export data to word in c#

Below is the code that creates a pdf to write a file..Every time i call the below code it creates a pdf file to write into..My question is,is there a same method for exporting to word or for simplicity just creates a blank doc file so that i can export data into it..
public void showPDf() {
iTextSharp.text.Document doc = new iTextSharp.text.Document(
iTextSharp.text.PageSize.A4);
string combined = Path.Combine(txtPath.Text,".pdf");
PdfWriter pw = PdfWriter.GetInstance(doc, new FileStream(combined, FileMode.Create));
doc.Open();
}

1. Interop API
It is available in Namespace Microsoft.Office.Interop.Word.
You can use Word Interop COM API to do that using following code,
// Open a doc file.
Application application = new Application();
Document document = application.Documents.Open("C:\\word.doc");
// Loop through all words in the document.
int count = document.Words.Count;
for (int i = 1; i <= count; i++)
{
// Write the word.
string text = document.Words[i].Text;
Console.WriteLine("Word {0} = {1}", i, text);
}
// Close word.
application.Quit();
Only Drawback is you must have office installed to use this feature.
2. OpenXML
you can use openxml to build word documents, try the following link,
http://msdn.microsoft.com/en-us/library/bb264572(v=office.12).aspx

Did you try searching the web for this ?
How to automate Microsoft Word to create a new document by using Visual C#

There is a free solution to export data to word,
http://www.codeproject.com/Articles/151789/Export-Data-to-Excel-Word-PDF-without-Automation-f

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Convert docx to pdf - c#

You will need a third party component such as ABCpdf and (probably) Word installed on the machine and use that component to convert from docx to pdf.

Related

Convert bytes to PDF File

Convert a Word (DOCX) file to a PDF in C# on cloud environment

Add HTML String to OpenXML (*.docx) Document

Embed an Excel graphic in a Word document with OpenXML working with Word 2010 and 2003

Export data to word in c#

Categories

Resources