I am able to create a word doc using the code below.
Question: how do i create a pdf instead of word doc?
Code
using (StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, tdindb.TDCode + "-test.doc")))
{
string html = string.Format("<html>{0}</html>", sbHtml);
outputFile.WriteLine(html);
}
string FileLocation = docPath + "\\" + tdindb.TDCode + "-test.doc";
byte[] fileBytes = System.IO.File.ReadAllBytes(FileLocation);
string fileName = Path.GetFileName(FileLocation);
return File(fileBytes, System.Net.Mime.MediaTypeNames.Application.Octet, fileName);
Thank you
You need to use one of PDF creating libraries. I tried to use iText , IronPDF, and PDFFlow. All of them create PDF documents from scratch.
But PDFFlow was better for my case because i needed automatic page creation and multi-page spread table)
This is how to create a simple PDF file in C#:
{
var DocumentBuilder.New()
.AddSection()
.AddParagraphToSection("your text goes here!")
.ToSection()
.ToDocument()
.Build("Result.PDF");
}
feel free to ask me if you need more help.
Converting a Word document to HTML has been answered here before. The linked example is the first result of many on this site.
Once you have your HTML, to create a PDF you need to use a PDF creation library. For this example we will use IronPDF which requires just 3 lines of code:
string html = string.Format("<html>{0}</html>", sbHtml);
var renderer = new IronPdf.ChromePdfRenderer();
// Save PDF file to BinaryData
renderer.RenderHtmlAsPdf(html).BinaryData;
// Save file to location
renderer.RenderHtmlAsPdf(html).SaveAs("output.pdf");
Related
I've had this issue for a few days and have tried a number of things. Essentially, I'm having users edit an SVG and once they're finished, they save it back to a file path on the server. My problem is my SVG contains an href to a file in a directory. As an SVG, it can reference the file with no problem but when I convert the file to a PDF, it doesn't have the image but it does have the other SVG elements such as lines and text. Has anyone had this issue or know how to resolve? I'm using ImageMagick to convert but it still has that problem. See below for my controller code:
[HttpPost]
public void ConvertToPDF(string[] svgs, string pdfFilePath, List<string> files)
{
MagickReadSettings settings = new MagickReadSettings();
using (MagickImageCollection images = new MagickImageCollection())
{
foreach(var s in svgs)
{
var fileName = System.Web.HttpContext.Current.Server.MapPath(ConfigurationManager.AppSettings["ConvertFromXMLToSvgTempPath"] + s + ".svg");
//images.Read(fileName, settings);
//var readSettings = new MagickReadSettings() { Format = MagickFormat.Svg };
using (var image = new MagickImage(fileName))
{
image.Format = MagickFormat.Pdf;
}
}
images.Write(pdfFilePath);
}
}
FYI - I ended up converting the image path into Base64. The rest of the SVG converted correctly. Hope this helps anyone in the future.
Better You can try with Syncfusion HTML to PDF library.This one works better for me.
sample code
HtmlToPdfConverter htmlConverter = new HtmlToPdfConverter(HtmlRenderingEngine.WebKit);
WebKitConverterSettings settings = new WebKitConverterSettings();
// WebKit library path
settings.WebKitPath = #"../../QtBinaries/";
//Assign WebKit settings to HTML converter
htmlConverter.ConverterSettings = settings;
//Convert a SVG file to PDF with HTML converter
PdfDocument document = htmlConverter.Convert(#"../../Sample.svg");
document.Save("Output.pdf");
document.Close(true);
To download webkit libraries
WebKit HTML Converter: https://www.syncfusion.com/downloads/latest-version
I have generated a word file using Open Xml and I need to send it as attachment in a email with pdf format but I cannot save any physical pdf or word file on disk because I develop my application in cloud environment(CRM online).
I found only way is "Aspose Word to .Net".
http://www.aspose.com/docs/display/wordsnet/How+to++Convert+a+Document+to+a+Byte+Array But it is too expensive.
Then I found a solution is to convert word to html, then convert html to pdf. But there is a picture in my word. And I cannot resolve the issue.
The most accurate conversion from DOCX to PDF is going to be through Word. Your best option for that is setting up a server with OWAS (Office Web Apps Server) and doing your conversion through that.
You'll need to set up a WOPI endpoint on your application server and call:
/wv/WordViewer/request.pdf?WOPISrc={WopiUrl}&type=downloadpdf
OR
/wv/WordViewer/request.pdf?WOPISrc={WopiUrl}&type=printpdf
Alternatively you could try and do it using OneDrive and Word Online, but you'll need to work out the parameters Word Online uses as well as whether that's permitted within the Ts & Cs.
You can try Gnostice XtremeDocumentStudio .NET.
Converting From DOCX To PDF Using XtremeDocumentStudio .NET
http://www.gnostice.com/goto.asp?id=24900&t=convert_docx_to_pdf_using_xdoc.net
In the published article, conversion has been demonstrated to save to a physical file. You can use documentConverter.ConvertToStream method to convert a document to a Stream as shown below in the code snippet.
DocumentConverter documentConverter = new DocumentConverter();
// input can be a FilePath, Stream, list of FilePaths or list of Streams
Object input = "InputDocument.docx";
string outputFileFormat = "pdf";
ConversionMode conversionMode = ConversionMode.ConvertToSeperateFiles;
List<Stream> outputStreams = documentConverter.ConvertToStream(input, outputFileFormat, conversionMode);
Disclaimer: I work for Gnostice.
If you wanna convert bytes array, then to use Metamorphosis:
string docxPath = #"example.docx";
string pdfPath = Path.ChangeExtension(docxPath, ".pdf");
byte[] docx = File.ReadAllBytes(docxPath);
// Convert DOCX to PDF in memory
byte[] pdf = p.DocxToPdfConvertByte(docx);
if (pdf != null)
{
// Save the PDF document to a file for a viewing purpose.
File.WriteAllBytes(pdfPath, pdf);
System.Diagnostics.Process.Start(pdfPath);
}
else
{
System.Console.WriteLine("Conversion failed!");
Console.ReadLine();
}
I have recently used SautinSoft 'Document .Net' library to convert docx to pdf in my React(frontend), .NET core(micro services- backend) application. It only take 15 seconds to generate a pdf having 23 pages. This 15 seconds includes getting data from database, then merging data with docx template and then converting it to pdf. The code has deployed to azure Linux box and works fine.
https://sautinsoft.com/products/document/
Sample code
public string GeneratePDF(PDFDocumentModel document)
{
byte[] output = null;
using (var outputStream = new MemoryStream())
{
// Create single pdf.
DocumentCore singlePDF = new DocumentCore();
var documentCores = new List<DocumentCore>();
foreach (var section in document.Sections)
{
documentCores.Add(GenerateDocument(section));
}
foreach (var dc in documentCores)
{
// Create import session.
ImportSession session = new ImportSession(dc, singlePDF, StyleImportingMode.KeepSourceFormatting);
// Loop through all sections in the source document.
foreach (Section sourceSection in dc.Sections)
{
// Because we are copying a section from one document to another,
// it is required to import the Section into the destination document.
// This adjusts any document-specific references to styles, bookmarks, etc.
// Importing a element creates a copy of the original element, but the copy
// is ready to be inserted into the destination document.
Section importedSection = singlePDF.Import<Section>(sourceSection, true, session);
// First section start from new page.
if (dc.Sections.IndexOf(sourceSection) == 0)
importedSection.PageSetup.SectionStart = SectionStart.NewPage;
// Now the new section can be appended to the destination document.
singlePDF.Sections.Add(importedSection);
//Paging
HeaderFooter footer = new HeaderFooter(singlePDF, HeaderFooterType.FooterDefault);
// Create a new paragraph to insert a page numbering.
// So that, our page numbering looks as: Page N of M.
Paragraph par = new Paragraph(singlePDF);
par.ParagraphFormat.Alignment = HorizontalAlignment.Center;
CharacterFormat cf = new CharacterFormat() { FontName = "Consolas", Size = 11.0 };
par.Content.Start.Insert("Page ", cf.Clone());
// Page numbering is a Field.
Field fPage = new Field(singlePDF, FieldType.Page);
fPage.CharacterFormat = cf.Clone();
par.Content.End.Insert(fPage.Content);
par.Content.End.Insert(" of ", cf.Clone());
Field fPages = new Field(singlePDF, FieldType.NumPages);
fPages.CharacterFormat = cf.Clone();
par.Content.End.Insert(fPages.Content);
footer.Blocks.Add(par);
importedSection.HeadersFooters.Add(footer);
}
}
var pdfOptions = new PdfSaveOptions();
pdfOptions.Compression = false;
pdfOptions.EmbedAllFonts = false;
pdfOptions.EmbeddedImagesFormat = PdfSaveOptions.EmbImagesFormat.Png;
pdfOptions.EmbeddedJpegQuality = 100;
//dont allow editing after population, also ensures content can be printed.
pdfOptions.PreserveFormFields = false;
pdfOptions.PreserveContentControls = false;
if (!string.IsNullOrEmpty(document.PdfProperties.Title))
{
singlePDF.Document.Properties.BuiltIn[BuiltInDocumentProperty.Title] = document.PdfProperties.Title;
}
if (!string.IsNullOrEmpty(document.PdfProperties.Author))
{
singlePDF.Document.Properties.BuiltIn[BuiltInDocumentProperty.Author] = document.PdfProperties.Author;
}
if (!string.IsNullOrEmpty(document.PdfProperties.Subject))
{
singlePDF.Document.Properties.BuiltIn[BuiltInDocumentProperty.Subject] = document.PdfProperties.Subject;
}
singlePDF.Save(outputStream, pdfOptions);
output = outputStream.ToArray();
}
return Convert.ToBase64String(output);
}
i want to read a pdf file line per line but i want to maintain his original format
¿can i do this with itextsharp?
i use the next code :
private void button1_Click(object sender, EventArgs e)
{
string text = string.Empty;
string path = string.Empty;
path = "C:\\Documents and Settings\\Rafael\\Desktop\\Imprimiendo\\Print1.pdf";
PdfReader reader = new PdfReader(path);
for (int page = 1; page <= reader.NumberOfPages; page++)
{
text = PdfTextExtractor.GetTextFromPage(reader, page);
richTextBox1.Text = text;
}
reader.Close();
return;
}
thanks, i really need your help
If you want to read PDF file with small data in it, iTextsharp would be the best choice, you may find answer here:
Reading PDF content with itextsharp dll in VB.NET or C#
However, if you have huge data in your PDF file, iTextsharp will have problems in realizing this task. in such a case, you may need a third party library. This article may help you much:
Read PDF file in C#
I read it some post referring to Populate word documents, but I need to populate a word document (Office 2007) using C#. For example i want to have a word document with a label [NAME], use that label in C# to put my value, and do all this in a ASP.NET MVC3 controller. Any idea?
You could use the OpenXML SDK provided by Microsoft to manipulate Word documents. And here's a nice article (it's actually the third of a series of 3 articles) with a couple of examples.
You can do like this :
- Introduce "signets" into your Word document template
- Work on a copy of your word template
- Modify signets values from c# code and save or print your file.
Be carefull with releasing correctly your word process if you treat several documents in your application :)
OP's solution extracted from the question:
The solution i found is this:
static void Main(string[] args)
{
Console.WriteLine("Starting up Word template updater ...");
//get path to template and instance output
string docTemplatePath = #"C:\Users\user\Desktop\Doc Offices XML\earth.docx";
string docOutputPath = #"C:\Users\user\Desktop\Doc Offices XML\earth_Instance.docx";
//create copy of template so that we don't overwrite it
File.Copy(docTemplatePath, docOutputPath);
Console.WriteLine("Created copy of template ...");
//stand up object that reads the Word doc package
using (WordprocessingDocument doc = WordprocessingDocument.Open(docOutputPath, true))
{
//create XML string matching custom XML part
string newXml = "<root>" +
"<Earth>Outer Space</Earth>" +
"</root>";
MainDocumentPart main = doc.MainDocumentPart;
main.DeleteParts<CustomXmlPart>(main.CustomXmlParts);
//MainDocumentPart mainPart = doc.AddMainDocumentPart();
//add and write new XML part
CustomXmlPart customXml = main.AddCustomXmlPart(CustomXmlPartType.CustomXml);
using (StreamWriter ts = new StreamWriter(customXml.GetStream()))
{
ts.Write(newXml);
}
//closing WordprocessingDocument automatically saves the document
}
Console.WriteLine("Done");
Console.ReadLine();
}
There is a way do convert HTML or PDF to RTF/DOC or HTML/PDF to image using DevExpress or Infragistics?
I tried this using DevExpress:
string html = new StreamReader(Server.MapPath(#".\teste.htm")).ReadToEnd();
RichEditControl richEditControl = new RichEditControl();
string rtf;
try
{
richEditControl.HtmlText = html;
rtf = richEditControl.RtfText;
}
finally
{
richEditControl.Dispose();
}
StreamWriter sw = new StreamWriter(#"D:\teste.rtf");
sw.Write(rtf);
sw.Close();
But I have a complex html content (tables, backgrounds, css etc) and the final result is not good...
To convert Html content into image or Pdf you may use the following code:
using (RichEditControl richEditControl = new RichEditControl()) {
richEditControl.LoadDocument(Server.MapPath(#".\teste.htm"), DocumentFormat.Html);
using (PrintingSystem ps = new PrintingSystem()) {
PrintableComponentLink pcl = new PrintableComponentLink(ps);
pcl.Component = richEditControl;
pcl.CreateDocument();
//pcl.PrintingSystem.ExportToPdf("teste.pdf");
pcl.PrintingSystem.ExportToImage("teste.jpg", System.Drawing.Imaging.ImageFormat.Jpeg);
}
}
I suggest you to use latest DevExpress version (version 10.1.5 this time). It handles tables much better than previous ones.
Please use the following code to avoid encoding issues (StreamReader and StreamWriter in your sample always use Encoding.UTF8 encoding, this will corrupt any content stored with another encoding):
using (RichEditControl richEditControl = new RichEditControl()) {
richEditControl.LoadDocument(Server.MapPath(#".\teste.htm"), DocumentFormat.Html);
richEditControl.SaveDocument(#"D:\teste.rtf", DocumentFormat.Rtf);
}
Also take a look at the richEditControl.Options.Import.Html and richEditControl.Options.Export.Rtf properties, you may find them useful for some cases.