Convert xsl - fo to pdf - c#

I have an XSL-FO file and data is available in xml/json formats. I wanted to create a pdf using this xsl structure.
Can anyone suggest any open source libraries for conversion? I want it to be done at the C# level.
Note: I tried converting to html but as it is xsl-fo file I can't get the alignment.

You can use Apache FOP (https://xmlgraphics.apache.org/fop/) tool. It can generate PDF document from XSL-FO input.
From the C# it is possible to start Apache FOP process pointing it to the XSL-FO file (or use stdin, so you don't have to use any temporary files). After process exists you get PDF file (in file on disk or stdout).
For start you can make Apache FOP read XSL-FO file and write PDF file to disk, for that use Process class (https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.process?view=netframework-4.8):
Draft code snippet (may contains errors but it should be a good start for you):
Process.Start("C:\\path\\to\\fop input_xsl-fo.xml output.pdf").WaitForExit();

I tried using fo.net and it worked for me,
here is the sample code
string lBaseDir = System.IO.Path.GetDirectoryName("e:\thermalpdf.xsl");
XslCompiledTransform lXslt = new XslCompiledTransform();
lXslt.Load("e:\thermalpdf.xsl");
lXslt.Transform("e:\billingData1.xml", "books1.fo");
FileStream lFileInputStreamFo = new FileStream("books1.fo", FileMode.Open);
FileStream lFileOutputStreamPDF = new FileStream("e:\response2.pdf", FileMode.Create);
FonetDriver lDriver = FonetDriver.Make();
lDriver.BaseDirectory = new DirectoryInfo(lBaseDir);
lDriver.CloseOnExit = true;
lDriver.Render(lFileInputStreamFo, lFileOutputStreamPDF);
lFileInputStreamFo.Close();
lFileOutputStreamPDF.Close();

Related

C# - Save XML file as PDF as Raw Image (not converting)

I am trying to save xml file as PDF as it is. In other words, I am trying to create PDF file that shows content of XML like a screenshot (like raw screenshot). My client somehow needs it like this. I couldn't really find the same question on stackoverflow. Is there anyway I can do this using iText or some other library?
Thank you!
First extract your text from XML file:
XmlDocument doc = new XmlDocument();
doc.Load("c:\\temp.xml");
string myText;
foreach(XmlNode node in doc.DocumentElement.ChildNodes){
myText= node.InnerText; //or loop through its children as well
}
Then create a PDF file and pass this text into it
in this case here I use PDFFlow library to create pdf documents
var DocumentBuilder.New()
.AddSection().AddParagraphToSection(myText).ToDocument()
.Build("Result.PDF");
If you load the xml in a browser you can easily save to searchable PDF
In a shell call (replace msedge with chrome if necessary)
"path to\msedge.exe" --headless --disable-gpu --print-to-pdf="out path\xml.pdf" --enable-logging "file://path to A\file.xml"
Enable logging helps as it can take a long time to process without any visual progress.
[0818/231038.640:INFO:headless_shell.cc(648)] Written to file ...\xml.pdf.
You can also add --print-to-pdf-no-header. Also if adding some style consider --run-all-compositor-stages-before-draw but I have no idea if that works for xml.
ForGet Image --screenshot as a 40 Page high XML as JPEG does NOT translate well to PDF. I tried :-)
If you want that as an image PDF then Re-Print the PDF to PDF using a command line viewer such as this since it is ONLY Print As Image output :-) also note it can in addition read the XML in Black and White (NO linting).
But have not tested how well it does XML2PDF via command line print
SumatraPDF -print-to "My Print to PDF" "path to\filename.pdf" (or xml in mono)
Note "My Print to PDF" is a promptless port you need to configure as required.

C# SaveFileDialog CSV (MS-DOS)

I'm using C# WPF and trying to save a file with a CSV (MS-DOS) type, every time it saves it as a normal CSV file.
How I can force it to be (MS-DOS) type?
According to this page the difference is...
If you export as Windows CSV, those fields are encoded using the Windows-1252 code page. DOS encoding usually uses code page 437, which maps characters used in old pre-Windows PCs.
So, you need to specify an encoding when you create your StreamWriter. Maybe something along the lines of (untested)...
using (StreamWriter sw = new StreamWriter(File.Open("filename", FileMode.Create), Encoding.GetEncoding(437)))
{
sw.WriteLine("my text...");
}

How to convert docx to html file using open xml with formatting

I know there are lot of question having same title but I am currently having some issue for them I didn't get the correct way to go.
I am using Open xml sdk 2.5 along with Power tool to convert .docx file to .html file which uses HtmlConverter class for conversion.
I am successfully able to convert the docx file into the Html file but the problem is, html file doesn't retain the original formatting of the document file. eg. Font-size,color,underline,bold etc doesn't reflect into the html file.
Here is my existing code:
public void ConvertDocxToHtml(string fileName)
{
byte[] byteArray = File.ReadAllBytes(fileName);
using (MemoryStream memoryStream = new MemoryStream())
{
memoryStream.Write(byteArray, 0, byteArray.Length);
using (WordprocessingDocument doc = WordprocessingDocument.Open(memoryStream, true))
{
HtmlConverterSettings settings = new HtmlConverterSettings()
{
PageTitle = "My Page Title"
};
XElement html = HtmlConverter.ConvertToHtml(doc, settings);
File.WriteAllText(#"E:\Test.html", html.ToStringNewLineOnAttributes());
}
}
}
So I just want to know if is there any way by which I can retain the formatting in converted HTML file.
I know about some third party APIs which does the same thing. But I would prefer if there any way using open xml or any other open source to do this.
PowerTools for Open XML just released a new HtmlConverter module. It now contains an open source, free implementation of a conversion from DOCX to HTML formatted with CSS. The module HtmlConverter.cs supports all paragraph, character, and table styles, fonts and text formatting, numbered and bulleted lists, images, and more. See https://openxmldeveloper.org/
Your end result will not look exactly the way your Word Document turns out, but this link might help.
You might want to find an external tool to help you do this, like Aspose Words
You can use OpenXML Viewer extension for Firefox for Converting with formatting.
http://openxmlviewer.codeplex.com
This works for me. Hope this helps.

Creating report with Aspose.Word without losing formatting

I am using Aspose.Words to create reports from a template file (.docx filetype).
After using Aspose.Words to modify the template file and saving it into a new file, the formatting of the template file were lost (such as bold text, comments, etc).
I have tried:
Aspose.Words.Document doc = new Document(inputStream);
var outputStream = new MemoryStream();
doc.Save(outputStream, SaveFormat.docx);
What I did not expect is that outputStream is much less bytes than inputStream although I have yet to make any modification on doc. It may the reason why the report file lose their formatting.
What should I try now?
Ok, the problem is because the current version of Aspose.Words I'm using does not support docx filetype. But it still can read text of a .docx file, and only text(without any associated formatting).

iTextSharp for PDF - how add file attachments?

I am using iTextSharp to create a PDF document in C#. I would like to attach another file to the PDF. I'm having just loads of trouble trying to do so. The examples here show some annotations, which apparently attachments are.
This is what I've tried:
writer.AddAnnotation(its.pdf.PdfAnnotation.CreateFileAttachment(writer, new iTextSharp.text.Rectangle(100,100,100,100), "File Attachment", its.pdf.PdfFileSpecification.FileExtern(writer, "C:\\test.xml")));
Well, what happens is it does add an annotation on the PDF (appears as a little comment voice balloon), which i don't want. test.xml is shown in the attachments pane in Adobe Reader, but it can't be read or saved, and its file size is unknown so it's likely that it's never being properly attached.
Any suggestions?
Well, I got some code working to attach it:
its.Document PDFD = new its.Document(its.PageSize.LETTER);
its.pdf.PdfWriter writer;
writer = its.pdf.PdfWriter.GetInstance(PDFD, new FileStream(targetpath, FileMode.Create));
its.pdf.PdfFileSpecification pfs = its.pdf.PdfFileSpecification.FileEmbedded(writer, "C:\\test.xml", "New.xml", null);
writer.AddFileAttachment(pfs);
where "its"="iTextSharp.text"
Now to read the attachment!

Categories