How to print svg string without file creation - c#

I generated svg strings in my program c#. How can I print svg strings without file creation?

A simple example using a WebBrowser class.
You can of course use a WebBorwser Control, if you also need to show the SVG on screen.
You can do the same with a WebView2 Control / CoreWebView2 object.
Using a bare-bone HTML5 document text, set the content of an IMG element to the SVG string converted to base64, setting the mime-type of the data to image/svg+xml.
The Width of the IMG element is set to 200 (~2 inches, ~50 millimeters).
Modify as required (considering the DPI, if necessary).
Then create a WebBrowser class, generate a new empty Document object and Write() to it the HTML content (using this method to have immediate rendering of the content).
After that, call the WebBrowser.Print() method to send the rendered SVG to the default Printer.
Assume svgContent is your SVG string:
string base64 = Convert.ToBase64String(svgContent);
int svgWidth = 200;
string html = $"<!DOCTYPE html><html><body>" +
$"<img width={svgWidth} src=\"data:image/svg+xml;base64,{base64}\"/>" +
"</body></html>";
var wb = new WebBrowser() {ScriptErrorsSuppressed = true };
try {
wb.Navigate("");
var doc = wb.Document.OpenNew(true);
doc.Write(html);
wb.Print();
}
finally {
wb.Dispose();
}

You can place <svg>whatever</svg> tags directly inline into your html5 document as text. When a browser receives that document it renders the svg in whatever. This is a double-cool way to do it because your CSS applies to your svg elements.
Here's a decent introduction. Look for the section on how to use inline svg.

Related

Inserting an image after text

I'm making a simple program with .NET and iText7
Inserting a signature image in a PDF document is one of the functions under production.
It has been implemented until the image is inserted into the PDF and newly saved, but I don't know if the image goes behind the text.
The Canvas function seems to be possible, but no matter how many times I look at the example, I can't see any parameters related to the placement.
It would be nice to present a keyword that can implement the function.
The sample results are attached to help understanding. In the figure, the left is the capture of the PDF in which I inserted my signature using a word processor, and the right is the capture of the PDF generated through IText.
My iText version is .Net 7.2.1.
I attached the code below just in case it was necessary.
Thank you.
public void PDF_SIGN(FileInfo old_fi)
{
string currentPath = System.IO.Path.GetDirectoryName(Process.GetCurrentProcess().MainModule.FileName);
String imageFile = currentPath + "\\sign.jpg";
ImageData data = ImageDataFactory.Create(imageFile);
string source = old_fi.FullName;
string sourceFileName = System.IO.Path.GetFileNameWithoutExtension(source);
string sourceFileExtenstion = System.IO.Path.GetExtension(source);
string dest = old_fi.DirectoryName + "\\" + sourceFileName + "(signed)" + sourceFileExtenstion;
PdfDocument pdfDoc = new PdfDocument(new PdfReader(source), new PdfWriter(dest));
Document document = new Document(pdfDoc);
iText.Layout.Element.Image image = new iText.Layout.Element.Image(data);
image.ScaleAbsolute(Xsize, Ysize);
image.SetFixedPosition(1, Xaxis, Yaxis);
document.Add(image);
document.Close();
pdfDoc.Close();
}
Sample Result (Left: Gaol, Right: Current result):
You can organize content you add using iText in the same pass before or behind each other simply by the order of adding, and layout elements may have background images or colors.
The previously existing content of the source document, consequentially, usually serves as a mere background of everything new. Except, that is, if you draw to a page in a content stream that precedes earlier content.
Unfortunately you cannot use the Document class for this as its renderer automatically works in the foreground. But you can use the Canvas class here; this class only works on a single object (e.g. a single page) but it can be initialized in a far more flexible way.
In your case, therefore, replace
Document document = new Document(pdfDoc);
iText.Layout.Element.Image image = new iText.Layout.Element.Image(data);
image.ScaleAbsolute(Xsize, Ysize);
image.SetFixedPosition(1, Xaxis, Yaxis);
document.Add(image);
document.Close();
by
iText.Layout.Element.Image image = new iText.Layout.Element.Image(data);
image.ScaleAbsolute(Xsize, Ysize);
image.SetFixedPosition(1, Xaxis, Yaxis);
PdfPage pdfPage = pdfDoc.GetFirstPage();
PdfCanvas pdfCanvas = new PdfCanvas(pdfPage.NewContentStreamBefore(), pdfPage.GetResources(), pdfDoc);
using (Canvas canvas = new Canvas(pdfCanvas, pdfPage.GetCropBox()))
{
canvas.Add(image);
}
and you should get the desired result.
(Actually I tested that using Java and ported it to C# in writing this answer. I hope it's ported all right.)
As an aside, if you only want to put an image on the page, you don't really need the Canvas, you can directly use one of the AddImage* methods of PdfCanvas. For multiple elements to be automatically arranged, though, using the Canvas is a good idea.
Also I said above that you cannot use Document here. Actually you can if you replace the document renderer that class uses. For the task at hand that would have been an overkill, though.

iText 7 add header and footer in HTML to PDF

I want to add a header and a footer, which becomes repeated, to my PDF which becomes created by iText7 by converting the HTML.
However, all examples I found so far on the internet describes how to create a blank PDF by code with header and footer.
Does anybody know how I can achive this? I already tried to use the CSS print media queries to specify some areas but it seems those are ignored by iText7.
The conversion is really simple:
string input = "Bestellung.html";
string output = "Bestellung.pdf";
HtmlConverter.ConvertToPdf(new FileInfo(input), new FileInfo(output));
bestellung.html is just a plain HTML file with some demo content.
See mediaDeviceDescription under ConverterProperties.
If your input file uses this feature, then you can simply tell pdfHTML
to interpret the relevant set of rules:
ConverterProperties props = new ConverterProperties();
props.setMediaDeviceDescription(new
MediaDeviceDescription(MediaType.PRINT));
Then you call the method with this signature:
static void convertToPdf(InputStream htmlStream, PdfDocument pdfDocument, ConverterProperties converterProperties)

Convert html to image with pagination using C#

I'm working on a windows service in c# 4.0 wich transform different file in image (tif and jpeg)
I have a problem when i want to convert a html file (usually an e-mail) in image.
I use WebBrowser
var browser = new WebBrowser();
browser.DocumentCompleted += this.BrowserDocumentCompleted;
browser.DocumentText = html;
and DrawToBitmap
var browser = sender as WebBrowser;
Rectangle body = new Rectangle(browser.Document.Body.ScrollRectangle.X * scaleFactor,
browser.Document.Body.ScrollRectangle.Y * scaleFactor,
browser.Document.Body.ScrollRectangle.Width * scaleFactor,
browser.Document.Body.ScrollRectangle.Height * scaleFactor);
browser.Height = body.Height;
Bitmap output = new Bitmap(body.Width, body.Height);
browser.DrawToBitmap(output, body);
It works fine for small or medium html, but with long html (like 22 000 height px or more)
I have GDI exeptions on DrawToBitmap :
Invalid parameter
Not an image GDI+ valid
According to internet, this kind of error append because the image is too big.
My question : How can i convert html in X images (pagination) without generate the big image and crop after, and if it's possible without using library.
Thank you in advance.
Edit : I found a tricky solution : surround the html with a div witch gonna set the page and another for the offset, for exemple :
<div style="height:3000px; overflow:hidden">
<div style="margin-top:-3000px">
But this solution can crop on a line of text or in the middle of an image...
You can try creating a custom IE Print Template and use DEVICERECT and LAYOUTRECT elements to drive the pagination. The lines wouldn't get cut in the middle then, and you'd capture a bitmap of each DEVICERECT as a page. You'd need to issue CGID_MSHTML/IDM_SETPRINTTEMPLATE command to MSHTML document object (webBrowser.Document.DomDocument as IOleCommandTarget) to enable the Print Template-specific element tags like those. More information about Print Templates can be found here.
[EDITED] You can even use IHTMLElementRender::DrawToDC API on a DEVICERECT object to draw its content on a bitmap DC. You'd need to enable FEATURE_IVIEWOBJECTDRAW_DMLT9_WITH_GDI and disable FEATURE_GPU_RENDERING feature control settings for your WebBrowser hosting app to use IHTMLElementRender::DrawToDC.
Thank you for your anwser Noseratio.
I founded a solution by using printing and a virtual printer to get image file.
Save the html in a file and remove all encoding :
html = Regex.Replace(html, "<meta[^>]*http-equiv=\"Content-Type\"[^>]*>", string.Empty, RegexOptions.Multiline);
using (var f = File.Create(filePath))
{
var bytes = Encoding.Default.GetBytes(html);
f.Write(bytes, 0, bytes.Length);
}
Run the print without show the webbrowser and printing popup :
const short PRINT_WAITFORCOMPLETION = 2;
const int OLECMDID_PRINT = 6;
const int OLECMDEXECOPT_DONTPROMPTUSER = 2;
dynamic ie = browser.ActiveXInstance;
ie.ExecWB(OLECMDID_PRINT, OLECMDEXECOPT_DONTPROMPTUSER, PRINT_WAITFORCOMPLETION);
I use PDFCreator for virtual printing and it keep me all files in a folder. It's not easy to get all of this file (know when printing is finish, how many files and when you can use them...) but it isn't the purpose of this post!

Screen Scraping, Web Scraping, Web Harvesting, Web Data Extraction, etc. using C# and the .NET Framework

I am working on a Microsoft .NET Application in C# for Web Harvesting, Web Scraping, Web Data Extraction, Screen Scraping, etc. Whatever you want to call it. For parsing HTML, I'm attempting to incorporate HTML Agility Pack but it's not as easy as I thought it would be. I have included some specifications and images of what I have so far and was hoping to get your opinions on how I could proceed. basically, I want to do something similar to the layout used in Visual Web Ripper but I have no idea how they do it... Any ideas?
Specifications:
My goal is to make a very user friendly point-and-click application for downloading data and images from the web. I would like to load HTML pages using the web browser, and output the parsed data and image links into the text box. The user can specify which HTML tags they want and then download the data into the grid. Finally, export the data into whatever format they need.
I'm trying to use HTML Agility Pack to load the HTML on the webpage and display it in the textbox.
// Load Web Browser
private void Form6_Load(object sender, EventArgs e)
{
// Navigate to webpage
webBrowser.Navigate("http://www.webopedia.com/TERM/H/HTML.html");
// Save URL to memory
SiteMemoryArray[count] = urlTextBox.Text;
// Load HTML from webBrowser
HtmlWindow window = webBrowser.Document.Window;
string str = window.Document.Body.OuterHtml;
// Extract tags using HtmlAgilityPack and display in textbox
HtmlAgilityPack.HtmlDocument HtmlDoc = new HtmlAgilityPack.HtmlDocument();
HtmlDoc.LoadHtml(str);
HtmlAgilityPack.HtmlNodeCollection Nodes =
HtmlDoc.DocumentNode.SelectNodes("//a");
foreach (HtmlAgilityPack.HtmlNode Node in Nodes)
{
textBox2.Text += Node.OuterHtml + "\r\n";
}
}
Using:
HtmlWindow window = webBrowser.Document.Window;
I get the error: Object reference not set to an instance of an object.
You might not have the page load completed when you are referencing the browser window. You can have the browser control fire the navigationcomplete event when it is done. See this SO answer for an example: C# how to wait for a webpage to finish loading before continuing
I am not familiar with HTMLAgilityPack but one component I have used in the past is SGMLReader: http://developer.mindtouch.com/SgmlReader. This functions like a drop-in replacement for an XMLReader and will even convert the document to XML for you if you want. You can load it up into an XMLDocument (or even an XDocument) and then it's up to you what you do with it.
So I'd suggest using a HTTPWebRequest to get the HTML and then load the HTML into this component. that way you don't need to go anywhere near a WebBrowser control.
For screen scraping, if you are searching for particular images/shapes, you can use:
EMGU
You can also read the screen using WinAPI as such:
private Bitmap Capture(IntPtr hwnd)
{
return Capture(hwnd, GetClientRectangle());
}
private Bitmap Capture(IntPtr hwnd, Rectangle zone)
{
IntPtr hdcSrc = GetWindowDC(hwnd);
IntPtr hdcDest = CreateCompatibleDC(hdcSrc);
IntPtr hBitmap = CreateCompatibleBitmap(hdcSrc, zone.Width, zone.Height);
IntPtr hOld = SelectObject(hdcDest, hBitmap);
BitBlt(hdcDest, 0, 0, zone.Width, zone.Height, hdcSrc, zone.X, zone.Y, SRCCOPY);
SelectObject(hdcDest, hOld);
DeleteDC(hdcDest);
ReleaseDC(hwnd, hdcSrc);
Bitmap retBitmap = Bitmap.FromHbitmap(hBitmap);
DeleteObject(hBitmap);
return retBitmap;
}
To parse a HTML document:
using SHDocVw; //Interop.SHDocVw.dll
using mshtml; //Microsoft.mshtml.dll
InternetExplorer ie= new InternetExplorer();
ie.Navigate("www.example.com");
ie.Visible = true;
Thread.Sleep(5000); //Wait until page loads.
mshtml.HTMLDocument doc;
doc = ie.Document; //Gives the HTML document of the page.
To get all elements of a tag:
//HTML element's tag name:
IHTMLElementCollection AnchorColl = body.getElementsByTagName("a");
And parse the AnchorColl for all elements of that tag.

Can a PDF be converted to a vector image format that can be printed from .NET?

We have a .NET app which prints to both real printers and PDF, currently using PDFsharp, although that part can be changed if there's a better option. Most of the output is generated text or images, but there can be one or more pages that get appended to the end. That page(s) are provided by the end-user in PDF format.
When printing to paper, our users use pre-printed paper, but in the case of an exported PDF, we concatenate those pages to the end, since they're already in PDF format.
We want to be able to embed those PDFs directly into the print stream so they don't need pre-printed paper. However, there aren't really any good options for rendering a PDF to a GDI page (System.Drawing.Graphics).
Is there a vector format the PDF could be converted to by some external program, that could rendered to a GDI+ page without being degraded by conversion to a bitmap first?
In an article titled "How To Convert PDF to EMF In .NET," I have shown how to do this using our PDFOne .NET product. EMFs are vector graphics and you can render them on the printer canvas.
A simpler alternative for you is PDF overlay explained in another article titled "PDF Overlay - Stitching PDF Pages Together in .NET." PDFOne allows x-y offsets in overlays that allows you stitch pages on the edges. In the article cited here, I have overlaid the pages one over another by setting the offsets to zero. You will have set it to page width and height.
DISCLAIMER: I work for Gnostice.
Ghostscript can output PostScript (which is a vector file) which can be directly sent to some types of printers. For example, if you're using an LPR capable printer, the PS file can be directly set to that printer using something like this project: http://www.codeproject.com/KB/printing/lpr.aspx
There are also some commercial options which can print a PDF (although I'm not sure if the internal mechanism is vector or bitmap based), for example http://www.tallcomponents.com/pdfcontrols2-features.aspx or http://www.tallcomponents.com/pdfrasterizer3.aspx
I finally figured out that there is an option that addresses my general requirement of embedding a vector format into a print job, but it doesn't work with GDI based printing.
The XPS file format created by Microsoft XPS Writer print driver can be printed from WPF, using the ReachFramework.dll included in .NET. By using WPF for printing instead of GDI, it's possible to embed an XPS document page into a larger print document.
The downside is, WPF printing works quite a bit different, so all the support code that directly uses stuff in the Sytem.Drawing namespace has to be re-written.
Here's the basic outline of how to embed the XPS document:
Open the document:
XpsDocument xpsDoc = new XpsDocument(filename, System.IO.FileAccess.Read);
var document = xpsDoc.GetFixedDocumentSequence().DocumentPaginator;
// pass the document into a custom DocumentPaginator that will decide
// what order to print the pages:
var mypaginator = new myDocumentPaginator(new DocumentPaginator[] { document });
// pass the paginator into PrintDialog.PrintDocument() to do the actual printing:
new PrintDialog().PrintDocument(mypaginator, "printjobname");
Then create a descendant of DocumentPaginator, that will do your actual printing. Override the abstract methods, in particular the GetPage should return DocumentPages in the correct order. Here's my proof of concept code that demonstrates how to append custom content to a list of Xps documents:
public override DocumentPage GetPage(int pageNumber)
{
for (int i = 0; i < children.Count; i++)
{
if (pageNumber >= pageCounts[i])
pageNumber -= pageCounts[i];
else
return FixFixedPage(children[i].GetPage(pageNumber));
}
if (pageNumber < PageCount)
{
DrawingVisual dv = new DrawingVisual();
var dc = dv.Drawing.Append();
dc = dv.RenderOpen();
DoRender(pageNumber, dc); // some method to render stuff to the DrawingContext
dc.Close();
return new DocumentPage(dv);
}
return null;
}
When trying to print to another XPS document, it gives an exception "FixedPage cannot contain another FixedPage", and a post by H.Alipourian demonstrates how to fix it: http://social.msdn.microsoft.com/Forums/da/wpf/thread/841e804b-9130-4476-8709-0d2854c11582
private DocumentPage FixFixedPage(DocumentPage page)
{
if (!(page.Visual is FixedPage))
return page;
// Create a new ContainerVisual as a new parent for page children
var cv = new ContainerVisual();
foreach (var child in ((FixedPage)page.Visual).Children)
{
// Make a shallow clone of the child using reflection
var childClone = (UIElement)child.GetType().GetMethod(
"MemberwiseClone", BindingFlags.Instance | BindingFlags.NonPublic
).Invoke(child, null);
// Setting the parent of the cloned child to the created ContainerVisual
// by using Reflection.
// WARNING: If we use Add and Remove methods on the FixedPage.Children,
// for some reason it will throw an exception concerning event handlers
// after the printing job has finished.
var parentField = childClone.GetType().GetField(
"_parent", BindingFlags.Instance | BindingFlags.NonPublic);
if (parentField != null)
{
parentField.SetValue(childClone, null);
cv.Children.Add(childClone);
}
}
return new DocumentPage(cv, page.Size, page.BleedBox, page.ContentBox);
}
Sorry that it's not exactly compiling code, I just wanted to provide an overview of the pieces of code necessary to make it work to give other people a head start on all the disparate pieces that need to come together to make it work. Trying to create a more generalized solution would be much more complex than the scope of this answer.
While not open source and not .NET native (Delphi based I believe, but offers a precompiled .NET library), Quick PDF can render a PDF to an EMF file which you could load into your Graphics object.

Categories