Some special characters have rendering problem as a pdf file in LocalReport

Some special characters have rendering problem as a pdf file in LocalReport - c#

I am using Microsoft.Reporting.WinForms.dll to render my RDLC as a pdf file, but when i open pdf file with adobe reader i have a problem about some special Turkish characters.In pdf, they seem normal but when i try to use CTRL + F to search some words in the pdf file. I couldn't find these words. Even if my pdf file included these Turkish characters. Also, when i copy-paste these words into the file, i get characters like 􀃹􀃻􀁏􀁈􀁐􀀃􀀮. It is interesting as i also use the same dll to render my RDLC as an excel file. I use same class same code and same method. I don't have this problem in an excel file.
I use byte[] Render(string format); method in WinForms.dll for rendering. Maybe some special character's ASCII code is out of range for byte array maybe because of this it couldn't render every characters for pdf format but i am not sure about this.
Thanks...

according to microsoft article there is an issue with special characters that is fixed in sql server 2014, the corresponding reportviewer dll would be the 2015 runtime.
maybe you should upgrade

I had a similar problem. My application generates PDFs using LocalReport.
My solution was:
1- Modify the RDLC XML schema to use the 2016 version. Change what you have for this.
<Report xmlns = "http://schemas.microsoft.com/sqlserver/reporting/2016/01/reportdefinition" xmlns: rd = "http://schemas.microsoft.com/SQLServer/reporting/reportdesigner">.
Then, modify the schema, it is no longer the same as previous versions (DataSources go up ...)
2- Remove <EmbedFonts>None</EmbedFonts> from DeviceInfo.
With these changes, I got the special characters painted and printed well.

Related

MigraDoc RTF Document cannot translate special characters

I created an application, where we want to create .rtf & .pdf Documents.
The documents also contain characters like ä,ü,ö,ß and we have the big issue, that those special characters are not shown correctly in the RTF Document.
For creating the rtf document, we are using "Migradoc" and the "RtfDocumentRenderer".
The PDF will be created correctly... And for the rtf document, we already tried a few things:
Setting the UTF encoding before calling the renderer
changing the culture info
creating the document as byte array, converting it to an array, encoded the byte array, but without success
with Unicode instead of the character.

The current version 1.51 of PDFsharp/MigraDoc targets .NET 2.0/.NET 3.5.
The next version (coming soon) targets .NET 6 and properly deals with the change of the default encoding.

PDFsharp cannot read text from Crystal Reports CR23-encoded documents

We are using Crystal Reports, C# and PDFsharp to generate PDF documents by individual users. Crystal Reports is first used to create a single monolithic PDF document with all the users' entries, with each user's respective portion delineated by text "tags." Afterwards, a C# program generates individual PDFs from the monolith, by extracting its text with PDFsharp, searching for the tags, and then generating a PDF from each between-tag portion.
This process worked fine for many years, but starting with Crystal Reports Service Pack 23, the encoding of the generated PDFs is no longer readable by PDFsharp, and hence the tags cannot be found. (No such problem occurs when copying from these documents if they are rendered in Chrome or Firefox.)
Is there a setting that can be changed in Crystal Reports to restore the old encoding, or must we either modify PDFsharp or use a different PDF processing library?

I posted this answer but it was deleted. I can't figure out why, given that it addresses an explicit question: "or must we either modify PDFsharp or use a different PDF processing library?"
I have no financial interest in the suggested library! I'm not the developer of it. I only use it.
Perhaps whoever decided to delete, didn't bother to read the whole question.
Consider using a different library. I use the Quick PDF library (Foxit, formerly Debenu) to do PDF splitting by tag in Crystal exports. It works fine for pdfs exported from any version of Crystal, including the latest runtime.

The SP16-generated PDFs used WinAnsi encoding, but the SP23 ones use Unicode. SAP said there is no setting in Crystal Reports to force the encoding to WinAnsi.
Solving this problem required adding ToUnicode CMap-retrieval to PDFsharp and using the CMaps at runtime to map each CString text index to its corresponding Unicode character.

How to get RDLC report render PDF with ToUnicode entry for copypasting non-ansi text from the resulting PDF

Preface: we have a reports generated in c# application using Microsoft.Reporting.WebForms. LocalReport class from a RDLC file. They are rendered in PDF format. The text in the report is mostly in Cyrillics. The problem is: it's impossible to copy it from the resulting PDF file, you get garbage.
The reason you get garbage is the text is written as the "Identity-H" encoding for the font. It's not a real encoding, it's just an assignment of CIDs (basically, numbers) for glyphs used in the PDF file. Adobe's PDF format has the "ToUnicode" entry for this reason – that's what should store the correspondence of CIDs to the Unicode characters. If this information was present, it would be possible to copy/past text from the file correctly.
Obviously, this class doesn't write it. While researching the problem, I came across this page that recognizes the lack of copy/paste support and praises it finally being implemented... in SQL Server 2016 Reporting Services.
Well, we don't use ServerReport class and SQL Server RS. Or SQL Server 2016. It'll be kinda a weird and way too giant architectural changes to move to it just because managers complain they cannot copy text from PDFs.
So, is there a workaround? I doubt noone faced this problem before. Maybe the writing of this ToUnicode entry was implemented in LocalReport in the newer version of dotNet? Did someone write some sort of wrapper classes that take a bytearray of the PDF and enhance it? Or maybe people render the report to DOCX and then use some other library to make a PDF out of that correctly?

Edit XPS content

I have got an application that is supposed to send a formatted document to a printer with some barcodes.
I've made other applications that work with printers and print directly through the printserver by sending a xps file, so I thought I would try to see if I could make a .xps file, change the text and be done with it, however every article I can find on the net has to do with creating xps files and not changing them. I feel like it should be possible, and it would be nice not to have to resort to installing Office on the server and print through there. Then I might as well use Open XML and a .docx file.
It is very simple. Let's say I want to change the text INCNUMMER in a .xps file to "testing123". How would I go about that?
I have tried the whole unzip, open the xml, find the text, edit, rezip but I'm afraid there's too much about the .xps format I don't understand to make that work.
Best regards, Kaspar.

As you already know, an XPS file is just a ZIP archive containing a number of files and folders that have particular names and a defined structure.
At the root level there is a Documents folder which will typically contain just a single document folder named 1. Inside that is a Pages folder containing one or more .fpage files: these define the content of each page in the document.
Documents
1
Pages
1.fpage
2.fpage
etc
If you open up these .fpage files in a text editor you will see that they are just XML files. Each page is typically represented by a <Canvas> element that contains multiple <Path> and <Glyphs> elements (text is represented by the latter). However, even though <Glyphs> elements do have a UnicodeString attribute the value of that attribute cannot be changed in isolation.
Each <Glyphs> element also has an Indices attribute. If you remove this attribute altogether and change the UnicodeString attribute at the same time, this almost works. However, you will probably find that when viewing the file in the XPS Viewer application certain characters in the text are replaced by question mark symbols.
Font glyphs are embedded in the XPS file (odttf files in the Resources folder), and the software that generated the XPS file will only embed glyphs that are used in the source document. For example, this means that (for a given font) if you did not use the letter "A" in the source document, then the glyph for that letter will not be written to the resources of the XPS file. Hence if you change the UnicodeString attribute to include a letter "A" then that character will display as a question mark in the viewer because it has no glyph resource that tells it how that character must be drawn.
If you have control over the source document (the one that later gets converted to XPS) then I suppose you could include a piece of text containing all of the characters that you are likely to use, and set its colour to white so that it doesn't print, but I'm not sure whether the XPS printer driver would strip that text out anyway. If it didn't then you could probably do something like this:
Open the relevant .fpage XML file
Search all UnicodeString attributes of <Glyphs> elements to find the text you want
Replace that text with something else
Remove the Indices attribute from the changed <Glyphs> elements
Save the updated XML back to the file
Re-zip then change the extension from ZIP to XPS

Search and Replace PDF using Itext

I need to generate a PDF based on some user inputs. The output PDF have some images, tables and texts. I think that Itext is not user friendly for programmatically generate this report.
Since the report I need to generate is quite complicated, I was wondering if it is possible to create a template PDF and then load -> search -> replace the strings/images I want.
The template PDF can be a tagged pdf.
Is it possible to do that?
Is it the best approach?
EDIT: I´m using WPF + MVVM + .Net 3.5

Replacing text within a PDF file is not simple. The PDF fileformat uses a dictionary at the file end where elements are listet with their byte offset within the file, also some elements have a field where they give their own length given in bytes. If these offsets are not met, the reader will probably report a broken pdf.
You should have a look at reporting as it is made for these tasks:
http://msdn.microsoft.com/en-us/library/bb885185%28v=vs.100%29.aspx
You can create a template with the report designer, set your data and export it to pdf.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.