Getting the Glyph Name from a TTF or OTF font file

Getting the Glyph Name from a TTF or OTF font file - c#

Hi anyone know how to get the Glyph Name from a TTF or OFT font file from C#, I'm willing to parse the file directly to get them if necessary? I found this post here Access opentype glyph names from WPF but it got no answer.
P.S. I have created the font myself and am creating an program to create a CSS (LESS or SASS) file to use the Glyphs I have made easily in web pages like Bootstrap or FontAwesome :)

In TrueType-based fonts (.TTF files), you can try parsing the 'post' table. It's fairly easy to figure out. But, only format 2.0 explicitly stores glyph names. If the post table is format 3.0, there are no glyph names stored (there are a couple of other formats defined, but fonts using them are very, very rare). In that case, your only option is to back-track using Unicode values from the 'cmap'...there are some standard references for Unicode-to-glyph names that may be useful.
For CFF-based fonts (.OTF files), glyph names are stored inside of the 'CFF ' table. That's a bit trickier to parse, but if you're only looking for the glyph name references it shouldn't be too difficult to figure out.

Related

Get glyph names from TTF file in C#

I have a TTF file (the file that can be downloaded from https://materialdesignicons.com/ to be more precise) and I want to get the name and unicode numeric value of each glyph contained in it (something like this site does https://andreinitescu.github.io/IconFont2Code/ but I'm obviously to dumb to read the essential code parts from his Github project). I do not want the name of the font or anything just the name of each individual glyph.
I can see the names when I open the file with for example Notepad++ but I can not identify a pattern which I could use to get the names programmatically.
I have now searched the internet for more than a day straight and I can't find any helpful answers. It can't be that hard to get it working - or can it?

How to PDF Resize generated by rdlc and itextsharp

I have generated pdfs using rdlc and then combined multiple pdf files to a single document using iTextSharp pdfsmartcopy class. But my pdf size is large and I want to reduce the size of that pdf file. I have tried compressing it using iTextSharp but that's unable to compress it. When I upload the pdf file to ilivepdf.com online for compression ,then it compresses the 21MB file to 1MB.

Often, the problem is related to embedded fonts.
You see, PDF really strives to preserve your document exactly how you made.
To do that, a PDF library can decide to embed a font. You can imagine this as simply putting the font file into the PDF document.
But, here comes the tricky part.
The PDF specification took into account that this may be overkill.
I mean, if you are only using the 50-something characters typically used in Western languages, it makes little sense to embed the entire font.
So PDF supports a feature called "font subsetting". This means, instead of embedding the entire font, only those characters that are actually used are embedded in the document.
So what is going wrong exactly when you're merging these documents?
(I will skip a lot of the technical details.)
In order to differentiate between a fully embedded font, system font, or subset embedded font, iText generates a new font name for your fonts whenever it embeds them.
So a document containing a subset of Times New Roman might have "Times-AUHFDI" in its resources.
Similarly, a second document (again containing a subset of Times New Roman) might list "Times-VHUIEF" as one of its resources.
I believe it simply adds a random 6-character suffix. (ex-iText developer here)
PdfSmartCopy has to decide what to do with these resources. And sadly, it doesn't know whether these fonts are actually the same. So it decides to embed both these subsets into the new document.
This is a huge memory penalty.
If you have 100 documents, all using a subset of the same font, that subset will be embedded 100 times.
The other tool you listed might actually check whether these fonts are the same (and if they are, embed them only once). Or the other tool might simply not care that much and assume based on the partial name match that they are the same.
The ideal solution would of course be to compare the actual characters in the font, to see whether these two subsets can be merged.
But that would be much more difficult (and might potentially be a performance penalty).
What can you do?
There are 12 fonts that are never embedded. They are assumed to be present on every system (hence why they are never embedded.)
If you have control over the process that generates the PDF documents, you could simply decide to create them using only these fonts.
Alternatively you could write a smarter PdfSmartCopy. You would need to look into how fonts are built and stored, and perform the actual comparison I mentioned earlier.
Ask for technical support at iText. If enough people request this particular feature, you may get it.

Edit XPS content

I have got an application that is supposed to send a formatted document to a printer with some barcodes.
I've made other applications that work with printers and print directly through the printserver by sending a xps file, so I thought I would try to see if I could make a .xps file, change the text and be done with it, however every article I can find on the net has to do with creating xps files and not changing them. I feel like it should be possible, and it would be nice not to have to resort to installing Office on the server and print through there. Then I might as well use Open XML and a .docx file.
It is very simple. Let's say I want to change the text INCNUMMER in a .xps file to "testing123". How would I go about that?
I have tried the whole unzip, open the xml, find the text, edit, rezip but I'm afraid there's too much about the .xps format I don't understand to make that work.
Best regards, Kaspar.

As you already know, an XPS file is just a ZIP archive containing a number of files and folders that have particular names and a defined structure.
At the root level there is a Documents folder which will typically contain just a single document folder named 1. Inside that is a Pages folder containing one or more .fpage files: these define the content of each page in the document.
Documents
1
Pages
1.fpage
2.fpage
etc
If you open up these .fpage files in a text editor you will see that they are just XML files. Each page is typically represented by a <Canvas> element that contains multiple <Path> and <Glyphs> elements (text is represented by the latter). However, even though <Glyphs> elements do have a UnicodeString attribute the value of that attribute cannot be changed in isolation.
Each <Glyphs> element also has an Indices attribute. If you remove this attribute altogether and change the UnicodeString attribute at the same time, this almost works. However, you will probably find that when viewing the file in the XPS Viewer application certain characters in the text are replaced by question mark symbols.
Font glyphs are embedded in the XPS file (odttf files in the Resources folder), and the software that generated the XPS file will only embed glyphs that are used in the source document. For example, this means that (for a given font) if you did not use the letter "A" in the source document, then the glyph for that letter will not be written to the resources of the XPS file. Hence if you change the UnicodeString attribute to include a letter "A" then that character will display as a question mark in the viewer because it has no glyph resource that tells it how that character must be drawn.
If you have control over the source document (the one that later gets converted to XPS) then I suppose you could include a piece of text containing all of the characters that you are likely to use, and set its colour to white so that it doesn't print, but I'm not sure whether the XPS printer driver would strip that text out anyway. If it didn't then you could probably do something like this:
Open the relevant .fpage XML file
Search all UnicodeString attributes of <Glyphs> elements to find the text you want
Replace that text with something else
Remove the Indices attribute from the changed <Glyphs> elements
Save the updated XML back to the file
Re-zip then change the extension from ZIP to XPS

XNA - how to determine font type and font size used in compiled XNB file

XNB files are created by Microsoft XNA and distributed with many
games. XNB is a general serialization format capable of representing
arbitrary .NET objects, but there are common definitions for textures,
sound samples, 3D models, fonts, and other game data. XNB files may
use LZX compression (referred to as the Xbox XMemCompress format).
I have decompressed xnb files with fonts and I need to get information what kind of font and font size is used. I don't have a source of primary application. I would like to use the same font design in other application.
Current xnb files with fonts don't have all special characters which I need. I'm able to generate new spritefont with Unicode characters and compile it to xnb files but because I don't know what font had been used so my addon design is visible different.
Does anyone know how to detect what kind of font and fontsize is used in XNB file?
Also I was thinking about change encoding through hex editior but I didn't find any info where information about encoding is stored and how easly change them.
Sample file with font:
https://onedrive.live.com/redir?resid=B6196AD97CA6B88A!251&authkey=!ADaJmio5n3RO2zM&ithint=file%2c.zip
Until now I found below helpful resources:
What is an XNB File?
Compiled (XNB) Content Format

You can use the "Compiled (XNB) Content Format" project from the msdn.
I haven't tried it myself, but according to Microsoft, you can use it to open an XNB file and see it's content printed on the screen:
It also includes an example .xnb parser, written in native C++, which
demonstrates how to parse a compiled XNB file by printing its contents
to the screen.
Check it out here.
You can find other similar tools, but I always prefer to go to the owner of the product itself, this way you have better chances of having something useful.

Converting pdf to text

I need to create a C# or C++ (MFC) application that converts pdf files to txt. I need not only to convert, but remove headers, footers, some garbage characters on the left margin etc. Thus the application shold allow the user to set page margins to cut off what is not needed. I actually have already created such an application using xpdf, but it gives me some problems when I am trying to insert custom tags into the extracted text to preserve italics and bold. Maybe somebody could suggest something useful?
Thanks.

There are shareware and freeware utilities out there. Try fetching their source code, or perhaps use them the way they are.
A public version of the PDF specification can be found here: Adobe PDF Specification
PDF Shareware readers can be found: PDF Reader source code # SourceForge

Please look at Podofo. It's a LGPL-licensed library that has many powerful editing features. One of it's examples, txt2pdf IIRC, is a good start: it shows basic text-extraction; From there you can check if pre (in pdf engine) or post (in text) filtering suffices to your goals. I didn't get to use Pdf Hummus, but it's supposed to have these capabilities too, although it's less straightforward.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.