Text chunks getting positions as in html?

Text chunks getting positions as in html? - c#

transform pdf points to pixels, worked correctly:
point-to-pixel = 1/72*300(DPI)
getting each text chunk positions (X,Y) in PDF the Y is calculated from
bottom-to-top, not as in standard html or java Script.
to get the Y value from top-to-down , cause not accurate Y position as in
html style , or win Form style.
how to get the correct Y top-to-down using any Page height, or rect mediaBox
or cropBox or rect textMarging finder ?
the code I used is your example of :
public class LocationTextExtractionStrategyClass : LocationTextExtractionStrategy
{
//Hold each coordinate
public List<RectAndText> myPoints = new List<RectAndText>();
/*
//The string that we're searching for
public String TextToSearchFor { get; set; }
//How to compare strings
public System.Globalization.CompareOptions CompareOptions { get; set; }
public MyLocationTextExtractionStrategy(String textToSearchFor, System.Globalization.CompareOptions compareOptions = System.Globalization.CompareOptions.None)
{
this.TextToSearchFor = textToSearchFor;
this.CompareOptions = compareOptions;
}
*/
//Automatically called for each chunk of text in the PDF
public override void RenderText(TextRenderInfo renderInfo)
{
base.RenderText(renderInfo);
//See if the current chunk contains the text
var startPosition = 0;// System.Globalization.CultureInfo.CurrentCulture.CompareInfo.IndexOf(renderInfo.GetText(), this.TextToSearchFor, this.CompareOptions);
//If not found bail
if (startPosition < 0)
{
return;
}
//Grab the individual characters
var chars = renderInfo.GetCharacterRenderInfos().ToList();//.Skip(startPosition).Take(this.TextToSearchFor.Length)
var charsText = renderInfo.GetText();
//Grab the first and last character
var firstChar = chars.First();
var lastChar = chars.Last();
//Get the bounding box for the chunk of text
var bottomLeft = firstChar.GetDescentLine().GetStartPoint();
var topRight = lastChar.GetAscentLine().GetEndPoint();
//Create a rectangle from it
var rect = new iTextSharp.text.Rectangle(
bottomLeft[Vector.I1],
bottomLeft[Vector.I2],
topRight[Vector.I1],
topRight[Vector.I2]
);
BaseColor curColor = new BaseColor(0f, 0f, 0f);
if (renderInfo.GetFillColor() != null)
curColor = renderInfo.GetFillColor();
//Add this to our main collection
myPoints.Add(new RectAndText(rect, charsText, curColor));//this.TextToSearchFor));
}
}//end-of-txtLocation-class//

You are asking many different questions in one post.
First let's start with the coordinate system in the PDF standard. Observe that I am talking about a standard, more specifically about ISO 32000. The coordinate system on a PDF page is explained in my answer to the Stack Overflow question How should I interpret the coordinates of a rectangle in PDF?
As you can see, a rectangle drawn in a PDF using a coordinate (llx, lly) for the lower-left corner and a coordinate (urx, ury) for the upper-right corner, assumes that the X-axis points to the right, and the Y-axis points upwards.
As for the width and the height of a page, that's explained in my answer to the Stack Overflow question How to Get PDF page width and Height?
For instance: you could have a /MediaBox that is defined as [0 0 595 842], and therefore measures 595 x 842 points (an A4 page), but that has a /CropBox that is defined as [5 5 590 837], which means that the visible area is only 585 x 832 points.
You also shouldn't assume that the lower-left corner of a page coincides with the (0, 0) coordinate. See Where is the Origin (x,y) of a PDF page?
When you create a document from scratch, a default margin of half an inch is used if you don't define a margin yourself. If you want to change the default, see Fit content on pdf size with iTextSharp?
Now for the height of a Chunk or, if you're using iText 7 (which you should, but —for some reason unknown to me— don't) the height of a Text object, this depends on the font size. The font size is an average size of the different glyphs in a font. If you look at the letter g, and you compare it with the letter h, you see that g takes more space under the baseline of the text than h, whereas h takes more space above the baseline than g.
If you want to calculate the exact space that is taken, read my answer to the question How to calculate the height of an element?
If the text snippet is used in the context of lines in a paragraph, you also have to take the leading into account: Changing text line spacing (Maybe that's not relevant in the context of your question, but it's good to know.)
If you have Chunk objects in iText 5, and you want to do specific things with these Chunks, you might benefit from using page events. See How to draw a line every 25 words?
If you want to add a colored background to a Chunk, it's even easier: How to set the paragraph of itext pdf file as rectangle with background color in Java
Update 1: All of the above may be irrelevant if you are looking to convert HTML to PDF. In that case, it's easy: use iText 7 + pdfHTML as described in Converting HTML to PDF using iText and all the Math is done by the pdfHTML add-on.
Update 2: There seems to be some confusion regarding the measurement units. The differences between user units, points and pixels is explained in the FAQ page How do the measurement systems in HTML relate to the measurement system in PDF?
Summarized:
1 in. = 25.4 mm = 72 user units by default (but it can be changed).
1 in. = 25.4 mm = 72 pt.
1 in. = 25.4 mm = 96 px.

Related

Unable find location of ColorSpace objects in PDF document

I want to identify the ColorSpace objects in PDF and fetch their location(coordinates, width and height of the colorspace) in the page. I tried traversing through the BaseDataObject in Contents.ContentContext.Resources.ColorSpaces, I can identify the Pantone Colorspaces in file (as shown in screenshot), but unable to find info regarding the location(x,y,w and h) of the object.
Where can I find the exact location of the visible objects(visible on opening a document) like ColorSpaces and embedded images?
I am using 'pdfclown' library to extract the info about ColorSpaces from PDF. Any inputs will be helpful. Thanks in advance.
ContentScanner cs = new ContentScanner(page);
System.Collections.Generic.List<org.pdfclown.documents.contents.colorSpaces.ColorSpace> list = cs.Contents.ContentContext.Resources.ColorSpaces.Values.ToList();
for (int i = 0; i < list.Count; i++)
{
org.pdfclown.objects.PdfArray array = (org.pdfclown.objects.PdfArray)list[i].BaseDataObject;
foreach (org.pdfclown.objects.PdfObject s in array)
{
//print colorspace and its x,y,w,h
}
}
PDF Document (has CMYK and Pantone Colors)
Screenshot

I want to identify the ColorSpace objects in PDF and fetch their location(coordinates, width and height of the colorspace) in the page.
I assume you mean the squares here:
Beware, these are not PDF ColorSpace objects, these are a number of simple (rectangular) paths filled with distinct colors and with some text drawn upon them.
PDF ColorSpaces are not specific renderings of colored areas, they are abstract color specifications:
Colours may be described in any of a variety of colour systems, or colour spaces. Some colour spaces are related to device colour representation (grayscale, RGB, CMYK), others to human visual perception (CIE-based). Certain special features are also modelled as colour spaces: patterns, colour mapping, separations, and high-fidelity and multitone colour.
(ISO 32000-1, section 8.6 "Colour Spaces")
As you look for something with coordinates, width and height, therefore, you are looking for drawing instructions using those abstract color spaces, not for the plain color spaces.
I tried traversing through the BaseDataObject in Contents.ContentContext.Resources.ColorSpaces, I can identify the Pantone Colorspaces in file (as shown in screenshot), but unable to find info regarding the location(x,y,w and h) of the object.
By looking at cs.Contents.ContentContext.Resources.ColorSpaces you get an enumeration of all special color spaces available for use in the current context but not the actual usages. To get the actual usages, you have to traverse the ContentScanner cs, i.e. you have to inspect the instructions in the current context, e.g. like this:
SeparationColorSpace space = null;
double X = 0, Y = 0, Width = 0, Height = 0;
void ScanForSpecialColorspaceUsage(ContentScanner cs)
{
cs.MoveFirst();
while (cs.MoveNext())
{
ContentObject content = cs.Current;
if (content is CompositeObject)
{
ScanForSpecialColorspaceUsage(cs.ChildLevel);
}
else if (content is SetFillColorSpace _cs)
{
ColorSpace _space = cs.Contents.ContentContext.Resources.ColorSpaces[_cs.Name];
space = _space as SeparationColorSpace;
}
else if (content is SetDeviceCMYKFillColor || content is SetDeviceGrayFillColor || content is SetDeviceRGBFillColor)
{
space = null;
}
else if (content is DrawRectangle _dr)
{
if (space != null)
{
X = _dr.X;
Y = _dr.Y;
Width = _dr.Width;
Height = _dr.Height;
}
}
else if (content is PaintPath _pp)
{
if (space != null && _pp.Filled && (X != 0 || Y != 0 || Width != 0 || Height != 0))
{
String name = ((PdfName)((PdfArray)space.BaseDataObject)[1]).ToString();
Console.WriteLine("Filling rectangle at {0}, {1} with size {2}x{3} using {4}", X, Y, Width, Height, name);
}
X = 0;
Y = 0;
Width = 0;
Height = 0;
}
}
}
BEWARE: This merely is a proof-of-concept, simplified as much as possible to still work in your PDF for the squares in the screen shot above.
For a general solution you will have to extend this considerably:
The code only inspects the given content scanner, i.e. only the content stream it has been initialized for, in your case a page content stream.
From such a context stream other content streams may be referenced, e.g. a form XObject. To catch all the usages of interesting color spaces in a generic document, you have to recursively inspect such dependent content streams, too.
The code ignores the current transformation matrix.
The current transformation matrix can be changed by an instruction to have all the drawings done by following instructions have their coordinates changed according to an affine transformation. To get all coordinates and dimensions right in a generic document, you have to apply the current transformation matrix to them.
The code ignores save-graphics-state/restore-graphics-state instructions.
The current graphics state (including fill color and current transformation matrix) can be stored on a stack and restored from it. To get colors, coordinates and dimensions right in a generic document, you have to keep track of saved and restored graphics states (or use data from the cs.State for color and transformation where PDF Clown does this for you).
The code only looks at Separation color spaces.
If you're interested in other color spaces, too, you have generalize this.
The code only understands very specific, trivial paths: only paths generated by a single instruction defining a rectangle.
For a generic solution you have to support arbitrary paths.

How does one set an image as or along a chart axis?

I am trying to use a colored spectrum strip as an axis for a chart. The idea is to match the color on the image with its associated wavelength along the x-axis at the bottom. The strip needs to change in size to match changes of the chart area and expand and contract sections to match scroll-zooming in the chart area.
I have tried using image annotations but as the chart area changes, the annotation dimensions remain fixed. Also, the scroll zooming that focuses in on mouse position obviously has no effect on the annotation.
The approach that came closest was using the image as a background for the chart area. This automatically scaled the image as the chart area changed but scroll-zooming has no effect on the background image. Also, it would be ideal to have the background clear so as to avoid obscuring data plot points. I can edit the image to have a large transparent section and only a colored strip at the bottom but even then, that strip could obscure lower intensity data points.
Spectrum as annotation and background:
Annotation not scaling, background scales well:
Both annotation and background not scaling with zooming:

This is a nice idea.
The simplest way is to draw the image in a Paint event of the Chart, maybe PrePaint.
Let's go to work.. We will use the DrawImage overload that allows us zooming as well as cropping. For this we need two rectangles.
The first challenge is to always get the correct target rectangle.
For this we need to convert the InnerPlotPosition from relative positions to absolute pixels.
These two functions will help:
RectangleF ChartAreaClientRectangle(Chart chart, ChartArea CA)
{
RectangleF CAR = CA.Position.ToRectangleF();
float pw = chart.ClientSize.Width / 100f;
float ph = chart.ClientSize.Height / 100f;
return new RectangleF(pw * CAR.X, ph * CAR.Y, pw * CAR.Width, ph * CAR.Height);
}
RectangleF InnerPlotPositionClientRectangle(Chart chart, ChartArea CA)
{
RectangleF IPP = CA.InnerPlotPosition.ToRectangleF();
RectangleF CArp = ChartAreaClientRectangle(chart, CA);
float pw = CArp.Width / 100f;
float ph = CArp.Height / 100f;
return new RectangleF(CArp.X + pw * IPP.X, CArp.Y + ph * IPP.Y,
pw * IPP.Width, ph * IPP.Height);
}
With these numbers setting the destination rectangle is as simple as:
Rectangle tgtR = Rectangle.Round(new RectangleF(ipr.Left, ipr.Bottom - 15, ipr.Width, 15));
You can chose a height as you like..
The next challenge is the source rectangle.
Without zooming it would simply be:
Rectangle srcR = new Rectangle( 0, 0, bmp.Width, bmp.Height);
But for zooming and panning we need to scale it; for this we can use the x-axis and the ScaleView's Minimum and Maximum values.
We calculate factors for the first and last spot on the axis:
double f1 = ax.ScaleView.ViewMinimum / (ax.Maximum - ax.Minimum);
double f2 = ax.ScaleView.ViewMaximum / (ax.Maximum - ax.Minimum);
now we get the source rectangle maybe like this:
int x = (int)(bmp.Width * f1);
int xx = (int)(bmp.Width * f2);
Rectangle srcR = new Rectangle( x, 0, xx - x, bmp.Height);
Let's put it together:
private void chart_PrePaint(object sender, ChartPaintEventArgs e)
{
// a few short names
Graphics g = e.ChartGraphics.Graphics;
ChartArea ca = chart.ChartAreas[0];
Axis ax = ca.AxisX;
// pixels of plot area
RectangleF ipr = InnerPlotPositionClientRectangle(chart, ca);
// scaled first and last position
double f1 = ax.ScaleView.ViewMinimum / (ax.Maximum - ax.Minimum);
double f2 = ax.ScaleView.ViewMaximum / (ax.Maximum - ax.Minimum);
// actual drawing with the zooming overload
using (Bitmap bmp = (Bitmap)Bitmap.FromFile(imagePath))
{
int x = (int)(bmp.Width * f1);
int xx = (int)(bmp.Width * f2);
Rectangle srcR = new Rectangle( x, 0, xx - x, bmp.Height);
Rectangle tgtR = Rectangle.Round(
new RectangleF(ipr.Left , ipr.Bottom - 15, ipr.Width, 15));
g.DrawImage(bmp, tgtR, srcR, GraphicsUnit.Pixel);
}
}
A few notes:
Of course I would recomend to use an Image resource instead of always loading from disk!
The Drawing will always overlay the data points and also the grids. You can either..
choose a different minimum to make room
make the image smaller
move it below the x-axis labels
make the image semi-transparent
make the x-axis so fat that it can hold the image strip : ax.LineWidth = 10
For the latter solution you would want to offset the y-position depending on the zoom state. Quick and dirty: int yoff = (ax.ScaleView.IsZoomed ? 12 : 5);. To avoid black stripes also make the axis Transparent or chart.BackColor..
Update:
You can also revert to using a StripLine. It can scale its BackgroundImage and you would have to create a suitable image whenever changing the scaleview, i.e. when zooming or panning. For this much of the above code would be used to create the new images. See this post for examples of adding and replacing varying NamedImage to a Chart! (The relevant portion is close to the end about the marker images!)
In fact I found that way to be the best solution and have added a second answer.

Alternative and recommended solution:
I dabbled with the last option I mentioned in my other answer and found it to be rather nice; it is similarily extensive, so I decided to post a second answer.
The idea is to use a StripLine with just the right BackgroundImage.
The advantage is that is will display nicely under all chart elements and never draw over the axis, grid, datapoints or conflict with the zoom tools.
Since the StripLine must be updated repeatedly I put it in a function:
Here is the function; it makes use of the same two helper functions to calculate pixel positions as the other answer does..:
void updateStripLine(Chart chart, ChartArea ca, string name)
{
// find our stripline; one could pass in a class level variable as well
StripLine sl = ca.AxisY.StripLines.Cast<StripLine>()
.Where(s => s.Tag.ToString() == name).FirstOrDefault();
if (sl != null) // either clean-up the resources..
{
var oldni = chart.Images.FindByName(name);
if (oldni != null)
{
oldni.Image.Dispose();
chart.Images.Remove(oldni);
oldni.Dispose();
}
}
else // or, create the line
{
sl = new StripLine();
sl.Tag = name;
ca.AxisY.StripLines.Add(sl);
}
ca.RecalculateAxesScale();
RectangleF ipr = InnerPlotPositionClientRectangle(chart, ca);
Axis ax = ca.AxisX;
Axis ay = ca.AxisY;
double f1 = ax.ScaleView.ViewMinimum / (ax.Maximum - ax.Minimum);
double f2 = ax.ScaleView.ViewMaximum / (ax.Maximum - ax.Minimum);
Bitmap b0 = (Bitmap)chart.Images["spectrum"].Image;
int x = (int)(b0.Width * f1);
int xx = (int)(b0.Width * f2);
Rectangle srcR = new Rectangle( x, 0, xx - x, b0.Height);
Rectangle tgtR = Rectangle.Round(new RectangleF(0,0, ipr.Width , 10));
// create bitmap and namedImage:
Bitmap bmp = new Bitmap( tgtR.Width, tgtR.Height);
using (Graphics g = Graphics.FromImage(bmp))
{ g.DrawImage(b0, tgtR, srcR, GraphicsUnit.Pixel); }
NamedImage ni = new NamedImage(name, bmp);
chart.Images.Add(ni);
sl.BackImageWrapMode = ChartImageWrapMode.Scaled;
sl.StripWidth = ay.PixelPositionToValue(0) - ay.PixelPositionToValue(12);
sl.Interval = 100; // make large enough to avoid another sLine showing up
sl.IntervalOffset = 0;
sl.BackImage = name;
}
Much of the comments and links apply, especially wrt to the NamedImage we use for the StripLine.
A few more notes:
I use one of the (four) axis conversion functions, PixelPositionToValue to calculate a pixel height of 12px; the StripLine takes values, so I use two pixel values to get the right difference value.
To identify the StripLine I use the Tag property. Of course the Name property would be much more natural, but it is read-only. No idea why?!
The function is called from the AxisViewChanged, the Resize event and also the the PrePaint event; this makes sure it will always be called when needed. To avoid invalid calls from the PrePaint there I do it like this: if (ay.StripLines.Count == 0) updateStripLine(chart, ca, "sl"); Of course you should adapt if you use other StripLines on this axis..
The code makes use of the same image as before; but I have put it into a first NamedImage called spectrum. This would be an option in the 1st answer as well.
NamedImage spectrum = new NamedImage("spectrum", Bitmap.FromFile(imagePath);
chart.Images.Add(spectrum);
It also makes sure to dispose of the old images properly, I hope..

PDFTron: converting pixels to fontsize

I have some text in a pdf that has been OCR'ed.
The OCR returns the bounding boxes of the words to me.
I'm able to draw the bounding boxes (wordRect) on the pdf and everything seems correct.
But when i tell my fontsize to be the height of these bounding boxes,
it all goes wrong. The text appears way smaller than it should be and doesn't match the height.
There's some conversion i am missing. How can i make sure the text is as high as the bounding boxes?
pdftron.PDF.Font font = pdftron.PDF.Font.Create(convertedPdf.GetSDFDoc(), pdftron.PDF.Font.StandardType1Font.e_helvetica);
for (int j = 0; j < ocrStream.pr_WoordList.Count; j++)
{
wordRect = (Rectangle) ocrStream.pr_Rectangles[j];
Element textBegin = elementBuilder.CreateTextBegin();
gStateTextRun = textBegin.GetGState();
gStateTextRun.SetTextRenderMode(GState.TextRenderingMode.e_stroke_text);
elementWriter.WriteElement(textBegin);
fontSize = wordRect.Height;
double descent;
if (hasColorImg)
{
descent = (-1 * font.GetDescent() / 1000d) * fontSize;
textRun = elementBuilder.CreateTextRun((string)ocrStream.pr_WoordList[j], font, fontSize);
//translate the word to its correct position on the pdf
//the bottom line of the wordrectangle is the baseline for the font, that's why we need the descender
textRun.SetTextMatrix(1, 0, 0, 1, wordRect.Left, wordRect.Bottom + descent );

How can i make sure the text is as high as the bounding boxes?
The font_size is just a scaling factor, which in most cases does map to 1/72 inch (pt), but not always.
The transformations are:
GlyphSpace -> TextSpace -> UserSpace (where UserSpace is essentially the page space, and is 1/72 inch)
The glyphs in the font are defined in GlyphSpace, and there is a font matrix that maps to TextSpace. Typically, 1000 units maps to 1 unit in test space, but not always.
Then the text matrix (element.SetTextMatrix), the font size (variable in question here) and some additional parameters, transform TextSpace coordinates to UserSpace.
In the end though, the exact height, depends on the glyph also.
This forum post shows how to go from the glyph data, to UserSpace. See ProcessElements
https://groups.google.com/d/msg/pdfnet-sdk/eOATUHGFyqU/6tsUF0BHukkJ

Custom richtextbox control kerning issues

Okay, so I have been working on something for a little while and I have gotten to the point where I am planning the Text rendering part.
I can already draw strings of text in two ways; DrawString and TextRenderer.DrawText. I prefer DrawText since measuring text is more accurate when using TextRenderer.Measure text.
I have a class:
public class Character
{
public string character {get; set; }
public Font font {get; set; }
public Point position {get; set; }
}
And a list of all characters pressed:
public List<Character> chars = new List<Character>();
Now my problem is that I need to be able to set a different font and color and boldness or italicization to any given selected characters or words at runtime. So I can't just draw a whole string because then there'd be no way for me to set individual font settings for each character the user has selected to change.
So I need to be able to store different font style info for each character and then add them all to a list so I can kinda go through each one and draw each one as it should be drawn (I. E. each char having its own style etc).
This solution works fine for me. And since I've not been able to find any info about this anywhere for months, I'm totally stuck.
My main problem is that because I am drawing char by char, I have no idea how far apart each character should be from the previously drawn character (kerning).
For input (text box) controls, how can we custom draw text and allow the user to make a part of a word blue, and the other half of the word a different size and color and style, for example, while still adhering to proper kerning settings?
How do we know where to draw each character?
People have said just keep restarting the whole string at once. But that doesn't solve my initial problem. I need to be able to draw each char one by one so I can save font info about it.

Kerning and Character Spacing are different and if you want to have complete control over what your code prints you may need to implement both.
Let's look at an example output first :
Image one shows direct output with an extra character spacing of 1 pixel, no kerning.
Image two has some kerning applied, but only for three kerning pairs.
I have tried to make things clearer by also drawing the result of the characterwise text measurements. Also there is a tiled 1 pixel raster as the panel BackgroundImage. (To see it better you may want to download the png files!)
private void panel2_Paint(object sender, PaintEventArgs e)
{
string fullText = "Text;1/2' LTA";
StringFormat strgfmt = StringFormat.GenericTypographic;
Font font = new Font("Times", 60f, FontStyle.Regular);
float x = 0f;
using (SolidBrush brush = new SolidBrush(Color.FromArgb(127, 0, 127, 127)))
{
for (int i = 0; i < fullText.Length; i++)
{
string text = fullText.Substring(i, 1);
SizeF sf = e.Graphics.MeasureString(text, font, 9999, strgfmt );
e.Graphics.FillRectangle(brush, new RectangleF(new PointF(x, 0f), sf));
e.Graphics.DrawString(text, font, Brushes.Black, x, 0, strgfmt );
x += sf.Width + 1; // character spacing = +1
//if (i < fullText.Length - 1) doKerning(fullText.Substring(i, 2), ref x);
}
}
}
void doKerning(string c12, ref float x)
{
if (smallKerningTable.ContainsKey(c12)) x -= smallKerningTable[c12];
}
Dictionary<string, float> smallKerningTable = new Dictionary<string, float>();
void initKerningTable()
{
smallKerningTable.Add("Te", 7f);
smallKerningTable.Add("LT", 8f);
smallKerningTable.Add("TA", 11f);
//..
}
This is how the background is created:
public Form1()
{
InitializeComponent();
Bitmap bmpCheck2 = new Bitmap(2, 2);
bmpCheck2.SetPixel(0, 0, Color.FromArgb(127, 127, 127, 0));
panel2.BackgroundImage = bmpCheck2;
panel2.BackgroundImageLayout = ImageLayout.Tile;
//..
}
If you want to use kerning you will need to build a much longer kerning table.
In real life typographers and font designers do that manually, looking hard at the glyphs, tweaking the kerning until it looks real good.
That is rather expensive and still doesn't cover font mixes.
So you may want to either
not use kerning after all. Make sure to use the StringFormat.GenericTypographic option both for measuring and for drawing the strings!
create a small kerning table for some of the especially problematic characters, like 'L', 'T', 'W', "V' and 'A'..
write code to create a full kerning table for all pairs you need or..
for all pairs
To write code to create a kerning table you would:
Create a Bitmap for each charcter
Iterate over all pairs and
move the second bitmap to the left until some non-transparent/ black pixels collide.
the moving should not got further than, say, half of the width, otherwise the distance should be reset to 0, because some character pairs will not collide at all and should not have any kerning, e.g.: '^_' or '.-'
If you want to mix fonts and /or FontStyles the key to the kerning table would have to be expanded to include some ID of the two respective fonts&styles the characters have..

Calculating Text Wrapping in the .NET DrawingContext DrawText method

I'm working on a project that has me approximating text rendered as an image and a DHTML editor for the text. The images are rendered using the .NET 4 DrawingContext object's DrawText method.
The DrawText method will take text along with font information as well as dimensions and calculate the wrapping necessary to get the text to fit as much as possible, placing an ellipsis at the end if the text is too long. So, if I have the following code to draw text in a Rectangle it will abbrevaiate it:
string longText = #"A choice of five engines, although the 2-liter turbo diesel, supposedly good for 48 m.p.g. highway, is not coming to America, at least for now. A 300-horsepower supercharged gasoline engine will likely be the first offered in the United States. All models will use start-stop technology, and fuel consumption will decrease by an average of 19 percent across the A6 lineup. A 245-horsepower A6 hybrid was also unveiled, but no decision has yet been made as to its North America sales prospects. Figure later in 2012, if sufficient demand is detected.";
var drawing = new DrawingGroup();
using (var context = drawing.Open())
{
var text = new FormattedText(longText,
CultureInfo.CurrentCulture,
FlowDirection.LeftToRight,
new Typeface("Calibri"),
30,
Brushes.Green);
text.MaxTextHeight = myRect.Height;
text.MaxTextWidth = myRect.Width;
context.DrawText(text, new Point(0, 0));
}
var db = new DrawingBrush(drawing);
db.Stretch = Stretch.None;
myRect.Fill = db;
Is there a way to calculate how the text will be wrapped? In this example, the outputted text is wrapped at "2-liter" and "48 m.p.g" etc as seen in the image below:

You can use the Graphics.MeasureString(String, Font, Int32) function. You pass it the string, font, and maximum width. It returns a SizeF with the rectangle it would form. You can use this to get the overall height, and thus the number of lines:
Graphics g = ...;
Font f = new Font("Calibri", 30.0);
SizeF sz = g.MeasureString(longText, f, myRect.Width);
float height = sz.Height;
int lines = (int)Math.round(height / f.Height); // overall height divided by the line height = number of lines
There are many ways to get a Graphics object, and any will do since you are only using it to measure and not to draw (you may have to correct its DpiX, DpiY, and PageUnit fields since those effect measurements.
Ways to get a Graphics object:
Graphics g = e.Graphics; // in OnPaint, with PaintEventArgs e
Graphics g = x.CreateGrahics(); // where x is any Form or Control
Graphics g = Graphics.CreateFrom(img); // where img is an Image.

Not sure if you still need a solution or if this particular solution is appropriate for your application, but if you insert the below snippet just after your using block it will show you the text in each line (and therefore where the text was broken for wrapping).
I arrived at this solution using the very ghetto/guerrilla approach of just browsing properties while debugging, looking for the wrapped text segments - I found 'em and they were in accessible properties...so there you go. There very well may be a more proper/direct way.
// Object heirarchy:
// DrawingGroup (whole thing)
// - DrawingGroup (lines)
// - GlyphRunDrawing.GlyphRun.Characters (parts of lines)
// Note, if text is clipped, the ellipsis will be placed in its own
// separate "line" below. Give it a try and you'll see what I mean.
List<DrawingGroup> lines = drawing.Children.OfType<DrawingGroup>().ToList();
foreach (DrawingGroup line in lines)
{
List<char> lineparts = line.Children
.OfType<GlyphRunDrawing>()
.SelectMany(grd => grd.GlyphRun.Characters)
.ToList();
string lineText = new string(lineparts.ToArray());
Debug.WriteLine(lineText);
}
Btw, Hi David. :-)

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.