I'm developing an application to manipulate images scanned on a wide-image scanner. These images are shown as a ImageBrush on a Canvas.
On this Canvas they can a make Rectangle with the mouse, to define an area to be cropped.
My problem here is to resize the Rectangle according to the original image size, so that it crops the exact area on the original image.
I've tried many things so far and it's just sqeezing my brain, to figure out the right solution.
I know that I need to get the percent that the original image is bigger than the image shown on the canvas.
The dimentions of the original image are:
h: 5606
w: 7677
And when I show the image, they are:
h: 1058,04
w: 1910
Which gives these numbers:
float percentWidth = ((originalWidth - resizedWidth) / originalWidth) * 100;
float percentHeight = ((originalHeight - resizedHeight) / originalHeight) * 100;
percentWidth = 75,12049
percentHeight = 81,12665
From here I can't figure how to resize the Rectangle correctly, to fit the original image.
My last approach was this:
int newRectWidth = (int)((originalWidth * percentWidth) / 100);
int newRectHeight = (int)((originalHeight * percentHeight) / 100);
int newRectX = (int)(rectX + ((rectX * percentWidth) / 100));
int newRectY = (int)(rectY + ((rectY * percentHeight) / 100));
Hopefully someone can lead me in the right direction, because i'm off track here and I can't see what i'm missing.
Solution
private System.Drawing.Rectangle FitRectangleToOriginal(
float resizedWidth,
float resizedHeight,
float originalWidth,
float originalHeight,
float rectWidth,
float rectHeight,
double rectX,
double rectY)
{
// Calculate the ratio between original and resized image
float ratioWidth = originalWidth / resizedWidth;
float ratioHeight = originalHeight / resizedHeight;
// create a new rectagle, by resizing the old values
// by the ratio calculated above
int newRectWidth = (int)(rectWidth * ratioWidth);
int newRectHeight = (int)(rectHeight * ratioHeight);
int newRectX = (int)(rectX * ratioWidth);
int newRectY = (int)(rectY * ratioHeight);
return new System.Drawing.Rectangle(newRectX, newRectY, newRectWidth, newRectHeight);
}
I think the only reliable option is to let your users zoom in to the image (100% or higher zoom level) and make a selection on part of the image. This way they can make an exact pixel-based selection. (Assuming that the purpose of your selection rectangle is to select part of an image.)
Your problem now is that you're using floating-point calculations because of the 75% zoom level and rounding errors will make your selection rectangles inaccurate. No matter what you do, when you try to make a selection on a shrinked image, you're not selecting exact pixels - you're selecting parts of pixels as you resize your rectangle. Since a partial pixel cannot be selected, the selection edges will be rounded up or down so you either select one pixel too many or one pixel too few in a given direction.
Another issue that I just noticed is that you distort your image - horizontally it's 75% zoom, vertically it's 81%. This makes it even harder for users because the image will be smoothed differently in the two directions. Horizontally 4 original pixels will be interpolated on 3 output pixels; vertically 5 original pixels will be interpolated on 4 output pixels.
You are actually doing a form of projection. Don't use percentages, just use the ratio between 5606 and 1058,4 = ~5.30. When the user drags the rectangle, reproject it which is selectedWidth * 5606/1058.4.
Related
I am trying to use a colored spectrum strip as an axis for a chart. The idea is to match the color on the image with its associated wavelength along the x-axis at the bottom. The strip needs to change in size to match changes of the chart area and expand and contract sections to match scroll-zooming in the chart area.
I have tried using image annotations but as the chart area changes, the annotation dimensions remain fixed. Also, the scroll zooming that focuses in on mouse position obviously has no effect on the annotation.
The approach that came closest was using the image as a background for the chart area. This automatically scaled the image as the chart area changed but scroll-zooming has no effect on the background image. Also, it would be ideal to have the background clear so as to avoid obscuring data plot points. I can edit the image to have a large transparent section and only a colored strip at the bottom but even then, that strip could obscure lower intensity data points.
Spectrum as annotation and background:
Annotation not scaling, background scales well:
Both annotation and background not scaling with zooming:
This is a nice idea.
The simplest way is to draw the image in a Paint event of the Chart, maybe PrePaint.
Let's go to work.. We will use the DrawImage overload that allows us zooming as well as cropping. For this we need two rectangles.
The first challenge is to always get the correct target rectangle.
For this we need to convert the InnerPlotPosition from relative positions to absolute pixels.
These two functions will help:
RectangleF ChartAreaClientRectangle(Chart chart, ChartArea CA)
{
RectangleF CAR = CA.Position.ToRectangleF();
float pw = chart.ClientSize.Width / 100f;
float ph = chart.ClientSize.Height / 100f;
return new RectangleF(pw * CAR.X, ph * CAR.Y, pw * CAR.Width, ph * CAR.Height);
}
RectangleF InnerPlotPositionClientRectangle(Chart chart, ChartArea CA)
{
RectangleF IPP = CA.InnerPlotPosition.ToRectangleF();
RectangleF CArp = ChartAreaClientRectangle(chart, CA);
float pw = CArp.Width / 100f;
float ph = CArp.Height / 100f;
return new RectangleF(CArp.X + pw * IPP.X, CArp.Y + ph * IPP.Y,
pw * IPP.Width, ph * IPP.Height);
}
With these numbers setting the destination rectangle is as simple as:
Rectangle tgtR = Rectangle.Round(new RectangleF(ipr.Left, ipr.Bottom - 15, ipr.Width, 15));
You can chose a height as you like..
The next challenge is the source rectangle.
Without zooming it would simply be:
Rectangle srcR = new Rectangle( 0, 0, bmp.Width, bmp.Height);
But for zooming and panning we need to scale it; for this we can use the x-axis and the ScaleView's Minimum and Maximum values.
We calculate factors for the first and last spot on the axis:
double f1 = ax.ScaleView.ViewMinimum / (ax.Maximum - ax.Minimum);
double f2 = ax.ScaleView.ViewMaximum / (ax.Maximum - ax.Minimum);
now we get the source rectangle maybe like this:
int x = (int)(bmp.Width * f1);
int xx = (int)(bmp.Width * f2);
Rectangle srcR = new Rectangle( x, 0, xx - x, bmp.Height);
Let's put it together:
private void chart_PrePaint(object sender, ChartPaintEventArgs e)
{
// a few short names
Graphics g = e.ChartGraphics.Graphics;
ChartArea ca = chart.ChartAreas[0];
Axis ax = ca.AxisX;
// pixels of plot area
RectangleF ipr = InnerPlotPositionClientRectangle(chart, ca);
// scaled first and last position
double f1 = ax.ScaleView.ViewMinimum / (ax.Maximum - ax.Minimum);
double f2 = ax.ScaleView.ViewMaximum / (ax.Maximum - ax.Minimum);
// actual drawing with the zooming overload
using (Bitmap bmp = (Bitmap)Bitmap.FromFile(imagePath))
{
int x = (int)(bmp.Width * f1);
int xx = (int)(bmp.Width * f2);
Rectangle srcR = new Rectangle( x, 0, xx - x, bmp.Height);
Rectangle tgtR = Rectangle.Round(
new RectangleF(ipr.Left , ipr.Bottom - 15, ipr.Width, 15));
g.DrawImage(bmp, tgtR, srcR, GraphicsUnit.Pixel);
}
}
A few notes:
Of course I would recomend to use an Image resource instead of always loading from disk!
The Drawing will always overlay the data points and also the grids. You can either..
choose a different minimum to make room
make the image smaller
move it below the x-axis labels
make the image semi-transparent
make the x-axis so fat that it can hold the image strip : ax.LineWidth = 10
For the latter solution you would want to offset the y-position depending on the zoom state. Quick and dirty: int yoff = (ax.ScaleView.IsZoomed ? 12 : 5);. To avoid black stripes also make the axis Transparent or chart.BackColor..
Update:
You can also revert to using a StripLine. It can scale its BackgroundImage and you would have to create a suitable image whenever changing the scaleview, i.e. when zooming or panning. For this much of the above code would be used to create the new images. See this post for examples of adding and replacing varying NamedImage to a Chart! (The relevant portion is close to the end about the marker images!)
In fact I found that way to be the best solution and have added a second answer.
Alternative and recommended solution:
I dabbled with the last option I mentioned in my other answer and found it to be rather nice; it is similarily extensive, so I decided to post a second answer.
The idea is to use a StripLine with just the right BackgroundImage.
The advantage is that is will display nicely under all chart elements and never draw over the axis, grid, datapoints or conflict with the zoom tools.
Since the StripLine must be updated repeatedly I put it in a function:
Here is the function; it makes use of the same two helper functions to calculate pixel positions as the other answer does..:
void updateStripLine(Chart chart, ChartArea ca, string name)
{
// find our stripline; one could pass in a class level variable as well
StripLine sl = ca.AxisY.StripLines.Cast<StripLine>()
.Where(s => s.Tag.ToString() == name).FirstOrDefault();
if (sl != null) // either clean-up the resources..
{
var oldni = chart.Images.FindByName(name);
if (oldni != null)
{
oldni.Image.Dispose();
chart.Images.Remove(oldni);
oldni.Dispose();
}
}
else // or, create the line
{
sl = new StripLine();
sl.Tag = name;
ca.AxisY.StripLines.Add(sl);
}
ca.RecalculateAxesScale();
RectangleF ipr = InnerPlotPositionClientRectangle(chart, ca);
Axis ax = ca.AxisX;
Axis ay = ca.AxisY;
double f1 = ax.ScaleView.ViewMinimum / (ax.Maximum - ax.Minimum);
double f2 = ax.ScaleView.ViewMaximum / (ax.Maximum - ax.Minimum);
Bitmap b0 = (Bitmap)chart.Images["spectrum"].Image;
int x = (int)(b0.Width * f1);
int xx = (int)(b0.Width * f2);
Rectangle srcR = new Rectangle( x, 0, xx - x, b0.Height);
Rectangle tgtR = Rectangle.Round(new RectangleF(0,0, ipr.Width , 10));
// create bitmap and namedImage:
Bitmap bmp = new Bitmap( tgtR.Width, tgtR.Height);
using (Graphics g = Graphics.FromImage(bmp))
{ g.DrawImage(b0, tgtR, srcR, GraphicsUnit.Pixel); }
NamedImage ni = new NamedImage(name, bmp);
chart.Images.Add(ni);
sl.BackImageWrapMode = ChartImageWrapMode.Scaled;
sl.StripWidth = ay.PixelPositionToValue(0) - ay.PixelPositionToValue(12);
sl.Interval = 100; // make large enough to avoid another sLine showing up
sl.IntervalOffset = 0;
sl.BackImage = name;
}
Much of the comments and links apply, especially wrt to the NamedImage we use for the StripLine.
A few more notes:
I use one of the (four) axis conversion functions, PixelPositionToValue to calculate a pixel height of 12px; the StripLine takes values, so I use two pixel values to get the right difference value.
To identify the StripLine I use the Tag property. Of course the Name property would be much more natural, but it is read-only. No idea why?!
The function is called from the AxisViewChanged, the Resize event and also the the PrePaint event; this makes sure it will always be called when needed. To avoid invalid calls from the PrePaint there I do it like this: if (ay.StripLines.Count == 0) updateStripLine(chart, ca, "sl"); Of course you should adapt if you use other StripLines on this axis..
The code makes use of the same image as before; but I have put it into a first NamedImage called spectrum. This would be an option in the 1st answer as well.
NamedImage spectrum = new NamedImage("spectrum", Bitmap.FromFile(imagePath);
chart.Images.Add(spectrum);
It also makes sure to dispose of the old images properly, I hope..
I am currently implementing Pan/Tilt/Zoom support in my application. I pass in an image and then calculate zoom in the following manner
int widthPercent = width / 100;
int heightPercent = height / 100;
int zoomX = width - (widthPercent * (int)Zoom);
int zoomY = height - (heightPercent * (int)Zoom);
where width is width of original image, height is height of original image and Zoom is a value passed in from the UI, ranging from 0 to 100.
I now wish to implement Pan/Tilt support while an image is zoomed in so that the whole image can still be accessed. Again Pan and Tilt will be controlled from the UI(again 0 to 100) but I want it to stay within the boundaries of the image so that the image does not repeat which it is currently doing.
I am current calculating the Pan and Tilt like so:
// Calculate Pan
int panWidthPercent = width / 100;
int finalPan = (int)Pan * panWidthPercent;
// Calculate Tilt
int tiltHeightPercent = height / 100;
int finalTilt = (int)Tilt * tiltHeightPercent;
This works to an extent however it seems to keep repeating the image after panning for a small time(usually when Pan = 10 or more). I wish it to stop after it renders all the image i.e it reaches the width but adding something like the following doesnt seem to stop this
if(finalPan >= width) finalPan = width;
Once the values are calculate I create a source rectangle using these values
mClippingRectangle = new Int32Rect(finalPan, finalTilt, zoomX, zoomY);
which is then used against the base image for rendering only that rectangle
In short how can I calculate how much I should pan/tilt when I already know the zoom of the image
Managed to accomplish it by checking whether the Pan/Tilt Amount added to the zoom width/height is greater than the overall image width/height then stop it from panning/tilting
// Calculate Pan
double finalPan = Pan * widthPercent;
if ((finalPan + zoomX) >= width)
{
finalPan = width - zoomX;
}
// Calculate Tilt
double finalTilt = Tilt * heightPercent;
if ((finalTilt + zoomY) >= height)
{
finalTilt = height - zoomY;
}
I have images from two camera. Both of them send a picture but the picture of camera 1 is zoomed(Its like that picture of camera 1 is inside of picture of camera 2).
I have the position of a point in picture of camera 1. This position can change in different pictures. Now I want to find that point in picture of camera 2.
Both of cameras images are in 2560X2048 px.
How can I find that x,y in picture 2?
I found the answer. I cropped the unzoomed! picture to be equal to zoomed picture. and save the x,y of cropped picture on unzoomed picture. Than I calculate percent of x,y of that point in zoomed picture. something like this :
double percentXZoom = (I_PLATE_MIN_X * 100) / 2560;
double xCropedImage = xD - xU;
double xDiff = (xCropedImage * percentXZoom) / 100;
double x = xD + Math.Abs(xDiff);
double percentYZoom = (I_PLATE_MIN_Y * 100) / 2560;
double yCropedImage = yD - yU;
double yDiff = (yCropedImage * percentYZoom) / 100;
double y = yD + Math.Abs(yDiff);
"2560" is the size of pixel in pictures.
xD, yD is the start point of cropped picture. and xU, yU is the end point of cropped picture.
Now I have the x,y of that point in unzoomed picture.
I'm trying to figure out a good way to auto-size a Rectangle that has text drawn inside of it. I basically want the size to have a ratio of width/height and then "grow" according to that ratio to fit the text. I've looked at Graphics.MeasureString but I don't think it does what I'm looking for (maybe it does and I'm just using it wrong).
I don't want to specify a specific width of the rectangle to be drawn. Instead I want to say find the smallest width/height to fit this text given a minimum width but the found rectangle must have some specific ratio of width and height.
This doesn't have to be specific to C#, any idea for solving this problem I'm sure can be mapped to C#.
Thanks!
I believe you can use Graphics.MeasureString. This is what I have used in my GUI code to draw rectangles around text. You hand it the text and the font you want to use, it returns to you a rectangle (technically a SizeF object - width and height). Then you can adjust this rectangle by the ratio you want:
Graphics g = CreateGraphics();
String s = "Hello, World!";
SizeF sizeF = g.MeasureString(s, new Font("Arial", 8));
// Now I have a rectangle to adjust.
float myRatio = 2F;
SizeF adjustedSizeF = new SizeF(sizeF.Width * myRatio, sizeF.Height * myRatio);
RectangleF rectangle = new RectangleF(new PointF(0, 0), adjustedSizeF);
Am I understanding your question correctly?
You should use TextRenderer.MeasureText, all controls use TextRenderer to draw text in .NET 2.0 and up.
There is no unambiguous solution to your question, there are many possible ways to fit text in a Rectangle. A wide one that displays just one line is just as valid as a narrow one that displays many lines. You'll have to constrain one of the dimensions. It is a realistic requirement, this rectangle is shown inside some other control and that control has a certain ClientSize. You'll need to decide how you want to lay it out.
On the back of my comment about the System.Windows.Forms.Label, maybe you could have a look at the code driving the painting of a Label? If you use Reflector this should get you part of the way.
There seems to be some methods on there like GetPreferredSizeCore() for example that probably have what you want which I'm sure could be made generic enough given a little work.
I've found my own solution. The following code determines the best rectangle (matching the ratio) to fit the text. It uses divide and conquer to find the closest rectangle (by decrementing the width by some "step"). This algorithm uses a min-width that is always met and I'm sure this could be modified to include a max width. Thoughts?
private Size GetPreferredSize(String text, Font font, StringFormat format)
{
Graphics graphics = this.CreateGraphics();
if (format == null)
{
format = new StringFormat();
}
SizeF textSize = SizeF.Empty;
// The minimum width allowed for the rectangle.
double minWidth = 100;
// The ratio for the height compared to the width.
double heightRatio = 0.61803399; // Gloden ratio to make it look pretty :)
// The amount to in/decrement for width.
double step = 100;
// The new width to be set.
double newWidth = minWidth;
// Find the largest width that the text fits into.
while (true)
{
textSize = graphics.MeasureString(text, font, (int)Math.Round(newWidth), format);
if (textSize.Height <= newWidth * heightRatio)
{
break;
}
newWidth += step;
}
step /= 2;
// Continuously divide the step to adjust the rectangle.
while (true)
{
// Ensure step.
if (step < 1)
{
break;
}
// Ensure minimum width.
if (newWidth - step < minWidth)
{
break;
}
// Try to subract the step from the width.
while (true)
{
// Measure the text.
textSize = graphics.MeasureString(text, font, (int)Math.Round(newWidth - step), format);
// If the text height is going to be less than the new height, decrease the new width.
// Otherwise, break to the next lowest step.
if (textSize.Height < (newWidth - step) * heightRatio)
{
newWidth -= step;
}
else
{
break;
}
}
step /= 2;
}
double width = newWidth;
double height = width * heightRatio;
return new Size((int)Math.Ceiling(width), (int)Math.Ceiling(height));
}
I'm trying to write a program to programmatically determine the tilt or angle of rotation in an arbitrary image.
Images have the following properties:
Consist of dark text on a light background
Occasionally contain horizontal or vertical lines which only intersect at 90 degree angles.
Skewed between -45 and 45 degrees.
See this image as a reference (its been skewed 2.8 degrees).
So far, I've come up with this strategy: Draw a route from left to right, always selecting the nearest white pixel. Presumably, the route from left to right will prefer to follow the path between lines of text along the tilt of the image.
Here's my code:
private bool IsWhite(Color c) { return c.GetBrightness() >= 0.5 || c == Color.Transparent; }
private bool IsBlack(Color c) { return !IsWhite(c); }
private double ToDegrees(decimal slope) { return (180.0 / Math.PI) * Math.Atan(Convert.ToDouble(slope)); }
private void GetSkew(Bitmap image, out double minSkew, out double maxSkew)
{
decimal minSlope = 0.0M;
decimal maxSlope = 0.0M;
for (int start_y = 0; start_y < image.Height; start_y++)
{
int end_y = start_y;
for (int x = 1; x < image.Width; x++)
{
int above_y = Math.Max(end_y - 1, 0);
int below_y = Math.Min(end_y + 1, image.Height - 1);
Color center = image.GetPixel(x, end_y);
Color above = image.GetPixel(x, above_y);
Color below = image.GetPixel(x, below_y);
if (IsWhite(center)) { /* no change to end_y */ }
else if (IsWhite(above) && IsBlack(below)) { end_y = above_y; }
else if (IsBlack(above) && IsWhite(below)) { end_y = below_y; }
}
decimal slope = (Convert.ToDecimal(start_y) - Convert.ToDecimal(end_y)) / Convert.ToDecimal(image.Width);
minSlope = Math.Min(minSlope, slope);
maxSlope = Math.Max(maxSlope, slope);
}
minSkew = ToDegrees(minSlope);
maxSkew = ToDegrees(maxSlope);
}
This works well on some images, not so well on others, and its slow.
Is there a more efficient, more reliable way to determine the tilt of an image?
I've made some modifications to my code, and it certainly runs a lot faster, but its not very accurate.
I've made the following improvements:
Using Vinko's suggestion, I avoid GetPixel in favor of working with bytes directly, now the code runs at the speed I needed.
My original code simply used "IsBlack" and "IsWhite", but this isn't granular enough. The original code traces the following paths through the image:
http://img43.imageshack.us/img43/1545/tilted3degtextoriginalw.gif
Note that a number of paths pass through the text. By comparing my center, above, and below paths to the actual brightness value and selecting the brightest pixel. Basically I'm treating the bitmap as a heightmap, and the path from left to right follows the contours of the image, resulting a better path:
http://img10.imageshack.us/img10/5807/tilted3degtextbrightnes.gif
As suggested by Toaomalkster, a Gaussian blur smooths out the height map, I get even better results:
http://img197.imageshack.us/img197/742/tilted3degtextblurredwi.gif
Since this is just prototype code, I blurred the image using GIMP, I did not write my own blur function.
The selected path is pretty good for a greedy algorithm.
As Toaomalkster suggested, choosing the min/max slope is naive. A simple linear regression provides a better approximation of the slope of a path. Additionally, I should cut a path short once I run off the edge of the image, otherwise the path will hug the top of the image and give an incorrect slope.
Code
private double ToDegrees(double slope) { return (180.0 / Math.PI) * Math.Atan(slope); }
private double GetSkew(Bitmap image)
{
BrightnessWrapper wrapper = new BrightnessWrapper(image);
LinkedList<double> slopes = new LinkedList<double>();
for (int y = 0; y < wrapper.Height; y++)
{
int endY = y;
long sumOfX = 0;
long sumOfY = y;
long sumOfXY = 0;
long sumOfXX = 0;
int itemsInSet = 1;
for (int x = 1; x < wrapper.Width; x++)
{
int aboveY = endY - 1;
int belowY = endY + 1;
if (aboveY < 0 || belowY >= wrapper.Height)
{
break;
}
int center = wrapper.GetBrightness(x, endY);
int above = wrapper.GetBrightness(x, aboveY);
int below = wrapper.GetBrightness(x, belowY);
if (center >= above && center >= below) { /* no change to endY */ }
else if (above >= center && above >= below) { endY = aboveY; }
else if (below >= center && below >= above) { endY = belowY; }
itemsInSet++;
sumOfX += x;
sumOfY += endY;
sumOfXX += (x * x);
sumOfXY += (x * endY);
}
// least squares slope = (NΣ(XY) - (ΣX)(ΣY)) / (NΣ(X^2) - (ΣX)^2), where N = elements in set
if (itemsInSet > image.Width / 2) // path covers at least half of the image
{
decimal sumOfX_d = Convert.ToDecimal(sumOfX);
decimal sumOfY_d = Convert.ToDecimal(sumOfY);
decimal sumOfXY_d = Convert.ToDecimal(sumOfXY);
decimal sumOfXX_d = Convert.ToDecimal(sumOfXX);
decimal itemsInSet_d = Convert.ToDecimal(itemsInSet);
decimal slope =
((itemsInSet_d * sumOfXY) - (sumOfX_d * sumOfY_d))
/
((itemsInSet_d * sumOfXX_d) - (sumOfX_d * sumOfX_d));
slopes.AddLast(Convert.ToDouble(slope));
}
}
double mean = slopes.Average();
double sumOfSquares = slopes.Sum(d => Math.Pow(d - mean, 2));
double stddev = Math.Sqrt(sumOfSquares / (slopes.Count - 1));
// select items within 1 standard deviation of the mean
var testSample = slopes.Where(x => Math.Abs(x - mean) <= stddev);
return ToDegrees(testSample.Average());
}
class BrightnessWrapper
{
byte[] rgbValues;
int stride;
public int Height { get; private set; }
public int Width { get; private set; }
public BrightnessWrapper(Bitmap bmp)
{
Rectangle rect = new Rectangle(0, 0, bmp.Width, bmp.Height);
System.Drawing.Imaging.BitmapData bmpData =
bmp.LockBits(rect,
System.Drawing.Imaging.ImageLockMode.ReadOnly,
bmp.PixelFormat);
IntPtr ptr = bmpData.Scan0;
int bytes = bmpData.Stride * bmp.Height;
this.rgbValues = new byte[bytes];
System.Runtime.InteropServices.Marshal.Copy(ptr,
rgbValues, 0, bytes);
this.Height = bmp.Height;
this.Width = bmp.Width;
this.stride = bmpData.Stride;
}
public int GetBrightness(int x, int y)
{
int position = (y * this.stride) + (x * 3);
int b = rgbValues[position];
int g = rgbValues[position + 1];
int r = rgbValues[position + 2];
return (r + r + b + g + g + g) / 6;
}
}
The code is good, but not great. Large amounts of whitespace cause the program to draw relatively flat line, resulting in a slope near 0, causing the code to underestimate the actual tilt of the image.
There is no appreciable difference in the accuracy of the tilt by selecting random sample points vs sampling all points, because the ratio of "flat" paths selected by random sampling is the same as the ratio of "flat" paths in the entire image.
GetPixel is slow. You can get an order of magnitude speed up using the approach listed here.
If text is left (right) aligned you can determine the slope by measuring the distance between the left (right) edge of the image and the first dark pixel in two random places and calculate the slope from that. Additional measurements would lower the error while taking additional time.
First I must say I like the idea. But I've never had to do this before and I'm not sure what all to suggest to improve reliability. The first thing I can think of this is this idea of throwing out statistical anomalies. If the slope suddenly changes sharply then you know you've found a white section of the image that dips into the edge skewing (no pun intended) your results. So you'd want to throw that stuff out somehow.
But from a performance standpoint there are a number of optimizations you could make which may add up.
Namely, I'd change this snippet from your inner loop from this:
Color center = image.GetPixel(x, end_y);
Color above = image.GetPixel(x, above_y);
Color below = image.GetPixel(x, below_y);
if (IsWhite(center)) { /* no change to end_y */ }
else if (IsWhite(above) && IsBlack(below)) { end_y = above_y; }
else if (IsBlack(above) && IsWhite(below)) { end_y = below_y; }
To this:
Color center = image.GetPixel(x, end_y);
if (IsWhite(center)) { /* no change to end_y */ }
else
{
Color above = image.GetPixel(x, above_y);
Color below = image.GetPixel(x, below_y);
if (IsWhite(above) && IsBlack(below)) { end_y = above_y; }
else if (IsBlack(above) && IsWhite(below)) { end_y = below_y; }
}
It's the same effect but should drastically reduce the number of calls to GetPixel.
Also consider putting the values that don't change into variables before the madness begins. Things like image.Height and image.Width have a slight overhead every time you call them. So store those values in your own variables before the loops begin. The thing I always tell myself when dealing with nested loops is to optimize everything inside the most inner loop at the expense of everything else.
Also... as Vinko Vrsalovic suggested, you may look at his GetPixel alternative for yet another boost in speed.
At first glance, your code looks overly naive.
Which explains why it doesn't always work.
I like the approach Steve Wortham suggested,
but it might run into problems if you have background images.
Another approach that often helps with images is to blur them first.
If you blur your example image enough, each line of text will end up
as a blurry smooth line. You then apply some sort of algorithm to
basically do a regression analisys. There's lots of ways to do
that, and lots of examples on the net.
Edge detection might be useful, or it might cause more problems that its worth.
By the way, a gaussian blur can be implemented very efficiently if you search hard enough for the code. Otherwise, I'm sure there's lots of libraries available.
Haven't done much of that lately so don't have any links on hand.
But a search for Image Processing library will get you good results.
I'm assuming you're enjoying the fun of solving this, so not much in actual implementation detalis here.
Measuring the angle of every line seems like overkill, especially given the performance of GetPixel.
I wonder if you would have better performance luck by looking for a white triangle in the upper-left or upper-right corner (depending on the slant direction) and measuring the angle of the hypotenuse. All text should follow the same angle on the page, and the upper-left corner of a page won't get tricked by the descenders or whitespace of content above it.
Another tip to consider: rather than blurring, work within a greatly-reduced resolution. That will give you both the smoother data you need, and fewer GetPixel calls.
For example, I made a blank page detection routine once in .NET for faxed TIFF files that simply resampled the entire page to a single pixel and tested the value for a threshold value of white.
What are your constraints in terms of time?
The Hough transform is a very effective mechanism for determining the skew angle of an image. It can be costly in time, but if you're going to use Gaussian blur, you're already burning a pile of CPU time. There are also other ways to accelerate the Hough transform that involve creative image sampling.
Your latest output is confusing me a little.
When you superimposed the blue lines on the source image, did you offset it a bit? It looks like the blue lines are about 5 pixels above the centre of the text.
Not sure about that offset, but you definitely have a problem with the derived line "drifting" away at the wrong angle. It seems to have too strong a bias towards producing a horizontal line.
I wonder if increasing your mask window from 3 pixels (centre, one above, one below) to 5 might improve this (two above, two below). You'll also get this effect if you follow richardtallent's suggestion and resample the image smaller.
Very cool path finding application.
I wonder if this other approach would help or hurt with your particular data set.
Assume a black and white image:
Project all black pixels to the right (EAST). This should give a result of a one dimensional array with a size of IMAGE_HEIGHT. Call the array CANVAS.
As you project all the pixels EAST, keep track numerically of how many pixels project into each bin of CANVAS.
Rotate the image an arbitrary number of degrees and re-project.
Pick the result that gives the highest peaks and lowest valleys for values in CANVAS.
I imagine this will not work well if in fact you have to account for a real -45 -> +45 degrees of tilt. If the actual number is smaller(?+/- 10 degrees), this might be a pretty good strategy. Once you have an intial result, you could consider re-running with a smaller increment of degrees to fine tune the answer. I might therefore try to write this with a function that accepted a float degree_tick as a parm so I could run both a coarse and fine pass (or a spectrum of coarseness or fineness) with the same code.
This might be computationally expensive. To optimize, you might consider selecting just a portion of the image to project-test-rotate-repeat on.