I looking for a way to calculate a rectangle (x,y,width & height) which can be used for cropping an image around the coordinates of a selected face.
I have an image 995x1000 (https://tourspider.blob.core.windows.net/img/artists/original/947a0903-9b64-42a1-8179-108bab2a9e46.jpg) by which the center of the face is located at 492x325. I can find this information using various services so even for multiple faces in an image I'm ableto find the most prominent - hence a single coordinate.
Now i need to make various sized cropped images from the source image (200x150, 200x200 & 750x250). Now I can't seem to solve how to best calculate a rectangle around the center coordinates while taking into account the edges of the images. The face should be as central as possible in the image.
Even after experimenting with various services (https://www.microsoft.com/cognitive-services/en-us/computer-vision-api) the result are pretty poor as the face, mainly in the 750x250, is sometimes not even present.
I'm also experimenting with the ImageProcessor (http://imageprocessor.org/) library with which you can use anchors for resizing but can't get the desired result.
Does anybody has an idea on how best crop around predefined coordinates?
Using Imageprocessor I created the following solution. It is not yet perfect but goes a long way ;)
public static void StoreImage(byte[] image, int destinationWidth, int destinationHeight, Point anchor)
{
using (var inStream = new MemoryStream(image))
using (var imageFactory = new ImageFactory())
{
// Load the image in the image factory
imageFactory.Load(inStream);
var originalSourceWidth = imageFactory.Image.Width;
var originalSourceHeight = imageFactory.Image.Height;
// Resizes the image until the shortest side reaches the set given dimension.
// This will maintain the aspect ratio of the original image.
imageFactory.Resize(new ResizeLayer(new Size(destinationWidth, destinationHeight), ResizeMode.Min));
var resizedSourceWidth = imageFactory.Image.Width;
var resizedSourceHeight = imageFactory.Image.Height;
//Adjust anchor position
var resizedAnchorX = anchor.X/(originalSourceWidth / resizedSourceWidth);
var resizedAnchorY = anchor.Y/(originalSourceHeight/resizedSourceHeight);
if (anchor.X > originalSourceWidth || anchor.Y > originalSourceHeight)
{
throw new Exception($"Invalid anchor point. Image: {originalSourceWidth}x{originalSourceHeight}. Anchor: {anchor.X}x{anchor.Y}.");
}
var cropX = resizedAnchorX - destinationWidth/2;
if (cropX < 0)
cropX = 0;
var cropY = resizedAnchorY - destinationHeight/2;
if (cropY < 0)
cropY = 0;
if (cropY > resizedSourceHeight)
cropY = resizedSourceHeight;
imageFactory
.Crop(new Rectangle(cropX, cropY, destinationWidth, destinationHeight))
.Save($#"{Guid.NewGuid()}.jpg");
}
}
Related
Hi i'm struggling with an algorithm that can effectively extract image bounding box and arrows from a rasterized document. The arrows and images can change is size shape and color. The arrows may not always be arrows but rather lines. The images may be just an outline or a full color picture. This is the code i wrote so far and it kinda works but not always. I wrote this algorithm based on the excellent paper Image to CAD: Feature Extraction and Translation of Raster Image of CAD Drawing to DXF CAD Format by Aditya Intwala.
This is the original image:
For detecting the arrowheads
using var kernel1 = Cv2.GetStructuringElement(MorphShapes.Rect, new Size(2, 2));
using var binary = grayImage.Threshold(0, 255, ThresholdTypes.Binary | ThresholdTypes.Otsu);
using var invertedBinary = grayImage.Threshold(0, 255, ThresholdTypes.BinaryInv | ThresholdTypes.Otsu);
using var blackHatResult = binary.MorphologyEx(MorphTypes.BlackHat, kernel1);
using var solidArrowHeads = invertedBinary - blackHatResult;
using var foundArrowHeads = solidArrowHeads.ToMat();
using var steKernel = Cv2.GetStructuringElement(MorphShapes.Rect, new Size(3, 3));
using var eroded = foundArrowHeads.Erode(steKernel);
using var dialated = eroded.Dilate(steKernel);
dialated.FindContours(out var arrowHeadContours, out var hierarchy, RetrievalModes.External,
ContourApproximationModes.ApproxSimple);
After finding arrow heads I;m doing the following for finding the lines that intersect the bounding box of arroheads. These intersecting lines is what im classifying as an arrow. It does work but i get a lot more false positives and would like to know how to improve my algorithm or if theres a better way. Next i'm erasing the arrows by masking it and then using this image for the next step.
To find the boundaries of the images I've written the following code
using var se1 = Cv2.GetStructuringElement(MorphShapes.Rect, new Size(70, 1));
using var closedImg = grayImage.MorphologyEx(MorphTypes.Close, se1);
Cv2.BitwiseAnd(grayImage, closedImg, grayImage);
using var structuringElement = Cv2.GetStructuringElement(MorphShapes.Rect, new Size(15, 15));
using var blurred = grayImage.MorphologyEx(MorphTypes.Gradient, structuringElement);
using var inverted = blurred.Threshold(197, 255, ThresholdTypes.Binary);
//Lets get bounding boxes for all large contours
Cv2.FindContours(inverted, out var contours, out var hierarchyIndices, RetrievalModes.External,
ContourApproximationModes.ApproxSimple);
for (var i = 0; (i >= 0) && (i < hierarchyIndices.Length); i = hierarchyIndices[i].Next)
{
CT_Assert.True(hierarchyIndices[i].Parent == -1, "Must be a top level contour ?");
var rect = Cv2.BoundingRect(contours[i]);
// save this rect as possible bounding box
}
This again kinda works but not always.
Final output image, black bounding box is detected arrowhead, red lines are arrows and blue boxes are detected image bounding boxes.
Final Output image:
In the app I'm trying to develop a key part is getting the position of where the user has touched. First I thought of using a tap gesture recognizer but after a quick google search I learned that was useless (See here for an example).
Then I believe I discovered SkiaSharp and after learning how to use it, at least somewhat, I'm still not sure how I get the proper coordinates of a touch. Here are sections of the code in my project that are relevant to the problem.
Canvas Touch Function
private void canvasView_Touch(object sender, SKTouchEventArgs e)
{
// Only carry on with this function if the image is already on screen.
if(m_isImageDisplayed)
{
// Use switch to get what type of action occurred.
switch (e.ActionType)
{
case SKTouchAction.Pressed:
TouchImage(e.Location);
// Update simply tries to draw a small square using double for loops.
m_editedBm = Update(sender);
// Refresh screen.
(sender as SKCanvasView).InvalidateSurface();
break;
default:
break;
}
}
}
Touch Image
private void TouchImage(SKPoint point)
{
// Is the point in range of the canvas?
if(point.X >= m_x && point.X <= (m_editedCanvasSize.Width + m_x) &&
point.Y >= m_y && point.Y <= (m_editedCanvasSize.Height + m_y))
{
// Save the point for later and set the boolean to true so the algorithm can begin.
m_clickPoint = point;
m_updateAlgorithm = true;
}
}
Here I'm just seeing or TRYING to see if the point clicked was in range of the image and I made a different SkSize variable to help. Ignore the boolean, not that important.
Update function (function that attempts to draw ON the point pressed so it's the most important)
public SKBitmap Update(object sender)
{
// Create the default test color to replace current pixel colors in the bitmap.
SKColor color = new SKColor(255, 255, 255);
// Create a new surface with the current bitmap.
using (var surface = new SKCanvas(m_editedBm))
{
/* According to this: https://learn.microsoft.com/en-us/xamarin/xamarin-forms/user-interface/graphics/skiasharp/paths/finger-paint ,
the points I have to start are in Xamarin forms coordinates, but I need to translate them to SkiaSharp coordinates which are in
pixels. */
Point pt = new Point((double)m_touchPoint.X, (double)m_touchPoint.Y);
SKPoint newPoint = ConvertToPixel(pt);
// Loop over the touch point start, then go to a certain value (like x + 100) just to get a "block" that's been altered for pixels.
for (int x = (int)newPoint.X; x < (int)newPoint.X + 200.0f; ++x)
{
for (int y = (int)newPoint.Y; y < (int)newPoint.Y + 200.0f; ++y)
{
// According to the x and y, change the color.
m_editedBm.SetPixel(x, y, color);
}
}
return m_editedBm;
}
}
Here I'm THINKING that it'll start, you know, at the coordinate I pressed (and these coordinates have been confirmed to be within the range of the image thanks to the function "TouchImage". And when it does get the correct coordinates (or at least it SHOULD of done that) the square will be drawn one "line" at a time. I have a game programming background so this kind of sounds simple but I can't believe I didn't get this right the first time.
Also I have another function, it MIGHT prove worthwhile because the original image is rotated and then put on screen. Why? Well by default the image, after taking the picture, and then displayed, is rotated to the left. I had no idea why but I corrected it with the following function:
// Just rotate the image because for some reason it's titled 90 degrees to the left.
public static SKBitmap Rotate()
{
using (var bitmap = m_bm)
{
// The new ones width IS the old ones height.
var rotated = new SKBitmap(bitmap.Height, bitmap.Width);
using (var surface = new SKCanvas(rotated))
{
surface.Translate(rotated.Width, 0.0f);
surface.RotateDegrees(90);
surface.DrawBitmap(bitmap, 0, 0);
}
return rotated;
}
}
I'll keep reading and looking up stuff on what I'm doing wrong, but if any help is given I'm grateful.
I have a MVC C# application that includes a .Net wrapper for tesseract-ocr nuget. The current version I am using is v4.1.0-beta1. The image that I am try to scan is shown below
My aim is to extract the player name and the number just above them to the left.
I tried making the OCR scan the field/pitch area but the results are way off base. So, I decided to section off all player names and all numbers as seen in the image below. Ratings area marked in blue and player names marked in red. As you can see the name and rating are always the same distance apart.
My current code setup is shown below.
public void Get(HttpPostedFileBase file)
{
using (var engine = new TesseractEngine(Path.Combine(HttpRuntime.AppDomainAppPath, "tessdata"), "eng+deu", EngineMode.Default))
{
var bitmap = (Bitmap)Image.FromStream(file.InputStream, true, true);
using (var img = PixConverter.ToPix(bitmap))
{
SetPlayerRatings(engine, img);
}
}
}
private void SetPlayerRatings(TesseractEngine engine, Pix img)
{
var width = 285;
var height = 76;
var textPositions = Service.Get<Formation>(this.FormationId).TextPositions.ToList();
foreach (var textPosition in textPositions)
{
var playerRating = GetPlayerData(engine, img, new Rect(textPosition.X, textPosition.Y, width, height));
}
}
private static PlayerRating GetPlayerData(TesseractEngine engine, Pix img, Rect region)
{
using (var page = engine.Process(img, region, PageSegMode.Auto))
{
var playerName = page.GetText();
}
var ratingRegion = new Rect(region.X1, region.Y1 - 52, 80, 50);
using (var page = engine.Process(img, ratingRegion, PageSegMode.Auto))
{
var playerRating = page.GetText();
}
}
This code is producing the correct results for the 1st image.
Is there any way to train OCR so that I dont have to workout the X and Y co-ordinates for each player position? I would like to just specify the area of the pitch and have OCR read in the rating followed by the player name.
With specifying coordinates you solved several problems regarding image processing. So if you do not want to specify coordinates, you have to deal with them: e.g. removing graphics component from OCR area like T-shirt, lines.
Next idea: Tesseract API has option GetComponentImages (I expect C# wrapper should provide it too - I am not familiar with C#), so you can iterate over found components.
I'm trying to build an application that solves a puzzle (trying to develop a graph algorithm), and I don't want to enter sample input by hand all the time.
Edit: I'm not trying to build a game. I'm trying to build an agent which plays the game "SpellSeeker"
Say I have an image (see attachment) on the screen with numbers in it, and I know the locations of the boxes, and I have the exact images for these numbers. What I want to do is simply tell which image (number) is on the corresponding box.
So I guess I need to implement
bool isImageInsideImage(Bitmap numberImage,Bitmap Portion_Of_ScreenCap) or something like that.
What I've tried is (using AForge libraries)
public static bool Contains(this Bitmap template, Bitmap bmp)
{
const Int32 divisor = 4;
const Int32 epsilon = 10;
ExhaustiveTemplateMatching etm = new ExhaustiveTemplateMatching(0.9f);
TemplateMatch[] tm = etm.ProcessImage(
new ResizeNearestNeighbor(template.Width / divisor, template.Height / divisor).Apply(template),
new ResizeNearestNeighbor(bmp.Width / divisor, bmp.Height / divisor).Apply(bmp)
);
if (tm.Length == 1)
{
Rectangle tempRect = tm[0].Rectangle;
if (Math.Abs(bmp.Width / divisor - tempRect.Width) < epsilon
&&
Math.Abs(bmp.Height / divisor - tempRect.Height) < epsilon)
{
return true;
}
}
return false;
}
But it returns false when searching for a black dot in this image.
How can I implement this?
I'm answering my question since I've found the solution:
this worked out for me:
System.Drawing.Bitmap sourceImage = (Bitmap)Bitmap.FromFile(#"C:\SavedBMPs\1.jpg");
System.Drawing.Bitmap template = (Bitmap)Bitmap.FromFile(#"C:\SavedBMPs\2.jpg");
// create template matching algorithm's instance
// (set similarity threshold to 92.5%)
ExhaustiveTemplateMatching tm = new ExhaustiveTemplateMatching(0.921f);
// find all matchings with specified above similarity
TemplateMatch[] matchings = tm.ProcessImage(sourceImage, template);
// highlight found matchings
BitmapData data = sourceImage.LockBits(
new Rectangle(0, 0, sourceImage.Width, sourceImage.Height),
ImageLockMode.ReadWrite, sourceImage.PixelFormat);
foreach (TemplateMatch m in matchings)
{
Drawing.Rectangle(data, m.Rectangle, Color.White);
MessageBox.Show(m.Rectangle.Location.ToString());
// do something else with matching
}
sourceImage.UnlockBits(data);
The only problem was it was finding all (58) boxes for said game. But changing the value 0.921f to 0.98 made it perfect, i.e. it finds only the specified number's image (template)
Edit: I actually have to enter different similarity thresholds for different pictures. I found the optimized values by trying, in the end I have a function like
float getSimilarityThreshold(int number)
A better approach is to build a custom class which holds all the information you need instead of relying on the image itself.
For example:
public class MyTile
{
public Bitmap TileBitmap;
public Location CurrentPosition;
public int Value;
}
This way you can "move around" the tile class and read the value from the Value field instead of analyzing the image. You just draw whatever image the class hold to the position it's currently holding.
You tiles can be held in an array like:
private list<MyTile> MyTiles = new list<MyTile>();
Extend class as needed (and remember to Dispose those images when they are no longer needed).
if you really want to see if there is an image inside the image, you can check out this extension I wrote for another post (although in VB code):
Vb.Net Check If Image Existing In Another Image
In my site, users can upload photos. I currently compress and resize the photo to make sure they aren't huge files. But this creates photos that are of varying dimensions... which makes the site a bit "ugly" in my opinion.
I'd like to ensure the thumbnails are square images, but not by using padding. It's ok if there is some loss of the photo in the thumbnail. I'd like to keep the fidelity of the photo high, even if it means some cropping needs to occur.
I wrote some code to do this exact thing. I chose to crop it because resizing without preserving aspect ratio looks pretty horrible. I do a crop then a resize to create a thumnail image:
public Bitmap CreateThumbnail(Bitmap RawImage)
{
int width = RawImage.Width;
int height = RawImage.Height;
Size ThumbnailDimensions = new Size();
ThumbnailDimensions.Width = 100;
ThumbnailDimensions.Height = 100;
Rectangle cropArea = new Rectangle();
if (width > height)
{
cropArea.Width = height;
cropArea.Height = height;
cropArea.X = 0;
cropArea.Y = 0;
}
else if (width < height)
{
cropArea.Width = width;
cropArea.Height = width;
cropArea.X = 0;
cropArea.Y = 0;
}
if(width != height) Bitmap thumbnail = CropImage(RawImage, cropArea);
thumbnail = ResizeImage(thumbnail, ThumbnailDimensions);
return thumbnail;
}
This just crops from the top left corner then resizes it to my thumbnail dimensions.
I would imagine you need to take the shortest dimension (either w or h), and use that as your dimension for creating the cropped image, essentially you can crop and then scale the image. Check out this article as an example for cropping an image. Also check out this Stack Overflow question regarding image quality.
Rather than cropping, I would make the div or whatever you put them in a fixed square size. Scale the image to fit inside that square.
How would you decide to crop it? From the top-left? Bottom-right? Center?
To make a rectangle into a square you need to either pad, resize without preserving aspect ratio or crop (or combinations).
Here's some code for cropping
http://snippets.dzone.com/posts/show/1484
(I work for Atalasoft -- in our DotImage Photo SDK), it's
AtalaImage img = new AtalaImage("filename.jpg");
AtalaImage img2 = new CropCommand( /*point and size of crop */).Apply(img).Image;
img2.Save("filename", new JpegEncoder(quality), null);