Basically i'm trying to check if a face is upside down in an image using this library https://github.com/takuya-takeuchi/FaceRecognitionDotNet.
Take example of the image below
This is an image that is successfully detected using the FaceRecognition.Net library.The image is upside down.I have marked all face landmarks in the image with a blue ellipses.
This is the approach i follow
// Finding faceparts
var faceparts = dparameters._FaceRecognition.FaceLandmark(dparameters.FCImage);
// Drawing Ellipses over all points got from faceparts
foreach(var facepart in faceparts) {
foreach(var mypoint in facepart.Values) {
foreach(var x in mypoint) {
tempg.DrawEllipse(Pens.Blue, x.Point.X, x.Point.Y, 2, 2);
}
}
}
Now i'm checking if the image is rotated by comparing maximum Y coordinates of the lip and eyepoints
var temp = faceparts.FirstOrDefault();
IEnumerable < FacePoint > lippoints;
temp.TryGetValue(FacePart.BottomLip, out lippoints);
IEnumerable < FacePoint > eyepoints;
temp.TryGetValue(FacePart.LeftEye, out eyepoints);
var lippoint = lippoints.Max(r => r.Point.Y);
var topeyepoint = eyepoints.Max(r => r.Point.Y);
if (lippoint > topeyepoint) {
bool isinverted = true;
} else {
bool isinverted = false;
}
The issue is that even when the image is not upside down, the eyecoordinate is less than the face coordinate.This is because a false face is detected as outlined in the image.How to get over this issue?
It looks like this library does not provide a confidence ratio for the results. Otherwise, I would suggest to try both the input and its flipped copy and take the one with higher confidence before doing the "eye over mouth" check.
So maybe what could help is:
using the CNN model, in the original library it is called by
face_locations = face_recognition.face_locations(image, number_of_times_to_upsample=0, model="cnn")
in the C# port it should be
_FaceRecognition.FaceLocations(image, 0, Model.Cnn)
That should give you a more accurate face bounding box which you can then compare with the bounding box of the landmarks. If you do the same for a flipped copy of the image, you can "emulate" the confidence I mentioned earlier and assume the orientation where the boxes match better. Then you can identify the orientation by the "eyes over mouth" test.
as far as I noticed the library does not provide pre-trained data, so in order to use the Cnn model you need to train it by yourself. Selection of the dataset for training is of course very important. If you already performed the training, more/better training data might improve the accuracy.
Related
In the app I'm trying to develop a key part is getting the position of where the user has touched. First I thought of using a tap gesture recognizer but after a quick google search I learned that was useless (See here for an example).
Then I believe I discovered SkiaSharp and after learning how to use it, at least somewhat, I'm still not sure how I get the proper coordinates of a touch. Here are sections of the code in my project that are relevant to the problem.
Canvas Touch Function
private void canvasView_Touch(object sender, SKTouchEventArgs e)
{
// Only carry on with this function if the image is already on screen.
if(m_isImageDisplayed)
{
// Use switch to get what type of action occurred.
switch (e.ActionType)
{
case SKTouchAction.Pressed:
TouchImage(e.Location);
// Update simply tries to draw a small square using double for loops.
m_editedBm = Update(sender);
// Refresh screen.
(sender as SKCanvasView).InvalidateSurface();
break;
default:
break;
}
}
}
Touch Image
private void TouchImage(SKPoint point)
{
// Is the point in range of the canvas?
if(point.X >= m_x && point.X <= (m_editedCanvasSize.Width + m_x) &&
point.Y >= m_y && point.Y <= (m_editedCanvasSize.Height + m_y))
{
// Save the point for later and set the boolean to true so the algorithm can begin.
m_clickPoint = point;
m_updateAlgorithm = true;
}
}
Here I'm just seeing or TRYING to see if the point clicked was in range of the image and I made a different SkSize variable to help. Ignore the boolean, not that important.
Update function (function that attempts to draw ON the point pressed so it's the most important)
public SKBitmap Update(object sender)
{
// Create the default test color to replace current pixel colors in the bitmap.
SKColor color = new SKColor(255, 255, 255);
// Create a new surface with the current bitmap.
using (var surface = new SKCanvas(m_editedBm))
{
/* According to this: https://learn.microsoft.com/en-us/xamarin/xamarin-forms/user-interface/graphics/skiasharp/paths/finger-paint ,
the points I have to start are in Xamarin forms coordinates, but I need to translate them to SkiaSharp coordinates which are in
pixels. */
Point pt = new Point((double)m_touchPoint.X, (double)m_touchPoint.Y);
SKPoint newPoint = ConvertToPixel(pt);
// Loop over the touch point start, then go to a certain value (like x + 100) just to get a "block" that's been altered for pixels.
for (int x = (int)newPoint.X; x < (int)newPoint.X + 200.0f; ++x)
{
for (int y = (int)newPoint.Y; y < (int)newPoint.Y + 200.0f; ++y)
{
// According to the x and y, change the color.
m_editedBm.SetPixel(x, y, color);
}
}
return m_editedBm;
}
}
Here I'm THINKING that it'll start, you know, at the coordinate I pressed (and these coordinates have been confirmed to be within the range of the image thanks to the function "TouchImage". And when it does get the correct coordinates (or at least it SHOULD of done that) the square will be drawn one "line" at a time. I have a game programming background so this kind of sounds simple but I can't believe I didn't get this right the first time.
Also I have another function, it MIGHT prove worthwhile because the original image is rotated and then put on screen. Why? Well by default the image, after taking the picture, and then displayed, is rotated to the left. I had no idea why but I corrected it with the following function:
// Just rotate the image because for some reason it's titled 90 degrees to the left.
public static SKBitmap Rotate()
{
using (var bitmap = m_bm)
{
// The new ones width IS the old ones height.
var rotated = new SKBitmap(bitmap.Height, bitmap.Width);
using (var surface = new SKCanvas(rotated))
{
surface.Translate(rotated.Width, 0.0f);
surface.RotateDegrees(90);
surface.DrawBitmap(bitmap, 0, 0);
}
return rotated;
}
}
I'll keep reading and looking up stuff on what I'm doing wrong, but if any help is given I'm grateful.
Actually I'm using AForge.NET to recognize suit and value from a gamecard.
Here is a snippet where I decide which suit/value is there:
public static CardTemplateIdentifier GetBestMatchingIdentifier(this List<CardTemplateIdentifier> templates, Bitmap bitmap)
{
float maxSimilar = 0f;
CardTemplateIdentifier result = null;
foreach (var template in templates)
{
// Identify similarity between template and bmp
ExhaustiveTemplateMatching tm = new ExhaustiveTemplateMatching(0);
TemplateMatch[] matchings = tm.ProcessImage(bitmap, template.Sample);
// If the currently tested template fits better than the best one so far,
// set the value as the identified card value
if (matchings.Length > 0 && matchings[0].Similarity > maxSimilar)
{
maxSimilar = matchings[0].Similarity;
result = template;
}
}
return result;
}
CardTemplateIdentifier contains a list with all possible cards and their comparism sample.
...
Now If trying to recognize the images, I can not rely on this being reliably recognized: Sometimes the scaling of the images differs from the one I made the sample. Or the font has got less thicknes, see example below:
I think I'm on the wrong way to detect the values from the images. Is there a better or convience way how to solve this problem?
I looking for a way to calculate a rectangle (x,y,width & height) which can be used for cropping an image around the coordinates of a selected face.
I have an image 995x1000 (https://tourspider.blob.core.windows.net/img/artists/original/947a0903-9b64-42a1-8179-108bab2a9e46.jpg) by which the center of the face is located at 492x325. I can find this information using various services so even for multiple faces in an image I'm ableto find the most prominent - hence a single coordinate.
Now i need to make various sized cropped images from the source image (200x150, 200x200 & 750x250). Now I can't seem to solve how to best calculate a rectangle around the center coordinates while taking into account the edges of the images. The face should be as central as possible in the image.
Even after experimenting with various services (https://www.microsoft.com/cognitive-services/en-us/computer-vision-api) the result are pretty poor as the face, mainly in the 750x250, is sometimes not even present.
I'm also experimenting with the ImageProcessor (http://imageprocessor.org/) library with which you can use anchors for resizing but can't get the desired result.
Does anybody has an idea on how best crop around predefined coordinates?
Using Imageprocessor I created the following solution. It is not yet perfect but goes a long way ;)
public static void StoreImage(byte[] image, int destinationWidth, int destinationHeight, Point anchor)
{
using (var inStream = new MemoryStream(image))
using (var imageFactory = new ImageFactory())
{
// Load the image in the image factory
imageFactory.Load(inStream);
var originalSourceWidth = imageFactory.Image.Width;
var originalSourceHeight = imageFactory.Image.Height;
// Resizes the image until the shortest side reaches the set given dimension.
// This will maintain the aspect ratio of the original image.
imageFactory.Resize(new ResizeLayer(new Size(destinationWidth, destinationHeight), ResizeMode.Min));
var resizedSourceWidth = imageFactory.Image.Width;
var resizedSourceHeight = imageFactory.Image.Height;
//Adjust anchor position
var resizedAnchorX = anchor.X/(originalSourceWidth / resizedSourceWidth);
var resizedAnchorY = anchor.Y/(originalSourceHeight/resizedSourceHeight);
if (anchor.X > originalSourceWidth || anchor.Y > originalSourceHeight)
{
throw new Exception($"Invalid anchor point. Image: {originalSourceWidth}x{originalSourceHeight}. Anchor: {anchor.X}x{anchor.Y}.");
}
var cropX = resizedAnchorX - destinationWidth/2;
if (cropX < 0)
cropX = 0;
var cropY = resizedAnchorY - destinationHeight/2;
if (cropY < 0)
cropY = 0;
if (cropY > resizedSourceHeight)
cropY = resizedSourceHeight;
imageFactory
.Crop(new Rectangle(cropX, cropY, destinationWidth, destinationHeight))
.Save($#"{Guid.NewGuid()}.jpg");
}
}
I am making a program with the SDK, where when users are detected, The program draws a skeleton for them to follow. I recently saw a game advertised on my Xbox, Nike+ Kinect and saw how it displays a copy of the character doing something else like:
http://www.swaggerseek.com/wp-content/uploads/2012/06/fcb69__xboxkinect1.jpg
Or
http://www.swaggerseek.com/wp-content/uploads/2012/06/fcb69__xboxkinect.jpg
Can I create a point-cloud representation of the only the person detected (not any of the background)? Thanks in advance!
EDIT
Using this site, I can create point clouds, but still can't crop around the body of the person.
You can do a very simple triangulation of the points.
Check this tutorial:
http://www.riemers.net/eng/Tutorials/XNA/Csharp/Series1/Terrain_basics.php
Check the result:
It doesn't look like they are displaying a complete point cloud but rather a blue shaded intensity map. This could be done with the depth image from the Kinect for Windows sdk.
What you are looking for is the player index. This is a provided bit in each pixel of the depth image. In order to get the player index bit you have to also enable the skeletal stream in your initialization code.
So this is how I would do it. I am modifying one of the Kinect for Windows SDK quickstarts found here load it up and make the following changes:
//Change image type to BGRA32
image1.Source =
BitmapSource.Create(depthFrame.Width, depthFrame.Height,
96, 96, PixelFormats.Bgra32, null, pixels, stride);
//hardcoded locations to Blue, Green, Red, Alpha (BGRA) index positions
const int BlueIndex = 0;
const int GreenIndex = 1;
const int RedIndex = 2;
const int AlphaIndex = 3;
//get player and depth at pixel
int player = rawDepthData[depthIndex] & DepthImageFrame.PlayerIndexBitmask;
int depth = rawDepthData[depthIndex] >> DepthImageFrame.PlayerIndexBitmaskWidth;
//check each pixel for player, if player is blue intensity.
if (player > 0)
{
pixels[colorIndex + BlueIndex] = 255;
pixels[colorIndex + GreenIndex] = intensity;
pixels[colorIndex + RedIndex] = intensity;
pixels[colorIndex + AlphaIndex] = 100;
}
else
{
//if not player make black and transparent
pixels[colorIndex + BlueIndex] = 000;
pixels[colorIndex + GreenIndex] = 000;
pixels[colorIndex + RedIndex] = 000;
pixels[colorIndex + AlphaIndex] = 0;
}
I like using this example for testing the colors since it still provides you with the depth viewer on the right side. I have attached an image of this effect running below:
The image to the left is the intensity map with slightly colored pixel level intensity data.
Hope that helps
David Bates
This is not possible automatically with official Kinect SDK. But it is implemented in alternative SDK called OpenNI, there you can just get the set of points which of which user consists. If you don't want to use it I can suggest rather easy method of separating user from background. Since you know the z-position of user you can just take points which z is from 0 to userZ + some value representing thickness of body.
Another idea is walk over point cloud starting from some joint (or joints) and taking points only if distance is changing smoothly, because if you take background point, border body and next body point the distance drop will be easily noticeable. The problem here is that you will start counting floor as a part of body, because transition there is smooth, so you should validate it using lowest (ankle) joint.
Or you can use segmentation in PCL (http://docs.pointclouds.org/trunk/group__segmentation.html) but I don't know if the feet-floor problem is solved there. Looks like they are good with it (http://pointclouds.org/documentation/tutorials/planar_segmentation.php).
Kinect for Windows SDK v1.5 has a sample that could be modified for this.
Sample names: depth-d3d or depthwithcolor-d3d.
They both do point clouds.
I am creating a minecraft clone, and whenever I move the camera even a little bit fast there is a big tear between the chunks as shown here:
Each chunk is 32x32x32 cubes and has a single vertex buffer for each kind of cube, in case it matters. I am drawing 2D text on the screen as well, and I learned that I had to set the graphic device state for each kind of drawing. Here is how I'm drawing the cubes:
GraphicsDevice.Clear(Color.LightSkyBlue);
#region 3D
// Set the device
device.BlendState = BlendState.Opaque;
device.DepthStencilState = DepthStencilState.Default;
device.RasterizerState = RasterizerState.CullCounterClockwise;
// Go through each shader and draw the cubes of that style
lock (GeneratedChunks)
{
foreach (KeyValuePair<CubeType, BasicEffect> KVP in CubeType_Effect)
{
// Iterate through each technique in this effect
foreach (EffectPass pass in KVP.Value.CurrentTechnique.Passes)
{
// Go through each chunk in our chunk map, and pluck out the cubetype we care about
foreach (Vector3 ChunkKey in GeneratedChunks)
{
if (ChunkMap[ChunkKey].CubeType_TriangleCounts[KVP.Key] > 0)
{
pass.Apply(); // assign it to the video card
KVP.Value.View = camera.ViewMatrix;
KVP.Value.Projection = camera.ProjectionMatrix;
KVP.Value.World = worldMatrix;
device.SetVertexBuffer(ChunkMap[ChunkKey].CubeType_VertexBuffers[KVP.Key]);
device.DrawPrimitives(PrimitiveType.TriangleList, 0, ChunkMap[ChunkKey].CubeType_TriangleCounts[KVP.Key]);
}
}
}
}
}
#endregion
The world looks fine if I'm standing still. I thought this might be because I'm in windowed mode, but when I toggled full screen the problem persisted. I also assume that XNA is double buffered by itself? Or so google has told me.
I had a similar issue - I found that I had to call pass.Apply() after setting all of the Effect's parameters...
The fix so far has been to use 1 giant vertex buffer. I don't like it, but that's all that seems to work.