Best way to detect a person in a WebCamTexture

Best way to detect a person in a WebCamTexture - c#

Soon I will begin development on a mobile application in unity 5 that will use the devices native camera for the user to view. The big thing here is that while the camera is up, I need it to be able to recognize a person's face and compare it to a stored image of a face that will already be saved on the device to see if it is a match.
Also, I need to be able to recognize whether the center of the camera is pointing at a part of the person's body.
So all in all it needs to recognize a person's face and run a comparison on it, and if the face is recognized, it will then determine if the center of the camera is pointing at a part of the person's body.
My question is: Is there a good plugin for this sort of behavior? I know it will be a bit extensive but I am wondering how I would get this done. I am more focused on just facial recognition than the body issue so if that is the easier issue than I am good with focusing on that.

It sounds complicated and it envolved alot of diffirent processing techniques to achieve some resonable results. If I were you I would look into using the Face recognition module that comes with iOS and integrate that as a Native plugin, they can be used to extract and locate a face on a photo/feed. Then you got a photo next I would to is to rotate the photo and prepare it for comparison with other photos, that is "normalized" , you should normalize the colors and align it as much as possible. The final and most complicated step would be to analyse the face and compare with diffirent faces. I would recommend that you use OpenCV as suggested for that. I would try and analyse the structure of the face based on positioning a simple structure on the eyes, mouth, chin forehead and jaws, try and calculate the distance between these values and then you will have a way of roughly sorting the images that you want to match the face with since it doesn't make sense to do further processing on face images tha varies too much. Final step i would try to detect the eye color, the color of the skin and various other factors, these combined will give you some stats that can be used to determine how equal these photos are.
Cheers

Not sure whether this is you want. There is a open cv plugin with high price $95 on asset store: OpenCV for Unity
And here is face recognation demo video: https://youtu.be/u5aDbn5nRbY
Hope it will help.

Related

different images from different point of view

I want different images to be displayed from different point of view. For the whole concept explaination please look at the images. they explain my idea/query!
As in the first image you see that there are three people at different angle looking at the monitor. Now i want the webcam to track the eyes and show the particular defined image to the user> For example: If user is at 45 degree angle then show image1.png
Depending upon the user's prespective of watching. The computer should show the image.
(the lady is the game character for representation purpose)
Can you please guide me on what steps can be taken to accomplish this? Is there any plugin available for unity that tracks faces? Please guide me
Also thanks for the compliments on my sketching skills xD

Stackoverflow is not really meant to recommend plugins, since the choice is usually opinion based so there is no exact answer.
That being said, on of the most common used API for computer vision (meaning interpreting images, including face recognition) is OpenCV, so that could be a good start for you to look at that.
And fortunately for you, there is a Unity plugin for OpenCV
It is too broad to give you more details about how it works here. You should try to make it work, and if you have a problem with your code, open a new question with the code portion that you struggle with.
PS: nice sketching skills

Perhaps easier option would be to use Kinect
(trying to detect face or eyes from that far might be shaky?)
With Kinect you can get skeletons for multiple people, and getting the angle between target and those kinect avatars would be easy.
If there is no space to put kinect in good position,
could consider placing it on the ceiling above (and then use depth data only to detect people in its view)
Only issue is that apparently Microsoft has stopped Windows kinect support,
so you would need to find 2nd hand versions.. (Unity Asset store still has some kinect plugins and examples available)
https://www.polygon.com/2018/1/2/16842072/xbox-one-kinect-adapter-out-of-stock-production-ended
Or look for kinect alternatives that work with unity, try RealSense cameras:
https://www.intel.sg/content/www/xa/en/architecture-and-technology/realsense-overview.html

Kinect touchable Surface

I have a programming project using Kinect Xbox one sensor. The project is mainly about turning any surface into an interactive touchable screen. I have collected all the hardware including the projector. In addition, I have done my research and downloaded the related packets such as Visual Studio in order to start coding in C#.
So, my question here:
Is there any any library that I could use which may facilitate me to determine the angles/depth of the surface?
Plus, I don't have a fully vision of the steps which need to be done for the next steps, so I would really appreciate it if there is anyone could draw me a small map for me for this project.

If you have trouble with getting started with kinect go through this
Quick start series
and you also might want to capture the depth of objects. For that try to use Kinect's depth image streams and the sdk itself does not provide much happy methods. You will have to do some image processing on that gray scaled depth stream. Then you can find the edges of a single object in different depths.

People Counting System

I want to develop a "People Counting System" using OpenCV (or Emgu CV).
Please guide me on how to implement or lead me to some examples or open source projects.
(I have done some work: extracting diff then threshold to delete background, using motion history and like that; still no good results.)
Edit 1: I am counting a high people flow (a dozen of them may come through simultaneously).
Edit 2: It must be at least 80% accurate. People are walking through a door that is almost 5 meters wide. The problem is I have no control on the position or angle of the camera. Camera is shouting the place from a 10m distance at a 2.5m height.
Thank you

If you call a people counting system a system that counts people that are in a room then I recommend you implement the hardware with a microcontroller with 2 lazers(normal lazer toys work) and 2 photoresistors.For the microcontroller I recomen you use Arduino.And then make an C# application that has a SerialPort object and reads the data that the arduino sends through the USB.The arduino will send 1 for "someone entered the room" and 0 for "someone left the room" for example.Then the logging and statistics can be done easily in C#.
Arduiono Site:here
Photoresistor for $1: here
This solution is alot cheaper and easyer to implement than using a camera that is with a fairly good quality.
Hope I helped you.

Check out the HOG pedestrian detector that comes with recent versions of OpenCV (>= 2.2).
See modules/objdetect/src/hog.cpp and samples/cpp/peopledetect.cpp in the OpenCV sources. Unfortunately there is no official documentation about it yet.

This would help you to count moving things including people: Motion Detection project on CodeProject

Are people the only kind of "entities" in the scene? If this is not the case, do you care about considering a person some other kind of thing that moves through the scene? Because if that is the case, you could just count blobs that come in or come out from the scene. It may sound a bit naive but I will take some kind of motion image, group motion pixels by distance in clusters. Your distance metric could take into account some restrictions, such as that people will "often" stand so pixels in a cluster should group around some kind of regression line (an straight-up line if the camera is aligned with de floor). It shouldn't be necessary to track them in the scene, just noticing when they enter or they leave, though you'd get some issues with, for example, people entering on their own in the scene and leaving in pairs or in groups... Good luck :)

I think if you have dense people crowd with a lot of occlusions you have to use some machine learning algorithm, for example you can use Implicit Shape Model for features.

It really depends on the position of the camera. Assuming that you can get front facing profiles of the people in the images:
This problem is basically face detection and recognition.
There are many ways to go about finding faces, but this is the approach that I'm a little more familiar with.
For the face detection you need to do image segmentation on the skin tone color. This will extract skin regions. [Arms, the chest (for those wearing V cut tops), face, legs, etc] Then you would need to line up the profiles of the skin regions to the profile of your trained faces.
[You'll need to use Eigenfaces to create a generic profile of what a face looks like]
If the skin region lines up and doesn't devate too far from the profile, then it is considered a face. Once the face is confirmed, then add it into the eigenfaces data store [for recognition]. To save processing you might want to consider limiting the search area if you are looking for a previous face. [Given the frame rate, and last time the person was seen]
If you are referring to "Crowd flow" I think you just mean the density of faces in a crowd.
Now you've confirmed that a moving object in the video is a person. Now you just need to note that and then make sure that you don't consider them as a new person again.
This approach: Really depends on your ability to detect face regions. This may not work if the people in the video are looking down, not fitting the profile of the trained data etc. Also it may be effected if a person puts on sunglasses within the video. [Probably would be considered a "new face"]

Capture a single pixel row from each frame of video and compile them together

I'm working on a project where I need to take a single horizontal or vertical pixel row (or column, I guess) from each frame of a supplied video file and create an image out of it, basically appending the pixel row onto the image throughout the video. The video file I plan to supply isn't a regular video, it's actually just a capture of a panning camera from a video game (Halo: Reach) looking straight down (or as far as the game will let me, which is -85.5°). I'll look down, pan the camera forward over the landscape very slowly, then take a single pixel row from each frame the captured video file (30fps) and compile the rows into an image that will effectively (hopefully) reconstruct the landscape into a single image.
I thought about doing this the quick and dirty way, using a AxWindowsMediaPlayer control and locking the form so that it couldn't be moved or resized, then just using a Graphics object to capture the screen, but that wouldn't be fast enough, there would be way too many problems, I need direct access to the frames.
I've heard about FFLib, and DirectShow.NET, I actually just installed the Windows SDK but haven't had a chance to mess with and of the DirectX stuff yet (I remember it being very confusing for me a while back when I messed with it). Hopefully someone can give me a pointer in the right direction.
If anyone has any information they think might help, I'd be super grateful for it. Thank you!

You could use a video rendered in renderless mode (E.g. VMR9, EVR), which allows you to process every frame yourself. By using frame stepping playback you can step one frame each time and process the frame.
DirectShow.NET can help you to use managed code where possible, and I can recommend it. It is however only a wrapper to DirectShow, so it might be worthwhile to look for more advanced libraries as well.
A few sidenotes: wouldn't you experience issues with lighting which differs from angle to angle? Perhaps it's easier to capture some screenshots and use existing stitching algorithms?

Moving from Wiimote to camera?

I've been doing some Johnny Chung Lee-style Wiimote programming, and am running into problems with the Wiimote's relatively narrow field-of-view and limit of four points. I've bought a Creative Live! camera with an 85-degree field of view and a high resolution.
My prototype application is written in C#, and I'd like to stay there.
So, my question: I'd like to find a C#.Net camera / vision library that lets me track points - probably LEDs - in the camera's field of view. In the future, I'd like to move to R/G/B point tracking so as to allow more points to be tracked and distinguished more easily. Any suggestions?

You could check out the Emgu.CV library which is a .NET (C#) wrapper for OpenCV. OpenCV is considered by many, including myself, to be the best (free) computer vision library.

Check out AForge.Net.. It seems to be a powerful library.

With a normal camera, the task of identifying and tracking leds is quite more challanging, because of all the other objects which are visibile.
I suggest that you try to maximize the contrast by reducing the exposure (thus turning of auto-exposure), if that's possible in the driver: you should aim for a value where your leds have still an high intensity in the image (>200) while not being overexposed (<255). You should then be able to threshold your image correctly and get higher quality results.
If the image is still too cluttered to be analyzed easily and efficiently, you may use infrared leds, remove the IR-block filter on the camera (if your camera has it), and maybe add an "Infrared Pass / Visible Light blocking" filter: you should then have bright spots only where the leds are, but you will not bee able to use color. There may be issues with the image quality though.

When tracking things like lights, especially if they are a special color, I recommend you apply a blur filter to the footage first. This blends out colors nicely, a while less accurate, will use less CPU and there's less threshold adjustments you have to do.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.