I'm trying to get started with Kinect, and it has a depth sensing camera, but I've seen no guidance on measuring width/height/lengths.
Is it a matter of working out the distance an object is away from the camera (depth sensor) and at that range the field of view of the Kinect, and then working out how many pixels your object takes up?
I'd like to be able to do a mesh or something from a dot-cloud and I'm having troubles figuring where to start and how to get proper width/height measurements for objects.
This is a rather complex task and cannot be answered with a few paragraphs here on Stackoverflow. The reason is that its a lot of knowledge that builds on other knowledge. I would start by reading up on Linear Algebra using for example the excellent Rorres, et al.
Creating the mesh from the point-cloud is a complex task and there is no defacto algorithm used today. The most popular approach seems to be to first create a discretized Truncated Signed Distance Function (TSDF) and then use e.g. Marching Cubes to get a mesh. Another algorithm is the Delaunay triangulation.
There is also a c# implementation provided by the s-hull project.
In the book Beginning Kinect Programming with the Microsoft Kinect SDK by Jarrett Webb,James Ashley
you have in chapter 3 a sample of how to calculate width and height and distance :
http://books.google.es/books?id=MupB_VAmtdEC&pg=PA69&hl=es&source=gbs_toc_r&cad=4#v=onepage&q&f=false
The code is available to download at apress.com
I am aware this was asked awhile ago, but for what it's worth there is an article on channel9.msdn which shows a simpler method to calculate the user's height. I would imagine you could use a similar technique with HipLeft, HipCenter, and HipRight to estimate the width.
If you follow the project information URL from the link it has more detailed information.
Related
I want different images to be displayed from different point of view. For the whole concept explaination please look at the images. they explain my idea/query!
As in the first image you see that there are three people at different angle looking at the monitor. Now i want the webcam to track the eyes and show the particular defined image to the user> For example: If user is at 45 degree angle then show image1.png
Depending upon the user's prespective of watching. The computer should show the image.
(the lady is the game character for representation purpose)
Can you please guide me on what steps can be taken to accomplish this? Is there any plugin available for unity that tracks faces? Please guide me
Also thanks for the compliments on my sketching skills xD
Stackoverflow is not really meant to recommend plugins, since the choice is usually opinion based so there is no exact answer.
That being said, on of the most common used API for computer vision (meaning interpreting images, including face recognition) is OpenCV, so that could be a good start for you to look at that.
And fortunately for you, there is a Unity plugin for OpenCV
It is too broad to give you more details about how it works here. You should try to make it work, and if you have a problem with your code, open a new question with the code portion that you struggle with.
PS: nice sketching skills
Perhaps easier option would be to use Kinect
(trying to detect face or eyes from that far might be shaky?)
With Kinect you can get skeletons for multiple people, and getting the angle between target and those kinect avatars would be easy.
If there is no space to put kinect in good position,
could consider placing it on the ceiling above (and then use depth data only to detect people in its view)
Only issue is that apparently Microsoft has stopped Windows kinect support,
so you would need to find 2nd hand versions.. (Unity Asset store still has some kinect plugins and examples available)
https://www.polygon.com/2018/1/2/16842072/xbox-one-kinect-adapter-out-of-stock-production-ended
Or look for kinect alternatives that work with unity, try RealSense cameras:
https://www.intel.sg/content/www/xa/en/architecture-and-technology/realsense-overview.html
I'm developing an application for the Kinect for my final year university project, and I have a requirement to develop a number of gesture recognition algorithms. I'd appreciate some advice on this.
My initial algorithm is detecting the users hand moving closer towards the kinect, within a certain time frame. For now i'll say this is an arbitrary 500ms.
My idea is as follows:
Record z-axis position every 100ms and store in List.
Each time a new position is recorded, check the z-position for each of the previous 4 positions in the List.
If the z position has varied by the required distance between any of those individually or collectively, fire off a gesture recognised event.
If gesture recognised, clear List, and start again.
This is the first time that I have tried anything like this, and would like some advise on my initial naive implementation.
Thanks.
Are you going to use the official Kinect SDK or opensource drivers(libfreenect or OpenNI) ?
If you're using the Kinect SDK you can start by having a look at something like:
Kinect SDK Dynamic Time Warping (DTW) Gesture Recognition
Candescent NUI
(Candescent NUI focuses more on finger detection though)
If you're planning to use opensource drivers, try OpenNI and NITE.
NITE comes with hand tracking and gestures(swipe, circle control, 2d sliders, etc.).
The idea is to at least have hand detection and carry on from there. If you've got that, you could implement something like an adaptation of the Unistroke Gesture Recognizer or look into other techniques like Motion Templates/MotionHistory, etc....adapting them to the new data you can play with now.
Goodluck!
If you're just trying to recognise the user swinging her hand towards you, your approach should work (despite being very susceptible to misfiring due to noisy data). What you're trying to do falls very nicely in the field of pattern recognition. For this, and very similar tasks, people very often use hidden Markov models with great success. You might want to check the Wikipedia article. I'm not a C# person, but as far as I know, Microsoft has very nice statistical inference libraries for C#, and they will definitely include HMM implementations.
I want to develop a "People Counting System" using OpenCV (or Emgu CV).
Please guide me on how to implement or lead me to some examples or open source projects.
(I have done some work: extracting diff then threshold to delete background, using motion history and like that; still no good results.)
Edit 1: I am counting a high people flow (a dozen of them may come through simultaneously).
Edit 2: It must be at least 80% accurate. People are walking through a door that is almost 5 meters wide. The problem is I have no control on the position or angle of the camera. Camera is shouting the place from a 10m distance at a 2.5m height.
Thank you
If you call a people counting system a system that counts people that are in a room then I recommend you implement the hardware with a microcontroller with 2 lazers(normal lazer toys work) and 2 photoresistors.For the microcontroller I recomen you use Arduino.And then make an C# application that has a SerialPort object and reads the data that the arduino sends through the USB.The arduino will send 1 for "someone entered the room" and 0 for "someone left the room" for example.Then the logging and statistics can be done easily in C#.
Arduiono Site:here
Photoresistor for $1: here
This solution is alot cheaper and easyer to implement than using a camera that is with a fairly good quality.
Hope I helped you.
Check out the HOG pedestrian detector that comes with recent versions of OpenCV (>= 2.2).
See modules/objdetect/src/hog.cpp and samples/cpp/peopledetect.cpp in the OpenCV sources. Unfortunately there is no official documentation about it yet.
This would help you to count moving things including people: Motion Detection project on CodeProject
Are people the only kind of "entities" in the scene? If this is not the case, do you care about considering a person some other kind of thing that moves through the scene? Because if that is the case, you could just count blobs that come in or come out from the scene. It may sound a bit naive but I will take some kind of motion image, group motion pixels by distance in clusters. Your distance metric could take into account some restrictions, such as that people will "often" stand so pixels in a cluster should group around some kind of regression line (an straight-up line if the camera is aligned with de floor). It shouldn't be necessary to track them in the scene, just noticing when they enter or they leave, though you'd get some issues with, for example, people entering on their own in the scene and leaving in pairs or in groups... Good luck :)
I think if you have dense people crowd with a lot of occlusions you have to use some machine learning algorithm, for example you can use Implicit Shape Model for features.
It really depends on the position of the camera. Assuming that you can get front facing profiles of the people in the images:
This problem is basically face detection and recognition.
There are many ways to go about finding faces, but this is the approach that I'm a little more familiar with.
For the face detection you need to do image segmentation on the skin tone color. This will extract skin regions. [Arms, the chest (for those wearing V cut tops), face, legs, etc] Then you would need to line up the profiles of the skin regions to the profile of your trained faces.
[You'll need to use Eigenfaces to create a generic profile of what a face looks like]
If the skin region lines up and doesn't devate too far from the profile, then it is considered a face. Once the face is confirmed, then add it into the eigenfaces data store [for recognition]. To save processing you might want to consider limiting the search area if you are looking for a previous face. [Given the frame rate, and last time the person was seen]
If you are referring to "Crowd flow" I think you just mean the density of faces in a crowd.
Now you've confirmed that a moving object in the video is a person. Now you just need to note that and then make sure that you don't consider them as a new person again.
This approach: Really depends on your ability to detect face regions. This may not work if the people in the video are looking down, not fitting the profile of the trained data etc. Also it may be effected if a person puts on sunglasses within the video. [Probably would be considered a "new face"]
I want to recognize handwriting shape and figure out which shape it probably is in the set. Simply saying, if I draw a triangle, the application should recognize it as an triangle.
How can I do this using C# or java, any help is appreciated.
Thanks in advance.
These are some of the shapes I need to identify
You can try to use OpenCV for that. EmguCV is a good wrapper to OpenCV for .net. Watch for ShapeDetection demo (included in OpenCV)
If you want to "roll your own" I would suggest the following steps:
First, skeletonize (thin out the image till all the lines are one pixel thick). There are many ways to do this, and it is a well studied problem. Google for more information.
Now, starting at a black pixel, go through and trace out the outline of the image, one pixel at a time. You add each of these segments to a list of segments outlining the shape (each segment will be a simple line from one pixel to its adjacent pixel). Now you have the outline of your shape as a many-sided polygon.
(Possible step at this point: smooth the outline by pulling each vertex closer to the average of its neighbors)
Now, you use a corner detection algorithm to find the corners (take a look here:http://visual.ipan.sztaki.hu/corner/node7.html).
This should be enough to identify the shapes you have listed.
If you want to get smarter, you can also identify the types of edges that exist between corners. If the segment between two corners stays within some threshold of the straight line between them, you treat it as a "straight line" edge. If it doesn't, you treat it as a curving edge.
With corners +straight/curving edge, you probably could detect any of the shapes you are looking for pretty well.
I'd suggest using a neural network.
You could teach it what the shapes look like.
This is one library for example:
Neural Networks on C#
If you are looking for particular shapes inside a larger image then OpenCV is a great alternative. Emgu.CV is a good .Net wrapper for it. See my picture of a SURF implementation for this. Also see other options in OpenCV, it has plenty to offer. Note that this approach requires a lot of processing power.
If you can easily identify the shape you want as a BLOB (that is, give the algorithm a picture of only this shape) you can do a search for "ANN OCR" ("Artificial Neural Networks" and "Optical Character Recognition"). Many (most?) ANN-implementations come with sample code for feeding it shapes (letters) and recognizing closest shape (hand written letters). For example Neural Network OCR. I believe this approach would solve your problem. (Sidenote: I've encountered and tested numerous libs that can do this. It's Neural Networks 101.)
If you need BLOB algorithms for the ANN-OCR OpenCV can provide this.
Both these approaches are farily easy to implement.
There is indeed a vast tree of research in shape recognition.
If your shapes are indeed some what predictable and are basic geometry,
the most straightforward way is to find the edges and apply hough transform.
Some managable reading materials for you to start with,
[1] Google Scholar for Hough Transform Shape Detection
http://scholar.google.com/scholar?q=hough+transform+shape+recognition&hl=en&as_sdt=0&as_vis=1&oi=scholart
[2] Hough Transform # Wiki http://en.wikipedia.org/wiki/Hough_transform
image http://prod.triplesign.com/map.jpg
How can I produce a similar output in C# window forms in the easiest way?
Is there a good library for this purpose?
I just needs to be pointed in the direction of which graphic library is best for this.
You should just roll your own in a 3d graphics library. You could use directx. If using WPF it is built-in, you can lookup viewport3d. http://msdn.microsoft.com/en-us/magazine/cc163449.aspx
In graphics programming what you are building is a very simple version of a heightmap. I think building your own would give your greater flexibility in the long run.
So a best library doesn't exist. There are plenty of them and some are just for different purposes. Here a small list of possibilities:
Tao: Make anything yourself with OpenGL
OpenTK: The successor of the Tao framework
Dundas: One of the best but quite expensive (lacks in real time performance)
Nevron: Quite good, but much cheaper (also has problems with real time data)
National Instruments: Expensive, not the best looking ones, but damn good in real time data.
... Probably someone else made some other experiences.
Checkout Microsoft Chart Controls library.
Here's how I'd implement this using OpenGL.
First up, you will need a wrapper to import the OpenGL API into C#. A bit of Googling led me to this:
CsGL - OpenGL .NET
There a few example programs available to demonstrate how the OpenGL interface works. Play around with them to get an idea of how the system works.
To implement the 3D map:
Create an array of vectors (that's not the std::vector/List type but x,y,z triplets) where x and y are along the horizontal plane and z is the up amount.
Set the Z compare to less-than-or-equal (so the overlaid line segments are visible).
Create a list of quads where the vertices of the quads are taken from the array in (1)
Calculate the colour of the quad. Use a dot-product of the quad's normal and a light source direction to get a value to shade value, i.e. normal.light of 1 is black and -1 is white.
Create a list of line segments, again from the array in (1).
Calculate the screen position of the various projected axes points.
Set up your camera and world->view transform (use the example programs to get an idea of how to do this).
Render the quads and lines, OpenGL will do the transformation from world co-ordinates (the list in (1)) to screen space. Draw the labels, you might not want to do this using OpenGL as the labels shouldn't scale with distance from camera, otherwise they could get too small to read.
Since the above is quite a lot of stuff, there isn't really the space (and time on my part) to post working code (but someone else might add something if you're lucky). You could break the task down and ask questions on the parts you don't quite understand.
Have you tried this... gigasoft data visualization tools (Its not free)
And you can checkout the online wireframe demo here