Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I'm developing a Windows application that would allow the user to fully interact with his computer using a Kinect sensor. The user should be able to teach the application his own gestures and assign each one of them some Windows event. After the learning process, the application should detect user's movements, and when it recognizes a known gesture, the assigned event should be fired.
The crucial part is the custom gesture recognizer. Since the gestures are user-defined, the problem cannot be solved by hard-coding all the gestures directly into the application. I've read many articles discussing this problem, but none of them has given me the answer to my question: which algorithm is the best for learning and recognizing user-defined gestures?
I'm looking for algorithm that is:
Highly flexible (the gestures can vary from simple hand gestures to
whole body movements)
Fast & effective (the application might be used
with video games so we can't consume all of the CPU capacity)
Doesn't require more than 10 repetitions when learning new gesture (repeating gesture more than 10 times to teach application is in my opinion not very user friendly)
Easy to
implement (preferably, I want to avoid struggling with two-page
equations or so)
Note that the outcome does not have to be perfect. If the algorithm recognizes wrong gesture from time to time, it is more acceptable than if the algorithm runs slow.
I'm currently deciding between 3 approaches:
Hidden Markov Models - these seem to be very popular when comes to gesture recognition, but also seem pretty hard to understand and implement. Besides, I'm not sure if HMM are suitable for what I'm trying to accomplish.
Dynamic Time Warping - came across this site offering gesture recognition using DTW, but many users are complaining about the performance.
I was thinking about adapting the $1 recognizer to 3D space and using movement of each joint as a single stroke. Then I would simply compare the strokes and pick the most similar gesture from the set of known gestures. But, in this case, I'm not sure about the performance of this algorithm, since there are many joints to compare and the recognition has to run in real-time.
Which of these approaches do you think is most suitable for what I'm trying to do? Or are there any other solutions to this problem? I would appreciate any piece of advice that could move me forward. Thank you.
(I'm using Kinect SDK.)
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Closed 9 years ago.
Improve this question
I'm totally new to windows development.I'm coming from Objective-c but now i wanna start to develop for the solution Kinect-Windows. I have to choose between C++ and C# , one of this languages is more appropriate to kinect development? I'm inclined to C++ but i don't know if C# will made all things easier, maybe more support for kinect?
EDIT
Another question, i need to buy the Kinect window sensor ? Or to develop i can use a standard xbox Kinect sensor?
Assuming you are using the official Kinect SDK, it supports C++, C# and VB. Use the language which best suites your needs.
To answer your second question, you can use the Kinect for Windows sensor or the Kinect for Xbox 360 sensor. The choice is yours.
However, there are some notable differences. This blog post does a good job of explaining them. Below are the main features that the Windows sensors offers over the Xbox sensor, taken from the blog encase the link breaks in the future.
Near mode: Enables the camera to see objects as close as 40 centimeters
in front of the device without losing accuracy or precision, with
graceful degradation out to 3 meters.
Seated or “10 joint” mode: Skeletal tracking which provides the
capability to track the head, neck and arms of either a seated or
standing user.
USB cable: Ensures reliability across a broad range of computers and
improves coexistence with other USB peripherals.
Extended camera settings: Provides extra settings such as brightness,
exposure, etc. so you can tune it even more.
Kinect Fusion: Maps the environnement to 3D on the fly or lets you use
object replacement.
Handgrip: Hand detection enables you to implement gestures like
pinch-to-zoom, grab, etc. to improve your apps and build whole new
kind of applications.
Licensing: When you want to go public with you’re application you’ll need to use a Kinect for Windows. Kinect for Xbox 360 isn’t legal.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I've done some volunteer game programming in LPC for a MUD in my past and everything there was easy. If I wanted a new item I would just use a function to load (for example an NPC) as many times over as I wanted. Now I want to program my own little game I cannot for the life of me identify what I even need to do. If I get nothing more than the name of what I want to do in order to carry out my own further research, that would be enough. Having rambled on about all of that, on to my question:
I want to make on the fly instances of in-game objects (for example people), some handled by the computer, other handled by the player. A lot of the help on game programming I've found has been about making sprites move and handling collision detection. This is all great, but I want to code a strategy game and so am more interested in creating some sandbox flexibility within my game and coding the AI to provide the interest, rather than swish graphics and awesome sounds, etc. I want to setup the game with a variable number of randomly generated people for the player to interact with. So far I've created a class to handle the people, but I'm now stuck as each instance of the class needs a unique name and I programming in that would mean there was no randomness in the number.
What would I need to look up to achieve what I'm after? What would it be called? Have I even explained myself with any degree of eloquence?
Thank you in advance for any potential help that might come my way.
Matt.
What you're looking for is something like RPG name generators (google it).
Basically those sites you'll find generate their names out of a random combination and concatenation of prefixes, syllables and suffixes. I don't know where there is research on this topic, but you could start with something like that.
Edit: you could use Markov chains, they are used for text generation.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I need to make an aplication for windows phone that uses augmented reality inside a building.
It's just for one building.
Can anyone tell me if is that even possible, because is indoors and GPS wont work.
I'm thinking on doing a Matrix were i put manualy all divisions and points of interest and so on (i will need to apply dijstrka or A* so the matrix is needed anyway).
But how can i navigate and use AR with that matrix in windows phone? Is it possible?
If so can anyone provide some tutorial or sample? Or some clues to get me in the right direction.
Thank you all in advance
I've written quite a few AR apps for WP. There are one major problems with this: to do AR on such a small scale, points of interest within such a small area like a building requires very high accuracy for both location and orientation to be useful. As you say your GPS isn't really working inside and even if it did on that scale you need at least accuracy down to ~1 foot. Secondary you need pretty accurate angular precision, but again the motion sensor in the phone isn't that accurate even in an outside environment. It gets way worse inside a building because the metal in the construction can significantly offset the compass (I often see errors >90° inside).
so unless you find some external location and orientation sensor that can work at that accuracy inside, I would say its not possible to make anything that's really useful.
I do have an article how to at least do AR rendering on a windows phone at my blog http://www.sharpgis.net/post/2011/12/07/Building-an-Augmented-Reality-XAML-control.aspx but again I wouldn't expect that great a result in your case.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I would like to make a 3d animation of a few dozens of primitives
I read the initial position of each atom and with other parameters I need to draw an animation that describe the ordering. I need to view then of all angles, manipulate the color and alpha channel.
All in all, it narrows to WPF 3D or XNA.
I would choose WPF, because I know it and it is a way much easier to add a TextBox to the app.
Yet I am afraid that WPF won't handle the animation or it would be problematic if I need to add some extra bits.
Which one is better for it? Can you point me to some examples how to manage datastructres and draw the animation? (I am new to 3d animations)
EDIT:
If anyone is interested in here the code and the app http://alloysvisualisation.codeplex.com/ I use Model3DGroup to build a mode from GeometryModel3D. As a result, performance is pretty good.
We use both WPF and XNA for 3D rendering in the projects I'm currently working on. The XNA stuff is used from within a WPF application, so there is no problem using those two combined either.
I would use WPF for the 3D rendering if you have a simple scene since I like the abstraction it gives you. You simply add objects to the scene, set up some cameras and lights and the rest is handled for you. We use WPF for a visual editor containing up to ~500 cylinders without any performance problems. The one performance issue we have had is that it takes a long time to add/remove objects from the scene. So WPF is not a good fit if that is something you need to do a lot of.
However, if you have more complex needs or if the application is performance critical I would go with XNA to get closer to the metal. It gives you a more "classic", OpenGL like approach to rendering.
To summarize:
WPF:
Simple to use
Nice abstraction
Ok performance overall
Poor performance when adding/removing objects
XNA:
Close to the metal
Good performance
Feels more like OpenGL if you are used to that
Works with both WinForms and WPF
A resource that we have used for the WPF side of things is Charles Petzold's writings on the subject. We currently use stuff from his library. It got us started quickly, but we've had some issues, so I would recommend some caution. However, the stuff on his site is worth looking at.
XNA + WPF is the solution. This article will guide you through the process: http://www.codeproject.com/Articles/38301/XNA-integration-inside-WPF
Definetely XNA is faster and more convenient for extensive 3D work. WPF is fine for 3D, but XNA is easier to debug and offers several nice features, such as additive alpha blending and a Camera object. Also, you can find good libraries to support your work.
If you are not sure, then you should without doubt use an XNA object to draw the graphics. WPF will provide you with buttons and text, which, in turn, are a pain in XNA.
Xna will offer easier debugging of the 3d aspects.
You could follow the winforms example to host xna in a windows forms form and then all the familiar controls like drop downs and buttons will work fine and interact with xna.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I am a pretty good programmer and I am working on a minecraft like block building game for Xbox. I have about 10 thousands blocks in my game but when ever I run it on my xbox I have some really bad lag problems. One thing I did that kind of helped was setting all objects to null after using them but I am still having issues. How do most game developers solve this problem??? I thought of only drawing blocks that are close to the player but I think that using a loop to cycle through all the blocks in the world would slow it down even more.
You're on the right track, you definitely only want to be drawing things in the immediate vicinity if at all possible.
Quadtrees and octrees are data structures designed to slice up 2D/3D space respectively to make finding objects in a given area very easy. Sounds like this is what you are looking for.
You could use either, depending on what you wanted your definition of "nearby" to be. If you wanted to achieve the same as Minecraft, then what Minecraft does is display entire columns of blocks, so you could get away with a quadtree used to manage things on the X/Z coordinates and always show everything on the Y. If you wanted to do a 3D based definition of nearby, then you'd need a octree.
The way these work is by partitioning space using a tree structure. Each branch in the tree represents a quadrant (or octant in the case of an octree) of the available space, and each subsequent branch is a quadrant of that quadrant. Hence, it is very easy to drill down to a specific area. The leafs of the tree hold the actual data, ie. the blocks that make up your world.