I've written a program that uses the Depth data from a Kinect, and does blob detection to find a user's hand. However, when using the user's hand to control the mouse, it becomes very jerky, probably because people aren't very good at holding body parts completely still.
I've tried averaging the position based on the last ten positioning's, but that just resulted in lag time without actually preventing jerkiness. The best solution so far that I've used is to not move the cursor if the pixel change is less than 10 in both directions (i.e., a 10 pixel change in either direction results in movement). This is okay, but it's still kinda jerky, and results in a clunky interface because you don't have fine precision.
How can I compensate for the lack of steadiness in the human form so that the mouse isn't so jerky?
This will in any case be a tradeoff between lag and stability.
Check your data. You may find that the jerking is because of low resolution in Kinect. If so the jerking distance will be determined at how close you are to the Kinect cameras. When you are too far away the camera resolution is too low and it will keep bouncing between one or two pixels (stereo cams).
You are thinking in the right direction by calculating average and having a threshold for movement. You say you have calculated average for the last 10 positions, which with a resolution of 30 fps causes a 0,33 second delay.
You may want to average out only the 5 last (experiment), and instead of average calculate the mean value.
Just a thought; movement rarely comes alone, so you could set a threshold for when you decrease the number of samples used for averaging/mean.
What is your sample rate? 10 positions is likely to be just a hundredth of a second. You may want to average the last 10th or 3rd of a second.
Did you try to apply a median filter to the depth map before doing your blob detection? I used that in a finger tracking demo and it greatly improved the steadiness.
A bandwidth between 3 and 5 gave me the best results (5 kills a bit the fps but it's really smooth).
Related
I have 9 stereo camera rigs that are essentially identical. I am calibrating them all with the same methodology:
Capture 25 images of an 8x11 chessboard (the same one for all rigs) in varying positions and orientations
Detect the corners for all images using FindChessboardCorners and refine them using CornerSubPix
Calibrate each camera intrinsics individually using CalibrateCamera
Calibrate the extrinsics using StereoCalibrate passing the CameraMatrix and DistortionCoeffs from #3 and using the FixIntrinsics flag
Compute the rectification transformations using StereoRectify
Then, with a projector using structured light, I place a sphere (the same one for all rigs) of known radius (16 mm) in front of the rigs and measure the sphere using:
Use image processing to match a large number of features between the two cameras in the distorted images
Use UndistortPoints to get their undistorted image locations
Use TriangulatePoints to get the points in homogeneous coordinates
Use ConvertFromHomogeneous to get the points in world coordinates
On two of the rigs, the sphere measurement comes out highly accurate (RMSE 0.034 mm). However, on the other seven rigs, the measurement comes out with an unacceptable RMSE 0.15 mm (5x). Also, the inaccuracy of each of the measurements seems to be skewed vertically. Its as if the sphere is measured "spherical" in the horizontal direction, but slightly skewed vertically with a peak pointing slightly downward.
I have picked my methodology apart for a few weeks and tried almost every variation I can think of. However, after recalibrating the devices multiple times and recapturing sphere measurements multiple times, the same two devices remain spot-on and the other seven devices keep giving the exact same error. Nothing about the calibration results of the 7 incorrect rigs stands out as "erroneous" in comparison to the results of the 2 good rigs other than the sphere measurement. Also, I cannot find anything about the rigs that are significantly different hardware-wise.
I am pulling my hair out at this point and am turning to this fine community to see if anyone notices anything I'm missing in my above described calibration procedure. I've tried every variation I can think of in each step of the above process. However, the process seems valid since it works for 2 of the 9 devices.
Thank you!
In one of my university courses (in Data-structures and Algorithmics), we are given a bonus assignment based on the game Sokoban:
With one Major exception: We only have one crate to push to our goal.
Example input
8 8
MMMMMMMM
M.....?M
M....TTM
M....TTM
M..!...M
M....+.M
M......M
MMMMMMMM
Here the first line gives the dimensions (b x h) of the board (8 by 8 in this case). This is followed up by h lines oh b characters. The meaning of these characters is as follows: . A walkable space, ? the goal (red point in the gif), ! the crate, and + is our position.
We are asked to output the shortest solution to the puzzle. (Note that a puzzle might be unsolveable.) We output this in 2 lines, the first tells us how many moves, and the second tells us the correct path. For the example, this would be:
Example Output
10
WWNNNWNEEE
Now, finding an algorithm that works isn't really an issue. Seeing as we're looking for the shortest path, and the nodes on this specific graph are in essence unweighted, I've implemented a breadth first search. In broad strokes, my current implementation looks like this:
0. Since the maze doesn't change, describe each state as a whole number based on the coordinates
of the crate and the player. - This defines a state uniquely and reduces memory costs.
1. Create a dictionary of visited states.
2. Get the input positions of the goal, crate and player.
3. Set up a Queue of move sequences.
4. Pop a move sequence from the Queue.
5. If this move sequence wins the game, go to step 8.
6. Make new move sequences which are copies of the original, each with a different legal move appended.
7. Append these new move sequences to the Queue.
8. Go to step 4
9. Print the output.
This is, of course a relatively simple algorithm. The problem is that it isn't fast enough. In one of the final test cases, we're thrown a 196 x 22 maze like "level" which has a solution that takes 2300 steps. We're asked to solve this level within 10 seconds, but it takes my algorithm more than 10 minutes.
Because of that, I'm kinda at a loss. I've already managed to increase the algorithm's speed 10 fold, and I still have 2 orders of magnitude to go...
Hence why I'm asking here: What makes this algorithm so slow, and how can I speed it up?
Yes, your comprehensive BFS search will be slow. You spend a large amount of your tree search in moves that are utterly wasted, your player thrashing around the maze area to no avail.
Change the focus of your goal: first, solve the maze for the crate rather than sending the player every which way. Include a heuristic for moving the crate closer to the goal spot. Make sure that the crate moves are possible: that there is a "push from " spot available for each move.
One initial heuristic is to make a maze fill by raw distance to the goal start at either the goal (what I've done here) and increment the steps through the maze, or start at the box and increment from there.
MMMMMMMM
M54321?M
M6543TTM
M7654TTM
M876567M <== crate is on the farther 6
M987678M <== player is on the nearer 7
Ma98789M
MMMMMMMM
Here, you would first try to find legal pushes to move the box along the path 654321?. You can also update this by making a penalty (moving the player without pushing) for any direction change.
These heuristics will give you a very good upper bound for a solution; you can then retrace decision points to try other paths, always keeping your "shortest solution" for any position.
Also keep track of where you've been, so that you don't waste time in position loops: never repeat a move (position and direction).
Does that help you get going?
Instead of using a pure dfs search of the player's movements, consider only the crate moves available to you at the time. For instance, in the very first frame of your gif, at the beginning of the simulation, the only crate move possible is the top one to the right one square.
An analogy would be for a game of chess on the first move, you would not consider any queen or bishop moves since they are all blocked by pawns.
After you've successfully found the sequence of crate moves leading to the solution, come back and trace the player moves necessary to construct the sequence of crate moves.
This improves time complexity because the time complexity will be based on the number of crates present in the map instead of total squares.
I've got two polygons defined as a list of Vectors, I've managed to write routines to transform and intersect these two polygons (seen below Frame 1). Using line-intersection I can figure out whether these collide, and have written a working Collide() function.
This is to be used in a variable step timed game, and therefore (as shown below) in Frame 1 the right polygon is not colliding, it's perfectly normal for on Frame 2 for the polygons to be right inside each other, with the right polygon having moved to the left.
My question is, what is the best way to figure out the moment of intersection? In the example, let's assume in Frame 1 the right polygon is at X = 300, Frame 2 it moved -100 and is now at 200, and that's all I know by the time Frame 2 comes about, it was at 300, now it's at 200. What I want to know is when did it actually collide, at what X value, here it was probably about 250.
I'm preferably looking for a C# source code solution to this problem.
Maybe there's a better way of approaching this for games?
I would use the separating axis theorem, as outlined here:
Metanet tutorial
Wikipedia
Then I would sweep test or use multisampling if needed.
GMan here on StackOverflow wrote a sample implementation over at gpwiki.org.
This may all be overkill for your use-case, but it handles polygons of any order. Of course, for simple bounding boxes it can be done much more efficiently through other means.
I'm no mathematician either, but one possible though crude solution would be to run a mini simulation.
Let us call the moving polygon M and the stationary polygon S (though there is no requirement for S to actually be stationary, the approach should work just the same regardless). Let us also call the two frames you have F1 for the earlier and F2 for the later, as per your diagram.
If you were to translate polygon M back towards its position in F1 in very small increments until such time that they are no longer intersecting, then you would have a location for M at which it 'just' intersects, i.e. the previous location before they stop intersecting in this simulation. The intersection in this 'just' intersecting location should be very small — small enough that you could treat it as a point. Let us call this polygon of intersection I.
To treat I as a point you could choose the vertex of it that is nearest the centre point of M in F1: that vertex has the best chance of being outside of S at time of collision. (There are lots of other possibilities for interpreting I as a point that you could experiment with too that may have better results.)
Obviously this approach has some drawbacks:
The simulation will be slower for greater speeds of M as the distance between its locations in F1 and F2 will be greater, more simulation steps will need to be run. (You could address this by having a fixed number of simulation cycles irrespective of speed of M but that would mean the accuracy of the result would be different for faster and slower moving bodies.)
The 'step' size in the simulation will have to be sufficiently small to get the accuracy you require but smaller step sizes will obviously have a larger calculation cost.
Personally, without the necessary mathematical intuition, I would go with this simple approach first and try to find a mathematical solution as an optimization later.
If you have the ability to determine whether the two polygons overlap, one idea might be to use a modified binary search to detect where the two hit. Start by subdividing the time interval in half and seeing if the two polygons intersected at the midpoint. If so, recursively search the first half of the range; if not, search the second half. If you specify some tolerance level at which you no longer care about small distances (for example, at the level of a pixel), then the runtime of this approach is O(log D / K), where D is the distance between the polygons and K is the cutoff threshold. If you know what point is going to ultimately enter the second polygon, you should be able to detect the collision very quickly this way.
Hope this helps!
For a rather generic solution, and assuming ...
no polygons are intersecting at time = 0
at least one polygon is intersecting another polygon at time = t
and you're happy to use a C# clipping library (eg Clipper)
then use a binary approach to deriving the time of intersection by...
double tInterval = t;
double tCurrent = 0;
int direction = +1;
while (tInterval > MinInterval)
{
tInterval = tInterval/2;
tCurrent += (tInterval * direction);
MovePolygons(tCurrent);
if (PolygonsIntersect)
direction = +1;
else
direction = -1;
}
Well - you may see that it's allways a point of one of the polygons that hits the side of the other first (or another point - but thats after all almost the same) - a possible solution would be to calculate the distance of the points from the other lines in the move-direction. But I think this would end beeing rather slow.
I guess normaly the distances between frames are so small that it's not importand to really know excactly where it hit first - some small intersections will not be visible and after all the things will rebound or explode anyway - don't they? :)
I am developing an application for an oscilloscope in c# .NET, I am drawing different kinds of waves (sine, square etc..) with the help of zedgraph control.
I get values from oscilloscope and stored in a buffer of size 1024(byte array) and have to calculate parameters like time period, Frequency, rise time, fall time etc at run time.
for this purpose i have to extract only a single cycle of whole signal.one more problem is that values are not always rise or fall continuously mean values are stored in buffer like this[0,0,0,1,1,2,3,4,5,5,6,6,6,5,5,4,3,2,1,1,0,0,0..........]. signals are continuously receive from machine.
it is not sure that waves are always oscillating around zero.
Thanks
Regards
Nilesh
You can estimate the frequency a number a of ways. Probably the easiest, if you have a math lib, is to compute the FFT and take the lowest frequency.
Alternatively you can check the zero crossings(around the mean value). The faster it oscillates about 0 the higher its frequency. Similarly the extrema tell you a lot about the frequency(think of a sinusoid whose extrema and zeroes alternate and are evenly spaced).
There is also a transform called the period transform but I don't remember it too much. I saw it in a book about music for finding the tempo of a song.
http://www.cs.berkeley.edu/~vazirani/s09quantum/notes/lecture4.pdf
Another way might be to use the auto-correlation and when it is large it means the function is in "sync" with itself(assuming it doesn't change shape to fast). and it should be easy to calculate the distance between these the maximums.
You could find out the time period between a crest and a trough, which will give you half the wavelength for that particular wave.
For graph 1, the first trough is 2, the first crest is 12. Find out the time taking between these points, and you have half the wavelength.
For graph two, the same principle applies, you can calculate the wavelength (and thus the period) for each section of the graph
I am working on a project with a robot that has to find its way to an object and avoid some obstacles when going to that object it has to pick up.
The problem lies in that the robot and the object the robot needs to pick up are both one pixel wide in the pathfinder. In reality they are a lot bigger. Often the A* pathfinder chooses to place the route along the edges of the obstacles, sometimes making it collide with them, which we do not wish to have to do.
I have tried to add some more non-walkable fields to the obstacles, but it does not always work out very well. It still collides with the obstacles, also adding too many points where it is not allowed to walk, results in that there is no path it can run on.
Do you have any suggestions on what to do about this problem?
Edit:
So I did as Justin L suggested by adding a lot of cost around the obstacles which results in the folling:
Grid with no path http://sogaard.us/uploades/1_grid_no_path.png
Here you can see the cost around the obstacles, initially the middle two obstacles should look just like the ones in the corners, but after running our pathfinder it seems like the costs are overridden:
Grid with path http://sogaard.us/uploades/1_map_grid.png
Picture that shows things found on the picture http://sogaard.us/uploades/2_complete_map.png
Picture above shows what things are found on the picture.
Path found http://sogaard.us/uploades/3_path.png
This is the path found which as our problem also was before is hugging the obstacle.
The grid from before with the path on http://sogaard.us/uploades/4_mg_path.png
And another picture with the cost map with the path on.
So what I find strange is why the A* pathfinder is overriding these field costs, which are VERY high.
Would it be when it evaluates the nodes inside the open list with the current field to see whether the current fields path is shorter than the one inside the open list?
And here is the code I am using for the pathfinder:
Pathfinder.cs: http://pastebin.org/343774
Field.cs and Grid.cs: http://pastebin.org/343775
Have you considered adding a gradient cost to pixels near objects?
Perhaps one as simple as a linear gradient:
C = -mx + b
Where x is the distance to the nearest object, b is the cost right outside the boundary, and m is the rate at which the cost dies off. Of course, if C is negative, it should be set to 0.
Perhaps a simple hyperbolic decay
C = b/x
where b is the desired cost right outside the boundary, again. Have a cut-off to 0 once it reaches a certain low point.
Alternatively, you could use exponential decay
C = k e^(-hx)
Where k is a scaling constant, and h is the rate of decay. Again, having a cut-off is smart.
Second suggestion
I've never applied A* to a pixel-mapped map; nearly always, tiles.
You could try massively decreasing the "resolution" of your tiles? Maybe one tile per ten-by-ten or twenty-by-twenty set of pixels; the tile's cost being the highest cost of a pixel in the tile.
Also, you could try de-valuing the shortest-distance heuristic you are using for A*.
You might try to enlarge the obstacles taking size of the robot into account. You could round the corners of the obstacles to address the blocking problem. Then the gaps that are filled are too small for the robot to squeeze through anyway.
I've done one such physical robot. My solution was to move one step backward whenever there is a left and right turn to do.
The red line is as I understand your problem. The Black line is what I did to resolve the issue. The robot can move straight backward for a step then turn right.