I am trying to create a simple scene where a few objects are placed on the table. Object placement works perfect but when I move the device, the objects drift around a bit. Which at one point makes the objects placed at the corner feel like they are not on the table but floating in the air.
Even in the sun moon and earth example in Unity examples here: https://github.com/googlesamples/tango-examples-unity
The earth n moon drifts as you move the device
Is this a bug or is there any special setting which I'm missing?
The objects drift because as the Tango device moves through space, it is only tracking its own position in 3D space. For objects to remain static in a dynamic environment, the device needs to understand the position of the placed objects in 3D space and their relation to the surroundings in order to anchor the objects and reduce drift.
Luckily, TangoCore has you covered here and the 3 Core technologies of Motion Tracking, Depth Perception and Area Learning all work together to help out.
If I'm not mistaken, the Sun and Moon example is the scene "SimpleAugmentedReality" under tango-examples-unity / UnityExamples / Assets / TangoSDK / Examples / Scenes /
However if you would like to anchor the objects in 3D space and reduce drift, you'll need to use Area Learning and Depth Perception as well. Area learning performs Loop Closures as the device realises it has "seen" an area before and adjusts the path and markers to provide a more accurate device and augmented content position.
So here is what you can do to learn what you need to. Save your current scene, go to open Scene and follow this path tango-examples-unity / UnityExamples / Assets / TangoSDK / Examples / Scenes / and load up some of the other scenes to get an understanding of how the technologies intertwine.
For example, you could load up the ExperimentalMeshBuilderWithColour scene, and learn how the Depth Processing works programmatically, and then load the MotionTracking scene and learn how to access and use Motion Tracking from the TangoManager Game Object. And finally (also probably most frustratingly difficult) learn how Area Learning is managed with the AreaDescriptionManagement and AreaLearning scenes.
This will not only solve your drift issues, but also give you a much fuller understanding of the capabilities of the Tango Technology and allow you to express your ideas much easier.
Related
Ok here is what Im trying to achieve - when user's feet collide with grass mesh, the mesh is secured to the ground at the base but the tops move away, then bounce back when foot is no longer there:
This example, and everything I've found however, are using billboarding/2d as grass. I want 3d, like these:
https://assetstore.unity.com/packages/3d/vegetation/plants/grass-toon-76674
Where some sort of collision is occurring, or Ive tried experimenting with unity's trees. Trying to find the most CPU/GPU effective (for Oculus app or mobile) way to do this.
Is this possible - to have interactive, 3d not 2d grass?
The only thing I can think of is to model each blade of grass as several bars connected by universal joints (I'm not sure what kind they would be in unity, but that is the mechanisms term).
Each joint could have a torque proportional to its angle, so the more you bend it, the harder it tries to flick back. Then its just a matter of adding forces to the blade structures as you bump into them.
This is more of a engineering solution than a programming one, and would require a little bit of math, so its probably not ideal. but it should work.
I am working on a Unity leap motion desktop project using latest versions of Orion and unity prefabs package. I have created a simple scene pretty much identical to the one in Desktop Demo scene. (pair of capsule hands and a simple object you can poke and move around)
https://developer.leapmotion.com/documentation/csharp/devguide/Leap_Coordinate_Mapping.html
This article covers all of the issues I am currently facing but so far, I was unable to implement these solutions in my project.
When moving hands from the maximum range of the camera for example left to right or any other direction, this only translates to a portion of available screen space, in other words, a user will never be able to reach out to the edges of a screen with their hands. From my understanding, the tracking info provided in millimetres by the camera is somehow translated into units that Unity can understand and process. I want to change that scale.
From the article, "You also have to decide how to scale the Leap Motion coordinates to suit your application (i.e. how many pixels per millimetre in a 2D application). The greater the scale factor, the more affect a small physical movement will have." - This is exactly what I want to do in Unity.
Additionally, even being able to what I think was a successful attempt at normalisation of coordinates using the InteractionBox, I am unsure what to do with the results. How or rather, where do I pass these values so that it will display the hands in an updated position.
I am trying to make a app in c# that will detect / classify 3 poses of human body which are standing, sitting and lying. I can correctly detect / classify 2 of them (sitting and standing) with skeleton tracking. When it comes to lying on the floor, Kinect seems to not be able to track skeleton of a human body.
Does anyone have any experiences with skeleton tracking in lying position? As soon as I lye down, I am loosing joints positions. Is this task impossible? Thank you.
The Windows SDK for Kinect V1 is not great at recognizing people lying down.
V2 improves a lot on this, I would recommend you to participate in the beta or wait for V2.
One possible solution with V1, is to place a second camera vertically on the floor, use seated mode - so it detects on motion - and see if you find a person with that camera and not on the other, then it is likely lying down.
I have not tested this solution, but in theory it can work - try to set it up yourself and see if it fits your scenario needs.
I want to develop a "People Counting System" using OpenCV (or Emgu CV).
Please guide me on how to implement or lead me to some examples or open source projects.
(I have done some work: extracting diff then threshold to delete background, using motion history and like that; still no good results.)
Edit 1: I am counting a high people flow (a dozen of them may come through simultaneously).
Edit 2: It must be at least 80% accurate. People are walking through a door that is almost 5 meters wide. The problem is I have no control on the position or angle of the camera. Camera is shouting the place from a 10m distance at a 2.5m height.
Thank you
If you call a people counting system a system that counts people that are in a room then I recommend you implement the hardware with a microcontroller with 2 lazers(normal lazer toys work) and 2 photoresistors.For the microcontroller I recomen you use Arduino.And then make an C# application that has a SerialPort object and reads the data that the arduino sends through the USB.The arduino will send 1 for "someone entered the room" and 0 for "someone left the room" for example.Then the logging and statistics can be done easily in C#.
Arduiono Site:here
Photoresistor for $1: here
This solution is alot cheaper and easyer to implement than using a camera that is with a fairly good quality.
Hope I helped you.
Check out the HOG pedestrian detector that comes with recent versions of OpenCV (>= 2.2).
See modules/objdetect/src/hog.cpp and samples/cpp/peopledetect.cpp in the OpenCV sources. Unfortunately there is no official documentation about it yet.
This would help you to count moving things including people: Motion Detection project on CodeProject
Are people the only kind of "entities" in the scene? If this is not the case, do you care about considering a person some other kind of thing that moves through the scene? Because if that is the case, you could just count blobs that come in or come out from the scene. It may sound a bit naive but I will take some kind of motion image, group motion pixels by distance in clusters. Your distance metric could take into account some restrictions, such as that people will "often" stand so pixels in a cluster should group around some kind of regression line (an straight-up line if the camera is aligned with de floor). It shouldn't be necessary to track them in the scene, just noticing when they enter or they leave, though you'd get some issues with, for example, people entering on their own in the scene and leaving in pairs or in groups... Good luck :)
I think if you have dense people crowd with a lot of occlusions you have to use some machine learning algorithm, for example you can use Implicit Shape Model for features.
It really depends on the position of the camera. Assuming that you can get front facing profiles of the people in the images:
This problem is basically face detection and recognition.
There are many ways to go about finding faces, but this is the approach that I'm a little more familiar with.
For the face detection you need to do image segmentation on the skin tone color. This will extract skin regions. [Arms, the chest (for those wearing V cut tops), face, legs, etc] Then you would need to line up the profiles of the skin regions to the profile of your trained faces.
[You'll need to use Eigenfaces to create a generic profile of what a face looks like]
If the skin region lines up and doesn't devate too far from the profile, then it is considered a face. Once the face is confirmed, then add it into the eigenfaces data store [for recognition]. To save processing you might want to consider limiting the search area if you are looking for a previous face. [Given the frame rate, and last time the person was seen]
If you are referring to "Crowd flow" I think you just mean the density of faces in a crowd.
Now you've confirmed that a moving object in the video is a person. Now you just need to note that and then make sure that you don't consider them as a new person again.
This approach: Really depends on your ability to detect face regions. This may not work if the people in the video are looking down, not fitting the profile of the trained data etc. Also it may be effected if a person puts on sunglasses within the video. [Probably would be considered a "new face"]
I've been doing some Johnny Chung Lee-style Wiimote programming, and am running into problems with the Wiimote's relatively narrow field-of-view and limit of four points. I've bought a Creative Live! camera with an 85-degree field of view and a high resolution.
My prototype application is written in C#, and I'd like to stay there.
So, my question: I'd like to find a C#.Net camera / vision library that lets me track points - probably LEDs - in the camera's field of view. In the future, I'd like to move to R/G/B point tracking so as to allow more points to be tracked and distinguished more easily. Any suggestions?
You could check out the Emgu.CV library which is a .NET (C#) wrapper for OpenCV. OpenCV is considered by many, including myself, to be the best (free) computer vision library.
Check out AForge.Net.. It seems to be a powerful library.
With a normal camera, the task of identifying and tracking leds is quite more challanging, because of all the other objects which are visibile.
I suggest that you try to maximize the contrast by reducing the exposure (thus turning of auto-exposure), if that's possible in the driver: you should aim for a value where your leds have still an high intensity in the image (>200) while not being overexposed (<255). You should then be able to threshold your image correctly and get higher quality results.
If the image is still too cluttered to be analyzed easily and efficiently, you may use infrared leds, remove the IR-block filter on the camera (if your camera has it), and maybe add an "Infrared Pass / Visible Light blocking" filter: you should then have bright spots only where the leds are, but you will not bee able to use color. There may be issues with the image quality though.
When tracking things like lights, especially if they are a special color, I recommend you apply a blur filter to the footage first. This blends out colors nicely, a while less accurate, will use less CPU and there's less threshold adjustments you have to do.