In my application,I want to find the exact coordinates of the objects detected in a laser scanner placed in a moving vehicle in real time. Till now I have found out the local minima of the points in the graph and is giving all local minimas including the unwanted 2 show in the fig .But I want only the main object locations like the one indicated as 1 in the figure.
I tried these methods in C# after searching in google and stackoverflow
I did moving average for the curve and found out the local minima.the result is okay.But Since it is real time,I worry that it may take some processing time.
I also tried finding out the slopes of the different points in the curve and tried to mark the positions with the maximum and minimum slopes. It works but not exactly finding the correct position.
I tried marking the points which satisfy both criteria i.e local minima having high slopes.but it is not working as intended.
The last option I have is to have a reference in the first scan and subtracting the other object graphs from the reference.Then I can compare the subtracted range and the local minima to find the exact position. i.e the part 1 and the black curve at the bottom.
the scanning frequency is 50Hz and if the moving average does not much time. I will go with the first option. Finally I am going to code the algorithm in c++. I am trying different things in c# since it is easier to view and analyse the graphs.
I have finally found out a solution. I used the foreground segmentation and blob detection algorithm. I referred this http://www.v2.nl/lab/projects/laser-measurement-system-object-for-max
Related
I try to identify changes on an object. Therefore I take a picture before and after using the object. At the moment I'm working with the absolute Difference of the two pictures and taking the contours of the resulting difference image. That works fine as long as the object is positioned perfectly and captured like in the image before. Only small differences in its position make my method useless.
Has anybody a different solution approach with OpenCV oder EmguCV? I was thinking about checking if one of the neighbor pixels is identical then there should be no change detected, but I don't know of an existing performant algorithm.
Example Images (Pictures don't match my usecase, but they should be helpful to illustrate my problem):
Before
After
Yes there are many way to do this. I like the following:
Histogram match. Get a histogram before and after and check for differences. Is sensitive to changes in lighting. Very good method if you are in a controlled lighting setting
Correlation match. If you use MatchTemplate you can get the “quality” of the match. This can be made to be less sensitive to light. But is sensitive to rotation changes between the two images.
Try to implement some and let’s see your code.
Recently I've been messing around with machine learning and I wanted to see if I could create AI for the game I'm currently making. The AI should be able to solve the puzzle for you.
The game currently works as followed. You have a few tiles in a grid, some of them are movable some of them aren't. You click on a tile you want to move, and you drag it into a direction. It'll then start moving the tiles and optionally also the player character itself. The end goal is to reach the end tile. Level example, Solving the level
Playing the game yourself:
Whenever you select a tile (you do this by clicking), you then hold the mouse button down, and drag onto the direction you want the tile to move towards. Once the tiles are done moving, the player object will move one step in the same direction. If the player is on top of a tile that you move, it'll move with the tile. And afterwards do another step in the same direction.
I was wondering if it's possible (and if so, how) for machine learning to define a position on the screen, (optionally) click and then define a movement direction?
Please keep in mind that I'm fairly new to machine learning!
To give some more clarification:
The grid is static for now, to keep it simple for the AI. But later one, the goal is to generate a level randomly, and see if it can solve it.
In theory, all the AI should have to do, is select a tile to move (A number between 0 and the width of the grid, and the same for the height). And define a movement direction. Either (0, 1), (0, -1), (1, 0) or (-1, 0).
Falling off the grid will results in a reset.
Reaching the end of the grid results in a win.
Moving in an invalid direction results in a reset.
Based off of your bullet points, I would honestly suggest just implementing the A* Pathfinding algorithm, with some modifications to emulate machine learning. The A* Pathfinding algorithm determines the best path on a grid from point a to point b, and using clever programming you could achieve the result you want with a reasonable amount of overhead.
Something along the lines of having a list of "do not touch" grid points(death traps, etc), which gets filled as the AI runs into them, so on the next iteration it knows not to take that path. This is a very basic abstraction of your idea, but would be highly obtainable.
Obviously we cannot write the code for you, luckily there are tons of resources on A* Pathfinding to help you get started!
Here is a simple tutorial
Here is an implementation that was used in Unity
Here is a code review on someones implementation
Assuming you actually want to use machine learning and not just a pathing system:
I will lay out some pseudo code that you can use for a basic scenario of the AI learning a static board. There are different ways you can write and implement this code, I have only suggested one way. But before we get to that lets first discuss this project overall and some suggestions for it.
Suggestions:
I would say that you will want to measure the game state on the board, and not the mouse movements. So basically the AI is measuring what moves can be made. The mouse movement part is just a way for the player to interact with the board so it is not needed by the AI. It will be simpler to just let the AI make the moves directly.
I don't think that unity is a good platform for this kind of experimentation. I think you would be better off programming this in a console program. So for example using a 2 dimensional array (board) in a visual studio c# console program, or in a C console program via CS50 IDE (comes with free sign up via edx.org for cs50 https://manual.cs50.net/ide). I have suggested these because I think Unity will just add unnecessary layers to a machine learning experiment.
My assumption is you want to learn machine learning, and not just how to make an ai solve a puzzle in your game. Because in the latter case better options would be a proper pathing system, or having the ai brute force several attempts at the puzzle before moving and select the solution with the fewest steps.
Pseudo Code:
Now onto some pseudo code for your machine learning program.
Assumptions:
A. You have a board with set dimensions that you can pass to the Ai at the start.
B. There are tiles on the board the AI cannot move into (obstacles).
C. The AI should learn to solve the problem, instead of having the answer at the beginning because of good code that we designed (like a decent pathing system).
D. We don't want the AI to brute force this by trying a billion different combinations before moving, because this suggests perfect understanding of its environment. If the ai has perfect understanding of its environment then yes, it should use brute force where reasonable.
Coding Logic:
Scenario 1: The AI plays on the same board every time with the same starting conditions.
I. You start by setting a discrete amount of time in which the AI makes a move. For example 1 move every 1 second.
II. Have a counter for the number of moves made to reach the end tile, and record the sequence of moves associated with this counter.
III. If the AI has no history with which to make a move it makes a move in a random direction.
IV. If the move is invalid then the counter increases and the move is recorded, but the AI stays on the same tile.
V. When the AI completes the puzzle the counter and sequence of moves is stored for later use.
VI. In subsequent play throughs the AI always starts by selecting the paths it has tried with smallest count.
VII. Once the AI begins moving it has a 1% chance per move to try something different. Here is an example. When the 1% is triggered the AI has a 50% to try one of the following:
a. 50% chance: It checks through all the sequences in its history to see if there is any section in the past sequences where the counter between its current tile and the finish tile is shorter than its current path. If there are multiple it selects the shortest. When the AI finishes the round it records the new total sequence taken.
b. 50% chance. The Ai makes a move in a random direction. If it made a move in a random direction. Subsequent moves again follow this logic of 50% chance check, and 50% chance move randomly again. When completed again record the sequence of moves.
VIII. You can seed this by making the AI run the puzzle a 10,000 times in a few seconds behind the scenes, and then when you observe it afterwards it should have selected a reasonable path.
If a computer can brute force a problem in reasonable time it should start with that. However bear in mind that machine learning in a computer program where the machine already knows all the variables is different from machine learning in the environment, where for example you have a robot that has to navigate an unknown environment. The above code should work in the latter case. You may also want to investigate the idea of the AI mapping out the entire terrain by trying to move to every tile and forming an understanding of the environment, then just brute forcing a solution once it understands the variables.
In a non static environment you will want to enhance the valuation system. This answer is already too long so I won't go into it.
Short answer to both questions: Yes,
You can create an ai that uses either gamestate (so it can read the objects/properties of your grid) or you could use raw-screen input combined with image processing, which is a hard thing to create, and expensive (computational) to run.
On the Unity forms there are several answers to the question "How to mimic mouse input" or alike. Take a look here:
https://answers.unity.com/questions/564664/how-i-can-move-mouse-cursor-without-mouse-but-with.html
If you are looking for the code for the AI, sadly, you are out of luck. There are lots of ai tutorials online to create a simple ai for such a game. I would advice not to dive head-first in the fancy stuff (like neural networks) and start simple. It would be the best, in my opinion, too start with creating an (class) structure for your ai, and start learning AI by practice. Start with an "AI" that just randomly returns something, then see what you can learn & manage online and make other versions.
For one of the first AI's, take a look into goal-driven AI's or state-machines. I think they should be able to give nice results, given your gifs.
I'm sorry if this question is broad but I have not been able to find any real solutions to the problem I must solve. I need to solve the problem of mapping a user's location to an image that represents a map (like an amusement park map).
One possible solution would be to define GPS coordinates to different parts of the image and then snap the user's location to the closest defined location.
Something else I saw was Geospatial PDF's but I couldn't find much on implementing a way to read Geospatial information from the PDF.
How can I take an image that represents lets say a theme park and map a user's location to it?
Short answer:
You can't, by which I mean you can't just take a regular image and snap co-ordinates to it's pixels.
Long answer:
You can, but it takes a lot of work and preparation, here's the basics of what you need to do.
STEP 1 - Georefrence the image
To do this you need some GIS software, and an existing map of the area that's registered in the correct co-ordinate space.
If you have the budget, then you should consider using professional software such as Autodesk map 3D or the ESRI suite of tools. If you don't have a budget, you can do this using free tools such as QGIS.
I'll assume QGis for this description.
Load the existing map that you have for the area (The one that's already referenced) into your GIS Package. (How and where you get this map from is entirely up to you, if your lucky you might have one someone else did, or the builders of your park might have site plans you can use, without this source map however, you can forget any chance of matching the image to it unless you have a list of all the points you want to reference) [SIDE NOTE: It's perfectly feasible to go out with a GPS device and record your points manually, esp if the site your mapping is not to big and you have full access to it, since your only referencing an image of your own, and not building anything then super, duper 1000% accuracy is not needed]
Assuming the use of QGis, go up to the "Raster" menu, and select the "GeoReferencer" tool
Once the tool loads, you'll be presented with a child window that allows you to load your "Un-referenced" map image into it. (The load button is marked with a red arrow)
Once you have your raster image loaded, you then need to use the already referenced map you loaded into QGis (Sorry no space to document this part, there are a multitude of ways, depending on what data you have) and pick points from it that match the raster, in your georeferencer tool.
Make sure the georeferencer tool is in add point mode.
Then click on the image that you loaded into your geo-referencing tool, at the location where you want your first point.
The enter map co-ordinates box will pop open.
If you already know the location of these points (For example because you went out with a GPS, or you have some survey data) then you can simply just type them in. If not, then click the button marked "from map canvas", and the geo-reference tool will switch to the already referenced map you have loaded, and ask you to click on the same location on the referenced map.
Once you click on the referenced map, QGis will then switch back to the Geo-reference tool with the co-ordinates filled in
At this point, you can click "OK" and the point will be registered on your un-referenced raster image as a referenced point (Small red dot)
Repeat this process, until you have as many locations as you want referenced. You don't have to do everything, but key locations, such as park entrances, corners around the main site outline, centers of prominent buildings and road junctions should be done.
The more points you reference, the more accurate the final referenced raster image will be.
Once you've finished adding the points to your image to reference it, you then need to click the yellow cog wheel, and fill in the options for the output raster, target SRS and other things that will turn this into a map.
Now, at this stage I've not mentioned a VERY, VERY, VERY important concept, and that's the "SRS" otherwise known as the "Spatial Reference System"
You may have noticed in my screen shots above, when the co-ordinates were entered in the dialog by clicking on the map, that they did not look like the usual latitude/longitude pair that a phone or GPS unit might produce.
That's because ALL of my maps are in an SRS known as "OSGB36" (or EPSG:27700), which is the local spatial reference system for the united kingdom.
I could have if I'd wanted to, used the standard GPS system (Known as WGS84 or 'EPSG:4326') but because I'm working only within the UK, doing that would actually have cause errors in my calculations.
If your working with something a small as an Amusement park, then for best results you NEED to find what your local geographic co-ordinate system is, using standard GPS co-ordinates will cause too many errors, and might even lead to incorrect location plotting when you finally plot your point on your image.
There's simply far too much info for me to put into an SO post, so I would strongly suggest that you grab a free copy of the EBook I've written on the subject from here:
https://www.syncfusion.com/resources/techportal/details/ebooks/gis
That will fill in a large amount of the background knowledge you need, in order to do this.
Once you've set your settings, and added all your reference points, your then ready to create your referenced raster image, by simply clicking on the green triangle
Once you get to this point, and your referenced image is saved, you will now have a map/image that should be referenced in your local co-ordinates, and can understand a point given to it in the same co-ordinate system, and know where to plot it on a map.
That however, is only the start of your journey.
STEP 2 - Build a map server
Once you have the image, you then need to host in something called a WMS server.
Again, describing how to do this from the ground up in an SO post is simply just not practical, what you need is something like "geoserver" (A Java based easy to use map server system) or something like a bare bones linux system, with apache webserver installed and the mapserver CGI binary application to run under it.
Once you have a map server set up, and serving maps using the WMS protocol, you can then move onto the final stage
STEP3 - Creating your application to display the map
The final part of the equation is to build an application to display the map from your WMS server, and then take the location of the person or item you want to plot, optionally convert the co-ordinates to the local SRS that matches your image, then plots the dot over the image in the correct location.
If your doing this in a web/mobile application, then you'll most likely want to investigate the use of openlayers.js or leaflet.js, if your doing this in a C# application, then GreatMaps or SharpMap are the toolkits you want to be looking at.
Wrap up
Many folks think that plotting locations onto a map image is a quite straight forward and simple task, I can tell you now it's not.
Iv'e built many GIS systems over the years, and even the simplest of them has taken me over 3 months.
Even a simple idea such as the one your asking about, takes tremendous amounts of planning and analysis, there is no quick way of doing this unless you simply just want to host a google maps image on your web page, and match your device co-ordinates up to that.
The second you start to produce custom maps, you automatically set yourself up for a lot of work, that's going to take you time and patience to get it right.
Pixels in images simply don't match up to real world co-ordinates, and the truth of the matter is simple. There's a reason why mapping and GIS companies charge as much as they do to create systems like this.
References and further reading
http://www.qgistutorials.com/en/docs/georeferencing_basics.html
http://www.digital-geography.com/qgis-tutorial-i-how-to-georeference-a-map/
http://glaikit.org/2011/03/27/image-georeferencing-with-qgis/
http://geoserver.org/
http://mapserver.org/uk/index.html
http://openlayers.org/
All the best with your project, and please know this, your in for a lot of work, but your also going to have a lot of fun, and learn heaps of new stuff, the world of GIS is by it's very nature complicated, but it's also a very fascinating subject, especially when you start drawing your own maps from scratch :-)
Shawty
If your map image represents a not so large area, then I would think of this as a rectangle.
It would be just a matter of transforming your Lat/Lng coordinates to (x,y) coordinates inside your image.
Lat2 |----------------------------|
| |
Lat1 |----------------------------|
Long1 Long2
Assign the real world Lat/Long coordinates to each corner of your map:
Bottom Left Corner = Lat1, Long1
Bottom Right Corner = Lat1, Long2
Upper Left Corner = Lat2, Long1
Upper Right Corner = Lat2, Long2
Given the user longitude and latitude and knowing the width and height of your image, you can calculate the transformed (x,y) coordinates over the image:
x = User Longitude * Image Width / |Long2 - Long1|
y = User Latitude * Image Height / |Lat2 - Lat1|
You should now be able to put a pin over that (x,y) position.
My Scenario:
I have a camera focused at a white screen, which is taking a live feed and displaying that feed in a picture box by virtue of a FrameReceived event.
I need to kick off a process to crop the image if something is inserted between the camera and the screen.
This process needs to start when the image first changes so I need to compare one frame with another to see if anything has changed.
My Efforts
I have tried hashing the images and comparing them, which doesnt work as the frames are never exactly the same
I have tried to loop through each pixel, comparing different values such as brightness, hue etc but this is too slow
I have tried looping through with a sub sample but it is either too slow or too unreliable.
I even tried what I like to call the "Twisted Pair Solution" where I inverted one then added them together and checking the result but this was far too complex and slow.
My Environment
Visual Studio 2012 (2010 if neccessary is available)
Ueye camera
C#
The images are of type System.Drawing.Bitmap
Notes
The biggest problem seems to be that to reliably get this result, it takes longer than we have for a reasonable frame rate, meaning that the calculation is not finished before a new frame comes in, which means that whatever variable I use to store the previous image is being overwritten before it can stop being used, and there appears to be thread after thread building up and it causes a whole lotta shakin.
I would recommend using some sort of image processing library , because the default .Net image processing tools are limited ,you can use an image processing library like http://www.aforgenet.com/framework/.
Than you can for example subtract image 1 from image 2, and sum the differences. If they are below a threshold (you choose the on the fits your need) they are identical.
or you can deep deeper and try this http://thecsharper.com/?p=94
I want to develop a "People Counting System" using OpenCV (or Emgu CV).
Please guide me on how to implement or lead me to some examples or open source projects.
(I have done some work: extracting diff then threshold to delete background, using motion history and like that; still no good results.)
Edit 1: I am counting a high people flow (a dozen of them may come through simultaneously).
Edit 2: It must be at least 80% accurate. People are walking through a door that is almost 5 meters wide. The problem is I have no control on the position or angle of the camera. Camera is shouting the place from a 10m distance at a 2.5m height.
Thank you
If you call a people counting system a system that counts people that are in a room then I recommend you implement the hardware with a microcontroller with 2 lazers(normal lazer toys work) and 2 photoresistors.For the microcontroller I recomen you use Arduino.And then make an C# application that has a SerialPort object and reads the data that the arduino sends through the USB.The arduino will send 1 for "someone entered the room" and 0 for "someone left the room" for example.Then the logging and statistics can be done easily in C#.
Arduiono Site:here
Photoresistor for $1: here
This solution is alot cheaper and easyer to implement than using a camera that is with a fairly good quality.
Hope I helped you.
Check out the HOG pedestrian detector that comes with recent versions of OpenCV (>= 2.2).
See modules/objdetect/src/hog.cpp and samples/cpp/peopledetect.cpp in the OpenCV sources. Unfortunately there is no official documentation about it yet.
This would help you to count moving things including people: Motion Detection project on CodeProject
Are people the only kind of "entities" in the scene? If this is not the case, do you care about considering a person some other kind of thing that moves through the scene? Because if that is the case, you could just count blobs that come in or come out from the scene. It may sound a bit naive but I will take some kind of motion image, group motion pixels by distance in clusters. Your distance metric could take into account some restrictions, such as that people will "often" stand so pixels in a cluster should group around some kind of regression line (an straight-up line if the camera is aligned with de floor). It shouldn't be necessary to track them in the scene, just noticing when they enter or they leave, though you'd get some issues with, for example, people entering on their own in the scene and leaving in pairs or in groups... Good luck :)
I think if you have dense people crowd with a lot of occlusions you have to use some machine learning algorithm, for example you can use Implicit Shape Model for features.
It really depends on the position of the camera. Assuming that you can get front facing profiles of the people in the images:
This problem is basically face detection and recognition.
There are many ways to go about finding faces, but this is the approach that I'm a little more familiar with.
For the face detection you need to do image segmentation on the skin tone color. This will extract skin regions. [Arms, the chest (for those wearing V cut tops), face, legs, etc] Then you would need to line up the profiles of the skin regions to the profile of your trained faces.
[You'll need to use Eigenfaces to create a generic profile of what a face looks like]
If the skin region lines up and doesn't devate too far from the profile, then it is considered a face. Once the face is confirmed, then add it into the eigenfaces data store [for recognition]. To save processing you might want to consider limiting the search area if you are looking for a previous face. [Given the frame rate, and last time the person was seen]
If you are referring to "Crowd flow" I think you just mean the density of faces in a crowd.
Now you've confirmed that a moving object in the video is a person. Now you just need to note that and then make sure that you don't consider them as a new person again.
This approach: Really depends on your ability to detect face regions. This may not work if the people in the video are looking down, not fitting the profile of the trained data etc. Also it may be effected if a person puts on sunglasses within the video. [Probably would be considered a "new face"]