How to match SURF interest points to a database of images - c#

I am using the SURF algorithm in C# (OpenSurf) to get a list of interest points from an image. Each of these interest points contains a vector of descriptors , an x coordinate (int), an y coordinate (int), the scale (float) and the orientation (float).
Now, i want to compare the interest points from one image to a list of images in a database which also have a list of interest points, to find the most similar image. That is: [Image(I.P.)] COMPARETO [List of Images(I.P.)]. => Best match. Comparing the images on an individual basis yields unsatisfactory results.
When searching stackoverflow or other sites, the best solution i have found is to build an FLANN index while at the same time keeping track of where the interest points comes from. But before implementation, I have some questions which puzzle me:
1) When matching images based on their SURF interest points an algorithm I have found does the matching by comparing their distance (x1,y1->x2,y2) with each other and finding the image with the lowest total distance. Are the descriptors or orientation never used when comparing interest points?
2) If the descriptors are used, than how do i compare them? I can't figure out how to compare X vectors of 64 points (1 image) with Y vectors of 64 points (several images) using a indexed tree.
I would really appreciate some help. All the places I have searched or API I found, only support matching one picture to another, but not to match one picture effectively to a list of pictures.

There are multiple things here.
In order to know two images are (almost) equal, you have to find the homographic projection of the two such that the projection results in a minimal error between the projected feature locations. Brute-forcing that is possible but not efficient, so a trick is to assume that similar images tend to have the feature locations in the same spot as well (give or take a bit). For example, when stitching images, the image to stitch are usually taken only from a slightly different angle and/or location; even if not, the distances will likely grow ("proportionally") to the difference in orientation.
This means that you can - as a broad phase - select candidate images by finding k pairs of points with minimum spatial distance (the k nearest neighbors) between all pairs of images and perform homography only on these points. Only then you compare the projected point-pairwise spatial distance and sort the images by said distance; the lowest distance implies the best possible match (given the circumstances).
If I'm not mistaken, the descriptors are oriented by the strongest angle in the angle histogram. Theat means you may also decide to take the euclidean (L2) distance of the 64- or 128-dimensional feature descriptors directly to obtain the actual feature-space similarity of two given features and perform homography on the best k candidates. (You will not compare the scale in which the descriptors were found though, because that would defeat the purpose of scale invariance.)
Both options are time consuming and direcly depend on the number of images and features; in other word's: stupid idea.
Approximate Nearest Neighbors
A neat trick is to not use actual distances at all, but approximate distances instead. In other words, you want an approximate nearest neighbor algorithm, and FLANN (although not for .NET) would be one of them.
One key point here is the projection search algorithm. It works like this:
Assuming you want to compare the descriptors in 64-dimensional feature space. You generate a random 64-dimensional vector and normalize it, resulting in an arbitrary unit vector in feature space; let's call it A. Now (during indexing) you form the dot product of each descriptor against this vector. This projects each 64-d vector onto A, resulting in a single, real number a_n. (This value a_n represents the distance of the descriptor along A in relation to A's origin.)
This image I borrowed from this answer on CrossValidated regarding PCA demonstrates it visually; think about the rotation as the result of different random choices of A, where the red dots correspond to the projections (and thus, scalars a_n). The red lines show the error you make by using that approach, this is what makes the search approximate.
You will need A again for search, so you store it. You also keep track of each projected value a_n and the descriptor it came from; furthermore you align each a_n (with a link to its descriptor) in a list, sorted by a_n.
To clarify using another image from here, we're interested in the location of the projected points along the axis A:
The values a_0 .. a_3 of the 4 projected points in the image are approximately sqrt(0.5²+2²)=1.58, sqrt(0.4²+1.1²)=1.17, -0.84 and -0.95, corresponding to their distance to A's origin.
If you now want to find similar images, you do the same: Project each descriptor onto A, resulting in a scalar q (query). Now you go to the position of q in the list and take the k surrounding entries. These are your approximate nearest neighbors. Now take the feature-space distance of these k values and sort by lowest distance - the top ones are your best candidates.
Coming back to the last picture, assume the topmost point is our query. It's projection is 1.58 and it's approximate nearest neighbor (of the four projected points) is the one at 1.17. They're not really close in feature space, but given that we just compared two 64-dimensional vectors using only two values, it's not that bad either.
You see the limits there and, similar projections do not at all require the original values to be close, this will of course result in rather creative matches. To accomodate for this, you simply generate more base vectors B, C, etc. - say n of them - and keep track of a separate list for each. Take the k best matches on all of them, sort that list of k*n 64-dimensional vectors according to their euclidean distance to the query vector, perform homography on the best ones and select the one with the lowest projection error.
The neat part about this is that if you have n (random, normalized) projection axes and want to search in 64-dimensional space, you are simply multiplying each descriptor with a n x 64 matrix, resulting in n scalars.

I am pretty sure that the distance is calculated between the descriptors and not their coordinates (x,y). You can compare directly only one descriptor against another. I propose the following possible solution (surely not the optimal)
You can find for each descriptor in the query image the top-k nearest neighbors in your dataset, and later take all top-k lists and finds the most common image there.

Related

DataStructure algorithm for pathfinding in a pointcloud

I was unsure how to phrase the title. I am searching for good data structure / algorithm combination for the following problem:
I have about 20000 objects, each containing a set of values (integers) and a XYZ-Position (doubles). I want to find the object that fits the following conditions best:
1. Needs to be within a maximum distance from a given starting point
2. From a given starting point, it has to be reachable by requiring no "hopping" (i.e. "travelling" from one object with position XYZ to another object's position) with a distance greater than a given threshold
3. It has to have the maximum value (of a given position in the set of integers)
At first I thought about graph theory and pathfinding, but it does not seem to fit well. I have no distinct edges, so I would have to link every point with every other point and use the distance as a weight on the edge. This would result in a lot(!) of edges. Second problem is that pathfinding (if I am not mistaken) only takes one criteria (usually distance or costs) as search criteria. But I would need multiple criteria (distance, hop limit, int value).
Any thoughts and also any good libraries to solve this well?

Matching Polygons

I'm working on a school project and my goal is to recognize objects. I started with taking pictures, applying various filters and doing boundary tracing. Fourier descriptors are to high for me, so I started approximating polygons from my List of Points. Now I have to match those polygons, which all have the same amount of vertices and all sites have the same length. More particular, I have two polygons and now I have to calculate some scale of similarity. This process has to be translation, rotation and scale invariant.
I tried turning and scaling one in different ways and calculate the distance between each pair of vertices, but this is very slow.
I tried turning the polygon in a set of vectors and calculate each angle of the corners and compare them. But this is also a bit slow.
I found an article called Contour Analysis. But i find this a bit difficult. In this article, firstly all vectors of each set are interpreted as complex numbers, so we only have two vectors with complex compounds. Then the cosine of both vectors is calculated. But the cosine is also a complex number and the norm of it is always 1 if both vectors are the same. So how does it make sense to interpret a set of vectors as one vector. I don't understand this practice.
Are there any other ways to compare two polygons or sets of vectors? Or can someone explain my 3rd try or do it with normal vectors?
I hope someone can help me out :-)
If your objects are well separated, you can characterize every contour using Hu's moments.
Description and basic math of image moments is rather simple and would be suitable for school project.

Efficiently find Delaunay Triangulation face, which a given point lies on

Given a Delaunay Triangulation of a point set, how should I index my triangulation to do quick point localization?
I'm currently looping over all the triangles. For each triangle, I'm checking if the given point is within triangle's bounding rectangle. If it is, I then check the triangle using geometry equations.
This is slow. Any ideas of how to make this search more efficient?
Mission accomplished, that's the way I ended up doing it:
1) Check if the point lies within triangle bounding rectangle.
2) Assign the point as the start of a horizontal line, ending at max width.
3) Check intersections from the triangles found in (1) with the line from (2).
4) If triangle intersect, check how many times the horizontal line intersect with the triangle.
5) If intersects 1 time, means point in triangle. Else, not in triangle.
Reference:
Fast generation of points inside triangulated objects obtained by cross-sectional contours
Ranging from quick and practical to theoretically robust, here are three approaches you could use:
Construct a regular grid where each cell contains a list of triangles that intersect it. Given a query point, in constant time determine the cell that contains it, then compare your query point against only those triangles that are in that cell's list.
Construct a quadtree where each leaf cell contains the triangles that intersect it. Localizing the query point to a quadtree leaf takes logtime, but this can be more efficient in both speed and memory overall.
Sweep a horizontal line down across all the triangles. Points in your point sets correspond to events. At each event, some triangles begin intersecting the sweepline, and other triangles stop intersecting the sweepline. You can use an immutable (aka persistent) sorted map data structure to efficiently represent this. map<double, sweepstate>, where the key is the y-intercept of the sweepline at an event and sweepstate is a sorted list of line segment pairs (corresponding to the left and right sides of triangles). Given a query point, you first use its y value to lookup a sweepstate, and then you do a single trapezoid containment test. (Two horizontal sweeplines and two line segments between them form a trapezoid.)
A common approach to solve this point location problem is the efficient Trapezoidal Decomposition. It reduces the query time to O(Log(N)) per point, after O(N.Log(N)) preprocessing time, using O(N) space.
It could also be that the distribution of your query points allows alternative/simpler approaches.
A solution is a hierarchical tree, I.e. dendogram or hierarchical cluster. For example use the euklidian distance:http://en.m.wikipedia.org/wiki/Hierarchical_clustering. Or you can use a metric tree.

Algorithm to generate equally distributed points in a polygon

I am looking for an algorithm to generate equally distributed points inside a polygon.
Here is the scenario:
I have a polygon specified by the coordinates of the points at the corners (x, y) for each point. And I have the number of points to generate inside the polygon.
For example lets say I have a polygon containing 5 points: (1, 1) ; (1, 2) ; (2, 3) ; (3, 2) ; and (3, 1)
And I need to generate 20 equally distanced points inside that polygon.
Note: Some polygons may not support equally distributed points, but I'm looking to distribute the points in a way to cover all the region of the polygon with as much consistency as possible. (what i mean is I don't want a part with a lot more points than another)
Is there an algorithm to do so? or maybe a library
I am working on a C# application, but any language is ok, since I only need the algorithm and I can translate it.
Thanks a lot for any help
The simple approach I use is:
Triangulate the polygon. Ear clipping is entirely adequate, as all you need is a dissection of the polygon into a set of non-overlapping triangles.
Compute the area of each triangle. Sample from each triangle proportionally to the area of that triangle relative to the whole. This costs only a single uniform random number per sample.
Once a point is determined to have come from a given triangle, sample uniformly over the triangle. This is itself easier than you might think.
So really it all comes down to how do you sample within a triangle. This is easily enough done. A triangle is defined by 3 vertices. I'll call them P1, P2, P3.
Pick ANY edge of the triangle. Generate a point (P4) that lies uniformly along that edge. Thus if P1 and P2 are the coordinates of the corresponding end points, then P will be a uniformly sampled point along that edge, if r has uniform distribution on the interval [0,1].
P4 = (1-r)*P1 + r*P2
Next, sample along the line segment between P3 and P4, but do so non-uniformly. If s is a uniform random number on the interval [0,1], then
P5 = (1-sqrt(s))*P3 + sqrt(s)*P4
r and s are independent pseudo-random numbers of course. Then P5 will be randomly sampled, uniform over the triangle.
The nice thing is it needs no rejection scheme to implement, so long, thin polygons are not a problem. And for each sample, the cost is only in the need to generate three random numbers per event. Since ear clipping is rather simply done and an efficient task, the sampling will be efficient, even for nasty looking polygons or non-convex polygons.
An easy way to do this is this:
Calculate the bounding box
Generate points in that box
Discard all points not in the polygon of interest
This approach generates a certain amount of wasted points. For a triangle, it is never more than 50%. For arbitrary polygons this can be arbitrarily high so you need to see if it works for you.
For arbitrary polys you can decompose the polygon into triangles first which allows you to get to a guaranteed upper bound of wasted points: 50%.
For equally distanced points, generate points from a space-filling curve (and discard all points that are not in the polygon).
You can use Lloyd’s algorithm:
https://en.m.wikipedia.org/wiki/Lloyd%27s_algorithm
You can try the {spatialEco} package (https://cran.r-project.org/web/packages/spatialEco/index.html)
and apply the function sample.poly (https://www.rdocumentation.org/packages/spatialEco/versions/1.3-2/topics/sample.poly)
You can try this code:
library(rgeos)
library(spatialEco)
mypoly = readWKT("POLYGON((1 1,5 1,5 5,1 5,1 1))")
plot(mypoly)
points = sample.poly(mypoly, n= 20, type = "regular")
#points2 = sample.poly(mypoly, n= 20, type = "stratified")
#another type which may answer your problem
plot(points, col="red", add=T)
The easy answer comes from an easier question: How to generate a given number of randomly distributed points from the uniform distribution that will all fit inside a given polygon?
The easy answer is this: find the bounding box of your polygon (let's say it's [a,b] x [c,d]), then keep generating pairs of real numbers, one from U(a,b), the other from U(b,c), until you have n coordinate pairs that fit inside your polygon. This is simple to program, but, if your polygon is very jagged, or thin and skewed, very wasteful and slow.
For a better answer, find the smallest rotated rectangular bounding box, and do the above in transformed coordinates.
Genettic algorithms can do it rather quickly
Reffer to GENETIC ALGORITHMS FOR GRAPH LAYOUTS WITH GEOMETRIC CONSTRAINTS
You can use Force-Directed Graph for that...
Look at http://en.wikipedia.org/wiki/Force-based_algorithms_(graph_drawing)
it defiantly can throw you a bone.
I didn't try it ever,
but i remmember there is a possiblity to set a Fix for some Vertices in the Graph
Your Algorithm will eventually be like
Create a Graph G = Closed Path of the Vertices in V
Fix the Vertecies in place
Add N Verticies to the Graph and Fully connect them with Edges with equal tension value 1.0
Run_force_graph(G)
Scale Graph to bounded Box of
Though it wont be absolute because some convex shapes may produce wiered results (take a Star)
LASTLY: didn't read , but it seems relevant by the title and abstract
take a look at Consistent Graph Layout for Weighted Graphs
Hope this helps...
A better answer comes from a better question. Suppose you want to put a set of n watchtowers to cover a polygon. You could see this as an optimization problem: find the 2n coordinates of the n points that will minimize a cost function (or maximize a value function) that fits your goal. One possible cost function could calculate, for each point, the distance to its closest neighbor or the boundary of the polygon, whichever is less, and calculate the variance of this sequence as a measure of "non-uniformity". You could use a random set of n points, obtained as above, as your initial solution.
I've seen such a "watchtower problem" in some book. Algorithms, calculus, or optimization.
#Youssef: sorry about the delay; a friend came, and a network hiccuped.
#others: have some patience, don't be so trigger-happy.

Quickly find and render terrain above a given elevation

Given an elevation map consisting of lat/lon/elevation pairs, what is the fastest way to find all points above a given elevation level (or better yet, just the the 2D concave hull)?
I'm working on a GIS app where I need to render an overlay on top of a map to visually indicate regions that are of higher elevation; it's determining this polygon/region that has me stumped (for now). I have a simple array of lat/lon/elevation pairs (more specifically, the GTOPO30 DEM files), but I'm free to transform that into any data structure that you would suggest.
We've been pointed toward Triangulated Irregular Networks (TINs), but I'm not sure how to efficiently query that data once we've generated the TIN. I wouldn't be surprised if our problem could be solved similarly to how one would generate a contour map, but I don't have any experience with it. Any suggestions would be awesome.
It sounds like you're attempting to create a polygonal representation of the boundary of the high land.
If you're working with raster data (sampled on a rectangular grid), try this.
Think of your grid as an assembly of right triangles.
Let's say you have a 3x3 grid of points
a b c
d e f
g h k
Your triangles are:
abd part of the rectangle abed
bde the other part of the rectangle abed
bef part of the rectangle bcfe
cef the other part of the rectangle bcfe
dge ... and so on
Your algorithm has these steps.
Build a list of triangles that are above the elevation threshold.
Take the union of these triangles to make a polygonal area.
Determine the boundary of the polygon.
If necessary, smooth the polygon boundary to make your layer look ok when displayed.
If you're trying to generate good looking contour lines, step 4 is very hard to to right.
Step 1 is the key to this problem.
For each triangle, if all three vertices are above the threshold, include the whole triangle in your list. If all are below, forget about the triangle. If some vertices are above and others below, split your triangle into three by adding new vertices that lie precisely on the elevation line (by interpolating elevation). Include the one or two of those new triangles in your highland list.
For the rest of the steps you'll need a decent 2d geometry processing library.
If your points are not on a regular grid, start by using the Delaunay algorithm (which you can look up) to organize your pointss in into triangles. Then follow the same algorith I mentioned above. Warning. This is going to look kind of sketchy if you don't have many points.
Assuming you have the lat/lon/elevation data stored in an array (or three separate arrays) you should be able to use array querying techniques to select all of the points where the elevation is above a certain threshold. For example, in python with numpy you can do:
indices = where(array > value)
And the indices variable will contain the indices of all elements of array greater than the threshold value. Similar commands are available in various other languages (for example IDL has the WHERE() command, and similar things can be done in Matlab).
Once you've got this list of indices you could create a new binary array where each place where the threshold was satisfied is set to 1:
binary_array[indices] = 1
(Assuming you've created a blank array of the same size as your original lat/long/elevation and called it binary_array.
If you're working with raster data (which I would recommend for this type of work), you may find that you can simply overlay this array on a map and get a nice set of regions appearing. However, if you need to convert the areas above the elevation threshold to vector polygons then you could use one of many inbuilt GIS methods to convert raster->vector.
I would use a nested C-squares arrangement, with each square having a pre-calculated maximum ground height. This would allow me to scan at a high level, discarding any squares where the max height is not above the search height, and drilling further into those squares where parts of the ground were above the search height.
If you're working to various set levels of search height, you could precalculate the convex hull for the various predefined levels for the smallest squares that you decide to use (or all the squares, for that matter.)
I'm not sure whether your lat/lon/alt points are on a regular grid or not, but if not, perhaps they could be interpolated to represent even 100' ft altitude increments, and uniform
lat/lon divisions (bearing in mind that that does not give uniform distance divisions). But if that would work, why not precompute a three dimensional array, where the indices represent altitude, latitude, and longitude respectively. Then when the aircraft needs data about points at or above an altitude, for a specific piece of terrain, the code only needs to read out a small part of the data in this array, which is indexed to make contiguous "voxels" contiguous in the indexing scheme.
Of course, the increments in longitude would not have to be uniform: if uniform distances are required, the same scheme would work, but the indexes for longitude would point to a nonuniformly spaced set of longitudes.
I don't think there would be any faster way of searching this data.
It's not clear from your question if the set of points is static and you need to find what points are above a given elevation many times, or if you only need to do the query once.
The easiest solution is to just store the points in an array, sorted by elevation. Finding all points in a certain elevation range is just binary search, and you only need to sort once.
If you only need to do the query once, just do a linear search through the array in the order you got it. Building a fancier data structure from the array is going to be O(n) anyway, so you won't get better results by complicating things.
If you have some other requirements, like say you need to efficiently list all points inside some rectangle the user is viewing, or that points can be added or deleted at runtime, then a different data structure might be better. Presumably some sort of tree or grid.
If all you care about is rendering, you can perform this very efficiently using graphics hardware, and there is no need to use a fancy data structure at all, you can just send triangles to the GPU and have it kill fragments above or below a certain elevation.

Categories