I've been reading about using the winged-edge data structure for storing a boundary representation. However, the linked site says that this is one of the oldest data structres for storing b-reps, are there newer better ones?
Secondly, is there an implementation of this in C#?
The datastructure used for a B-rep is very similar to those used for polygonal modeling - you just replace the edges with curves and the faces with surfaces.
The wikipedia page on polygonal meshes has several types listed, including winged edge. Personally I like half-edge meshes. The only thing they can't do well is non-manifold topology, which you may or may not need. If you do, look for radial edge topology.
There's also a freely available B-rep datastructure from OpenNurbs (McNeel, the makers of Rhino). That also gets you file IO, which is nice.
Boundary Representation Modelling Techniques by Ian Stroud will give you a survey of ways people have approached B-reps, along with a plethora of diagrams with all the Euler operators, and concrete data structures and algorithms for implementing B-reps imperatively.
Whether you want to move a few characters forward into F# or not, you may glean quite a bit of info from the source code for Wings3d (written in Erlang). Just don't get lost making spaceships and forget you were supposed to be coding!
Also the GML will allow you to investigate interactively what you can do with your B-reps, and the data structure is the code.
Not sure if this will help or not but there are Geometry objects in the XNA library for dealing with 3D Structures and what not. There may be something in there. However my guess is that it will either be Point based or Triangle based vs edge based.
But it might be a place to look.
Related
I'm trying to perform image registration without much luck.
The image below is my 'reference' image. I use a webcam to acquire images of the same object in different orientations and then need to perform a transformation on these images so that they look as close to the reference image as possible.
I've been using both the Aforge.NET and Accord.NET libraries in order to solve this problem.
Feature detection/extraction
So far I've tried the image stitching method used in this article. It works well for certain types of image but unfortunately it doesn't seem to work for my sample images. The object itself is rather bland and doesn't have many features so the algorithm doesn't find many correlation points. I've tried two versions of the above approach, one which uses the Harris corner detector and one which uses SURF, neither of which has provided me with the results I need.
One option might be to 'artificially' add more features to the object (i.e. stickers, markings) but I'd like to avoid this if possible.
Shape detection
I've also tried several variations of the shape detection methods used in this article. Ideally I'd like to detect the four well-defined circles/holes on the object. I could then use the coordinates of these to create a transformation matrix (homography?) that I could use to transform the image.
Unfortunately I can't reliably detect all four of the circles. I've tried myriad different ways of pre-processing the image in order to get better circle detection, but can't quite find the perfect sequence. My normal operations is:
turn image grayscale
apply a filter (Mean, Median, Conservative Smoothing, Adaptive Smoothing, etc)
apply edge detection (Homogenity, Sobel, Difference, Canny, etc)
apply color filtering
run shape/circle detector
I just can't quite find the right series of filters to apply in order to reliably detect the four circles.
Image / Template matching
Again, I'd like to detect the four circles/holes in the object, so I tried an image / template matching technique with little success. I've created a template (small image of one of the circles) and run the Exhaustive Template Matching algorithm, without much success. Usually it detects just one of the holes, usually the one the template was created from!
In summary
I feel like I'm using the correct techniques to solve this problem, I'm just not sure quite where I'm going wrong, or where I should focus my attention further.
Any help or pointers would be most appreciated.
If you've added examples of transformations you're trying to be invariant to - we could be more specific. But generally, you can try to use HOG for detecting this structure, since it is rather rich in gradients.
HOG is mostly used to detect pedestrians, besides it is good for detecting distinct logos.
I am not sure about HOG's invariance to rotations, but it's pretty robust under different lighting and under moderate perspective distortion. If rotation invariance is important, you can try to train the classifier on rotated version of object, although your detector may become less discriminative.
After you have roughly detected the scale and position of your structure - you can try to refine it, by detecting ellipse of it's boundary. After that you will have a coarse estimate of holes, which you can further refine using something like maximum of normalized cross correlation in this neighbourhood.
I know it's been awhile but just a short potential solution:
I would just generate a grid of points on the original image (let's say, 16x16) and then use a Lucas-Kanade (or some other) feature detector to find those points on second image. Of course you likely won't find all the points but you can sort and choose the best correlations. Let's say, the best four? Then you can easily compute a transformation matrix.
Also if you don't get good correlations on your first grid, then you can just make other grids (shifted, etc.) until you find good matches.
Hope that helps anyone.
Despite Googling around a fair amount, the only things that surfaced were on neural networks and using existing APIs to find tags about an image, and on webcam tracking.
What I would like to do is create my own data set for some objects (a database containing the images of a product (or a fingerprint of each image), and manufacturer information about the product), and then use some combination of machine learning and object detection to find if a given image contains any product from the data I've collected.
For example, I would like to take a picture of a chair and compare that to some data to find which chair is most likely in the picture from the chairs in my database.
What would be an approach to tackling this problem? I have already considered using OpenCV, and feel that this is a starting point and probably how I'll detect the object, but I've not found how to use this to solve my problem.
I think in the end it doesn't matter what tool you use to tackle your problem. You will probably need some kind of machine learning. It's hard to say which method would result in the best detection, for this I'd recommend to use a tool like weka. It's a collection of multiple machine learning algorithms and lets you easily try out what works best for you.
Before you can start trying out the machine learning you will first need to extract some features out of your dataset. Since you can hardly compare the images pixel by pixel which would result in huge computational effort and does not even necessarily provide the needed results. Try to extract features which make your images unique, like average colour or brightness, maybe try to extract some shapes or sizes out of the image. So in the end you will feed your algorithm just with the features you extracted out of your images and not the images itself.
Which are good features is hard to define, it depends on your special case. Generally it helps to have not just one but multiple features covering completely different aspects of the image. To extract the features you could use openCV, or any other image processing tool you like. Get the features of all images in your dataset and get started with the machine learning.
From what I understood, you want to build a Content Based Image Retrieval system.
There are plenty of methods to do this. What defines the best method to solve your problem has to do with:
the type of objects you want to recognize,
the type of images that will be introduced to search the objects,
the priorities of your system (efficiency, robustness, etc.).
You gave the example of recognizing chairs. In your system which would be the determining factor for selecting the most similar chair? The color of the chair? The shape of the chair? These are typical question that you have to answer before choosing the method.
Either way one of the most used methods to solve such problems is the Bag-of-Words model (also Referred the Bag of Features). I wish I could help more but for that I need that you explain it better which are the final goals of your work / project.
OK. This is part of an (non-English) OCR project. I have already completed preprocessing steps like deskewing, grayscaling, segmentation of glyphs etc and am now stuck at the most important step: Identifcation of a glyph by comparing it against a database of glyph images, and thus need to devise a robust and efficient perceptual image hashing algorithm.
For many reasons, the function I require won't be as complicated as required by the generic image comparison problem. For one, my images are always grayscale (or even B&W if that makes the task of identification easier). For another, those glyphs are more "stroke-oriented" and have simpler structure than photographs.
I have tried some of my own and some borrowed ideas for defining a good similarity metric. One method was to divide the image into a grid of M x N cells and take average "blackness" of each cell to create a hash for that image, and then take Euclidean distance of the hashes to compare the images. Another was to find "corners" in each glyph and then compare their spatial positions. None of them have proven to be very robust.
I know there are stronger candidates like SIFT and SURF out there, but I have 3 good reasons not to use them. One is that I guess they are proprietary (or somehow patented) and cannot be used in commercial apps. Second is that they are very general purpose and would probably be an overkill for my somewhat simpler domain of images. Third is that there are no implementations available (I'm using C#). I have even tried to convert pHash library to C# but remained unsuccessful.
So I'm finally here. Does anyone know of a code (C# or C++ or Java or VB.NET but shouldn't require any dependencies that cannot be used in .NET world), library, algorithm, method or idea to create a robust and efficient hashing algorithm that could survive minor visual defects like translation, rotation, scaling, blur, spots etc.
It looks like you've already tried something similar to this, but it may still be of some use:
https://www.memonic.com/user/aengus/folder/coding/id/1qVeq
I was hoping that I could achieve some guidance from the stackoverflow community regarding a dilemma I have run into for my senior project. First off, I want to state that I am a novice programmer, and I'm sure some of you will quickly tell me this project was way over my head. I've quickly become well aware that this is probably true.
Now that's that's out of the way, let me give some definitions:
Project Goal:
The goal of the project, like many others have sought to achieve in various SO questions (many of which have been very helpful to me in the course of this effort), is to detect
whether a parking space is full or available, eventually reporting such back to the user (ideally via an iPhone or Droid or other mobile app for ease of use -- this aspect was quickly deemed outside the scope of my efforts due to time constraints).
Tools in Use:
I have made heavy use of the resources of the AForge.Net library, which has provided me with all of the building blocks for bringing the project together in terms of capturing video from an IP camera, applying filters to images, and ultimately completing the goal of detection. As a result, you will know that I have selected to program in C#, mainly due to ease-of-use for beginners. Other options included MATLAB/C++, C++ with OpenCV, and other alternatives.
The Problem
Here is where I have run into issues. Below is linked an image that has been pre-processed in the AForge Image Processing Lab. The sequence of filters and processes used was: Grayscale, Histogram Equalization, Sobel Edge Detection and finally Otsu Threshholding (though I'm not convinced the final step is needed).
http://i.stack.imgur.com/u6eqk.jpg
As you can tell from the image with the naked eye of course, there are sequences of detected edges which clearly are parked cars in the spaces I am monitoring with the camera. These cars are clearly defined by the pattern of brightened wheels, the sort of "double railroad track" pattern that essentially represents the outer edging of the side windows, and even the outline of the license plate in this instance. Specifically though, in a continuation of the project the camera chosen would be a PTZ to cover as much of the block as possible, and thus I'd just like to focus on the side features of the car (eliminating factors such as license plate). Features such as a a rectangle for a sunroof may also be considered but obviously this is a not a universal feature of cars, whereas the general window outline is.
We can all see that there are differences to these patterns, varying of course with car make and model. But, generally this sequence not only results in successful retrieval of the desired features, but also eliminates the road from view (important as I intend to use road color as a "first litmus test" if you will for detecting an empty space... if I detect a gray level consistent with data for the road, especially if no edges are detected in a region, I feel I can safely assume an empty space). My question is this, and hopefully it is generic enough to be practically beneficial to others out there on the site:
Focused Question:
Is there a way to take an image segment (via cropping) and then compare the detected edge sequence with future new frames from the camera? More specifically, is there a way to do this while allowing leeway/essentially creating a tolerance threshhold for minor differences in edges?
Personal Thoughts/Brainstorming on The Question:
-- I'm sure there's a way to literally compare pixel-by-pixel -- crop to just the rectangle around your edges and then slide your cropped image through the new processed frame for comparison pixel-by-pixel, but that wouldn't help particularly unless you had an exact match to your detected edges.
All help is appreciated, and I'm more than happy to clarify as needed as well.
Let me give it a shot.
You have two images. Lets call them BeforePic and AfterPic. For each of these two pictures you have a ROI (rectangle of interest) - AKA a cropped segment.
You want to see if AfterPic.ROI is very different from BeforePic.ROI. By "very different" I mean that the difference is greater then some threshold.
If this is indeed your problem, then it should be split into three parts:
get BeforePic and AfterPic (and the ROI for each).
Translate the abstract concept of picture\edge difference into a numerical one.
compare the difference to some threshold.
The first part isn't really a part of your question, so I'll ignore it.
The last part is based basically finding the right threshold. Again out of the scope of the question.
The second part is what I think is the heart of the question (I hope I'm not completely off here). For this I would use the algorithm ShapeContext (In the PDF, it'll be best for you to implement it up to section 3.3, as it gets too robust for your needs from 3.4 and on).
Shape Context is a image matching algorithm using image edges with great success rates.
Implementing this was my finals project, and it seems like a perfect match (no pun intended) for you. If your edges are well, and your ROI is accurate, it won't fail you.
It may take some time to implement, but if done correctly, this will work perfectly for you.
Bare in mind, that a poor implementation might run slowly and I've seen a worst case of 5 seconds per image. A good (yet not perfect) implementation, on the other hand, will take less then 0.1 seconds per image.
Hope this helps, and good luck!
Edit: I found an implementation of ShapeContext in C# # CodeProject, if it's of any interest
I take on a fair number of machine vision problems in my work and the most important thing I can tell you is that simpler is better. The more complex the approach, the more likely it is for unanticipated boundary cases to create failures. In industry, we usually address this by simplifying conditions as much as possible, imposing rigid constraints that limit the number of things we need to consider. Granted, a student project is different than an industry project, as you need to demonstrate an understanding of particular techniques, which may well be more important than whether it is a robust solution to the problem you've chosen to take on.
A few things to consider:
Are there pre-defined parking spaces on the street? Do you have the option to manually pre-define the parking regions that will be observed by the camera? This can greatly simplify the problem.
Are you allowed to provide incorrect results when cars are parked illegally (taking up more than one spot, for instance)?
Are you allowed to provide incorrect results when there are unexpected environmental conditions, such as trash, pot holes, pooled water or snow in the space?
Do you need to support all categories of vehicles (cars, flat-bed trucks, vans, delivery trucks, motorcycles, mini electric cars, tripod vehicles, ?)
Are you allowed to take a baseline snapshot of the street with no cars present?
As to comparing two sets of edges, probably the most robust approach is known as geometric model finding (describing the edges of interest mathematically as a series of 'edgels', combining them into chains and comparing the geometry), but this is over-kill for your application. I would look more toward thresholds of the count of 'edge pixels' present in a parking region or differencing from a baseline image (need to be careful of image shift, however, since material expansion from outdoor temperature changes may cause the field of view to change slightly due to the camera mechanically moving.)
this is a totally unfamiliar area for me. can anyone point me in the right direction on how to create a social graph and the best way to represent it? i'm building a website in C#/asp net and need to create a "friends" feature... is this type of thing usually stored entirely in the DB? if so, how?
Is your primary concern painting a picture of the social network or storing the data?
For storage you might consider a graph database. However, the most mature product in this space is neo4j, which has the name suggests is written in Java. This SO discussion list some alternative approaches for .Net.
edit
You are still not being clear whether you need design advice or code samples. Andrew Siemer wrote a two-part article which outlines the issues and then presents some ASP.net code. I don't think it's by any means a complete solution but it could give you a steer in the right direction.
Your question is rather open-ended. For drawing complex graphs, one of my favorite tools is Graphviz. Graphviz can work with directed or non-directed graphs. It can take the input as a simple text file, and then output the graph in a variety of formats.
So your problem is primarily a data storage issue, and how to store and retrieve edges in your graph. Applying some simple graph terms to your problem:
Node/Vertex: In your case each person will represent a node.
Edge/Link: The relationship between nodes, in this case 'friends', will create an undirected edge between two nodes.
So you will need to maintain a data structure in your DB that allows you to resolve the edge relationships between friends.
Some useful information can probably be found in this question:
challenge-how-to-implement-an-algorithm-for-six-degree-of-separation
Also, something you should consider when deciding how to store your edge list is how many edges you think your site will generate. This will probably effect the storage mechanism you decide on.
Hope those pointers help.