Despite Googling around a fair amount, the only things that surfaced were on neural networks and using existing APIs to find tags about an image, and on webcam tracking.
What I would like to do is create my own data set for some objects (a database containing the images of a product (or a fingerprint of each image), and manufacturer information about the product), and then use some combination of machine learning and object detection to find if a given image contains any product from the data I've collected.
For example, I would like to take a picture of a chair and compare that to some data to find which chair is most likely in the picture from the chairs in my database.
What would be an approach to tackling this problem? I have already considered using OpenCV, and feel that this is a starting point and probably how I'll detect the object, but I've not found how to use this to solve my problem.
I think in the end it doesn't matter what tool you use to tackle your problem. You will probably need some kind of machine learning. It's hard to say which method would result in the best detection, for this I'd recommend to use a tool like weka. It's a collection of multiple machine learning algorithms and lets you easily try out what works best for you.
Before you can start trying out the machine learning you will first need to extract some features out of your dataset. Since you can hardly compare the images pixel by pixel which would result in huge computational effort and does not even necessarily provide the needed results. Try to extract features which make your images unique, like average colour or brightness, maybe try to extract some shapes or sizes out of the image. So in the end you will feed your algorithm just with the features you extracted out of your images and not the images itself.
Which are good features is hard to define, it depends on your special case. Generally it helps to have not just one but multiple features covering completely different aspects of the image. To extract the features you could use openCV, or any other image processing tool you like. Get the features of all images in your dataset and get started with the machine learning.
From what I understood, you want to build a Content Based Image Retrieval system.
There are plenty of methods to do this. What defines the best method to solve your problem has to do with:
the type of objects you want to recognize,
the type of images that will be introduced to search the objects,
the priorities of your system (efficiency, robustness, etc.).
You gave the example of recognizing chairs. In your system which would be the determining factor for selecting the most similar chair? The color of the chair? The shape of the chair? These are typical question that you have to answer before choosing the method.
Either way one of the most used methods to solve such problems is the Bag-of-Words model (also Referred the Bag of Features). I wish I could help more but for that I need that you explain it better which are the final goals of your work / project.
Related
I am new to unity, but I know C# very well.
I am working on a game similar to eu4.
When area is conquered, it should change color.
I have no idea how to do it, or what to search in the internet for a solution.
Here is the map:
(The borders separate the areas)
Any help please?
It's very simple, you just have sprites for each "country".
(So, in the dark color.)
Simply turn them on or off as you wish. That's all there is to it.
Another approach is you could learn how to make and use a typical "flood fill algorithm". But that is far beyond the scope of this question and is a general computer science issue. (You might look on, say, gamedev for starter tips on this. Additionally you'll have to become expert at generating textures dynamically in Unity.)
Check EU4 and CK2 map files. It might give you many ideas how to handle grand strategy maps.
You got some work to do to achieve this. But basically, there is a .bmp file in game files which every province is painted with a unique color. Here is an example of small portion which consist Italy:
Once you've done that, you will need some sort of data file (.csv in eu4's case) that includes which color includes which province, which provinces are neighboring, etc... In EU4, color info and adjadency info are in seperate files, but of course you're free to approach however you want.
Once you are done with "Data" part, you will need an algorithm that scans these files and distinguish each province in seperate game objects. Mainly, you must write a method that removes every color in an image except a certain one, instantiate it to the map and repeat process for each province. And you're done.
Edit: You can, of course, do these manually. But automated process is always better especially when you a lot of povinces to sepearate. Also making tweaks in map will be much easier that way, or you'll have a hard time recreating every single sprite for even small changes.
Another advantage is, you can easily add different mapmodes like heightmap, rivermap or such...
As your question doesn't give much in terms of your structure it's hard to give you examples.
The most straightforward for you probably is to separate your map into nodes (your bordered areas) and display them all as unique gameObjects. You can then access things like their Renderer to give them different colour.
We can give much better answers if you give us more information on how your data/graphics are structured.
I'm trying to perform image registration without much luck.
The image below is my 'reference' image. I use a webcam to acquire images of the same object in different orientations and then need to perform a transformation on these images so that they look as close to the reference image as possible.
I've been using both the Aforge.NET and Accord.NET libraries in order to solve this problem.
Feature detection/extraction
So far I've tried the image stitching method used in this article. It works well for certain types of image but unfortunately it doesn't seem to work for my sample images. The object itself is rather bland and doesn't have many features so the algorithm doesn't find many correlation points. I've tried two versions of the above approach, one which uses the Harris corner detector and one which uses SURF, neither of which has provided me with the results I need.
One option might be to 'artificially' add more features to the object (i.e. stickers, markings) but I'd like to avoid this if possible.
Shape detection
I've also tried several variations of the shape detection methods used in this article. Ideally I'd like to detect the four well-defined circles/holes on the object. I could then use the coordinates of these to create a transformation matrix (homography?) that I could use to transform the image.
Unfortunately I can't reliably detect all four of the circles. I've tried myriad different ways of pre-processing the image in order to get better circle detection, but can't quite find the perfect sequence. My normal operations is:
turn image grayscale
apply a filter (Mean, Median, Conservative Smoothing, Adaptive Smoothing, etc)
apply edge detection (Homogenity, Sobel, Difference, Canny, etc)
apply color filtering
run shape/circle detector
I just can't quite find the right series of filters to apply in order to reliably detect the four circles.
Image / Template matching
Again, I'd like to detect the four circles/holes in the object, so I tried an image / template matching technique with little success. I've created a template (small image of one of the circles) and run the Exhaustive Template Matching algorithm, without much success. Usually it detects just one of the holes, usually the one the template was created from!
In summary
I feel like I'm using the correct techniques to solve this problem, I'm just not sure quite where I'm going wrong, or where I should focus my attention further.
Any help or pointers would be most appreciated.
If you've added examples of transformations you're trying to be invariant to - we could be more specific. But generally, you can try to use HOG for detecting this structure, since it is rather rich in gradients.
HOG is mostly used to detect pedestrians, besides it is good for detecting distinct logos.
I am not sure about HOG's invariance to rotations, but it's pretty robust under different lighting and under moderate perspective distortion. If rotation invariance is important, you can try to train the classifier on rotated version of object, although your detector may become less discriminative.
After you have roughly detected the scale and position of your structure - you can try to refine it, by detecting ellipse of it's boundary. After that you will have a coarse estimate of holes, which you can further refine using something like maximum of normalized cross correlation in this neighbourhood.
I know it's been awhile but just a short potential solution:
I would just generate a grid of points on the original image (let's say, 16x16) and then use a Lucas-Kanade (or some other) feature detector to find those points on second image. Of course you likely won't find all the points but you can sort and choose the best correlations. Let's say, the best four? Then you can easily compute a transformation matrix.
Also if you don't get good correlations on your first grid, then you can just make other grids (shifted, etc.) until you find good matches.
Hope that helps anyone.
OK. This is part of an (non-English) OCR project. I have already completed preprocessing steps like deskewing, grayscaling, segmentation of glyphs etc and am now stuck at the most important step: Identifcation of a glyph by comparing it against a database of glyph images, and thus need to devise a robust and efficient perceptual image hashing algorithm.
For many reasons, the function I require won't be as complicated as required by the generic image comparison problem. For one, my images are always grayscale (or even B&W if that makes the task of identification easier). For another, those glyphs are more "stroke-oriented" and have simpler structure than photographs.
I have tried some of my own and some borrowed ideas for defining a good similarity metric. One method was to divide the image into a grid of M x N cells and take average "blackness" of each cell to create a hash for that image, and then take Euclidean distance of the hashes to compare the images. Another was to find "corners" in each glyph and then compare their spatial positions. None of them have proven to be very robust.
I know there are stronger candidates like SIFT and SURF out there, but I have 3 good reasons not to use them. One is that I guess they are proprietary (or somehow patented) and cannot be used in commercial apps. Second is that they are very general purpose and would probably be an overkill for my somewhat simpler domain of images. Third is that there are no implementations available (I'm using C#). I have even tried to convert pHash library to C# but remained unsuccessful.
So I'm finally here. Does anyone know of a code (C# or C++ or Java or VB.NET but shouldn't require any dependencies that cannot be used in .NET world), library, algorithm, method or idea to create a robust and efficient hashing algorithm that could survive minor visual defects like translation, rotation, scaling, blur, spots etc.
It looks like you've already tried something similar to this, but it may still be of some use:
https://www.memonic.com/user/aengus/folder/coding/id/1qVeq
I've been reading about using the winged-edge data structure for storing a boundary representation. However, the linked site says that this is one of the oldest data structres for storing b-reps, are there newer better ones?
Secondly, is there an implementation of this in C#?
The datastructure used for a B-rep is very similar to those used for polygonal modeling - you just replace the edges with curves and the faces with surfaces.
The wikipedia page on polygonal meshes has several types listed, including winged edge. Personally I like half-edge meshes. The only thing they can't do well is non-manifold topology, which you may or may not need. If you do, look for radial edge topology.
There's also a freely available B-rep datastructure from OpenNurbs (McNeel, the makers of Rhino). That also gets you file IO, which is nice.
Boundary Representation Modelling Techniques by Ian Stroud will give you a survey of ways people have approached B-reps, along with a plethora of diagrams with all the Euler operators, and concrete data structures and algorithms for implementing B-reps imperatively.
Whether you want to move a few characters forward into F# or not, you may glean quite a bit of info from the source code for Wings3d (written in Erlang). Just don't get lost making spaceships and forget you were supposed to be coding!
Also the GML will allow you to investigate interactively what you can do with your B-reps, and the data structure is the code.
Not sure if this will help or not but there are Geometry objects in the XNA library for dealing with 3D Structures and what not. There may be something in there. However my guess is that it will either be Point based or Triangle based vs edge based.
But it might be a place to look.
this is a totally unfamiliar area for me. can anyone point me in the right direction on how to create a social graph and the best way to represent it? i'm building a website in C#/asp net and need to create a "friends" feature... is this type of thing usually stored entirely in the DB? if so, how?
Is your primary concern painting a picture of the social network or storing the data?
For storage you might consider a graph database. However, the most mature product in this space is neo4j, which has the name suggests is written in Java. This SO discussion list some alternative approaches for .Net.
edit
You are still not being clear whether you need design advice or code samples. Andrew Siemer wrote a two-part article which outlines the issues and then presents some ASP.net code. I don't think it's by any means a complete solution but it could give you a steer in the right direction.
Your question is rather open-ended. For drawing complex graphs, one of my favorite tools is Graphviz. Graphviz can work with directed or non-directed graphs. It can take the input as a simple text file, and then output the graph in a variety of formats.
So your problem is primarily a data storage issue, and how to store and retrieve edges in your graph. Applying some simple graph terms to your problem:
Node/Vertex: In your case each person will represent a node.
Edge/Link: The relationship between nodes, in this case 'friends', will create an undirected edge between two nodes.
So you will need to maintain a data structure in your DB that allows you to resolve the edge relationships between friends.
Some useful information can probably be found in this question:
challenge-how-to-implement-an-algorithm-for-six-degree-of-separation
Also, something you should consider when deciding how to store your edge list is how many edges you think your site will generate. This will probably effect the storage mechanism you decide on.
Hope those pointers help.