Image analysis in C# with ML.Net - c#

I have thousands of jpegs in a folder structure. These images are a snapshot of my driveway in 2560 x 1440 and are taken and stored every 60 seconds.
I'd like to create a program that can detect, from analyzing an image, whether I or my wife, was home at that particular time or not. I have a red car, she has a bright yellow car. So a simple color threshold should probably suffice. Another clear distinction is that we both have our own spot and never park in the others. Also, other people don't use the driveway (and if they do, I don't mind a false positive). One minor complication is that the camera's switch to black/white during the dark (but that may be when the parking spot rather than the color might come in handy).
So I was hoping I could use ML.Net and train a model with some hand-annotated images where I tag the image with data whether I see my or her car in the driveway. I was thinking of annotating maybe a 100 to a couple of hundred images for day and another set for night and feed all these images to ML.Net to train it and then have analyse a few 100 images where I can manually check the results and correct any mistakes and then create a sort of feedback-loop to train on a few hundred more images.
Once the training is complete I'd like to analyze all images currently stored and each new image as it comes in to generate some data on when I'm (or my wife is) home, away etc.
My problem is (and this is probably going to be the reason for the question being closed as "too broad" or something): I have no clue on how to do this. I have seen awesome tutorials that all make it seem like child's play but when I then try to do this in C# (my language of choice) and look for ML.Net Howto's I can't seem to find anything that helps me in the right direction.
For example: Train a machine learning model with data that's not in a text file. I'm a competent programmer so it's peanuts to create CSV file / database / whatever that has 1.jpg -> rob home, wife not home data. But the "How To" doesn't explain how to feed the image into ML.Net and I haven't been able to find anything that does. Most probable cause is that I'm new to ML(.Net) and probably that I'm too stubborn to give up trying to accomplish this in C# but the information available is, weird as it sounds, overwhelming but also scarce. The information available usually leads me going down some rabbit hole only to find out after way too long that it's not what I want or I can't find anything that hints of me going in the right direction.
So long story short; tl;dr:
How do I feed images into ML.Net, how do I tell ML.Net that my/her car is in the driveway for any given image (training) and how do I get ML.Net to tell me whether it thinks I'm / my wife is home or not for a given image? Or is this not possible (currently)? I'm NOT looking for complete code but for pointers, hints, links, tutorials, examples or whatever may help me in the right direction.

you might find something usefull here Image recognition/classification using Microsoft ML.net 0.2 (Machine learning)
However I would encourage you to consider python as weapon of choice for your task.
Here you would just store the data in different folders according to the label, you #home, your wife #home, both #home, no car in the drive way, other
and you are ready to go.
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
It probably won't take you more than a weekend, and thats inlcuding to learn the bacics of python.
Edit:
I seems as it still does not support to train image classification tasks using ML.Net: "Again, note that this sample only uses/consumes a pre-trained TensorFlow model with ML.NET API. Therefore, it does not train any ML.NET model. Currently, TensorFlow is only supported in ML.NET for scoring/predicting with existing TensorFlow trained models."
There is a thread about it here https://github.com/dotnet/docs/issues/5379,
What you could try is uses: http://www.emgu.com/wiki/index.php/Main_Page in combination with OpenCV, this https://www.geeksforgeeks.org/opencv-python-program-vehicle-detection-video-frame/ is an example in python but it should translate well to c++ or c# using emgu. Once the car is detected check for the position and color. This approach would probably also avoid labeling any data.
Alternatively use a pre trained model h5 file and load into ML.Net then check for the position and mean color to check whos car it is.

Related

Path Finding Using Building Maps

Hi everyone,
I'm working on a path finding application in C# and I've run into a problem before I even start looking into coding the path finding aspect. The application will allow a user to place a marker on the map of the building then show the user the nearest exit from that position. I have the maps of the building I need but I'm not sure if I can use them straight away as jpeg images.
Would I be able to use the maps as they are or would it be better to remake them in a grid format so its all split up into squares? I'm thinking it may be easier to code the path finding aspect if the maps were made up of squares in a grid but it may take some time to remake the maps in this format.
Any advice is greatly appreciated, I do have experience in C# but path finding is a fairly new subject to me so I'm not sure of the best format for the maps to be in.
Thanks in advance!
Well, if you can afford some manual data processing, the best way would be to simply build a graph of the aisles and store it. You can then use simple graph search algorithms to find the nearest exit. Even spliting it into a grid is an overkill.
I've used this before to build graphs of aisles and stores in commerce buildings, and it's very useful and simple to implement.

How does Gmap.net cache works?

I´m using the GMAP.net library in a project and I found it was a powerful tool. It´s cache facility made it a real profit to my project. Anyway, I need if someone can tell me a little bit of how that cache works. As far as I tested it, I can see that it pre-allocates space (in my case about 200 mb on SQLITE file), so I started doing some test seeing how it worked, and it turns out it works really well, but in some cases I have been viewing maps that haven´t been cached. I don´t know if I have to spent some time with the position on the map so It can be cached or something like that. Does the tile cache file size increases with the time?, or it just keeps the prealocated size?.
Thanks in advance for any possible answer.
I've been doing some research on this same topic. I learned that the application by default has a gmdb of 256mb. This doesnt mean that there is 256mb of cached map. It just creates room for the cache data. Look at this post by Radio man for more clarity. Hope this helps. If you find more info post if for others because there is a lot of missing info on this topic.
http://greatmaps.codeplex.com/discussions/274628

Video frame by frame Search Engine

So here is the thing .. i have my final year project coming and i have this idea of video search engine ...
It will do these following things ... get the user query or whatever he/she wants to search and then search the video frame by frame ... and i know it might take a lot of time ...
There actually will be two steps the pre-processing stage where the algorithm will run that will put tag on videos like youtube does.. only this time the tagging will be done by the algorithm which i don't know of..
I just need an initial push to start ...
Is there any algorithm which will give the result i want ..?
PS : This will only work for video Lectures ..if there are any other ideas please do tell.. !
You need to break the problem in to it's component parts first as there will be no one solution or algrorithom to do what you want (otherwise your Sr. Project would be done for you already).
From what I can tell here are the parts that I can see.
Get a video stream
Split the video stream in to relevant chunks to process in detail. (look for more than say 30% change in a short time span (like a blackboard being erased))
Process the chunk in detail either passing it to the next step or splitting the chunk in to two smaller chunks. (maybe look for a smaller change over a longer time span)
OCR the text.
Detect if the previous chunk has the same text, if so, throw the current chunk out (you did too fine of splitting in step 3 or 4).
Store the OCR data in a database of some sort with the time index of the text.
Build a program to query that database for student use.
Each of those steps will have sub steps to them that you can use the same technique of divide and conquer to figure out how to do that step.
If you need any help doing one of those singular steps let us know in a new question (one topic per question please).

Comparing two images(Book Spines) and recognize

I am creating an android application to recognize book in a library. What I do is I will take a image of the book spine of a book and send it to a server to do the image process there and recognize the book from a database and send the details about the book to the phone or if book is not there, it will recognize the optical characters and send it to the mobile application. I am hoping to do the image processing process using C#. The book recognition is done using a template image comparing which are in the database with the sent image. So I need some help figuring out what would be the best approach to do this. I have already researched on some methods such as
Template matching
Pattern recognition
feature recognition
I want to know when it comes to images like books what would be the recommended method which I better follow. And Is there any good APIs for this. I have researched on OpenCV but want to know if there are better APIs. And how can I use OCR when recognizing the book. I want application to be fast. Normally when we compare two book spines(template and image) if i get 60% of similarities I can assume its the same book. So I am searching for the optimal way...! Help me out with this...!
While I have limited knowledge in the field of image processing, there is a library which offers such facilities: AForge.NET. That might be good as a starting reference.
EDIT: for an introductory explanation of the theory behind image processing, this may also offer some guidance: http://www.societyofrobots.com/programming_computer_vision_tutorial.shtml
I understand that you are looking for some API or "already-built" image processing library to help you with this, but this answer might help you in a way, or other people who want to pursue something like this.
There are some pretty helpful research papers (including tests from successful implementations) on this Mobile Visual Search page at Stanford. Check out the heading "Book Spine Recognition for Asset Tracking" on that page.

Detecting forged images with C#?

One of my friends came up with an interesting problem - Assume that we have a set of images in the system. Now, some one might submit a new image by slightly modifying any of the images already submitted, and in that case, the system should report that the submitted image is a forged image.
I can think about two solutions.
Solution 1 - Do an image comparison (bitmap based) for each input image with the given images in the database, probably after converting them to gray scale to counter color changing tricks, and after resizing them to a standard size.
Solution 2 - Create a Self Organized Map and train with all the existing images. And if some one submits an image, if it has a close match, report it as forged.
It might not be possible to have a system with more than 90% accuracy. Please share your thoughts/suggestions/solutions.
Edit after going through few answers: I already have a backprop neural network and an xml based language to train neural networks here - http://www.codeproject.com/KB/dotnet/neuralnetwork.aspx
I'm looking forward for specific answers for the problem I described above.
Thanks
Good question, but depends on how much code you want to write. What if I mirror/flip an image, cut&paste with-in images. When you solve this problem, you've cracked most CAPTCHA too?
If you have alot of horsepower and programming man-hours you might want to look at Forier Transformations and Historgams to find matches. This would identify flip/mirror copy/paste.
Then create lots of fragments of tests, like unit tests(?) for things like "can this bit of image be found in the source" "can this bit when hue-rotated be found" etc etc.
Very open ended problem
Guess you can start with Image Recognition with Neural Networks.
Basically I think it covers your Solution 2 approach. At least you'll find useful guidance for Neural Networks and how to train them.
There is certainly a trade-off between performance and accuracy here. You could use neural networks but may need some pre-transformations first: e.g. http://en.wikipedia.org/wiki/Image_registration.
There are several more efficient algorithms like histogram comparison. Check The segmentation article at Wikipedia: en.wikipedia.org/wiki/Segmentation_%28image_processing%29
I think the simplest solution would be to simply invisibly digitally watermark images that are already in the system, and new images as they are added.
As new images are added, simply check for traces of the digital watermark.
No offense, but this might be a "if you only know a hammer, every problem looks like a nail"-type of situation. Artificial neural networks aren't a good solution for everything. If you simply calculated a pixel-by-pixel mean squared difference between the stored images and the "forge candidate", you could probably get judge image similarity more reliably.
I'd also suggest resizing all images to e.g. 50x50 pixels and performing a histogram equalization before comparing them. That way you could ignore image resizing and global brightness contrast changes.
After some research, I've decided that the best way is to use the Self organizing maps (SOM) approach.
The idea is to self train the SOM network initially with the available/valid images, and then when a new image is inserted, find the nearest images and if matches found under a threshold, report the same.
AForge is an excellent library with SOM support (http://code.google.com/p/aforge/)
Information on basic SOM here
A good read on SOM here

Categories