Hi all
I need your help in studying object recognition in video as this will be my new project in my faculty.
I had a previous study in "Computer Vision" field !
I just need your suggestions as " Good Books, Web resources, Tutorials, others " that will help me in my project.
my project will be in c# or Matlab
thanks
Just a simple suggestion here. Break the problem down into small manageable chunks. Since your studying object recognition you will probably want to get most software off the shelf so you can focus on studying and not debugging.
If you are using mat lab. maybe you should look at this.
http://www.mathworks.com/products/image/
c# I would assume to have some awesome image processing library's now thanks to the xbox kinect.
http://www.codeproject.com/Articles/148251/How-to-Successfully-Install-Kinect-on-Windows-Open.aspx
and just another technology that is good for image processing is labview. If your faculty has licences and people that know it well to help you it may be another option.
http://www.ni.com/labview/whatis/?nipkw=LabVIEW&nicam=OceaniaZA-VI2009&nigrp=labview&nisrc=Google&niurl=&ninet=search
It's not a simple subject so finding specific books/resources are unlikley. As Oli said, your tutor is a specialist, they should be able to point you to guides/reference material
Check out scientific journals in the area of image processing such as Pattern Recognition, Pattern Recognition Letters, Computer Vision and Image Understanding, International Journal of Computer Vision etc, and books such as Computer Vision And Image Processing.
Related
I have thousands of jpegs in a folder structure. These images are a snapshot of my driveway in 2560 x 1440 and are taken and stored every 60 seconds.
I'd like to create a program that can detect, from analyzing an image, whether I or my wife, was home at that particular time or not. I have a red car, she has a bright yellow car. So a simple color threshold should probably suffice. Another clear distinction is that we both have our own spot and never park in the others. Also, other people don't use the driveway (and if they do, I don't mind a false positive). One minor complication is that the camera's switch to black/white during the dark (but that may be when the parking spot rather than the color might come in handy).
So I was hoping I could use ML.Net and train a model with some hand-annotated images where I tag the image with data whether I see my or her car in the driveway. I was thinking of annotating maybe a 100 to a couple of hundred images for day and another set for night and feed all these images to ML.Net to train it and then have analyse a few 100 images where I can manually check the results and correct any mistakes and then create a sort of feedback-loop to train on a few hundred more images.
Once the training is complete I'd like to analyze all images currently stored and each new image as it comes in to generate some data on when I'm (or my wife is) home, away etc.
My problem is (and this is probably going to be the reason for the question being closed as "too broad" or something): I have no clue on how to do this. I have seen awesome tutorials that all make it seem like child's play but when I then try to do this in C# (my language of choice) and look for ML.Net Howto's I can't seem to find anything that helps me in the right direction.
For example: Train a machine learning model with data that's not in a text file. I'm a competent programmer so it's peanuts to create CSV file / database / whatever that has 1.jpg -> rob home, wife not home data. But the "How To" doesn't explain how to feed the image into ML.Net and I haven't been able to find anything that does. Most probable cause is that I'm new to ML(.Net) and probably that I'm too stubborn to give up trying to accomplish this in C# but the information available is, weird as it sounds, overwhelming but also scarce. The information available usually leads me going down some rabbit hole only to find out after way too long that it's not what I want or I can't find anything that hints of me going in the right direction.
So long story short; tl;dr:
How do I feed images into ML.Net, how do I tell ML.Net that my/her car is in the driveway for any given image (training) and how do I get ML.Net to tell me whether it thinks I'm / my wife is home or not for a given image? Or is this not possible (currently)? I'm NOT looking for complete code but for pointers, hints, links, tutorials, examples or whatever may help me in the right direction.
you might find something usefull here Image recognition/classification using Microsoft ML.net 0.2 (Machine learning)
However I would encourage you to consider python as weapon of choice for your task.
Here you would just store the data in different folders according to the label, you #home, your wife #home, both #home, no car in the drive way, other
and you are ready to go.
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
It probably won't take you more than a weekend, and thats inlcuding to learn the bacics of python.
Edit:
I seems as it still does not support to train image classification tasks using ML.Net: "Again, note that this sample only uses/consumes a pre-trained TensorFlow model with ML.NET API. Therefore, it does not train any ML.NET model. Currently, TensorFlow is only supported in ML.NET for scoring/predicting with existing TensorFlow trained models."
There is a thread about it here https://github.com/dotnet/docs/issues/5379,
What you could try is uses: http://www.emgu.com/wiki/index.php/Main_Page in combination with OpenCV, this https://www.geeksforgeeks.org/opencv-python-program-vehicle-detection-video-frame/ is an example in python but it should translate well to c++ or c# using emgu. Once the car is detected check for the position and color. This approach would probably also avoid labeling any data.
Alternatively use a pre trained model h5 file and load into ML.Net then check for the position and mean color to check whos car it is.
I am creating an android application to recognize book in a library. What I do is I will take a image of the book spine of a book and send it to a server to do the image process there and recognize the book from a database and send the details about the book to the phone or if book is not there, it will recognize the optical characters and send it to the mobile application. I am hoping to do the image processing process using C#. The book recognition is done using a template image comparing which are in the database with the sent image. So I need some help figuring out what would be the best approach to do this. I have already researched on some methods such as
Template matching
Pattern recognition
feature recognition
I want to know when it comes to images like books what would be the recommended method which I better follow. And Is there any good APIs for this. I have researched on OpenCV but want to know if there are better APIs. And how can I use OCR when recognizing the book. I want application to be fast. Normally when we compare two book spines(template and image) if i get 60% of similarities I can assume its the same book. So I am searching for the optimal way...! Help me out with this...!
While I have limited knowledge in the field of image processing, there is a library which offers such facilities: AForge.NET. That might be good as a starting reference.
EDIT: for an introductory explanation of the theory behind image processing, this may also offer some guidance: http://www.societyofrobots.com/programming_computer_vision_tutorial.shtml
I understand that you are looking for some API or "already-built" image processing library to help you with this, but this answer might help you in a way, or other people who want to pursue something like this.
There are some pretty helpful research papers (including tests from successful implementations) on this Mobile Visual Search page at Stanford. Check out the heading "Book Spine Recognition for Asset Tracking" on that page.
Hi can someone point me in the write direction. I would like to be able to stream video to the internet from the Glory TV satellite and have no idea on where to start.
I'm learning C#, ASP.NET and Silverlight but need to know how to stream videos from a satellite. I'd like the site to be something like HULU but don't have a clue where to start but i know the technologies i want to use to do.
I've already looked at some of the free open sources silverlight video players i can use.
Are there some similar sites or services api's available that i can look at to get started or learn how to implement this or even books. Any help would be greatly appreciated.
Lanesa
check these out :
http://forums.silverlight.net/forums/p/101837/237098.aspx
http://forums.silverlight.net/forums/p/21970/77051.aspx
http://www.learn-silverlight-tutorial.com/StreamingMediaUsingSilverlight.cfm
I'm making my final year project i.e. speech recognition. but I don't have any idea how to start. I will use c#. Please can anyone guide me how to start? what should be the first step?
Thanks
You probably want to start with the wikipedia entry on speech recognition here: http://en.wikipedia.org/wiki/Speech_recognition - at the end of that article, there are a bunch of useful links to papers and software on the topic.
Another thing you will want to do is talk to the professor who is coordinating this project. He or she will know about other resources and can probably point you in a good direction.
Also - whenever embarking on a project you know nothing about, google is your friend
Speech recognition is really fuzzy pattern-matching, so how about looking into artificial neural networks as they're extremely good at pattern matching. Ensure that the audio's in a nice simple format and trim to syllables/words. Train the network on these files and then find a way to split the files you record in code. It may be simplest to start with a very limited vocabulary (individual letters maybe) as a proof of concept. Be prepared to run computers overnight to train the networks and try to get access to a high performance cluster.
I would start by researching some libraries and reading up on these subjects..
http://www.microsoft.com/speech/evaluation/thirdparty/engines.mspx
http://www.codeproject.com/KB/audio-video/TTSinVBpackage.aspx
http://blogs.msdn.com/coding4fun/archive/2006/10/31/909044.aspx
http://www.c-sharpcorner.com/UploadFile/ssrinivas/SpeeechRecognitionusingCSharp11222005054918AM/SpeeechRecognitionusingCSharp.aspx
You can look at the .Net System.Speech.Recognition namespace:
http://msdn.microsoft.com/en-us/library/system.speech.recognition.aspx
Take a look at this MSDN article which describes the Speech libraries built into Windows Vista and Windows 7: http://msdn.microsoft.com/en-us/magazine/cc163663.aspx
Is it possible to have an application built using the .NET speech recognition classes and pass in a WAV file for it to go through and create a text representation of it. For example, this what I'm trying to do:
We have a QA department at my office and they have to listen to hundreds of calls a day which is quite impossible, and there's not enough people listening to everything to keep up. What I want to do is have the audio file uploaded to our server and have the server parse it and create a transcript of it. It doesn't matter if it's not perfect, but just a base which would be easier to skim through a couple of dozen lines of text than listen to a 2 hour recording.
Based on a saved transcript I can implement full-text search in the database and also run checks against the transcript if someone is saying something that's a misrepresentation.
So, is it possible to create an application using the .NET speech recognition classes and just pass the WAV file to it and it spit out a rough transcript?
I've dug around MSDN on the Speech classes briefly while thinking up the idea, so I don't have that much knowledge if it's possible to be done.
If possible, I would appreciate any examples in C#. Topic 1055347 is similar to the question I'm having, and was provided links, the most specific of which is in C++. I'm not a C++ developer, nor have I ever went to school for programming, I'm all self though C#, so I would like to stay in the language that I know.
Thanks in advance!
This sounds like you've got a call center type of application. Microsoft Speech Server has a SR engine optimized for telephony (8000 Hz sample rate), which will generate much better recognitions than the desktop SR engine. However, the engine isn't really designed for transcription (although it can do it), and the transcriptions definitely need to be reviewed before further processing occurs. Microsoft Exchange Unified Communications uses the SR engine to generate transcripts of voice mail, and while it's better than nothing, it often generates amusing nonsense.
With areas like speech recognition you are likely to either find a stand alone EXE or an API in c/c++.
For the links in the other topic, you can use a tool like P Interop Assistant to generate C# code. The C# code acts like a wrapper around the unmanaged dll, so you can call it from c#.
This is likely to be the best way to get the functionality you are looking for.
Yes.
I did such an application a few years ago on the Tablet PC; you can read about it at http://web.archive.org/web/20060615192119/www.devx.com/TabletPC/Article/30761 (At the time, I spoke of using Interop to access the libraries, but I believe that the programming model has remained the same, just with a managed wrapper.)
At the time, the results were very poor, but maybe for your use-case better than nothing.
How about route the calls to Google Voice? I'm sure there are similar services. I have been amazed at its accuracy so far, plus you can click and listen to it if required. Google Voice will forward voice calls to SMS or email.
UPDATE: On reread, maybe since you are recording calls it won't work as I yous the voice message left.