Image recognition to read 7segment Display - c#

I am implementing SURF to detect Digits inside a seven segment display using some template. But it is not working fine. Is there any way that can be slow but more effective. I am using Emgu Wrapper for OpenCV

I would suggest you don't use SURF and instead look into using tesserect for character recognition.
SURF is really good for recognising patterns such as logo's and images, but for characters tesserect will not only produce better results, it's easier to implement!
You can create your own custom fonts to look for if the digits you are trying to read are non-standard.
https://www.youtube.com/watch?v=RqvvXJXuRYY&list=UUxAnMtjN08ryThpgYTBmILg
Try following this tutorial, its really helpful for getting started with OCR.
It's in VB but it won't be hard to write in C# once you have got the logic down.
Hope this helps!

Related

parse jpeg binary file

I exploring internet for two days and still can't find a good head start for this. I want to write a code with c# to get a .jpeg binary file and decode it and display the image. everywhere I looked there is lots of explanation about jpeg algorithm but still I can't find good explanation about how to parse and decode this file. I mean for example how can I know Huffman DC table starts with what number and end's with what number?
I appreciate if someone can link me somewhere that I can find explanation about parsing binary jpeg file.
thank you and sorry for my english.
Trust me, it isn't something you can do. I wouldn't touch the thing with a pole long various meters...
http://ijg.org/
Here there is the site of:
IJG is an informal group that writes and distributes a widely used free library for JPEG image compression. The first version was released on 7-Oct-1991.
There is the source code for libjpeg.
if you just want to take a look, here http://elm-chan.org/fsw/tjpgd/00index.html there is the source of
TJpgDec is a generic JPEG image decompressor module that highly optimized for small embedded systems.
it is even
Platform independent. Written in ANSI-C.
Being tiny it will be probably easy to reimplement in C# :-)

What should I use for monospaced digits recognition?

I have to recognize digits within an image from video stream and there are several more things, that should make recognition easier:
1) it is fixed font 6x8, all symbols are equal width
2) I know exact positions of digits, they are always rectangular, are not rotated/sqewed/scaled, but there may be some distortions because of air transmission glitch.
3) It is only digits and .
4) digit background is semi black (50% opaque)
I've tried tesseract v2 and v3, but .NET wrappers aren't perfect and recognition error was very large, even if I trained with custom font, as far as I understand that is because of small resolution.
I've made very simple algorithm by my self by turning image to black and white and counting matching pixels between original font image and image from stream, it performs better than tesseract, but I hink more sophisticated algorithm would do better.
I've tried to train AForge using ActivationNetwork with BackPropagationLearning and it fails to converge(this article first part, as long as I don't need scaling and several fonts http://www.codeproject.com/Articles/11285/Neural-Network-OCR, as I understand code in article is for older version of AForge), bad thing is, that this project is not supported anymore, forum is closed and google groups as I understand too.
I know there's OpenCV port to .NET, as far as I see, it has different network approaches than AForge, so questiton is which approach would fit best.
So is there any .NET framework to help me at this, and if it supports more than one neural network implementations, which implementation would fit best?
For fixed size fonts at a fixed magnification, you can probably get away with a less-sophisticated OCR approach based on template matching. See here for an example of how to do template matching using OpenCV (not .NET, but hopefully enough to get you started.) The basic idea is that you create a template for each digit, then try matching all templates at your target location, choosing the one with the highest match score. Because you know where the digits are located, you can search over a very small area for each digit. For more information on the theory behind template-matching, see this wiki article on Cross-correlation.
This is actually the basis for simplified OCR applications (usually for recognizing special OCR fonts, like the SEMI standard fonts used for printing serial numbers on silicon wafers.) The production-grade algorithms can also support tolerance for scaling, rotation and translation, but the underlying techniques are pretty much the same.
Try looking at this project and this project too. Both projects explain how OCR works and shows you how to implement it in C# and .NET.
If you are not in an absolute hurry I would advise you to first look for a method that solves the problem. I've made good experiences with WEKA. Using WEKA you can test a bunch of algorithms pretty fast.
As soon as you found the algorithm that solves your problem, you can either port it to .NET, build a wrapper, look for an implementation or (if it's an easy algo) rebuild it in .NET.

OCR engine to capture characters from images

i'm using c# tessnet2 wrapper for Tesseract OCR engine to capture chracters of image files. i been searching everywhere if tessnet2 has any build in functions to overwrite certain characters and saved them into the same image file it's reading but have not found anything in regards to that. so what i'm thinking of doing is creating a new imagine file base on what i'm receiving from tessnet2 but i need to create the new image the same exact way but change just few things in the new created image. i'm not sure if i'm using the correct methology or if there is other c# assemblies out there that allow you to read characters from image file and at the same time allow you to manipulate as you need them.
Good luck--but tess has no way of replacing in the proper font. Raster graphics don't generally store glyph information. Even if it did, you would potentially be in violation of licenses and/or copyrights surrounding the fonts you'd be writing in. I'm not an expert in OCR, but I will confidently say that this is something not readily available out there in the wild.
To expand on Brian's answer:
You will need to do this yourself. I have not worked with Tesseract, but I have used the Nuance OCR engine. It will return you font information as well as coordinates for the character it has recognized (note that you will most likely have to compute the actual image coordinate as the OCR engine will have deskewed the image before performing the recognition). Once you get the coordinates and the deskew so that you can compute the actual coordinate, you can then use any image manipulation library (Leadtools, Accusoft, etc) or just straight GDI+ functions to clear the character, then using the font info and size info create a new character and merge it into the image. This is not trivial but certainly doable.
Edit:
It was late when I wrote the initial answer, wanted to clarify what is meant by font information. The OCR engine will give you information regarding the point size, whether its bold/italicized and the font family (Seriph, etc). I do not know of one that will tell you the exact font that the document is in. If you have a sample of the documents that you will process, then you can make a good guess based on the info the OCR engine gives you.

Image straightening algorithm

I am looking for a way to auto-straighten my images, and I was wondering if anyone has come across any algorithms to do this. I realize that the ability to do this depends on the content of the image, but any known algorithms would be a start.
I am looking to eventually implement this in C# or PHP, however, I am mainly after the algorithm right now.
Is this possible with OpenCV? ImageMagick? Others?
Many thanks,
Brett
Here is my idea:
edge detection (Sobel, Prewitt, Canny, ...)
hough transformation (horizontal lines +/- 10 degrees)
straighten the image according to the longest/strongest line
This is obviously not going to work in any type of image. This is just meant to fuel the discussion.
Most OCR programs straighten the scanned image prior to running recognition. You probably find good code in the many open source'd OCR programs, such Tesseract
Of course this does depend on what type of images you want to straighten, but there seems to be some resources available for automatic straightening of text scans.
One post I found mentioned 3 programs that could do auto-straightening:
TechSoft's PixEdit 7.0.11
Mystik Media's AutoImager 3.03
Spicer's Imagenation 7.50
If manual straightening is acceptable, there are many tutorials out there for how to straighten them manually using Photoshop; just google "image straightening"
ImageMagick has the -deskew option. This will simply rotate the image to be straight.
Most commercial OCR engines like ABBYY FineReader and Nuance OmniPage do this automatically.
The Leptonica research library has a command line tool called skewtest which will rotate the image.
I have not found a library which can take an image which has been distorted in any other way (like pin cushion or if it has been moved during a scanning operation, or removing the warp at the edge of a book). I am looking for a library or tool that can do this, but cannot find one.
Patrick.

Parse text from a screen grab

Not sure the best way to explain this but i'll give it a shot. I'm trying to find a way to parse text/numbers from a screen grab in either C# or Java - whichever provides the easiest way, but preferably java.
An example would be as follows. You have a website/document/application with a block of text. You can take a screenshot of the specific area which contains this text. Once the screenshot has been taken you can extract a string from it containing the relevant characters.
Any feedback is appreciated. Thanks
You could try the Tessnet2 a .NET 2.0 Open Source OCR assembly using Tesseract engine.
If you are taking a screenshot it will be saved into an image file. You can use OCR image tools to break it down into text, but it's not 100% always in the conversion. Parsing text and/or numbers is just basic code in C# or Java.
FreeOCR is best out of all the OCRs I've tried.

Categories