Best Jpeg Encoder for Silverlight 4.0 - c#

I want to convert Writablebitmap to Jpeg stream, and it looks like there is no platform support as well as I can see a bunch of opensource Encoder libraries on web, I want to get your opinion on which is the recommended one in terms of performance and reliability.

I made good experience with FJCore.
I also blogged about it a while ago http://kodierer.blogspot.com/2009/11/convert-encode-and-decode-silverlight.html

I've spent quite a bit of time with both FJCore and LibJpeg.Net. FJCore is easier to use, since it was ported over from Java, and has an object model that vaguely resembles what you'd expect to see in C#. However, LibJpeg.NET is by far the more complete library (it's based on the informally canonical libjpeg), and it's significantly faster as well. To give one example, FJCore uses a naive implementation of an inverse discrete cosine transform that involves something like 1024 multiplications and an additional 1024 additions for each 8x8 block. In contrast, LibJpeg.NET uses the high performance AAN algorithm which only takes 144 multiplications and 464 additions (see http://datasheets.chipdb.org/Intel/x86/MMX/MMX/AP528.HTM#AAN Algorithm). In addition, FJCore is fairly inefficient in how it uses memory, constantly recreating objects that could easily be re-used. At the same time, because FJCore has fewer optimizations, it's significantly easier to hack.
For my current project (which involves writing a video codec for Silverlight), I used FJCore as a starting point, fixed a whole bunch of its inefficiencies, replaced its IDCT algorithm with the one from LibJpeg.NET, and ended up with something that gave me about 10x the original performance.

Ken why don't you submit your updated code to the FJCore source?
http://code.google.com/p/fjcore/

Related

BinDCT implementation for a 32x32 matrix

So I am playing a bit with DCT implementations and noticed they are (relative) slow due to the necessary multiplier calculations.
After googling a bit, I came across BinDCT, which results in very good approximations of the DCT and only uses bit shifts.
While scanning a paper about it (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.7.834&rep=rep1&type=pdf and http://www.docstoc.com/docs/130118150/Image-Compression-Using-BinDCT) and reading some code I found on ohlo (http://code.ohloh.net/file?fid=vz-HijUWVLFS65NRaGZpLZwZFq8&cid=mt_ZjvIU0Us&s=&fp=461906&projSelected=true#L0), I noticed there are only implementations for a 8x8 matrix.
I am looking for an implementation of this BinDCT for a 32x32 matrix so I can use it in a faster variation of the perceptual hash algorithm (phash).
I am no mathematician and although I tried to understand what's going on in the paper and the c code I found I just can't wrap my head around how to transform this implementation to apply to a 32x32 matrix.
Has anyone ever written one? Is it even possible?
I understand that extending the implementation requires a lot more bit shifting and tmp variables. But although I could try with trial and error, I don't even understand the theory, so I would never know if I get the correct result.
I am writing this in C#, but any language would suffice as it's all basic operations and can be easily translated.
1.you have fixed input size
so you multiply by the same weights all the time
pre-compute them once and then use only them
this ditch all sin,cos operations
2.2D DCT can be computed as 1D DCT (similar to FFT)
first do DCT on rows
then on collumns of the DCTed rows
multiply by normalization constant
so this converts O(N^4) to O(N^3)
3.use FastDCT
well this is very tricky
Fast algorithm is fusion between (I)DST and (I)DCT
there are few papers about it
but there are vague (and all equations are different in different papers and not whole)
I actually newer see a functional equation nor program for it
the only almost functional approach is by use of FFT
but for small N is there no gain because of switching to complex domain
and the values are not really a DCT but a close approximation to it.
of course I am no expert in this field so I can overlooked something
in all that hundreds of paper pages equations
anyway after Fast algorith implementation the 2D (I)DCT and the bullet 2
is complexity around O((N^2).log(N))
4.ditching the FPU multiplications
you can take all the weights and convert them to a1=a0*1024
or any other mask
so:
x*a0 = (x*a1)/1024 = (x*a1)>>10
the same can be done for input data
so now just integer operations remains
but on modern machines can be this approach slower then FPU usage (depends on platform and implementation)
4.ditching integer multiplications
you can ditch all multiplications by shift and add operations (look for binary multiplication)
but on modern machines will this actually slow things down
of course if you are wiring this on some logic board/IO then it has its merit
My only understanding of applying matrices is related to manipulating 3D vectors so I don't know the answer to your question directly. But in looking around, I did find this link to a blog where your specific issue is addressed. The comments at the bottom are from a bunch of people that could be a good pool of resources to chat with who have knowledge in this area. Also, If you follow the links there is a lot of good image compression info.
The author appears to be heavily involved in photo forensics. He explains how pHash is more robust than the average hash and mentions using a 32 x 32 matrix.
This could be a really good starting point. Take care.
http://www.hackerfactor.com/blog/?/archives/432-Looks-Like-It.html

Perceptual image hashing

OK. This is part of an (non-English) OCR project. I have already completed preprocessing steps like deskewing, grayscaling, segmentation of glyphs etc and am now stuck at the most important step: Identifcation of a glyph by comparing it against a database of glyph images, and thus need to devise a robust and efficient perceptual image hashing algorithm.
For many reasons, the function I require won't be as complicated as required by the generic image comparison problem. For one, my images are always grayscale (or even B&W if that makes the task of identification easier). For another, those glyphs are more "stroke-oriented" and have simpler structure than photographs.
I have tried some of my own and some borrowed ideas for defining a good similarity metric. One method was to divide the image into a grid of M x N cells and take average "blackness" of each cell to create a hash for that image, and then take Euclidean distance of the hashes to compare the images. Another was to find "corners" in each glyph and then compare their spatial positions. None of them have proven to be very robust.
I know there are stronger candidates like SIFT and SURF out there, but I have 3 good reasons not to use them. One is that I guess they are proprietary (or somehow patented) and cannot be used in commercial apps. Second is that they are very general purpose and would probably be an overkill for my somewhat simpler domain of images. Third is that there are no implementations available (I'm using C#). I have even tried to convert pHash library to C# but remained unsuccessful.
So I'm finally here. Does anyone know of a code (C# or C++ or Java or VB.NET but shouldn't require any dependencies that cannot be used in .NET world), library, algorithm, method or idea to create a robust and efficient hashing algorithm that could survive minor visual defects like translation, rotation, scaling, blur, spots etc.
It looks like you've already tried something similar to this, but it may still be of some use:
https://www.memonic.com/user/aengus/folder/coding/id/1qVeq

H.264 (or similar) encoder in C#?

Does anyone know of an open source H.264 encoder in C# (or any other managed language)? I might be able to make do with a python implementation as well.
The libraries that I’ve found (e.g. x264) are written in pretty low level c (procedural with lots of macros) and assembly. Tweaking them is turning out to be far more complex than I'd thought. My project has no concern for performance or compatibility. We just want to test how some ideas will impact the perception of the outputted video.
We’d be willing to pay for or license the code if need be.
Thanks in advance!
Edit - Some important points:
I don't care about performance (e.g. real time encoding) at all. It could take 10 days to encode for all I care.
A wrapper isn't helpful since I want to actually modify the encoder itself.
No one would likely spend the time to develop the codec in those languages because they would be hopelessly slow for actual encoding. However, the reference implementation should be less optimized and more useful to you. It is still in C.
http://iphome.hhi.de/suehring/tml/download/
I don't think there is already such a port you need - you'll find wrappers for any langugae but a pure implementation does not have the critical mass. I'd recommend to port it yourself, document your port well and then start tweakening it.
If you'd want to port some encoder to C#, this should be easier to start with - about 8k LOC: http://sourceforge.net/projects/fevh264/
How about http://www.ffmpeg-csharp.com/?

Accessing math coprocessor from C#

How can I access math coprocessor from C# code? I would like to make some calculations on integers as fast as it's possible. I know it's possible under C++ compliers to use Assembler code inside it, but what about .Net?
The JIT compiler knows about the math coprocessor and will use it. What you really want is to use the SIMD engine, not the math coprocessor. This was part of the promise of JIT-compilation, that the runtime could pick the fastest hardware acceleration available on each computer, but I don't think .NET actually does that, at least in v4.
Or are you using the term "math coprocessor" to mean something other than the x87 FPU? There are some FPGA boards marketed as accelerator/coprocessor systems. If that's what you mean, you'll need to consult the programming manual that comes with the particular product. There are no special CPU instructions for accessing those, inline assembler wouldn't be helpful in this case.
For example, the GPU is even faster at math on large datasets than the CPU's SIMD engine, and you can access that from .NET using DirectX Compute Shaders (or p/invoking OpenCL), no assembler required.
I don't think that this would be possible to do directly from managed code. You could still call unmanaged code which does those calculations but whether the cost of interop marshaling is worth it is difficult to say. You will have to minimize it as much as possible and do all the calculations in unmanaged code and do only a single call to minimize overhead.
No, you cannot directly use inline assembler in C# managed code.
Your best bet is to make sure your general approach/algorithm is clean and efficient, and your math operations are clean and efficient, and then rely on the compiler to make efficient use of the available coprocessor.
This is not natively supported by C# as a language, nor .NET as a framework.
If you need that kind of speed or prowess, use something else altogether.
I know this is an old post, but for those coming here for similar reason of speeding up maths operations, for example a large number of vector operations.
To get the greatest speed from C# in maths you should convert your formulae to the logarithmic equivalent. This takes some practice, but once you have the idea you can do it with every formulae. Then decide to keep your values in log form, only converting to human readable form for those values the user needs to see.
The reason logs work faster is because they are all addition and subtraction (subtraction just being the addition of a compliment number), your processors can do these in large numbers with ease.
If you have not done this sort of maths before there are lessons online that will lead you through it, it has a learning curve but for maths/graphics programmers the learning curve is worth it.

C# - Default library has better performance?

Earlier today i made myself a lightweight memory stream, which basically writes to a byte array. I thought i'd benchmark the two of them to see if there's any difference - And there was:
(writing 1 byte to the array)
MemoryStream: 1.0001ms
mine: 3.0004ms
Everyone tells me that MemoryStream basically provides a byte array and a bunch of methods to work with it.
My question: Does the default C# library have a slightly better performance than the code we write? (maybe it runs in release rather than debug?)
The .NET implementation was probably a bit better than your own, but also, how did you benchmark? A couple of million iterations, or just a few? Remember that you need to use a large test base so that you can eliminate some data (CPU being called away for a moment, etc) that will give false results.
The folks at Microsoft are much smarter than you and I and most likely have written a better optimized wrapper over Byte[], much better than something that you or I would implement.
If you are curious, I would suggest that you disassemble the types that you have recreated to see how exactly Microsoft has implemented them. In some of the more important areas of the framework (such as this I would imagine) you will find that the BCL calls out to unmanaged code to accomplish its goals.
Unmanaged code has a much better chance of outperforming managed code in cases like this since you can freely work with arrays without the overhead of a managed runtime (for things like bounds checking and such).
Many of the framework assemblies are NGENed, which may give them a small boost by bypassing the initial JIT time. This is unlikely to be the cause of a 2ms difference, especially if you'd already warmed up your methods before starting the stopwatch, but I mention it for completeness.
Also, yes, the framework assemblies are built in "release" mode (optimisations on and checks off), not "debug."
You probably used Array.Copy() instead of the faster Buffer.BlockCopy(). The fastest way is to use unsafe code with pointers. Check out how they do this in the Mono project (search for memcpy).
Id wager that Microsoft's implementation is a wee bit better than yours. ;)
Did you check the source?

Categories