Faster Image Compression/Decompression - Latency Compared to Microsoft's Jpeg

Faster Image Compression/Decompression - Latency Compared to Microsoft's Jpeg - c#

May be a bit subjective.
But quite a straightforward question.
What is the fastest image compression/decompression (both together)?
And i mean available in c#.
I am pretty sure myself that it´s Jpeg.
But then again, jpeg has been following a standard since many years back, and must abide certain rules so it doesn´t break compatibility.
So perhaps there is something better that i don´t know off?
And when i say Fastest, i mean Latency and performance.
Meaning, let´s say, PNG for example, to compress a 1080p file, it takes 1 sec.
and decompressing that takes, 30ms, then from the bitmap source to the second bitmap, it will be a 1.030 sec delay.
Jpeg is Alot faster than png for many reasons, and it´s extremely fast on decompression as well. And as many other things, the encoder/decoder does most of the job, meaning a bad encoder will produce worse results even if the standard itself can produce much better.
I am currently limited to the inbuilt jpeg encoder/decoder as i have not fully grasped how to P/invoke from other encoders/decoders (libjpeg etc), but that´s off topic to this.
So hopefully this is a valid question, though i think it may be on the edge of that.
EDIT: noticed that i had asked this before but in another term or what to call it. Though now i have written more specifically about it. But i think it´s pretty much a duplicate.
I leave it in your hands Moderators.

PNG is extremely slow, as you say. For a 10,000 x 10,000 pixel RGB image I see:
$ time vips copy wtc.jpg x.jpg
real 0m0.915s
user 0m1.652s
sys 0m0.052s
$ time vips copy wtc.png x.png
real 0m28.808s
user 0m32.448s
sys 0m0.272s
That's time to decompress and recompress again, real is wall clock time, so it's about 30x slower.
Most of that is spent in Deflate decompress and recompress. PNG has an option to set the compression level, with 0 being no compression, ie. deflate turned off. It's still far slower than JPEG.
$ time vips copy wtc0.png x.png[compression=0]
real 0m6.552s
user 0m8.528s
sys 0m0.440s
About 7x slower, and of course with compression turned off the file will be much larger. I've no idea why libpng is so incredibly slow, it would be great if someone could make a libpng-turbo.
TIFF is probably the fastest widely-used format. I see:
$ time vips copy wtc.tif x.tif
real 0m0.637s
user 0m0.432s
sys 0m0.344s
So about 50% faster than JPEG, though again the file on disc will be much larger since the image is not compressed.
Formats like PPM are even faster. They are a simple dump of the image data with a small header giving dimensions. I see:
$ time vips copy wtc.ppm x.ppm
real 0m0.336s
user 0m0.196s
sys 0m0.296s
So almost 3x faster than JPEG. Again, the downside is that file will be huge, since there's no compression.

Related

Compressing and decompressing very large files using System.IO.Compressing.Gzip

My problem can be described with following statements:
I would like my program to be able to compress and decompress selected files
I have very large files (20 GB+). It is safe to assume that the size will never fit into the memory
Even after compression the compressed file might still not fit into the memory
I would like to use System.IO.Compression.GzipStream from .NET Framework
I would like my application to be parallel
As I am a newbie to compression / decompression I had following idea on how to do it:
I could use split the files into chunks and compress each of them separately. Then merge them back into a whole compressed file.
Question 1 about this approach - Is compressing multiple chunks and then merging them back together going to give me the proper result i.e. if I were to reverse the process (starting from compressed file, back to decompressed) will I receive the same original input?
Question 2 about this approach - Does this approach make sense to you? Perhaps you could direct me towards some good lecture about the topic? Unfortunately I could not find anything myself.

You do not need to chunk the compression just to limit memory usage. gzip is designed to be a streaming format, and requires on the order of 256KB of RAM to compress. The size of the data does not matter. The input could be one byte, 20 GB, or 100 PB -- the compression will still only need 256KB of RAM. You just read uncompressed data in, and write compressed data out until done.
The only reason to chunk the input as you diagram is to make use of multiple cores for compression. Which is a perfectly good reason for your amount of data. Then you can do exactly what you describe. So long as you combine the output in the correct order, the decompression will then reproduce the original input. You can always concatenate valid gzip streams to make a valid gzip stream. I would recommend that you make the chunks relatively large, e.g. megabytes, so that the compression is not noticeably impacted by the chunking.
Decompression cannot be chunked in this way, but it is much faster so there would be little to no benefit even if you could. The decompression is usually i/o bound.

FileStream.SetLength(long length) too slow when length is in gigabytes

I need to write a small tool to eat up a disk's free space (just leaving a few Kilo Bytes) to test some "low disk space" use cases. The code:
new FileStream(filename).SetLength(remaining free bytes - 1024); //leaving 1KB's free space
But FileStream.SetLength(long length) is too slow if the length is in gigabytes, is just as slow as copying big HD movies to the disk. (Edit: Sorry I just realize that I experienced this only when writing to removable flashes, if write to other local drives, the speed is fast enough.)
So I wonder, is there a faster way to write blank files (that is, filled with zeros)? Code in C/C++ is also welcome.
Or is there other trick that I can test the "low disk space" cases without having to write blank files?

You can create large files without writing to them via the Windows API call SetFileValidData.
However, note that it will NOT fill the file with zeros (which is why it is faster). Also, READ CAREFULLY the documentation for that function, since there are security implications.

After a google search I was directed to this SO question:
Creating big file on Windows
That answers my question.

Simple 2-color differential image compression

Is there an efficient, quick and simple example of doing differential b/w image compression? Or even better, some simple (but lossless - jagged 1bpp images don't look very convincing when compressed using lossy compression) streaming technique which could accept a number of frames as input?
I have a simple b/w image (320x200) stream, displaying something similar to a LED display, which is updated about once a second using AJAX. Images are pretty similar most of the time, so if I subtracted them, result would compress pretty well (even with simple RLE). Is something like this available?

I don't know of any library that already exists that can do what you're asking other than just running it through gzip or some other lossless compression algorithm. However, since you know that the frames are highly correlated, you could XOR the frames like Conspicuous Compiler suggested and then run gzip on that. If there are few changes between frames, the result of the XOR should have a great deal less entropy than the original frame. This will allow gzip or another lossless compression algorithm to achieve a higher compression ratio.
You would also want to send a key(non-differential) frame every once in a while so you can resynchronize in the event of errors.
If you are just interested in learning about compression, you could try implementing the RLE after XORing the frames. Check out the bit-level RLE discussed here about half way down the page. It should be pretty easy to implement as it just stores in each byte a 7 bit length and a one bit value so it could achieve a best-case compression ratio of 128/8=16 if there are no changes between frames.
Another thought is that if there are very few changes, you may want to just encode the bit positions that flipped between frames. You could address the 320x200 image with a 16-bit integer. For instance, if only 100 pixels change, you can just store 100 16 bit integers representing those positions (1600 bits) where the RLE discussed above would take 64000/16=4000 bits at the minimum (it would probably be quite a bit higher). You could actually switch between this method and RLE depending on the frame content.
If you wanted to go beyond simple methods, I would suggest using variable-length codes to represent the possible runs during the run-length encoding. You could then assign shorter codes to the runs with the highest probability. This would be similar to the RLE used in JPEG or MPEG after the lossy part of the compression is performed (DCT and quantization).

Image resizing efficiency in C# and .NET 3.5

I have written a web service to resize user uploaded images and all works correctly from a functional point of view, but it causes CPU usage to spike every time it is used. It is running on Windows Server 2008 64 bit. I have tried compiling to 32 and 64 bit and get about the same results.
The heart of the service is this function:
private Image CreateReducedImage(Image imgOrig, Size NewSize)
{
var newBM = new Bitmap(NewSize.Width, NewSize.Height);
using (var newGrapics = Graphics.FromImage(newBM))
{
newGrapics.CompositingQuality = CompositingQuality.HighSpeed;
newGrapics.SmoothingMode = SmoothingMode.HighSpeed;
newGrapics.InterpolationMode = InterpolationMode.HighQualityBicubic;
newGrapics.DrawImage(imgOrig, new Rectangle(0, 0, NewSize.Width, NewSize.Height));
}
return newBM;
}
I put a profiler on the service and it seemed to indicate the vast majority of the time is spent in the GDI+ library itself and there is not much to be gained in my code.
Questions:
Am I doing something glaringly inefficient in my code here? It seems to conform to the example I have seen.
Are there gains to be had in using libraries other than GDI+? The benchmarks I have seen seem to indicate that GDI+ does well compare to other libraries but I didn't find enough of these to be confident.
Are there gains to be had by using "unsafe code" blocks?
Please let me know if I have not included enough of the code...I am happy to put as much up as requested but don't want to be obnoxious in the post.

Image processing is usually an expensive operation. You have to remember that a 32 bit color image is expanded in memory into 4 * pixel width * pixel height before your app even starts any kind of processing. A spike is definitely to be expected especially when doing any kind of pixel processing.
That being said, the only place i could see you in being able to speed up the process or lowering the impact on your processor is to try a lower quality interpolation mode.

You could try
newGrapics.InterpolationMode = InterpolationMode.Low;
as HighQualityBicubic will be the most processor-intensive of the resampling operations, but of course you will then lose image quality.
Apart from that, I can't really see anything that can be done to speed up your code. GDI+ will almost certainly be the fastest on a Windows machine (no code written in C# is going to surpass a pure C library), and using other image libraries carries the potential risk of unsafe and/or buggy code.
The bottom line is, resizing an image is an expensive operation no matter what you do. The simplest solution is your case might simply be to replace your server's CPU with a faster model.

I know that the DirectX being released with Windows 7 is said to provide 2D hardware acceleration. Whether this implies it will beat out GDI+ on this kind of operation, I don't know. MS has a pretty unflattering description of GDI here which implies it is slower than it should be, among other things.
If you really want to try to do this kind of stuff yourself, there is a great GDI Tutorial that shows it. The author makes use of both SetPixel and "unsafe blocks," in different parts of his tutorials.
As an aside, multi-threading will probably help you here, assuming your server has more than one CPU. That is, you can process more than one image at once and probably get faster results.

When you write
I have written a web service to resize
user uploaded images
It sounds to mee that the user uploads an image to a (web?) server, and the server then calls a web service to do the scaling?
If that is the case, I would simply move the scaling directly to the server. Imho, scaling an image doesn't justify it's own web service. And you get quite a bit unnecessary traffic going from the server to the web service, and back. In particular because the image is probably base64 encoded, which makes the data traffic even bigger.
But I'm just guessing here.
p.s. Unsafe blocks in itself doesn't give any gain, they just allow unsafe code to be compiled. So unless you write your own scaling routing, an unsafe block isn't going to help.

You may want to try ImageMagick. It's free, and there is also a .NET wrapper: click here. Or here.
Or you can send a command to a DOS Shell.
We have used ImageMagick on Windows Servers now and then, for batch processing and sometimes for a more flexible image conversion.
Of course, there are commercial components as well, like those by Leadtools and Atalasoft. We have never tried those.

I suspect the spike is because you have the interpolation mode cranked right up. All interpolation modes work per pixel and BiCubic High Quality is about as high as you can go with GDI+ so I suspect the per pixel calculations are chewing up your CPU.
As a test try dropping the interpolation mode down to InterpolationModeNearestNeighbor and see if the CPU spike drops - if so then that's your culprit.
If so then do some trial and error for cost vs quality, chances are you might not need High Quality BiCubic to get decent results

Compressing/decompressing audio data

i am using the win32 waveform api's in a C# app to make a voip system. all is going well, however i need some way of compressing the audio data on the fly.
so basically the audio data comes into a 'record' buffer of size 150 bytes, and then this buffer is sent over udp, and at the remote end, the 150 bytes are received and put into a 'play' buffer.
so i need some way of compressing/decompressing the data just before the udp->send and just after the udp->recv. normal compression algorithms dont work with audio, including the .NET GZip class.
does anyone know of a library that i can use that will help me do this ?
thanks in advance...

150 bytes is an unbelievably small buffer for audio data--less than 5 milliseconds for e.g. 16 KHz mono. I'm no expert but I think regardless of the compression scheme you choose, your compression ratio will suffer greatly for using such a small buffer. Besides that there is significant overhead for each packet you send.
That said, if you are sending speech data, take a look at Speex for lossy compression (I have found it very effective at compressing speech, but the sound quality is terrible for music.)

I would think you'd want to batch up those 150-byte chunks to get better compression.
Although, even at small buffer sizes like that, you can still get some compression.
If the built-in GZipStream isn't working you could try the GZipStream that is included in DotNetZip. There is also a ZlibCodec class available in DotNetZip that implements the Codec pattern - this may facilitate compressing in 150-byte blocks.

The component you're looking for is more well-known as a coder/decoder, or codec, and there are many options when it comes to picking one.

As suggested above, I'd look into Speex. It's well supported, and now the defacto standard for Flash Player.
I assume that by the size you are setting your buffers that latency is an issue (the bigger the buffer, the bigger the latency), so don't go for a codec that has a high decompressed frame size, because it introduces high latency. This more or less rules out MP3... for voice at 5khz output sample rate (it wouldn't serve much purpose going higher), the minimum decompressed frame size is 576 samples, or ~100ms of data that must be encoded prior to send. This means a bothway latency of over 200ms before you've even considered the network part of the problem.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.