Huffman algorithm for decompress an already compressed bytes - c#

I have a byte array in the database which has been compressed using the Huffman algorithm in Delphi. Now I need to decompress it in SSIS (SQL Server Integration Services).
I need some C#/.NET code, or a tool to decrypt the bytes into a file, based on the Huffman algorithm. There are a few tools in the market that unpack a compressed file, but I don't have the file with full header for LHA compression. I need a way to somehow convert a series of bytes to a decompressed file, and then save it to a file.

With Huffman compression, each character in your input file will map to a sequence of bits. If you have access to the Delphi source code, you can try to figure out how the frequency and character mapping information was done. Is the mapping included in the compressed file? You will need that mapping in order to write the code to decompress the file.

Related

How to deflate minecraft chunk in c#?

So I am trying to make my own minecraft map editor in Unity3d which uses c#. I am using this as a reference https://minecraft.gamepedia.com/Region_file_format. I am able to parse the region file's chunk table and I can then use the table to find the compressed chunks starting with the zlib magic bytes 78 9C. My idea is to use System.IO.Compression.DeflateStream to decompress this chunk of data but when I do this I get an error IOException: Corrupted data ReadInternal coming from the CheckResult method of DelfateStreamNative. If I export the chunk to a temporary file and use nodejs zlib to inflate it, then it works but I think it is skipping the checksum because no matter how much data I give it, it works and gives me something back.
I can't imagine that the .NET inflator is broken so I must be doing something wrong. The compressed chunk length I parse before the compressed data seems to be wrong because my first chunk is said to be 644 bytes but when I look at the file in a hexdump I have 1,230 bytes before the zero padding. Unfortunately the 3 bit header of the zlib block indicates that the block is dynamically compressed so I cannot easily determine the EOB code and I really don't understand RFC 1951 so figuring out where the deflate buffer ends and the checksum starts is beyond me. And again nodejs zlib doesn't care if I give it 644 bytes or 1,230, but to get the full decompressed chunk I assume I need to give it all the compressed data.
Has anyone who has decompress minecraft chunks embedded in anvil region files give me some insights?
So I compiled the puff.c example from github which is meant to demonstrate inflating zlib buffers and noticed that the parser was not parsing the first two 'file magic' bytes of the buffer, which I don't understand how it knows the window size since that is encoded in those bytes. But anyways I got the idea that the .NET inflator also doesn't want these bytes. And bam! It parses. So some tips for anyone wanting to inflate minecraft chunks, you need to create your seek offset into the .mca file as sectorIndex * 4096 + 7 not sectorIndex * 4096 + 5 and for anyone getting the checksum error from System.IO.Compression.DeflateStream perhaps you need to make sure you are sending it an RFC 1951 buffer and not an RFC 1950 buffer.

How to decompress bytes from a zip file in C#

My program handles zip files with encrypted headers, it decrypts the headers and shows the info. Now I want to view the pictures within the zip file in a picturebox so I have to decompress the files into a memorystream.
I have all the bytes of the compressed files. Wich means: header, compressed data, extra length.
How can I decompress these bytes so I can view the file?
You should use the ZipArchive class to read the compressed data, since it appears you're reading valid zip files.
If you're using .NET 4 or older, you'll have to use a third-party library, like DotNetZip and its ZipInputStream class.

Read and Encryption of a PDF file in c#

I want to read a the content of a PDF file and encrypt the content using AES256 encryption and post the content(encrypted) as a base64 string.
for that i have 2 solution
read the content using a stream reader(PDF formated data) the encrypt the content and the base64 encoding, Finally send the encrypted string
Read PDF content and convert it into text then encrpt and then send
Which is the best method, If i use first method then there will be any problem for failure
I need your opinion Please help me
Your first method seems absolutely fine and I would certainly go for that approach. What you are essentially doing is simple transmitting a file from one machine to another.
If you consider this without encryption all you should be aiming to do would be to send the file stream exactly the same as you read it, this ensures the receiver gets the file in it's original state and can reliable open the file as it will be in the exact same format as it started.
Now when we consider adding encryption, all we are doing is change the raw binary data of the file. As long as we decrypt the file at the other end using the same key parameters we can be sure that we will still have the same original raw file data we started with (assuming we don't get any data loss during connection - you could add a hash check for this if desired, for example)

Need help manipulating WAV (RIFF) Files at a byte level

I'm writing an an application in C# that will record audio files (*.wav) and automatically tag and name them. Wave files are RIFF files (like AVI) which can contain meta data chunks in addition to the waveform data chunks. So now I'm trying to figure out how to read and write the RIFF meta data to and from recorded wave files.
I'm using NAudio for recording the files, and asked on their forums as well on SO for way to read and write RIFF tags. While I received a number of good answers, none of the solutions allowed for reading and writing RIFF chunks as easily as I would like.
But more importantly I have very little experience dealing with files at a byte level, and think this could be a good opportunity to learn. So now I want to try writing my own class(es) that can read in a RIFF file and allow meta data to be read, and written from the file.
I've used streams in C#, but always with the entire stream at once. So now I'm little lost that I have to consider a file byte by byte. Specifically how would I go about removing or inserting bytes to and from the middle of a file? I've tried reading a file through a FileStream into a byte array (byte[]) as shown in the code below.
System.IO.FileStream waveFileStream = System.IO.File.OpenRead(#"C:\sound.wav");
byte[] waveBytes = new byte[waveFileStream.Length];
waveFileStream.Read(waveBytes, 0, waveBytes.Length);
And I could see through the Visual Studio debugger that the first four byte are the RIFF header of the file.
But arrays are a pain to deal with when performing actions that change their size like inserting or removing values. So I was thinking I could then to the byte[] into a List like this.
List<byte> list = waveBytes.ToList<byte>();
Which would make any manipulation of the file byte by byte a whole lot easier, but I'm worried I might be missing something like a class in the System.IO name-space that would make all this even easier. Am I on the right track, or is there a better way to do this? I should also mention that I'm not hugely concerned with performance, and would prefer not to deal with pointers or unsafe code blocks like this guy.
If it helps at all here is a good article on the RIFF/WAV file format.
I did not write in C#, but can point on some places which are bad from my point of view:
1) Do not read whole WAV files in memory unless the files are your own files and knowingly have small size.
2) There is no need to insert a data in memory. You can simply for example do about the following: Analyze source file, store offsets of chunks, and read metadata in memory; present the metadata for editing in a dialog; while saving write RIFF-WAV header, fmt chunk, transfer audio data from source file (by reading and writing blocks), add metadata; update RIFF-WAV header.
3) Try save metadata in the tail of file. This will results in alternating only tag will not require re-writing of whole file.
It seems some sources regarding working with RIFF files in C# are present here.

Computing MD5SUM of large files in C#

I am using following code to compute MD5SUM of a file -
byte[] b = System.IO.File.ReadAllBytes(file);
string sum = BitConverter.ToString(new MD5CryptoServiceProvider().ComputeHash(b));
This works fine normally, but if I encounter a large file (~1GB) - e.g. an iso image or a DVD VOB file - I get an Out of Memory exception.
Though, I am able to compute the MD5SUM in cygwin for the same file in about 10secs.
Please suggest how can I get this to work for big files in my program.
Thanks
I suggest using the alternate method:
MD5CryptoServiceProvider.ComputeHash(Stream)
and just pass in an input stream opened on your file. This method will almost certainly not read in the whole file in memory in one go.
I would also note that in most implementations of MD5 it's possible to add byte[] data into the digest function a chunk at a time, and then ask for the hash at the end.

Categories