How to deflate minecraft chunk in c#? - c#

So I am trying to make my own minecraft map editor in Unity3d which uses c#. I am using this as a reference https://minecraft.gamepedia.com/Region_file_format. I am able to parse the region file's chunk table and I can then use the table to find the compressed chunks starting with the zlib magic bytes 78 9C. My idea is to use System.IO.Compression.DeflateStream to decompress this chunk of data but when I do this I get an error IOException: Corrupted data ReadInternal coming from the CheckResult method of DelfateStreamNative. If I export the chunk to a temporary file and use nodejs zlib to inflate it, then it works but I think it is skipping the checksum because no matter how much data I give it, it works and gives me something back.
I can't imagine that the .NET inflator is broken so I must be doing something wrong. The compressed chunk length I parse before the compressed data seems to be wrong because my first chunk is said to be 644 bytes but when I look at the file in a hexdump I have 1,230 bytes before the zero padding. Unfortunately the 3 bit header of the zlib block indicates that the block is dynamically compressed so I cannot easily determine the EOB code and I really don't understand RFC 1951 so figuring out where the deflate buffer ends and the checksum starts is beyond me. And again nodejs zlib doesn't care if I give it 644 bytes or 1,230, but to get the full decompressed chunk I assume I need to give it all the compressed data.
Has anyone who has decompress minecraft chunks embedded in anvil region files give me some insights?

So I compiled the puff.c example from github which is meant to demonstrate inflating zlib buffers and noticed that the parser was not parsing the first two 'file magic' bytes of the buffer, which I don't understand how it knows the window size since that is encoded in those bytes. But anyways I got the idea that the .NET inflator also doesn't want these bytes. And bam! It parses. So some tips for anyone wanting to inflate minecraft chunks, you need to create your seek offset into the .mca file as sectorIndex * 4096 + 7 not sectorIndex * 4096 + 5 and for anyone getting the checksum error from System.IO.Compression.DeflateStream perhaps you need to make sure you are sending it an RFC 1951 buffer and not an RFC 1950 buffer.

Related

C# Compress from memory buffer to another memory buffer in Zip format

I have a byte buffer (between 3000 and 50000 bytes) that I would like to send through serial COM from a PC to an ยต-controller.
As far as this transfer is slow (115200 bytes/s), I would like to compress it in ZIP format in memory before sending it.
I don't want GZIP, as the destination controller can natively uncompress ZIP.
System.IO.Compress.ZipArchive seems to be used for files, but I just want to remain in memory using an input buffer and an output buffer.
How can I do that, please?
Thanks
David

Huffman algorithm for decompress an already compressed bytes

I have a byte array in the database which has been compressed using the Huffman algorithm in Delphi. Now I need to decompress it in SSIS (SQL Server Integration Services).
I need some C#/.NET code, or a tool to decrypt the bytes into a file, based on the Huffman algorithm. There are a few tools in the market that unpack a compressed file, but I don't have the file with full header for LHA compression. I need a way to somehow convert a series of bytes to a decompressed file, and then save it to a file.
With Huffman compression, each character in your input file will map to a sequence of bits. If you have access to the Delphi source code, you can try to figure out how the frequency and character mapping information was done. Is the mapping included in the compressed file? You will need that mapping in order to write the code to decompress the file.

Is it possible decompress a zip file while maintaining hierarchy using just .NET or some other built-in Windows API?

I have a zip file that contains folder hierarchies and files.
\images\
\images\1.jpg
\images\2.jpg
\something\something\a.exe
\something\something\b.exe
1.txt
I need to decompress the contents of this zip file to a location. I also need to preserve the structure of the zip file.
I've read about .NET's GZipStream and DeflateStream but I am of the opinion that it is too "complicated" for my purpose.
I've also used DotNetZip and SharpZipLib in the past for personal projects but since this is work related and I'm working at a huge company, I would have a hard time convincing legal to use these libraries.
Question:
Is it possible decompress a zip file while maintaining hierarchy using just .NET or some other built-in Windows API?
PS: I've also read this but I think it's hacky because you'll need to produce another executable just to hide the progress dialog.
Thanks!
Check out if Ionic Zip helps?
DotNetZip would do what you want, but I understand your concerns about legal approval.
On a side note, It might be good for you to navigate the legal jungle associated with getting an open-source library approved for use in the company, just to understand what's involved. But I'll leave that up to you.
Getting back to rolling your own...
DotNetZip is pretty full featured, and it handles a number of scenarios you probably don't care about. Like Unicode filenames and comments, setting windows timestamps and permissions of extracted files, getting timestamps of zip files created on old unix systems, split archives, Encrypted archives, files over 2gb, or self-extracting archives, etc etc etc. Many zip files use none of those things.
Also DotNetZip does eventing and zip updates and zip creation - all the code associated with these things is probably not of interest to you, if you confine yourself just to the requirements you described in your question.
You could, though, grab the DotNetZip code and use it to help you roll your own solution. If you constrain yourself to JUST reading zip files and not dealing with all the possible special cases, the zip format is not difficult to parse.
here's how to do it:
open the zip file using new FileStream() or File.Open. You want a FileStream object.
Read 4 bytes. Verify that it is the zip-entry-header descriptor. (0x04034b50)
In the file, the order you will find these bytes is 50 4b 03 04.
if you find a match, you're in business.
at offset 14 is a 4-byte CRC. Get it. (Same byte ordering as above)
at offset 18 - the 4-byte length of the compressed blob. get it. (N)
at offset 22 - the 4-byte length of the UNcompressed blob. get it. (U)
at 26 - the 2-byte length of the filename. get it (L)
at 28 - the 2-byte length of the "extra field". get it (E)
Beyond the extra field, at offset 30, is the actual filename. read L bytes for the filename, and call System.Text.Encoding.ASCII.GetString(). The result will include a directory path, with the backslashes replaced with slashes (unix style). String.Replace() the slashes.
after the filename comes the extra field - seek E bytes to get beyond it. You can mostly ifgnore it. This is where the compressed data starts.
Open a System.IO.DeflateStream() on the zip FileStream, using CompressionMode.Decompress, and using the current offset of the FileStream as input. open a new FileStream, for output, with the file path you read in step 3. in a loop, call inflater.Read(). and output.Write(), to write the decompressed output of the DeflateStream to a filesystem file with the correct name. You will need to stop reading from the DeflateStream when you read exactly U (uncompressed) bytes.
Check the uncompressed size (U) against the data you actually wrote out from the DeflateStream (after compression). They should match.
If you are fancy, you can check the CRC of the output against what was in the header.
go to step 2, to look for the next entry in the file.
The most complicated part is step 3. Working code for that is easily found in this source module, look for the ReadHeader method.
Maybe the full features set of GZipStream it's a bit complicated, but note that the sample in the msdn page it's exactly what you need. I mean this msdn web (the 4.0 version) not the one you supply in the question.
http://msdn.microsoft.com/en-us/library/system.io.compression.gzipstream.aspx#Y2750

Need help manipulating WAV (RIFF) Files at a byte level

I'm writing an an application in C# that will record audio files (*.wav) and automatically tag and name them. Wave files are RIFF files (like AVI) which can contain meta data chunks in addition to the waveform data chunks. So now I'm trying to figure out how to read and write the RIFF meta data to and from recorded wave files.
I'm using NAudio for recording the files, and asked on their forums as well on SO for way to read and write RIFF tags. While I received a number of good answers, none of the solutions allowed for reading and writing RIFF chunks as easily as I would like.
But more importantly I have very little experience dealing with files at a byte level, and think this could be a good opportunity to learn. So now I want to try writing my own class(es) that can read in a RIFF file and allow meta data to be read, and written from the file.
I've used streams in C#, but always with the entire stream at once. So now I'm little lost that I have to consider a file byte by byte. Specifically how would I go about removing or inserting bytes to and from the middle of a file? I've tried reading a file through a FileStream into a byte array (byte[]) as shown in the code below.
System.IO.FileStream waveFileStream = System.IO.File.OpenRead(#"C:\sound.wav");
byte[] waveBytes = new byte[waveFileStream.Length];
waveFileStream.Read(waveBytes, 0, waveBytes.Length);
And I could see through the Visual Studio debugger that the first four byte are the RIFF header of the file.
But arrays are a pain to deal with when performing actions that change their size like inserting or removing values. So I was thinking I could then to the byte[] into a List like this.
List<byte> list = waveBytes.ToList<byte>();
Which would make any manipulation of the file byte by byte a whole lot easier, but I'm worried I might be missing something like a class in the System.IO name-space that would make all this even easier. Am I on the right track, or is there a better way to do this? I should also mention that I'm not hugely concerned with performance, and would prefer not to deal with pointers or unsafe code blocks like this guy.
If it helps at all here is a good article on the RIFF/WAV file format.
I did not write in C#, but can point on some places which are bad from my point of view:
1) Do not read whole WAV files in memory unless the files are your own files and knowingly have small size.
2) There is no need to insert a data in memory. You can simply for example do about the following: Analyze source file, store offsets of chunks, and read metadata in memory; present the metadata for editing in a dialog; while saving write RIFF-WAV header, fmt chunk, transfer audio data from source file (by reading and writing blocks), add metadata; update RIFF-WAV header.
3) Try save metadata in the tail of file. This will results in alternating only tag will not require re-writing of whole file.
It seems some sources regarding working with RIFF files in C# are present here.

Computing MD5SUM of large files in C#

I am using following code to compute MD5SUM of a file -
byte[] b = System.IO.File.ReadAllBytes(file);
string sum = BitConverter.ToString(new MD5CryptoServiceProvider().ComputeHash(b));
This works fine normally, but if I encounter a large file (~1GB) - e.g. an iso image or a DVD VOB file - I get an Out of Memory exception.
Though, I am able to compute the MD5SUM in cygwin for the same file in about 10secs.
Please suggest how can I get this to work for big files in my program.
Thanks
I suggest using the alternate method:
MD5CryptoServiceProvider.ComputeHash(Stream)
and just pass in an input stream opened on your file. This method will almost certainly not read in the whole file in memory in one go.
I would also note that in most implementations of MD5 it's possible to add byte[] data into the digest function a chunk at a time, and then ask for the hash at the end.

Categories