Unzip internal ZIP file to path

Unzip internal ZIP file to path - c#

I have a application that I want to copy directories within a internal ZIP to a path.
Did some searching and found this: Decompress byte array to string via BinaryReader yields empty string. However, the result is simply bytes. I haven't a clue about how to translate this back into folders that can then be moved to a path. (Working with just bytes is confusing to me)
Doing some more searching on here pointed me to the .NET 4.5 feature:
https://learn.microsoft.com/en-us/dotnet/standard/io/how-to-compress-and-extract-files
There's one complication, I don't have a zip path, rather a array of bytes from the zip kept internally inside my application. Keeping this in mind, how would I go about using this ZipFile feature but instead with a array of bytes as a input?
Some other things I've looked at:
Compress a single file using C#
https://msdn.microsoft.com/en-us/library/system.io.compression.zipfile%28v=vs.110%29.aspx
How to extract zip file contents into a folder in .NET 4.5
Note, for this particular application, I'd like to refrain from using external DLL's. A portable CLI executable is what I'm aiming for.

In order to satisfy both the need that I have only bytes and unzip the bytes (without using MemoryBuffer as that still makes no sense to me), I ended up creating a temporary folder, creating a empty file in that folder, filling it with the bytes of the zipped file then using ZipFile.ExtractToDirectory() to extract it to the final destination.
It may not be the most efficient, but it works quite well.

Related

How do I get the file version from bytes or stream?

I get the file version this way:
var fileVersion = FileVersionInfo.GetVersionInfo(path).FileVersion
But this option is not suitable for me, since I have to use a non-native tool to get the file that returns the stream. Can I get the file version from this stream or from an array of bytes?

Unfortunately, you cant do this directly
you should
Write the file to disk in some sort of temporary location
Read the version from the file on disk
Delete the file

In short, no, what you want is not possible with the current tools. The problem is that, as you've noticed, FileVersionInfo.GetVersionInfo relies on a physical file to be present on disk. If you look at its internals, you'll see that all it really does is to delegate to the Windows API which does the real work, precisely in the GetFileVersionInfo function, which in turn also takes a file name as parameter, so it's only designed to operate from the filesystem.
A possible workaround would be to drop a temp file with the binary you got from your stream, get the version info you need, then delete the file.
Another option would be to look for a library that can parse in-memory exe/dll files and extract the relevant details directly from there.

Unpacking tar/BZ2 files using C#

I have a tar.bz2 file and I want to extract it to a directory. In the examples I only see option of compress or decompress however I want actually to extract or unpack.
Also tried ICSharpCode.SharpZipLib.BZip2 but I didn't find an option to unpack.

While you use a ZipInputStream for .zip files, you should use a BZip2InputStream for .bz2 files (and GZipInputStream for .gz files etc.).
Taken from:
How to decompress .bz2 file in C#?

Decompressing and unpacking are two different operations. A foo.tar.bz2 file is actually a foo.tar file which was then compressed using bz2.
So to get single files you have to do this in the opposite direction. I.e. first decompress it (which you managed to do with sharpziplib). The result of this decompression has then to be untared (which can also be done with sharpziplib) see the docs for details.

packaging files to be read without need for extraction

In my current project i'm dealing with a huge number of files (over tens of milliard files with low volume-between 1 and 30 KB) as resources which copying them for my customer is time consuming job. i'm searching for a packaging mechanism that can help me to package each 1000 or 10000 one of them into a single file,resulting more copy speed because in that case i'm dealing with much less count of files; and also reading them from my application should not need any extraction and also no compression while i'm writing or changing them (because of the performance and nature of application which is distributed and resources are being shared between clients),I have searched and i know about following ZIP libraries:
SharpZipLib
DotNetZip
System.IO.Packaging
But seems above libraries need to be -at least- iterated through files to access a file in the zip or package without extraction. i need to access the files via their address (folder structure hierarchy) in the zip or package file! following links are similar questions which are answered via Iterating through the zip file:
how-to-read-data-from-a-zip-file-without-having-to-unzip-the-entire-file
content-inside-zip-file
Has anyone any idea or solution about this issue?
By the way,i'm coding in C# and the project is windows form-based.

I would do my own Package Format. With GZipStream or something else. For each files, you compress them with GZipStream, after you get the bytes values and you need to create a header in your Package Format which contains for each files (name, starting position and length). With this data in your header, that will probably by at the beginning of your package. You can get the information for your wanted file and after you just seek to the position of the compressed data, you get the byte array with the specified length.
But if you modify one files, you will need to recalculate all index after the modified files.

How can I extract a multi-volume zip file using SharpZipLib in C#?

I have a single file, Setup1.cab, which is split up into Setup1.zip.001 and Setup1.zip.002 that I used 7zip to archive. Once those volumes reach their destination, I'd like to be able to use C# to extract that file from both archives into the same directory where they will reside. Is this something that SharpZipLib is capable of, or should I be using another tool?
Otherwise, is there a way to combine the two using C# (or another tool - I'm open!) into one zip file, THEN extract it using SharpZipLib?
Thanks!
EDIT: 7zip will not be installed on the destination machines. Also, I'm open to using a different method of archiving the original file; I just need it to be in chunks of under 500MB, and the original file is 570MB.

I would take a look at the SevenZipSharp library and actually use 7zip via C# to handle the decompression.

Is it possible decompress a zip file while maintaining hierarchy using just .NET or some other built-in Windows API?

I have a zip file that contains folder hierarchies and files.
\images\
\images\1.jpg
\images\2.jpg
\something\something\a.exe
\something\something\b.exe
1.txt
I need to decompress the contents of this zip file to a location. I also need to preserve the structure of the zip file.
I've read about .NET's GZipStream and DeflateStream but I am of the opinion that it is too "complicated" for my purpose.
I've also used DotNetZip and SharpZipLib in the past for personal projects but since this is work related and I'm working at a huge company, I would have a hard time convincing legal to use these libraries.
Question:
Is it possible decompress a zip file while maintaining hierarchy using just .NET or some other built-in Windows API?
PS: I've also read this but I think it's hacky because you'll need to produce another executable just to hide the progress dialog.
Thanks!

Check out if Ionic Zip helps?

DotNetZip would do what you want, but I understand your concerns about legal approval.
On a side note, It might be good for you to navigate the legal jungle associated with getting an open-source library approved for use in the company, just to understand what's involved. But I'll leave that up to you.
Getting back to rolling your own...
DotNetZip is pretty full featured, and it handles a number of scenarios you probably don't care about. Like Unicode filenames and comments, setting windows timestamps and permissions of extracted files, getting timestamps of zip files created on old unix systems, split archives, Encrypted archives, files over 2gb, or self-extracting archives, etc etc etc. Many zip files use none of those things.
Also DotNetZip does eventing and zip updates and zip creation - all the code associated with these things is probably not of interest to you, if you confine yourself just to the requirements you described in your question.
You could, though, grab the DotNetZip code and use it to help you roll your own solution. If you constrain yourself to JUST reading zip files and not dealing with all the possible special cases, the zip format is not difficult to parse.
here's how to do it:
open the zip file using new FileStream() or File.Open. You want a FileStream object.
Read 4 bytes. Verify that it is the zip-entry-header descriptor. (0x04034b50)
In the file, the order you will find these bytes is 50 4b 03 04.
if you find a match, you're in business.
at offset 14 is a 4-byte CRC. Get it. (Same byte ordering as above)
at offset 18 - the 4-byte length of the compressed blob. get it. (N)
at offset 22 - the 4-byte length of the UNcompressed blob. get it. (U)
at 26 - the 2-byte length of the filename. get it (L)
at 28 - the 2-byte length of the "extra field". get it (E)
Beyond the extra field, at offset 30, is the actual filename. read L bytes for the filename, and call System.Text.Encoding.ASCII.GetString(). The result will include a directory path, with the backslashes replaced with slashes (unix style). String.Replace() the slashes.
after the filename comes the extra field - seek E bytes to get beyond it. You can mostly ifgnore it. This is where the compressed data starts.
Open a System.IO.DeflateStream() on the zip FileStream, using CompressionMode.Decompress, and using the current offset of the FileStream as input. open a new FileStream, for output, with the file path you read in step 3. in a loop, call inflater.Read(). and output.Write(), to write the decompressed output of the DeflateStream to a filesystem file with the correct name. You will need to stop reading from the DeflateStream when you read exactly U (uncompressed) bytes.
Check the uncompressed size (U) against the data you actually wrote out from the DeflateStream (after compression). They should match.
If you are fancy, you can check the CRC of the output against what was in the header.
go to step 2, to look for the next entry in the file.
The most complicated part is step 3. Working code for that is easily found in this source module, look for the ReadHeader method.

Maybe the full features set of GZipStream it's a bit complicated, but note that the sample in the msdn page it's exactly what you need. I mean this msdn web (the 4.0 version) not the one you supply in the question.
http://msdn.microsoft.com/en-us/library/system.io.compression.gzipstream.aspx#Y2750

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.