How to unzip files present inside another zip file in vb.net

How to unzip files present inside another zip file in vb.net - c#

I have a folder containing zip files. I want to unzip them. After unzipping them, I have to find if there are any other zip file found inside the directory. If found, I've to unzip them also. The inner level of the presence of zip files are undetermined. How to unzip all the zip files in sub directories.

It sounds like a fundamentally recursive operation. As Tim indicated above, we can't really give specifics without knowing the library you're using (personally, I'm a fan of Ionic's library), but it would go something like this:
Function Unzip(file as File)
zipfile = ZipLibrary.Load(file);
For Each innerfile as File in zipfile.files
If (innerfile.Name.EndsWith(".zip")) Then
Unzip(innerfile);
End If
End For
End Function
Of course, as with any form of recursion like this, you can potentially save on stack space by building a list of files to be unpacked and adding and iterating through it rather than doing the recursive call. You could also potentially use the zip library itself to check whether a file is a valid zip file if you are uncertain whether it will have the correct extension.

Related

Add Files to Tar Archive Without Copying Parent File Structure

I'm creating a game in which a user can create custom content. There are two files associated with each custom creation: an .ogg file and a .xml file. Previously, I had a folder that contained all of the associated files, but I'd like to wrap all the associated files within a .tar file instead.
Using the following code, I can create a .tar archive (with the custom extension ".krs"):
FileInfo[] filesInDirectory = folder.GetFiles();
string tarArchiveName = #"C:\Users\me\Desktop\UserData\songs\songName.krs";
using (Stream targetStream = new GZipOutputStream(File.Create(tarArchiveName)))
using (TarArchive tarArchive = TarArchive.CreateOutputTarArchive(targetStream, TarBuffer.DefaultBlockFactor))
{
foreach (FileInfo file in filesInDirectory)
{
TarEntry entry = TarEntry.CreateEntryFromFile(file.FullName);
tarArchive.WriteEntry(entry, false);
}
}
This doesn't give any errors, but when I open the .krs file as a .tar using 7zip, the the files are buried underneath ALL the parent directories of the original files that were copied to the archive. For example, the path of the "data.xml" file within the .tar file is "C:\Users\me\Desktop\UserData\songs\songName\data.xml".
I want to open the .tar file and there no top-level directory - just the two files. For example the data.xml file within the .tar archive should be simply "data.xml".
I know this is achieveable because I can do it manually using 7zip. How can I do this using the SharpZipLib library in C#? I found this answer that seems to address my problem, but it's written in Python, a language I have no understanding of.
EDIT: I did some more searching and found this answer. I tried the solution, and it took away everything except the first parent directory of the files (".tar : parentFolder\data.xml"). Is it possible to remove that as well to avoid having to do any digging when I extract these files later?

When testing the answer I posted in my edit, I found that when extracting the files, they come out by themselves and do not come in a folder. This is the answer I was looking for.

Unzip internal ZIP file to path

I have a application that I want to copy directories within a internal ZIP to a path.
Did some searching and found this: Decompress byte array to string via BinaryReader yields empty string. However, the result is simply bytes. I haven't a clue about how to translate this back into folders that can then be moved to a path. (Working with just bytes is confusing to me)
Doing some more searching on here pointed me to the .NET 4.5 feature:
https://learn.microsoft.com/en-us/dotnet/standard/io/how-to-compress-and-extract-files
There's one complication, I don't have a zip path, rather a array of bytes from the zip kept internally inside my application. Keeping this in mind, how would I go about using this ZipFile feature but instead with a array of bytes as a input?
Some other things I've looked at:
Compress a single file using C#
https://msdn.microsoft.com/en-us/library/system.io.compression.zipfile%28v=vs.110%29.aspx
How to extract zip file contents into a folder in .NET 4.5
Note, for this particular application, I'd like to refrain from using external DLL's. A portable CLI executable is what I'm aiming for.

In order to satisfy both the need that I have only bytes and unzip the bytes (without using MemoryBuffer as that still makes no sense to me), I ended up creating a temporary folder, creating a empty file in that folder, filling it with the bytes of the zipped file then using ZipFile.ExtractToDirectory() to extract it to the final destination.
It may not be the most efficient, but it works quite well.

Move Files That Do Not Exist

I need copy files from my local hard drive to an external hard drive. My thought is, I only want to copy the files that do not currently exist. I am sure there is a much easier way to do as such, but this is where my mind went first.
My thoughts on how to accomplish this:
1) Get a list of all files on my C: drive and write to a text file
2) Get a list of all files on my L: drive (backup) and write to a text file
3) Compare C: drive text file to L: drive text file to find the files that do not exist
4) Write results of the files that do not exist to an array
5) Iterate through the newly created array and copy the files to the L: drive
Is there a more effective/time efficient way to accomplish this task?

For sure you don't want to create text files listing file names, and then compare them. That will be inefficient and clunky. The way to do this is to walk through the source directories looking for all the files. As you go, you'll be creating a matching destination path for each file. Just before you copy the file you need to decide whether or not to copy it. If a file exists at the destination path, skip copying.
Some enhancements on that might include skipping copying only if the file exists and the last modified date/time and file size match. And so on, I'm sure you can imagine variants on this.
One thing that you might not want to do is build a list of all the files first, and then start copying. It may very well be more efficient to copy files as you are iterating over the source directory. For example you could use Directory.EnumerateFiles to do this in an efficient way.
Of course, you don't need to write a program to do this. Thousands already exist, some of which are quite effective.

Can you pre-compress data files to be inserted into a zip file at a later time to improve performance?

As part of our installer build, we have to zip thousands of large data files into about ten or twenty 'packages' with a few hundred (or even thousands of) files in each which are all dependent on being kept with the other files in the package. (They are versioned together if you will.)
Then during the actual install, the user selects which packages they want included on their system. This also lets them download updates to the packages from our site as one large, versioned file rather than asking them to download thousands of individual ones which could also lead to them being out of sync with others in the same package.
Since these are data files, some of them change regularly during the design and coding stages, meaning we then have to re-compress all files in that particular zip package, even if only one file has changed. This makes the packaging step of our installer build take well over an hour each time, with most of that going to re-compressing things that we haven't touched.
We've looked into leaving the zip packages alone, then replacing specific files inside them, but inserting and removing large files from the middle of a zip doesn't give us that much of a performance boost. (A little, but not enough that its worth it.)
I'm wondering if its possible to pre-process files down into a cached raw 'compressed state' that matches how it would be written to the zip package, but only the data itself, not the zip header info, etc.
My thinking is if that is possible, during our build step, we would first look for any data file that doesn't have a compressed cache associated with it, and if not, we would compress that file and write the result to the cache.
Next we would simply append all of the caches together in a file stream, adding any appropriate zip header needed for the files.
This would mean we are still recreating the entire zip during each build, but we are only recompressing data that has changed. The rest would just be written as-is which is very fast since it is a straight write-to-disk. And if a data file changes, its cache is destroyed, so next build-pass it would be recreated.
However, I'm not sure such a thing is possible. Is it, and if so, is there any documentation to show how one would go about attempting this?

Yes, that's possible. The most straightforward approach would be to zip each file individually into its own associated zip archive with one entry. When any file is modified, you replace its associated zip file to keep all of those up to date. Then you can write a simple program to take a set of those single entry zip files and merge them into a single zip file. You will need to refer to the documentation in the PKZip appnote. Take a look at that.
Now that you've read the appnote, what you need to do is use the local header, data, and central header from each individual zip file, write the local header and data as is sequentially to the new zip file, and save the central header and the offsets of the local headers in the new file. Then at the end of the new file save the current offset, write a new central directory using the central headers you saved, updating the offsets appropriately, and ending with a new end of central directory record with the offset of the start of the central directory.
Update:
I decided this was a useful enough thing to write. You can get it here.

You could zip each file before hand, and then "zip" them together with no compression at the end to quickly aggregate them into a distributable package. It won't be as efficient as compressing all the data at once, but should be faster to make modifications.

I cannot seem to locate an actual exe that implements this type of functionality. It appears that most existing tools I've tried that have the ability to merge/update will reprocess(compress) the data stream as you have already stated you saw.
However it seems what you describe can be done if you or someone wants to write it. If you take a look at this link for the ZIP file format specification, you can get an overview of the structure you would have to parse out and process. It looks like you can pretty quickly go from file to file gathering up and discarding the files of interest, then merging in your new/updated files. You would still need to rebuild a new central directory (refer to section 4.3.6 of the above linked document) within your new destination archive.
After a little more digging, the DotNetZip Library forum has a message asking about the same type of functionality which also gives a description just like I described above. It also links to this document which seems to indicate that support for that may be added to the DotNetZip library for you to further experiment with.

SharpZipLib is only compressing some of the directory's sub-directories

I'm using SharpZipLib to create a zip file of a directory in a .NET 3.5 project, and I'm creating the archive like that :
fastZip.CreateZip(Server.MapPath(zipToPath), Server.MapPath(zipFromPath), true, null);
And that doesn't set neither files nor folders filters.
The problem is that the outcome zip file only has some of the sub-directories in that directory and not all of them, say the directory I want to compress has 3 sub-directories, the resulting zip file has only one of them.
Any ideas why is this happening?

A couple of possibles:
Permissions - Since you're using Server.MapPath(), I'm assuming this is a website. In a partial-trust environment the website code has very few permissions, and the library may be swallowing any permissions errors that are occurring during the zip process.
Filenames - Could be a problem with filename length, spaces in the filenames, etc, etc. Since you haven't provided any examples (of the file/directory names, there's no way to narrow it down.

After some debugging I've found the problem.
The cause of the issue is that another process is accessing the created zip file during adding files to it, which causes the SharpZipLib process to terminate and throw an exception, leaving the created zip file with only some of the files.
For more please read my How to know which processes is using a file under ASP.NET? question.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.