I have a tarred gunzip file called ZippedXmls.tar.gz which has 2 xmls inside it.
I need to programmatically unzip this file and the output should be 2 xmls copied in a folder.
How do I achieve this using C#?
I've used .Net's built-in GZipStream for gzipping byte streams and it works just fine. I suspect that your files are tarred first, before being gzipped.
You've asked for code, so here's a sample, assuming you have a single file that is zipped:
FileStream stream = new FileStream("output.xml", FileMode.Create); // this is the output
GZipStream uncompressed = new GZipStream(stream, CompressionMode.Decompress);
uncompressed.Write(bytes,0,bytes.Length); // write all compressed bytes
uncompressed.Flush();
uncompressed.Close();
stream.Dispose();
Edit:
You've changed your question so that the file is a tar.gz file - technically my answer is not applicable to your situation, but I'll leave it here for folks who want to handle .gz files.
sharpziplib should be able to do this
I know this question is ancient, but search engines redirect here for how to extract gzip in C#, so I thought I'd provide a slightly more recent example:
using (var inputFileStream = new FileStream("c:\\myfile.xml.gz", FileMode.Open))
using (var gzipStream = new GZipStream(inputFileStream, CompressionMode.Decompress))
using (var outputFileStream = new FileStream("c:\\myfile.xml", FileMode.Create))
{
await gzipStream.CopyToAsync(outputFileStream);
}
For what should be the simpler question of how to untar see: Decompress tar files using C#
Related
I'm trying to copy a file to a rar archive.
it works with this code,
using (FileStream fStream = File.Open(dest, FileMode.Create))
{
GZipStream obj = new GZipStream(fStream, CompressionMode.Compress);
byte[] bt = File.ReadAllBytes(src);
obj.Write(bt, 0, bt.Length);
obj.Close();
obj.Dispose();
}
but i need to choose the name/extension of the file in the archive independently
What do i need to?
That's not possible, just because you're not creating a rar archive at all, it's instead a gzip, which is a very simple compresed stream and very little extra metadata, unless more serious formats like rar, zip or 7z.
You'll most likely want to control the name of the compressed file by adding a .gz extension. For example, if the original is "SomeText.txt", output it as "SomeText.txt.gz".
I have a large zip file (let's say 10 GB), to which I want to add a single small file (let's say 50 KB). I'm using the following code:
using System.IO.Compression;
using (var targetZip = ZipFile.Open(largeZipFilePath), ZipArchiveMode.Update)
{
targetZip.CreateEntryFromFile(smallFilePath, "foobar");
}
While this works (eventually), it takes a very long time and consumes a ludicrous amount of memory. It seems to extract and recompress the whole archive.
How can I improve this in .Net 4.7? Solution without external dependencies is preferred, but not required if impossible.
use visual studio nuget package manager and install that
Install-Package DotNetZip -Version 1.11.0
using (ZipFile zip = new ZipFile())
{
zip.AddFile("ReadMe.txt"); // no password for this one
zip.Password= "123456!";
zip.AddFile("7440-N49th.png");
zip.Password= "!Secret1";
zip.AddFile("2005_Annual_Report.pdf");
zip.Save("Backup.zip");
}
https://www.nuget.org/packages/DotNetZip/
Since you are in above .NET 4.5, you can use the ZipArchive (System.IO.Compression) class to achieve this. Here is the MSDN documentation: (MSDN).
Here is their example, it just writes text, but you could read in a .csv file and write it out to your new file. To just copy the file in, you would use CreateFileFromEntry, which is an extension method for ZipArchive.
using (FileStream zipToOpen = new FileStream(#"c:\users\exampleuser\release.zip", FileMode.Open))
{
using (ZipArchive archive = new ZipArchive(zipToOpen, ZipArchiveMode.Update))
{
ZipArchiveEntry readmeEntry = archive.CreateEntry("Readme.txt");
using (StreamWriter writer = new StreamWriter(readmeEntry.Open()))
{
writer.WriteLine("Information about this package.");
writer.WriteLine("========================");
}
}
}
Check this:- https://stackoverflow.com/a/22339337/9912441
https://learn.microsoft.com/en-us/dotnet/standard/io/how-to-compress-and-extract-files
I found the reason for this behaviour in another Stack Overflow answer: Out of memory exception while updating zip in c#.net.
The gist of it is that this takes a long time because ZipArchiveMode.Update caches the zip file into memory. The suggestion for avoiding this caching behaviour is to create a new archive, and copy the old archive contents along with the new file to it.
See the MSDN documentation which explains how ZipArchiveMode.Update behaves:
this is my first question on here, so bear with me.
What I'm aiming to do is just create a basic .zip archive in C#. I have tried using the built-in GZipStream class of .NET and have managed to accomplish this, but then I have the problem that I cannot name the file "usercode.zip" without the archived file losing it's extension. Due to constraints I cannot make my program create these files as "usercode.trf.zip", which is the only way I've found of leaving the filename's extension intact inside the archive.
I've tried using a number of other zipping libraries and I can't seem to manage getting them working properly or the way I want it to.
I came upon the SevenZipHelper library that provides some nifty functions to use the LZMA (or 7-zip) library to compress a file.
The code I'm using looks as follows:
//Take the BF file and zip it, using 7ZipHelper
BinaryReader bReader = new BinaryReader(File.Open(pFileName, FileMode.Open));
byte[] InBuf = new byte[Count];
bReader.Read(InBuf, 0, InBuf.Length);
Console.WriteLine("ZIP: read for buffer length:" + InBuf.Length.ToString());
byte[] OutBuf = SevenZip.Compression.LZMA.SevenZipHelper.Compress(InBuf);
FileStream BZipFile = new FileStream(pZipFileName, FileMode.OpenOrCreate, FileAccess.Write);
BZipFile.Seek(0, SeekOrigin.Begin);
BZipFile.Write(OutBuf, 0, OutBuf.Length);
BZipFile.Close();
This creates a compressed file neatly, using the 7-zip algorithm. Problem is I can't guarantee that the clients using this program will have access to 7-zip, so the file has to be in normal zip algorithm. I've gone through the helper- as well as the 7-zip libraries and it seems it is possible to use this library to compress a file with the normal "ZIP" algorithm. I just cannot seem to figure out how to do this. I've noticed properties settings in a few places, but I cannot find any documentation or googling to tell me where to set this.
I realize there's probably better ways to do this and that I'm just missing something, but I can't sit and struggle with such a supposedly easy task forever. Any help would be greatly appreciated.
If you want you can take a look at this library, I've used it before and it's preaty simple to use : dotnetzip
EDIT(example):
using (ZipFile zip = new ZipFile())
{
foreach (String filename in FilesList)
{
Console.WriteLine("Adding {0}...", filename);
ZipEntry e = zip.AddFile(filename,"");
e.Comment = "file " +filename+" added "+DateTime.Now;
}
Console.WriteLine("Done adding files to zip:" + zipName);
zip.Comment = String.Format("This zip archive was created by '{0}' on '{1}'",
System.Net.Dns.GetHostName(), DateTime.Now);
zip.Save(zipName);
Console.WriteLine("Zip made:" + zipName);
}
I have a .tar file containing multiple compressed .gz files. I have no issue itterating through the .tar file creating each .gz file in a destination directory. I'd like to skip writting the .gz all together and just decompress it from the TarEntry/TarArchive? and write its contents on the fly via the .Net native GZipStream. Not even sure this is possible.
Here is my current code that writes each g'zipped file out. Not sure what to modify to get where I need to be.
using (FileStream _fsIn = new FileStream(#"F:\data\abc.tar", FileMode.Open, FileAccess.Read))
{
using (TarInputStream _tarIn = new TarInputStream(_fsIn))
{
TarEntry _tarEntry;
while ((_tarEntry = _tarIn.GetNextEntry()) != null)
{
string _archiveName = _tarEntry.Name;
using (FileStream _outStr = new FileStream(#"F:\data\" + _archiveName, FileMode.Create))
{
_tarIn.CopyEntryContents(_outStr);
}
}
}
}
I'am not sure what you want to do. Maybe you can clarify your aim. The sharpzlib is not that good documented as I Expected to be.
I've iterated through a tar archive and pushed the content of a file into a new Stream, maybe you can use this as a starting point. Have a look at this StackOverflow Article
How can i get the content names of a zipped folder in C# i.e. name of files and folders inside the compressed folder?
I want to decompress the zip by using GZipStream only.
thanks,
kapil
You can't do this using GZipStream only. You will need an implementation of the ZIP standard such as #ziplib. Quote from MSDN:
Compressed GZipStream objects written
to a file with an extension of .gz can
be decompressed using many common
compression tools; however, this class
does not inherently provide
functionality for adding files to or
extracting files from .zip archives.
Example with #ziplib:
using (var stream = File.OpenRead("test.zip"))
using (var zipStream = new ZipInputStream(stream))
{
ZipEntry entry;
while ((entry = zipStream.GetNextEntry()) != null)
{
// entry.IsDirectory, entry.IsFile, ...
Console.WriteLine(entry.Name);
}
}