I have LZH archive (.lzh, .lha extensions of archive), and need to extract file from it in .NET Framework 4? Does .NET Framework 4 have some built in toolset for this?
I ported an LHA Java decompression library to .NET called LHA Decompressor.
Thanks very much to the author above. I have posted a simple implementation of it for reference.
//Extracts all files in the .lzh archive
LhaFile lhaFile = null;
byte[] dest = new byte[8];
List<string> extractedFileList = new List<string>();
lhaFile = new LhaFile(filePath, Encoding.UTF7);
IEnumerator<LhaEntry> enumerator = lhaFile.GetEnumerator();
while (enumerator.MoveNext())
{
string fileName = enumerator.Current.GetPath();
LhaEntry lhaEntry = lhaFile.GetEntry(fileName);
dest = lhaFile.GetEntryBytes(lhaEntry);
File.WriteAllBytes(Path.Combine(extractionPath, fileName), dest);
string fullPath = Path.Combine(extractionPath, fileName);
extractedFileList.Add(fullPath);
}
lhaFile.Close();
Related
I have millions of doc files which need to be converted to docx. I am currently using the below method to convert each file in the specified directory. How can I effectively multithread this process?
static void ConvertDocToDocx(string path)
{
Application word = new Application();
var sourceFile = new FileInfo(path);
var document = word.Documents.Open(sourceFile.FullName);
string newFileName = sourceFile.FullName.Replace(".doc", ".docx");
document.SaveAs2(newFileName, WdSaveFormat.wdFormatXMLDocument,
CompatibilityMode: WdCompatibilityMode.wdWord2010);
word.ActiveDocument.Close();
word.Quit();
//File.Delete(path);
}
My current approach is to use Directory.GetFiles to create a list of files which are in my path, then use Parallel.ForEach to convert the files. Here's my code:
string[] filesList = Directory.GetFiles(path);
Parallel.ForEach(filesList, new ParallelOptions { MaxDegreeOfParallelism = 20 }, file =>
{
if (file.Contains(".doc"))
{
ConvertDocToDocx(file);
}
});
However, this doesn't seem to increase performance. Am I misunderstanding the use of Parallel.ForEach?
You are using Word via automation which is equivalent of opening the files manually one by one and saving them. This method may have one performance increasing possibility: there is no need to create new Word instances for each file, just reuse the first instance.
...
var wordInstance = new Application();
try
{
var fileNameList = Directory.GetFiles(path);
foreach(var fileName in fileNameList)
{
if (fileName.Contains(".doc"))
{
ConvertDocToDocx(wordInstance, file);
}
}
}
finally
{
word.Quit();
}
...
static void ConvertDocToDocx(Application wordInstance, string path)
{
var sourceFile = new FileInfo(path);
var newFileName = sourceFile.FullName.Replace(".doc", ".docx");
var document = wordInstance.Documents.Open(sourceFile.FullName);
document.SaveAs2(
newFileName,
WdSaveFormat.wdFormatXMLDocument,
CompatibilityMode: WdCompatibilityMode.wdWord2010);
wordInstance.ActiveDocument.Close();
//File.Delete(path);
}
But as others already mentioned that is the limit of this approach.
You should have a look at solutions which are based on file format knowledge, like e.g. NPOI. It is a C# rewrite of popular Apache POI package so if you search for "POI convert doc to docx" and find Java code do not be afraid almost the same code will compile under C# with NPOI package too, in most cases just minor syntax changes would be required.
I am developing UWP and Windows phone 8.1 in the same solution.
On both projects I need a functionality of compressing a whole folder to one gzip file (in order to send it to server).
Libraries I've tried and encountered issues with:
SharpZipLib - uses System.IClonable which I cannot referance in my PCL project
DotNetZip - Not Suporting PCL/UWP
System.IO.Compression - Work only with Stream, cannot compress whole folder
I can split the implementation for each platform (although it is not perfect) but I still didn't found something that can be used in UWP.
Any help will be appriciated
Ok, so I found this project called SharpZipLib.Portable which is also an open source
Github : https://github.com/ygrenier/SharpZipLib.Portable
Really nice :)
Working on a UWP library you will have to use the Stream subsystem of the System.IO.Compression. There are many such limitations when you need a PCL version of .NET Framework. Live with that.
In your context that is not much of a trouble.
The required usings are:
using System;
using System.IO;
using System.IO.Compression;
Then the methods...
private void CreateArchive(string iArchiveRoot)
{
using (MemoryStream outputStream = new MemoryStream())
{
using (ZipArchive archive = new ZipArchive(outputStream, ZipArchiveMode.Create, true))
{
//Pick all the files you need in the archive.
string[] files = Directory.GetFiles(iArchiveRoot, "*", SearchOption.AllDirectories);
foreach (string filePath in files)
{
FileAppend(iArchiveRoot, filePath, archive);
}
}
}
}
private void FileAppend(
string iArchiveRootPath,
string iFileAbsolutePath,
ZipArchive iArchive)
{
//Has to return something like "dir1/dir2/part1.txt".
string fileRelativePath = MakeRelativePath(iFileAbsolutePath, iArchiveRootPath);
ZipArchiveEntry clsEntry = iArchive.CreateEntry(fileRelativePath, CompressionLevel.Optimal);
Stream entryData = clsEntry.Open();
//Write the file data to the ZipArchiveEntry.
entryData.Write(...);
}
//http://stackoverflow.com/questions/275689/how-to-get-relative-path-from-absolute-path
private string MakeRelativePath(
string fromPath,
string toPath)
{
if (String.IsNullOrEmpty(fromPath)) throw new ArgumentNullException("fromPath");
if (String.IsNullOrEmpty(toPath)) throw new ArgumentNullException("toPath");
Uri fromUri = new Uri(fromPath);
Uri toUri = new Uri(toPath);
if (fromUri.Scheme != toUri.Scheme) { return toPath; } // path can't be made relative.
Uri relativeUri = fromUri.MakeRelativeUri(toUri);
String relativePath = Uri.UnescapeDataString(relativeUri.ToString());
if (toUri.Scheme.Equals("file", StringComparison.OrdinalIgnoreCase))
{
relativePath = relativePath.Replace(Path.AltDirectorySeparatorChar, Path.DirectorySeparatorChar);
}
return relativePath;
}
I want to zip one "CSV" file in to Zip file using C#.Net. Below i have written some code for create Zip file , using this code i am able to create zip file but after creating "Data1.zip" file extract manually means extracted file extension should be ".csv" but it is not coming.
FileStream sourceFile = File.OpenRead(#"C:\Users\Rav\Desktop\rData1.csv");
FileStream destFile = File.Create(#"C:\Users\Rav\Desktop\Data1.zip");
GZipStream compStream = new GZipStream(destFile, CompressionMode.Compress,false);
try
{
int theByte = sourceFile.ReadByte();
while (theByte != -1)
{
compStream.WriteByte((byte)theByte);
theByte = sourceFile.ReadByte();
}
}
finally
{
compStream.Dispose();
}
http://msdn.microsoft.com/en-us/library/system.io.compression.gzipstream.aspx
This is gzip compression, and apparently it only compresses a stream, which when decompressed takes the name of the archive without the .gz extension. I don't know if I'm right here though. You might as well experiment with the code from MSDN, see if it works.
I used ZipLib for zip compression. It also supports Bz2, which is a good compression algorithm.
Use ICSharpCode.SharpZipLib(you can download it) and do the following
private void CreateZipFile(string l_sFolderToZip)
{
FastZip z = new FastZip();
z.CreateEmptyDirectories = true;
z.CreateZip(l_sFolderToZip + ".zip", l_sFolderToZip, true, "");
if (Directory.Exists(l_sFolderToZip))
Directory.Delete(l_sFolderToZip, true);
}
private void ExtractFromZip(string l_sFolderToExtract)
{
string l_sZipPath ="ur folder path" + ".zip";
string l_sDestPath = "ur location" + l_sFolderToExtract;
FastZip z = new FastZip();
z.CreateEmptyDirectories = true;
z.ExtractZip(l_sZipPath, l_sDestPath, "");
if (File.Exists(l_sZipPath))
File.Delete(l_sZipPath);
}
Hope it helps...
Use one of these libraries:
http://www.icsharpcode.net/opensource/sharpziplib/
http://dotnetzip.codeplex.com/
I prefer #ziplib, but both are well documented and widely spread.
Since .NET Framework 4.5, you can use the built-in ZipFile class (In the System.IO.Compression namespace).
public void ZipFiles(string[] filePaths, string zipFilePath)
{
ZipArchive zipArchive = ZipFile.Open(zipFilePath, ZipArchiveMode.Create);
foreach (string file in filePaths)
{
zipArchive.CreateEntryFromFile(file, Path.GetFileName(file), CompressionLevel.Optimal);
}
zipArchive.Dispose();
}
Take a look at the FileSelectionManager library here: www.fileselectionmanager.com
First you have to add File Selection Manager DLL to your project
Here is an example for zipping:
class Program
{
static void Main(string[] args)
{
String directory = #"C:\images";
String destinationDiretory = #"c:\zip_files";
String zipFileName = "container.zip";
Boolean recursive = true;
Boolean overWrite = true;
String condition = "Name Contains \"uni\"";
FSM FSManager = new FSM();
/* creates zipped file containing selected files */
FSManager.Zip(directory,recursive,condition,destinationDirectory,zipFileName,overWrite);
Console.WriteLine("Involved Files: {0} - Affected Files: {1} ",
FSManager.InvolvedFiles,
FSManager.AffectedFiles);
foreach(FileInfo file in FSManager.SelectedFiles)
{
Console.WriteLine("{0} - {1} - {2} - {3} - {4} Bytes",
file.DirectoryName,
file.Name,
file.Extension,
file.CreationTime,
file.Length);
}
}
}
Here is an example for unzipping:
class Program
{
static void Main(string[] args)
{
String destinationDiretory = #"c:\zip_files";
String zipFileName = "container.zip";
Boolean unZipWithDirectoryStructure = true;
FSM FSManager = new FSM();
/* Unzips files with or without their directory structure */
FSManager.Unzip(zipFileName,
destinationDirectory,
unZipWithDirectoryStructure);
}
}
Hope it helps.
I use the dll fileselectionmanager to compress and decompress files and folders, it has worked properly in my project. You can see example in your web http://www.fileselectionmanager.com/#Zipping and Unzipping files
and documentation http://www.fileselectionmanager.com/file_selection_manager_documentation
i use DotNetZip in my project.
using (var zip = new ZipFile())
{
zip.ProvisionalAlternateEncoding = System.Text.Encoding.GetEncoding(866);
zip.AddFile(filename, "directory\\in\\archive");
zip.Save("archive.zip");
}
all ok but when i use method AddDirectoryByName i have a bad directory names.
Universal way for all is :
zip.AlternateEncoding = Encoding.UTF8;
zip.ProvisionalAlternateEncoding = Encoding.GetEncoding(Console.OutputEncoding.CodePage);
zip.AlternateEncodingUsage = ZipOption.AsNecessary;
This way in new version work for me
zip.AlternateEncodingUsage = ZipOption.Always;
zip.AlternateEncoding = Encoding.GetEncoding(866);
You may Peek Definition first.
Then you will find this:
public ZipFile(Encoding encoding);
So you can use this:
using (ZipFile zip = new ZipFile(Encoding.UTF8))
How can I list the contents of a zipped folder in C#? For example how to know how many items are contained within a zipped folder, and what is their name?
.NET 4.5 or newer finally has built-in capability to handle generic zip files with the System.IO.Compression.ZipArchive class (http://msdn.microsoft.com/en-us/library/system.io.compression.ziparchive%28v=vs.110%29.aspx) in assembly System.IO.Compression. No need for any 3rd party library.
string zipPath = #"c:\example\start.zip";
using (ZipArchive archive = ZipFile.OpenRead(zipPath))
{
foreach (ZipArchiveEntry entry in archive.Entries)
{
Console.WriteLine(entry.FullName);
}
}
DotNetZip - Zip file manipulation in .NET languages
DotNetZip is a small, easy-to-use class library for manipulating .zip files. It can enable .NET applications written in VB.NET, C#, any .NET language, to easily create, read, and update zip files.
sample code to read a zip:
using (var zip = ZipFile.Read(PathToZipFolder))
{
int totalEntries = zip.Entries.Count;
foreach (ZipEntry e in zip.Entries)
{
e.FileName ...
e.CompressedSize ...
e.LastModified...
}
}
If you are using .Net Framework 3.0 or later, check out the System.IO.Packaging Namespace. This will remove your dependancy on an external library.
Specifically check out the ZipPackage Class.
Check into SharpZipLib
ZipInputStream inStream = new ZipInputStream(File.OpenRead(fileName));
while (inStream.GetNextEntry())
{
ZipEntry entry = inStream.GetNextEntry();
//write out your entry's filename
}
Ick - that code using the J# runtime is hideous! And I don't agree that it is the best way - J# is out of support now. And it is a HUGE runtime, if all you want is ZIP support.
How about this - it uses DotNetZip (Free, MS-Public license)
using (ZipFile zip = ZipFile.Read(zipfile) )
{
bool header = true;
foreach (ZipEntry e in zip)
{
if (header)
{
System.Console.WriteLine("Zipfile: {0}", zip.Name);
if ((zip.Comment != null) && (zip.Comment != ""))
System.Console.WriteLine("Comment: {0}", zip.Comment);
System.Console.WriteLine("\n{1,-22} {2,9} {3,5} {4,9} {5,3} {6,8} {0}",
"Filename", "Modified", "Size", "Ratio", "Packed", "pw?", "CRC");
System.Console.WriteLine(new System.String('-', 80));
header = false;
}
System.Console.WriteLine("{1,-22} {2,9} {3,5:F0}% {4,9} {5,3} {6:X8} {0}",
e.FileName,
e.LastModified.ToString("yyyy-MM-dd HH:mm:ss"),
e.UncompressedSize,
e.CompressionRatio,
e.CompressedSize,
(e.UsesEncryption) ? "Y" : "N",
e.Crc32);
if ((e.Comment != null) && (e.Comment != ""))
System.Console.WriteLine(" Comment: {0}", e.Comment);
}
}
I'm relatively new here so maybe I'm not understanding what's going on. :-)
There are currently 4 answers on this thread where the two best answers have been voted down. (Pearcewg's and cxfx's) The article pointed to by pearcewg is important because it clarifies some licensing issues with SharpZipLib.
We recently evaluated several .Net compression libraries, and found that DotNetZip is currently the best aleternative.
Very short summary:
System.IO.Packaging is significantly slower than DotNetZip.
SharpZipLib is GPL - see article.
So for starters, I voted those two answers up.
Kim.
If you are like me and do not want to use an external component, here is some code I developed last night using .NET's ZipPackage class.
var zipFilePath = "c:\\myfile.zip";
var tempFolderPath = "c:\\unzipped";
using (Package package = ZipPackage.Open(zipFilePath, FileMode.Open, FileAccess.Read))
{
foreach (PackagePart part in package.GetParts())
{
var target = Path.GetFullPath(Path.Combine(tempFolderPath, part.Uri.OriginalString.TrimStart('/')));
var targetDir = target.Remove(target.LastIndexOf('\\'));
if (!Directory.Exists(targetDir))
Directory.CreateDirectory(targetDir);
using (Stream source = part.GetStream(FileMode.Open, FileAccess.Read))
{
source.CopyTo(File.OpenWrite(target));
}
}
}
Things to note:
The ZIP archive MUST have a [Content_Types].xml file in its root. This was a non-issue for my requirements as I will control the zipping of any ZIP files that get extracted through this code. For more information on the [Content_Types].xml file, please refer to: A New Standard For Packaging Your Data There is an example file below Figure 13 of the article.
This code uses the Stream.CopyTo method in .NET 4.0
The best way is to use the .NET built in J# zip functionality, as shown in MSDN: http://msdn.microsoft.com/en-us/magazine/cc164129.aspx. In this link there is a complete working example of an application reading and writing to zip files. For the concrete example of listing the contents of a zip file (in this case a Silverlight .xap application package), the code could look like this:
ZipFile package = new ZipFile(packagePath);
java.util.Enumeration entries = package.entries();
//We have to use Java enumerators because we
//use java.util.zip for reading the .zip files
while ( entries.hasMoreElements() )
{
ZipEntry entry = (ZipEntry) entries.nextElement();
if (!entry.isDirectory())
{
string name = entry.getName();
Console.WriteLine("File: " + name + ", size: " + entry.getSize() + ", compressed size: " + entry.getCompressedSize());
}
else
{
// Handle directories...
}
}
Aydsman had a right pointer, but there are problems. Specifically, you might find issues opening zip files, but is a valid solution if you intend to only create pacakges. ZipPackage implements the abstract Package class and allows manipulation of zip files. There is a sample of how to do it in MSDN: http://msdn.microsoft.com/en-us/library/ms771414.aspx. Roughly the code would look like this:
string packageRelationshipType = #"http://schemas.microsoft.com/opc/2006/sample/document";
string resourceRelationshipType = #"http://schemas.microsoft.com/opc/2006/sample/required-resource";
// Open the Package.
// ('using' statement insures that 'package' is
// closed and disposed when it goes out of scope.)
foreach (string packagePath in downloadedFiles)
{
Logger.Warning("Analyzing " + packagePath);
using (Package package = Package.Open(packagePath, FileMode.Open, FileAccess.Read))
{
Logger.OutPut("package opened");
PackagePart documentPart = null;
PackagePart resourcePart = null;
// Get the Package Relationships and look for
// the Document part based on the RelationshipType
Uri uriDocumentTarget = null;
foreach (PackageRelationship relationship in
package.GetRelationshipsByType(packageRelationshipType))
{
// Resolve the Relationship Target Uri
// so the Document Part can be retrieved.
uriDocumentTarget = PackUriHelper.ResolvePartUri(
new Uri("/", UriKind.Relative), relationship.TargetUri);
// Open the Document Part, write the contents to a file.
documentPart = package.GetPart(uriDocumentTarget);
//ExtractPart(documentPart, targetDirectory);
string stringPart = documentPart.Uri.ToString().TrimStart('/');
Logger.OutPut(" Got: " + stringPart);
}
// Get the Document part's Relationships,
// and look for required resources.
Uri uriResourceTarget = null;
foreach (PackageRelationship relationship in
documentPart.GetRelationshipsByType(
resourceRelationshipType))
{
// Resolve the Relationship Target Uri
// so the Resource Part can be retrieved.
uriResourceTarget = PackUriHelper.ResolvePartUri(
documentPart.Uri, relationship.TargetUri);
// Open the Resource Part and write the contents to a file.
resourcePart = package.GetPart(uriResourceTarget);
//ExtractPart(resourcePart, targetDirectory);
string stringPart = resourcePart.Uri.ToString().TrimStart('/');
Logger.OutPut(" Got: " + stringPart);
}
}
}
The best way seems to use J#, as shown in MSDN: http://msdn.microsoft.com/en-us/magazine/cc164129.aspx
There are pointers to more c# .zip libraries with different licenses, like SharpNetZip and DotNetZip in this article: how to read files from uncompressed zip in c#?. They might be unsuitable because of the license requirements.