How do I extract a SubDirectory using C# DotNetZip? - c#

I have MyFile.zip that has a main directory "MyMainFolder", and several SubDirectories inside of that, one of which I want to extract (MySubFolder)...with all of its subdirs and contents.
I am trying to figure out how to 'step-into' the MyMainFolder , so that I can extract 'MySubFolder'.
I have some code that will extract a folder as long as that folder I am looking for exists as the main folder in the zip...and I can detect if the main folder is called "MyMainFolder" so it knows to look inside that and extract from there rather than looking in the main zip root for MySubFolder).
using (ZipFile zip1 = ZipFile.Read(fileName))
{
zipFile = ZipFile.Read(#""+fileName);
var result = zipFile.Any(entry => entry.FileName.Contains("MySubFolder"));
if (result == false)
{
MessageBox.Show("MyMainFolder detected....Extracting from MyMainFolder...");
// something here that will extract JUST MySubFolder and contents
} else {
foreach (var e in selection)
{
var selection = (from e in zip1.Entries where (e.FileName).Contains("NySubfolder") select e)
e.Extract(outputDirectory);
}
}
}
So far, I have tried putting a separate using inside each part of the if-else, and I tried creating a seperate selectionX in which I tried to force the root-folder name (which will always be 'MyMainFolder' for this experiment) to be part of what it looked through, thinking I could then extract MySubFolder, but I couldn't get that to work either. I tried to incorporate several other methods I found on stackflow and elsehwere, like using parts of 'how to extract files, but ignoring the path in the zipfile' and other such posts to try and find a way to 'skip' over that main root folder when extracting. (so that it gets ONLY 'MySubFolder' (and contents) and extracts to outputDirectory (not MyMainFolder\MySubFolder...)
Any help is appreciated.
Thanks!!

Enumerating though the entire contents until I came across what I was looking for worked, but just as an experiment, I wanted to see if it could be done another way.
Since I was unable able to check the names of the subfolders inside a root folder, I figured I could just match what I was looking for as I was parsing through it, extracting only what I wanted to, and then just change the output path.
using (ZipFile zip1 = ZipFile.Read(fileName))
{
zipFile = ZipFile.Read(#""+fileName);
var result = zipFile.Any(entry => entry.FileName.Contains("MySubFolder"));
if (result == false)
{
// something here that will extract JUST MySubFolder and content
string TestX = Path.GetDirectoryName(e.FileName) ;
string MyNewPath = outputDirectory+#"\"+TestX ;
e.Extract(MyNewPath);
} else {
foreach (var e in selection)
{
var selection = (from e in zip1.Entries where (e.FileName).Contains("MySubfolder")
.select e)
e.Extract(outputDirectory);
}
}
Something like that..
Not very useful, but interesting and helped me learn a little.
(if nothing else, an example of how NOT to do things..hehe)
Thanks

Related

VisualStudio Express 2012: StreamReader gives [System.UnauthorizedAccessException] error

I have read a lot of answers on this issue, but none of them helps for me.
Now, it's been 5 years that I had C# and apperantly I've forgotten it all. But I like to get into the language again to use it for automation. So, here is the bit of code I already have:
{
string path = #"C:\Users\decraiec\Documents\Client Automated";
//In this folder I will find all my XML files that I just want to load in a textbox
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
//create a way to read and write the files
//go get the files from my harddrive
StreamReader FileReader = new StreamReader(path);
//make something readable for what you have fetched
StreamWriter FileWriter = new StreamWriter(textBox1.ToString());
int c = 0;
while (c == FileReader.Read())
{
string load = FileReader.ReadToEnd();//read every xmlfile up to the end
string stream = FileWriter.ToString();//make something readable
}
try
{
textBox1.Text = FileWriter.ToString();//what you have made readable, show it in the textbox
FileWriter.Close();
}
finally
{
if (FileReader != null)
{ FileReader.Close(); }
}
if (FileWriter != null)
{ FileWriter.Close(); }
}
}
If I run this code like this I'll get:
An unhandled exception of type 'System.UnauthorizedAccessException' occurred in mscorlib.dll
Additional information: Access to the path 'C:\Users\decraiec\Documents\Atrias Automated' is denied.
While I was hoping to see all the XML files in the textbox listed and clickable ( - although I need to insert the clickable code yet )
I've been looking in my folder and subfolder and files and I do have admin rights on everything. About the [ mscorlib.dll ] I have no clue where to find this.
Now if I wrap the StreamReader in a use ( var....;) VS doesn't recognizes it (red lines under the words) saying that I'm missing an instance of an object or something else of issue (just trying to glue the things together).
Could someone try to get me in the right direction please?
I think your path is a directory, not a file. Almost the exact same issue was addressed here: Question: Using Windows 7, Unauthorized Access Exception when running my application
What you can do is create a DirectoryInfo object on the path and then call GetFiles on it. For example:
DirectoryInfo di = new DirectoryInfo(directoryPath);
Foreach(var file in di.GetFiles())
{
string pathToUseWithStreamReader = file.FullName;
}
You need to use Directory.GetFiles to get any files residing in your "Client Automated" folder, then loop through them and load every single file into the stream.
var files = Directory.GetFiles(path);
foreach (var file in files)
{
var content = File.ReadAllText(file);
}
You can read more on it here:
https://msdn.microsoft.com/en-us/library/07wt70x2(v=vs.110).aspx
Also - in general, when working with files or directories like this, it's a good idea to programmatically check if they exist before working with them. You can do it like so:
if (Directory.Exists(path))
{
...
}
Or with files:
if (File.Exists(path))
{
...
}

Fastest way to delete million of files

I have a list of string which are relative paths. I also have a string which contains root path for those files. Now I am deleting them like this:
foreach (var rawDocumentPath in documents.Select(x => x.RawDocumentPath))
{
if (string.IsNullOrEmpty(rawDocumentPath))
{
continue;
}
string fileName = Path.Combine(storagePath, rawDocumentPath);
File.Delete(fileName);
}
the problem is that I call Path.Combine for every file, and it's slow enough.
How can I speed up this code? I can't delete whole folders, I cannot change current directory (because it affects a whole program)...
I need something like a class which can delete fast several files in specified directory.
If your disk can handle it, parallizing should help a lot:
documents.AsParallel().ForAll(
document =>
{
if (!string.IsNullOrEmpty(document.RawDocumentPath))
{
string fileName = Path.Combine(storagePath, document.RawDocumentPath);
File.Delete(fileName);
}
});

Faulty file in use error

This code is the first code in my Form_Load method:
DirectoryInfo dir =new DirectoryInfo("d:\\themes.thumb");
string[] animals = new string []
{
"Snakes",
"SnowyOwls",
"Tigers",
"TropicalFish",
"WildBeauty",
"Wolves"
};
foreach (FileInfo fil in dir.GetFiles())
{
for(int ii=0;ii<animals.Length;ii++)
{
if (fil.Name.StartsWith(animals[ii]))
{
try
{
fil.Replace(fil.FullName,fil.FullName.Replace(fil.Name,"Animals-" + fil.Name));
}
catch
{
}
}
}
and I'm getting the following error whenever if (fil.Name.StartsWith(animals[ii])) is true:
The process cannot access the file because it is being used by another process.
What is wrong as I have not opened any files before this code?
You should seperate your reading logic from your update logic.
for example:
var replacements = dir.GetFiles()
.Where(file => animals.Any(animal => file.Name.StartsWith(animal)))
.Select(file => new
{
OldFullName = file.FullName,
NewFullName = file.FullName.Replace(file.Name, "Animals-" + file.Name)
})
.ToList();
foreach (var replacement in replacements)
{
File.Move(replacement.OldFullName, replacement.NewFullName);
}
Your replace logic has some subtle bugs (what happens with files that are in a folder called "Wolves" for example?) you may wan to work that out.
It looks like you are misunderstanding how to use the FileInfo.Replace method.
fil.Replace(fil.FullName,fil.FullName.Replace(fil.Name,"Animals-" + fil.Name));
Here you are actually trying to overwrite fil's contents with itself. That explains the error message.
You might want to read the documentation a bit more closely.
EDIT:
To be absolutely clear: FileInfo.Replace is not meant to be used to perform file renames. It's meant to replace file contents. To perform a rename, you use FileInfo.MoveTo.
Get LockHunter. It's a free tool which shows you which process is holding onto a particular file or folder. I found it really useful.
Microsoft Process Explorer is also free and can also find open handles (Ctrl+F) by name.

Need elegant way to move orphan files from folder after paired files are moved with FileInfo class

I have a folder from which I'm moving pairs of related files (xml paired with pdf). Additional files could be deposited into this folder at any time, but the utility runs every 10 minutes or so. We could use the FileSystemWatcher class but for internal reasons we don't for this utility.
I'm using the System.IO.FileInfo class to read all the files in the folder (will only be xml and pdf) during each run. Once I have the files in the FileInfo object, I iterate through the files, moving matches to a working folder. Once that is done, I want to move any files that were not paired, but are in the FileInfo object, to a failure folder.
Since I can't seem to remove items from the FileInfo object (or I am missing something), would it be easier to (1) use a string array from Directory class .GetFiles, (2) create a Dictionary from the FileInfo object and remove values from that during iteration, or (3) is there a more elegant approach using LINQ or something else?
Here is the code so far:
internal static bool CompareXMLandPDFFileNames(FileInfo[] xmlFiles, FileInfo[] pdfFiles, string xmlFilePath)
{
string workingFilePath = xmlFilePath + #"\WORKING";
if (xmlFiles.Length > 0)
{
foreach (var xmlFile in xmlFiles)
{
string xfn = xmlFile.Name; //xml file name
string pdfName = xfn.Substring(0,xfn.IndexOf('_')) + ".pdf"; //parsed pdf file name contained in xml file name
foreach (var pdfFile in pdfFiles)
{
string pfn = pdfFile.Name; //pdf file name
if (pfn == pdfName)
{
//move xml and pdf files to working folder...
FileInfo xmlInfo = new FileInfo(xmlFilePath + xfn);
FileInfo pdfInfo = new FileInfo(xmlFilePath + pfn);
if (!File.Exists(workingFilePath + xfn))
{
xmlInfo.MoveTo(workingFilePath + xfn);
}
if (!File.Exists(workingFilePath + pfn))
{
pdfInfo.MoveTo(workingFilePath + pfn);
}
}
}
}
//all files in the file objects should now be moved to working folder, if not, fix orphans...
}
return true;
}
To be honest I think the question is a bit poor. The problem is stated in a very complicated fashion. I think the workflow be designed to be more robust and deterministic. (e.g. why not upload file pairs in zipped sets in the first place?)
(And no "Someone" most likely "must not have been here before")
Here are some random improvements:
using System;
using System.Linq;
using System.IO;
using System.Text;
using System.Text.RegularExpressions;
using System.Collections.Generic;
namespace O
{
static class X
{
private static readonly Regex _xml2pdf = new Regex("(_.*).xml$", RegexOptions.Compiled | RegexOptions.IgnoreCase);
internal static void MoveFileGroups(string uploadFolder)
{
string workingFilePath = Path.Combine(uploadFolder, "PROGRESS");
var groups = new DirectoryInfo(uploadFolder)
.GetFiles()
.GroupBy(fi => _xml2pdf.Replace(fi.Name, ".pdf"), StringComparer.InvariantCultureIgnoreCase)
.Where(group => group.Count() >1);
foreach (var group in groups)
{
if (!group.Any(fi => File.Exists(Path.Combine(workingFilePath, fi.Name))))
foreach (var file in group)
file.MoveTo(Path.Combine(workingFilePath, file.Name));
}
}
public static void Main(string[]args)
{
}
}
}
use readable names (say what you mean)
IndexOf returns -1 if filename contains no "_"; random upload filenames could make procedure fail
Handle filenames case insensitive on Windows
Don't manually do the path concats (you could accidentally manufacture UNC paths, and your code is less portable)
don't assume one xml will map to one pdf: the naming scheme implies that many xmls map to the same pdf name. This implementation allows that (or you could detect the situation by rejecting groups.Where(g => g.Count()>2)
Move groups atomically only (!): if any one of the files in a group exist in the target dir, don't move any (or you will have a race condition, where part of a group get's moved before the last file was (completely) uploaded and it will never get moved because the group is no longer detected
Other items (todo)
Don't pass redundant parameters. You might pass a FI[] instead of the raw GetFiles() call if you want filtering.
Do error handling, notably:
handle IO exceptions
locking errors are expectable while uploads in progress (test it or end up with corrupted files); you need to atomically handle these (i.e. not move any files in a group unless all could be moved; this will be somewhat tricky)
test your code (none of my sample was tested; it just compiled on linux with mono)

How do I compare one collection of files to another in c#?

I am just learning C# (have been fiddling with it for about 2 days now) and I've decided that, for leaning purposes, I will rebuild an old app I made in VB6 for syncing files (generally across a network).
When I wrote the code in VB 6, it worked approximately like this:
Create a Scripting.FileSystemObject
Create directory objects for the source and destination
Create file listing objects for the source and destination
Iterate through the source object, and check to see if it exists in the destination
if not, create it
if so, check to see if the source version is newer/larger, and if so, overwrite the other
So far, this is what I have:
private bool syncFiles(string sourcePath, string destPath) {
DirectoryInfo source = new DirectoryInfo(sourcePath);
DirectoryInfo dest = new DirectoryInfo(destPath);
if (!source.Exists) {
LogLine("Source Folder Not Found!");
return false;
}
if (!dest.Exists) {
LogLine("Destination Folder Not Found!");
return false;
}
FileInfo[] sourceFiles = source.GetFiles();
FileInfo[] destFiles = dest.GetFiles();
foreach (FileInfo file in sourceFiles) {
// check exists on file
}
if (optRecursive.Checked) {
foreach (DirectoryInfo subDir in source.GetDirectories()) {
// create-if-not-exists destination subdirectory
syncFiles(sourcePath + subDir.Name, destPath + subDir.Name);
}
}
return true;
}
I have read examples that seem to advocate using the FileInfo or DirectoryInfo objects to do checks with the "Exists" property, but I am specifically looking for a way to search an existing collection/list of files, and not live checks to the file system for each file, since I will be doing so across the network and constantly going back to a multi-thousand-file directory is slow slow slow.
Thanks in Advance.
The GetFiles() method will only get you files that does exist. It doesn't make up random files that doesn't exist. So all you have to do is to check if it exists in the other list.
Something in the lines of this could work:
var sourceFiles = source.GetFiles();
var destFiles = dest.GetFiles();
foreach (var file in sourceFiles)
{
if(!destFiles.Any(x => x.Name == file.Name))
{
// Do whatever
}
}
Note: You have of course no guarantee that something hasn't changed after you have done the calls to GetFiles(). For example, a file could have been deleted or renamed if you try to copy it later.
Could perhaps be done nicer somehow by using the Except method or something similar. For example something like this:
var sourceFiles = source.GetFiles();
var destFiles = dest.GetFiles();
var sourceFilesMissingInDestination = sourceFiles.Except(destFiles, new FileNameComparer());
foreach (var file in sourceFilesMissingInDestination)
{
// Do whatever
}
Where the FileNameComparer is implemented like so:
public class FileNameComparer : IEqualityComparer<FileInfo>
{
public bool Equals(FileInfo x, FileInfo y)
{
return Equals(x.Name, y.Name);
}
public int GetHashCode(FileInfo obj)
{
return obj.Name.GetHashCode();
}
}
Untested though :p
One little detail, instead of
sourcePath + subDir.Name
I would use
System.IO.Path.Combine(sourcePath, subDir.Name)
Path does reliable, OS independent operations on file- and foldernames.
Also I notice optRecursive.Checked popping out of nowhere. As a matter of good design, make that a parameter:
bool syncFiles(string sourcePath, string destPath, bool checkRecursive)
And since you mention it may be used for large numbers of files, keep an eye out for .NET 4, it has an IEnumerable replacement for GetFiles() that will let you process this in a streaming fashion.

Categories