file content comparison issue c# - c#

my requirement is to move files from one directory to another after certain interval. So basic copy works, but during the next subsequent interval I want to move only the new files.
Following is my approach:
I am creating the file list of both source & target directory in target location, the idea is based on the difference of these two files copy only the files that are new from last iteration.
For 1st iteration it will create a blank file in target indicating copy everything. But my file comparison is hitting issues here, based on logic in below code it's creating this excpetion "System.Linq.Enumerable+d__99`1[System.String]"
Here's the code:
static void Main(string[] args)
{
create_source_fileList();
string source_dir = System.Configuration.ConfigurationSettings.AppSettings["SourceDir"];
string target_dir = System.Configuration.ConfigurationSettings.AppSettings["TargetDir"];
string dpath = target_dir + "Diff" + ".txt";
TextWriter df = new StreamWriter(dpath);
DirectoryInfo sourceinfo = new DirectoryInfo(source_dir);
DirectoryInfo targetinfo = new DirectoryInfo(target_dir);
string[] source_f_list = File.ReadAllLines(target_dir + "Source_File_List.txt");
string[] target_f_list = File.ReadAllLines(target_dir + "Target_File_List.txt");
IEnumerable<String> file_list_diff = source_f_list.Except(target_f_list);
df.WriteLine(file_list_diff);
df.Close();
if (!Directory.Exists(targetinfo.FullName))
{
Directory.CreateDirectory(targetinfo.FullName);
}
foreach (FileInfo fi in sourceinfo.GetFiles())
{
fi.CopyTo(Path.Combine(targetinfo.ToString(), fi.Name), true);
}
create_target_fileList();
}
Need help in fixing this issue,also will this approach down the line work in loop where I will iterate only the names in diff file.
Thanks!!

It's not creating an exception; rather, it's writing the result of calling ToString on an IEnumerable<string> instance. You need to replace:
df.WriteLine(file_list_diff);
with
df.WriteLine(string.Join(Environment.NewLine, file_list_diff));
or, prior to .NET 4:
df.WriteLine(string.Join(Environment.NewLine, file_list_diff.ToArray()));

Related

Error while trying to rename files using C#

I have the following code to rename files in the following tree as from 00000001.pdf to the last file with this 8 character left padding, e.g: 00000100.pdf
Folder1
subfolder1
childfolder1
pdffile1
pdffile2
childfolder2
pdffile3
pdffile4
subfolder2
childfolder3
pdffile5
pdffile6
But for some reason in some of those child folders it keeps renaming them with no end.
Some times it just jumps to another number, as if it was an async operation. But if I stop and start again it goes okay until the second next folder, when it messes up again.
But this error only happened within 19 folders.
Indeed their pdf names are different from the others, but I don't see how it is related.
The other files were named something like "DOCUMENT_01" and so on, but these are:
0000000100000001.pdf
0000000200000001.pdf
0000000300000001.pdf
etc
static void Main(string[] args)
{
Console.WriteLine("Digite a pasta 'pai' onde serão buscados pdfs dentro das pastas 'filhas':");
string path = Console.ReadLine();
foreach (string dir in Directory.EnumerateDirectories(path))
{
foreach (string subdir in Directory.EnumerateDirectories(dir))
{
Console.WriteLine($"{dir} - {subdir}");
int n = 1;
foreach (string pdffile in Directory.EnumerateFiles(subdir, "*.pdf", SearchOption.AllDirectories))
{
Console.WriteLine(n.ToString().PadLeft(8, '0') + " " + new FileInfo(pdffile).Length);
File.Move(pdffile, subdir + $"\\{n.ToString().PadLeft(8, '0')}.pdf");
n++;
}
Console.WriteLine("\n\n");
}
}
}
What could be going wrong?
It should await for the File.Move method to end to add the n + 1 and then moving to the next pdffile as a synchronous operation. So why does it jumps numbers after a random time and why it keeps going forever other times?
And just to remember, if I stop the program and start again and put the folder that was messed up as the first one, it goes ok and only when it goes to the next folder, or the folder after next that it start to give me this error again.
Hope that I could make myself clear... Thanks for your attention!
EDIT: will try using FileInfo class to give me the parent folder with the SearchOption.AllDirectories option and exclude this 3 stage loop plus actually working for any kind of tree structure
EDIT2: Tried, worked as a "tree indepent" script but getting the same result with the files name after the first folder... As it's really fast, in 3 seconds it goes from 00000169.pdf to 00006239.pdf in a folder with just 330 items.
As commented already, it is not a good idea to move or rename files “WHILE” the code is enumerating though the list of those files as the posted code appears to do. This will cause obvious problems and you should simply mark the files somehow, then later come back and rename or move them.
More importantly, the big issue related to renaming/moving files is exactly as you describe with your current issue. The problem is that the errors are erratic and not consistent. Making it very difficult to trace. However, the problems you describe are classic trademarks of moving/renaming files while enumerating through those files.
With that said, the best way and easiest way to traverse an unknown number of folder levels given a starting folder is by using recursion. In a lot of cases, recursion can be avoided with some well though out loops, however when we do not know how many levels of folders there are, then, using a simple loop or foreach loop paradigm may be doable, however, you will most likely be adding variables and code that only makes this more complex. This is shown in the current code with the addition of the dir variable to keep track of “when” a different folder is used. Recursion is suited ideally for this situation.
In this case, this recursive method will be called ONCE for each folder and subfolders from a given “starting” folder location. This means that each time this recursive method is called is when a different folder is beginning to be processed. So n would always start at 1 and we do not need to keep track of the current folders path.
So the signature of this method will take a DirectoryFolder object as a “starting” folder. First we create some variables; a FileInfo array pdffiles to hold the pdf files in the given folder; in addition to a DirectoryInfo array foldersInThisFolder to hold all the other folders in this starting folder. Lastly an int n to index the files as the posted code is doing.
Next we get all the pdf files in this “starting” folder. If there are pdf files in this folder, then we loop through those files and process them. Next, we get all the other folders in this “starting” folder. Then start a loop through each folder. For each folder in this collection we will make the recursive call back to this method using the next folder as the “starting” folder, then the whole process continues until the loop through those folders ends.
static void TraverseDirectoryTree(DirectoryInfo startingFolder) {
FileInfo[] pdffiles = null;
DirectoryInfo[] foldersInThisFolder = null;
int n = 1;
Console.WriteLine(startingFolder.FullName);
// get all the pdf files in this folder
try {
pdffiles = startingFolder.GetFiles("*.pdf");
}
catch (Exception e) {
// you may want to catch specific exceptions
// however in this example we do not care what
// the exception is, we will simply ignore this.
// in most cases pdffiles will be null if an exception is thrown
Console.WriteLine(e.Message);
}
if (pdffiles != null) {
foreach (FileInfo pdffile in pdffiles) {
Console.WriteLine(pdffile.FullName + " -> " + n.ToString().PadLeft(8, '0') + " " + pdffile.Length);
//File.Move(pdffile.FullName, pdffile.DirectoryName + $"\\{n.ToString().PadLeft(8, '0')}.pdf");
// add file path to a list of files to rename later?
n++;
}
// start over wiith the sub folders in this folder
foldersInThisFolder = startingFolder.GetDirectories();
foreach (DirectoryInfo dirInfo in foldersInThisFolder) {
TraverseDirectoryTree(dirInfo);
}
}
}
Usage…
Console.WriteLine("Type the folder you want to start with:");
string path = Console.ReadLine();
DirectoryInfo di = new DirectoryInfo(path);
TraverseDirectoryTree(di);
Edit… after further testing it appears that what you are wanting to do is simply “rename” the pdf files. As suggested a simple solution is to save the files that we want to rename, then, after we collect the files we want to rename, we simply loop through those files and rename them. This should eliminate any problems by renaming files while enumerating though the files collection.
To help, I created a Dictionary<string, int> called filesToRename. While recursively looping through all the folders, we will add the full path of each pdf file we want to rename as the Key and the int value n as the Value. After the dictionary is filled we would simply loop through it and rename the files.
private static Dictionary<string, int> filesToRename = new Dictionary<string, int>();
Then replace the commented-out line in the recursive method TraverseDirectoryTree…
//File.Move(pdffile.FullName, pdffile.DirectoryName + $"\\{n.ToString().PadLeft(8, '0')}.pdf");
With…
filesToRename.Add(pdffile.FullName, n);
Then after the dictionary is filled we would loop through it and rename the files, something like…
DirectoryInfo di = new DirectoryInfo(path);
TraverseDirectoryTree(di);
foreach (KeyValuePair<string, int> kvp in filesToRename) {
int index = kvp.Key.ToString().LastIndexOf(#"\");
string dir = kvp.Key.ToString().Substring(0, index);
File.Move(kvp.Key, dir + $"\\{kvp.Value.ToString().PadLeft(8, '0')}.pdf");
}
I am hoping this makes sense…
Answer as Klaus Gütter helped me, I just added .ToList() to the Directory.EnumerateFiles so it made a fixed list first, and then made the foreach for each file
It will rename every pdf within the folder and it's subfolders
Console.WriteLine("Type the folder you want to start with:");
string path = Console.ReadLine();
string dir = "";
int n = 1;
foreach (string pdffile in Directory.EnumerateFiles(path, "*.pdf", SearchOption.AllDirectories).ToList())
{
FileInfo fi = new FileInfo(pdffile);
if (fi.DirectoryName == dir)
{
Console.WriteLine("\t" + n.ToString().PadLeft(8, '0'));
File.Move(pdffile, dir + $"\\{n.ToString().PadLeft(8, '0')}.pdf");
n++;
}
else
{
n = 1;
dir = fi.DirectoryName;
Console.WriteLine("\n\n" + dir);
File.Move(pdffile, dir + $"\\{n.ToString().PadLeft(8, '0')}.pdf");
Console.WriteLine("\t" + n.ToString().PadLeft(8, '0'));
n++;
}
}

Quote-Enclosing 600+ CSV files in a directory c#

I currently have this method that can successfully quote-enclose a single CSV file but I am trying to loop through 600+ CSV files in a directory and perform the Quote Enclose method on each one. I am unsure how to do this effectively. Any feedback is appreciated.
Below is my code:
public void QuoteEnclosingCSV()
{
string fileNamePath = Path.GetTempPath() + #"\Reports\*.csv";
var stringBuilder = new StringBuilder();
foreach (var line in File.ReadAllLines(fileNamePath))
{
stringBuilder.AppendLine(string.Format("\"{0}\"", string.Join("\",\"", line.Split(','))));
}
File.WriteAllText(string.Format(fileNamePath, Path.GetDirectoryName(fileNamePath)), stringBuilder.ToString());
}
string marFolder = Path.GetTempPath() + #"\Reports\";
var dir = new DirectoryInfo(marFolder);
foreach (var file in dir.EnumerateFiles("*.csv"))
{
QuoteEnclosingCSV();
}
Below is the error I'm receiving:
Illegal characters in path.
My first step in unraveling this conundrum would be to guess what the error message is trying to tell me. My first guess would be that it's trying to say that the path has illegal characters in it. Did you stop to check what characters were in the path that you get the error on?
I'll show you:
C:\Users\YoungStamos\AppData\Local\Temp\\Reports\*.csv
That's the path you pass to File.ReadAllLines(). The single argument to that method is a path to one single file. You can't have an asterisk (*) in a filename in Windows, because it's a wildcard.
What you seem to be trying to do is pass a parameter to QuoteEnclosingCSV(). In this loop, you carefully list each file, but you never tell QuoteEnclosingCSV() about any of them.
foreach (var file in dir.EnumerateFiles("*.csv"))
{
QuoteEnclosingCSV();
}
This is more like what you want:
public void QuoteEnclosingCSV(string fileNamePath)
{
var stringBuilder = new StringBuilder();
foreach (var line in File.ReadAllLines(fileNamePath))
{
stringBuilder.AppendLine(string.Format("\"{0}\"", string.Join("\",\"", line.Split(','))));
}
// I don't know what string.Format() is meant to do here; I'm guessing your guess is
// as good as mine, so I'm eliminating it.
//File.WriteAllText(string.Format(fileNamePath, Path.GetDirectoryName(fileNamePath)), stringBuilder.ToString());
File.WriteAllText(fileNamePath, stringBuilder.ToString());
}
And then call it like this:
string marFolder = Path.Combine(Path.GetTempPath(), "Reports");
var dir = new DirectoryInfo(marFolder);
foreach (var fileInfo in dir.EnumerateFiles("*.csv"))
{
QuoteEnclosingCSV( fileInfo.FullName );
}

Trouble specifying destination filename for use in FileInfo.Copy example from MSDN

I'm using two DateTimePickers to specify a date range, then I'm using a CheckedListBox to specify some strings for filenames with wildcards to enumerate in each day's subdirectory contained within a system environment variable path. I want to copy from that source to a destination using FileInfo.Copy.
I have my code already creating the necessary directories. But I'm having trouble specifying the destination filenames -- they are not being specified at all with how I have this written.
I was thinking of using regular expressions, but after some digging I found this MSDN article that seems to do what I want already. I think I need to alter my code in order to use it. I could use some assistance fitting what I already have into what MSDN shows in its example.
I have been on this part of my program for a month now, which has led me to learn quite a bit about c#, parallel programming, async, lambda expressions, background workers, etc. What seems should be simple has become a big rabbit hole for me. For this question I just need a nudge in the right direction, and I will greatly appreciate it!
Here is my code as it stands:
private async void ProcessFiles()
{
// create a list of topics
var topics = topicsBox.CheckedItems.Cast<string>().ToList();
// create a list of source directories based on date range
var directories = new List<string>();
var folders = new List<string>();
for (DateTime date = dateTimePicker1.Value.Date;
date.Date <= dateTimePicker2.Value.Date;
date = date.AddDays(1))
{
directories.Add(_tracePath + #"\" + date.ToString("yyyy-MM-dd") + #"\");
folders.Add(#"\" + date.ToString("yyyy-MM-dd") + #"\");
}
// create a list of source files to copy and destination
// revise based on https://msdn.microsoft.com/en-us/library/kztecsys.aspx?f=255&MSPPError=-2147217396
foreach (var path in directories)
{
var path1 = path;
try
{
foreach (var files2 in folders)
{
// create the target directory
var destPath = textBox1.Text + #"\" + textBox4.Text + files2;
Console.WriteLine("Target directory is {0}", destPath);
Console.WriteLine("Destination filename is {0}", files2);
foreach (var files in topics)
{
foreach (string sourcePath in Directory.EnumerateFiles(path1, files + "*.*", SearchOption.AllDirectories))
{
// copy the files to the temp folder asynchronously
Console.WriteLine("Copy {0} to {1}", sourcePath, destPath);
Directory.CreateDirectory(sourcePath.Replace(sourcePath, destPath));
}
}
}
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
}
}
So sourcePath contains the source path and filename. You can easily construct the destination path from that, like so:
// Get just the filename of the source file.
var filename = Path.GetFileName(sourcePath);
// Construct a full path to the destination by combining the destination path and the filename.
var fullDestPath = Path.Combine(destPath, filename);
// Ensure the destination directories exist. Don't pass in the filename to CreateDirectory!
Directory.CreateDirectory(destPath);
Then you can copy the file (synchronously) like this:
File.Copy(sourcePath, fullDestPath);

Moving files based on name to the corresponding folder

Hello everyone and well met! I have tried a lot of different methods/programs to try and solve my problem. I'm a novice programmer and have taken a Visual Basic Class and Visual C# class.
I'm working with this in C#
I started off by making a very basic move file program and it worked fine for one file but as I mentioned I will be needing to move a ton of files based on name
What I am trying to do is move .pst (for example dave.pst) files from my exchange server based on username onto a backup server in the users folder (folder = dave) that has the same name as the .pst file
The ideal program would be:
Get files from the folder with the .pst extension
Move files to appropriate folder that has the same name in front of the .pst file extension
Update:
// String pstFileFolder = #"C:\test\";
// var searchPattern = "*.pst";
// var extension = ".pst";
//var serverFolder = #"C:\test3\";
// String filename = System.IO.Path.GetFileNameWithoutExtension(pstFileFolder);
// Searches the directory for *.pst
DirectoryInfo sourceDirectory = new DirectoryInfo(#"C:\test\");
String strTargetDirectory = (#"C:\test3\");
Console.WriteLine(sourceDirectory);
Console.ReadKey(true);>foreach (FileInfo file in sourceDirectory.GetFiles()) {
Console.WriteLine(file);
Console.ReadKey(true);
// Try to create the directory.
System.IO.Directory.CreateDirectory(strTargetDirectory);
file.MoveTo(strTargetDirectory + "\\" + file.Name);
}
This is just a simple copy procedure. I'm completely aware. The
Console.WriteLine(file);
Console.ReadKey(true);
Are for verification purpose right now to make sure I'm getting the proper files and I am. Now I just need to find the folder based on the name of the .pst file(the folder for the users are already created), make a folder(say 0304 for the year), then copy that .pst based on the name.
Thanks a ton for your help guys. #yuck, thanks for the code.
Have a look at the File and Directory classes in the System.IO namespace. You could use the Directory.GetFiles() method to get the names of the files you need to transfer.
Here's a console application to get you started. Note that there isn't any error checking and it makes some assumptions about how the files are named (e.g. that they end with .pst and don't contain that elsewhere in the name):
private static void Main() {
var pstFileFolder = #"C:\TEMP\PST_Files\";
var searchPattern = "*.pst";
var extension = ".pst";
var serverFolder = #"\\SERVER\PST_Backup\";
// Searches the directory for *.pst
foreach (var file in Directory.GetFiles(pstFileFolder, searchPattern)) {
// Exposes file information like Name
var theFileInfo = new FileInfo(file);
// Gets the user name based on file name
// e.g. DaveSmith.pst would become DaveSmith
var userName = theFileInfo.Name.Replace(extension, "");
// Sets up the destination location
// e.g. \\SERVER\PST_Backup\DaveSmith\DaveSmith.pst
var destination = serverFolder + userName + #"\" + theFileInfo.Name;
File.Move(file, destination);
}
}
System.IO is your friend in this case ;)
First, Determine file name by:
String filename = System.IO.Path.GetFileNameWithoutExtension(SOME_PATH)
To make path to new folder, use Path.Combine:
String targetDir = Path.Combine(SOME_ROOT_DIR,filename);
Next, create folder with name based on given fileName
System.IO.Directory.CreateDirectory(targetDir);
Ah! You need to have name of file, but with extension this time. Path.GetFileName:
String fileNameWithExtension = System.IO.Path.GetFileName(SOME_PATH);
And you can move file (by File.Move) to it:
System.IO.File.Move(SOME_PATH,Path.Combine(targetDir,fileNameWithExtension)
Laster already show you how to get file list in folder.
I personally prefer DirectoryInfo because it is more object-oriented.
DirectoryInfo sourceDirectory = new DirectoryInfo("C:\MySourceDirectoryPath");
String strTargetDirectory = "C:\MyTargetDirectoryPath";
foreach (FileInfo file in sourceDirectory.GetFiles())
{
file.MoveTo(strTargetDirectory + "\\" + file.Name);
}

How do I compare one collection of files to another in c#?

I am just learning C# (have been fiddling with it for about 2 days now) and I've decided that, for leaning purposes, I will rebuild an old app I made in VB6 for syncing files (generally across a network).
When I wrote the code in VB 6, it worked approximately like this:
Create a Scripting.FileSystemObject
Create directory objects for the source and destination
Create file listing objects for the source and destination
Iterate through the source object, and check to see if it exists in the destination
if not, create it
if so, check to see if the source version is newer/larger, and if so, overwrite the other
So far, this is what I have:
private bool syncFiles(string sourcePath, string destPath) {
DirectoryInfo source = new DirectoryInfo(sourcePath);
DirectoryInfo dest = new DirectoryInfo(destPath);
if (!source.Exists) {
LogLine("Source Folder Not Found!");
return false;
}
if (!dest.Exists) {
LogLine("Destination Folder Not Found!");
return false;
}
FileInfo[] sourceFiles = source.GetFiles();
FileInfo[] destFiles = dest.GetFiles();
foreach (FileInfo file in sourceFiles) {
// check exists on file
}
if (optRecursive.Checked) {
foreach (DirectoryInfo subDir in source.GetDirectories()) {
// create-if-not-exists destination subdirectory
syncFiles(sourcePath + subDir.Name, destPath + subDir.Name);
}
}
return true;
}
I have read examples that seem to advocate using the FileInfo or DirectoryInfo objects to do checks with the "Exists" property, but I am specifically looking for a way to search an existing collection/list of files, and not live checks to the file system for each file, since I will be doing so across the network and constantly going back to a multi-thousand-file directory is slow slow slow.
Thanks in Advance.
The GetFiles() method will only get you files that does exist. It doesn't make up random files that doesn't exist. So all you have to do is to check if it exists in the other list.
Something in the lines of this could work:
var sourceFiles = source.GetFiles();
var destFiles = dest.GetFiles();
foreach (var file in sourceFiles)
{
if(!destFiles.Any(x => x.Name == file.Name))
{
// Do whatever
}
}
Note: You have of course no guarantee that something hasn't changed after you have done the calls to GetFiles(). For example, a file could have been deleted or renamed if you try to copy it later.
Could perhaps be done nicer somehow by using the Except method or something similar. For example something like this:
var sourceFiles = source.GetFiles();
var destFiles = dest.GetFiles();
var sourceFilesMissingInDestination = sourceFiles.Except(destFiles, new FileNameComparer());
foreach (var file in sourceFilesMissingInDestination)
{
// Do whatever
}
Where the FileNameComparer is implemented like so:
public class FileNameComparer : IEqualityComparer<FileInfo>
{
public bool Equals(FileInfo x, FileInfo y)
{
return Equals(x.Name, y.Name);
}
public int GetHashCode(FileInfo obj)
{
return obj.Name.GetHashCode();
}
}
Untested though :p
One little detail, instead of
sourcePath + subDir.Name
I would use
System.IO.Path.Combine(sourcePath, subDir.Name)
Path does reliable, OS independent operations on file- and foldernames.
Also I notice optRecursive.Checked popping out of nowhere. As a matter of good design, make that a parameter:
bool syncFiles(string sourcePath, string destPath, bool checkRecursive)
And since you mention it may be used for large numbers of files, keep an eye out for .NET 4, it has an IEnumerable replacement for GetFiles() that will let you process this in a streaming fashion.

Categories