Check inside loop if *txt file has been created - c#

My code is searchcing inside a loop if a *txt file has been created.
If file will not be created after x time then i will throw an exception.
Here is my code:
var AnswerFile = #"C:\myFile.txt";
for (int i = 0; i <= 30; i++)
{
if (File.Exists(AnswerFile))
break;
await Task.Delay(100);
}
if (File.Exists(AnswerFile))
{
}
else
{
}
After the loop i check my file if has been created or not. Loop will expire in 3 seconds, 100ms * 30times.
My code is working, i am just looking for the performance and quality of my code. Is there any better approach than mine? Example should i use FileInfo class instead this?
var fi1 = new FileInfo(AnswerFile);
if(fi1.Exists)
{
}
Or should i use filewatcher Class?

You should perhaps use a FileSystemWatcher for this and decouple the process of creating the file from the process of reacting to its presence. If the file must be generated in a certain time because it has some expiry time then you could make the expiry datetime part of the file name so that if it appears after that time you know it's expired. A note of caution with the FileSystemWatcher - it can sometimes miss something (the fine manual says that events can be missed if large numbers are generated in a short time)
In the past I've used this for watching for files being uploaded via ftp. As soon as the notification of file created appears I put the file into a list and check it periodically to see if it is still growing - you can either look at the filesystem watcher lastwritetime event for this or directly check the size of the file now vs some time ago etc - in either approach it's probably easiest to use a dictionary to track the file and the previous size/most recent lastwritedate event.
After a minute of no growth I consider the file uploaded completely and I process it. It might be wise for you to implement a similar delay if using a file system watcher and the files are arriving by some slow generating method

Why you don't retrieve a list of files name, then search in the list? You can use Directory.GetFiles to get the files list inside a directory then search in this list.
This would be more fixable for you since you will create the list once, and reuse it across the application, instead of calling File.Exists for each file.
Example :
var path = #"C:\folder\"; // set the folder path, which contains all answers files
var ext = "*.txt"; // set the file extension.
// GET filename list (bare name) and make them all lowercase.
var files = Directory.GetFiles(path, ext).Select(x=> x.Substring(path.Length, (x.Length - path.Length) - ext.Length + 1 ).Trim().ToLower()).ToList();
// Search for this filename
var search = "myFile";
// Check
if(files.Contains(search.ToLower()))
{
Console.WriteLine($"File : {search} is already existed.");
}
else
{
Console.WriteLine($"File : {search} is not found.");
}

Related

Increment Counter

In my application there is a option for where user creates a simple txt file containing some data. I would like to name the file in sequential order like ST1, ST2.... This sequence will be remain same for all users. if user1 creates a file system should name the file ST100 and then if user2 creates a file then system should name the file ST101.
I cant use the application scope as it is ready only and cant be changed at run time where as the user scope will only impact individual user not across the whole application.
I was wondering is there any other solution to achieve this apart from using database table and tacking sequence.
Thanks
You could use a loop and File.Exists:
var dir = #"C:\SampleFolder";
int number = 100; // you want to start at 100
string fileName = String.Format("ST{0}.txt", number.ToString("D3"));
while(File.Exists(Path.Combine(dir, fileName)))
fileName = String.Format("ST{0}.txt", (++number).ToString("D3"));
Finally you will have a new file-name and you get the correct path:
string path = Path.Combine(dir, fileName);
if your application is always on, you can use a global variable for the entire system with the numeric value of the next entry.
if not, you can use a text file as a database with the current number of the file; The problem with this solution is the synchronization, if two users doing the same operation at the same time can give you problems, so it is best to use a database.
I hope it helps you

Add Files Into Existing Zip - performance issue

I have a WCF webservice that saves files to a folder(about 200,000 small files).
After that, I need to move them to another server.
The solution I've found was to zip them then move them.
When I adopted this solution, I've made the test with (20,000 files), zipping 20,000 files took only about 2 minutes and moving the zip is really fast.
But in production, zipping 200,000 files takes more than 2 hours.
Here is my code to zip the folder :
using (ZipFile zipFile = new ZipFile())
{
zipFile.UseZip64WhenSaving = Zip64Option.Always;
zipFile.CompressionLevel = CompressionLevel.None;
zipFile.AddDirectory(this.SourceDirectory.FullName, string.Empty);
zipFile.Save(DestinationCurrentFileInfo.FullName);
}
I want to modify the WCF webservice, so that instead of saving to a folder, it saves to the zip.
I use the following code to test:
var listAes = Directory.EnumerateFiles(myFolder, "*.*", SearchOption.AllDirectories).Where(s => s.EndsWith(".aes")).Select(f => new FileInfo(f));
foreach (var additionFile in listAes)
{
using (var zip = ZipFile.Read(nameOfExistingZip))
{
zip.CompressionLevel = Ionic.Zlib.CompressionLevel.None;
zip.AddFile(additionFile.FullName);
zip.Save();
}
file.WriteLine("Delay for adding a file : " + sw.Elapsed.TotalMilliseconds);
sw.Restart();
}
The first file to add to the zip takes only 5 ms, but the 10,000 th file to add takes 800 ms.
Is there a way to optimize this ? Or if you have other suggestions ?
EDIT
The example shown above is only for test, in the WCF webservice, i'll have different request sending files that I need to Add to the Zip file.
As WCF is statless, I will have a new instance of my class with each call, so how can I keep the Zip file open to add more files ?
I've looked at your code and immediately spot problems. The problem with a lot of software developers nowadays is that they nowadays don't understand how stuff works, which makes it impossible to reason about it. In this particular case you don't seem to know how ZIP files work; therefore I would suggest you first read up on how they work and attempted to break down what happens under the hood.
Reasoning
Now that we're all on the same page on how they work, let's start the reasoning by breaking down how this works using your source code; we'll continue from there on forward:
var listAes = Directory.EnumerateFiles(myFolder, "*.*", SearchOption.AllDirectories).Where(s => s.EndsWith(".aes")).Select(f => new FileInfo(f));
foreach (var additionFile in listAes)
{
// (1)
using (var zip = ZipFile.Read(nameOfExistingZip))
{
zip.CompressionLevel = Ionic.Zlib.CompressionLevel.None;
// (2)
zip.AddFile(additionFile.FullName);
// (3)
zip.Save();
}
file.WriteLine("Delay for adding a file : " + sw.Elapsed.TotalMilliseconds);
sw.Restart();
}
(1) opens a ZIP file. You're doing this for every file you attempt to add
(2) Adds a single file to the ZIP file
(3) Saves the complete ZIP file
On my computer this takes about an hour.
Now, not all of the file format details are relevant. We're looking for stuff that will get increasingly worse in your program.
Skimming over the file format specification, you'll notice that compression is based on Deflate which doesn't require information on the other files that are compressed. Moving on, we'll notice how the 'file table' is stored in the ZIP file:
You'll notice here that there's a 'central directory' which stores the files in the ZIP file. It's basically stored as a 'list'. So, using this information we can reason on what the trivial way is to update that when implementing steps (1-3) in this order:
Open the zip file, read the central directory
Append data for the (new) compressed file, store the pointer along with the filename in the new central directory.
Re-write the central directory.
Think about it for a moment, for file #1 you need 1 write operation; for file #2, you need to read (1 item), append (in memory) and write (2 items); for file #3, you need to read (2 item), append (in memory) and write (3 items). And so on. This basically means that you're performance will go down the drain if you add more files. You've already observed this, now you know why.
A possible solution
In the previous solution I have added all files at once. That might not work in your use case. Another solution is to implement a merge that basically merges 2 files together every time. This is more convenient if you don't have all files available when you start the compression process.
Basically the algorithm then becomes:
Add a few (say, 16, files). You can toy with this number. Store this in -say- 'file16.zip'.
Add more files. When you hit 16 files, you have to merge the two files of 16 items into a single file of 32 items.
Merge files until you cannot merge anymore. Basically every time you have two files of N items, you create a new file of 2*N items.
Goto (2).
Again, we can reason about it. The first 16 files aren't a problem, we've already established that.
We can also reason what will happen in our program. Because we're merging 2 files into 1 file, we don't have to do as many read and writes. In fact, if you reason about it, you'll see that you have a file of 32 entries in 2 merges, 64 in 4 merges, 128 in 8 merges, 256 in 16 merges... hey, wait we know this sequence, it's 2^N. Again, reasoning about it we'll find that we need approximately 500 merges -- which is much better than the 200.000 operations that we started with.
Hacking in the ZIP file
Yet another solution that might come to mind is to overallocate the central directory, creating slack space for future entries to add. However, this probably requires you to hack into the ZIP code and create your own ZIP file writer. The idea is that you basically overallocate the central directory to a 200K entries before you get started, so that you can simply append in-place.
Again, we can reason about it: adding file now means: adding a file and updating some headers. It won't be as fast as the original solution because you'll need random disk IO, but it'll probably work fast enough.
I haven't worked this out, but it doesn't seem overly complicated to me.
The easiest solution is the most practical
What we haven't discussed so far is the easiest possible solution: one approach that comes to mind is to simply add all files at once, which we can again reason about.
Implementation is quite easy, because now we don't have to do any fancy things; we can simply use the ZIP handler (I use ionic) as-is:
static void Main()
{
try { File.Delete(#"c:\tmp\test.zip"); }
catch { }
var sw = Stopwatch.StartNew();
using (var zip = new ZipFile(#"c:\tmp\test.zip"))
{
zip.UseZip64WhenSaving = Zip64Option.Always;
for (int i = 0; i < 200000; ++i)
{
string filename = "foo" + i.ToString() + ".txt";
byte[] contents = Encoding.UTF8.GetBytes("Hello world!");
zip.CompressionLevel = Ionic.Zlib.CompressionLevel.None;
zip.AddEntry(filename, contents);
}
zip.Save();
}
Console.WriteLine("Elapsed: {0:0.00}s", sw.Elapsed.TotalSeconds);
Console.ReadLine();
}
Whop; that finishes in 4,5 seconds. Much better.
I can see that you just want to group the 200,000 files into one big single file, without compression (like a tar archive).
Two ideas to explore:
Experiment with other file formats than Zip, as it may not be the fastest. Tar (tape archive) comes to mind (with natural speed advantages due to its simplicity), it even has an append mode which is exactly what you are after to ensure O(1) operations. SharpCompress is a library that will allow you to work with this format (and others).
If you have control over your remote server, you could implement your own file format, the simplest I can think of would be to zip each new file separately (to store the file metadata such as name, date, etc. in the file content itself), and then to append each such zipped file to a single raw bytes file. You would just need to store the byte offsets (separated by columns in another txt file) to allow the remote server to split the huge file into the 200,000 zipped files, and then unzip each of them to get the meta data. I guess this is also roughly what tar does behind the scene :).
Have you tried zipping to a MemoryStream rather than to a file, only flushing to a file when you are done for the day? Of course for back-up purposes your WCF service would have to keep a copy of the received individual files until you are sure they have been "committed" to the remote server.
If you do need compression, 7-Zip (and fiddling with the options) is well worth a try.
You are opening the file repeatedly, why not add loop through and add them all to one zip, then save it?
var listAes = Directory.EnumerateFiles(myFolder, "*.*", SearchOption.AllDirectories)
.Where(s => s.EndsWith(".aes"))
.Select(f => new FileInfo(f));
using (var zip = ZipFile.Read(nameOfExistingZip))
{
foreach (var additionFile in listAes)
{
zip.CompressionLevel = Ionic.Zlib.CompressionLevel.None;
zip.AddFile(additionFile.FullName);
}
zip.Save();
}
If the files aren't all available right away, you could at least batch them together. So if you're expecting 200k files, but you only have received 10 so far, don't open the zip, add one, then close it. Wait for a few more to come in and add them in batches.
If you are OK with performance of 100 * 20,000 files, can't you simply partition your large ZIP into a 100 "small" ZIP files? For simplicity, create a new ZIP file every minute and put a time-stamp in the name.
You can zip all the files using .Net TPL (Task Parallel Library) like this:
while(0 != (read = sourceStream.Read(bufferRead, 0, sliceBytes)))
{
tasks[taskCounter] = Task.Factory.StartNew(() =>
CompressStreamP(bufferRead, read, taskCounter, ref listOfMemStream, eventSignal)); // Line 1
eventSignal.WaitOne(-1); // Line 2
taskCounter++; // Line 3
bufferRead = new byte[sliceBytes]; // Line 4
}
Task.WaitAll(tasks); // Line 6
There is a compiled library and source code here:
http://www.codeproject.com/Articles/49264/Parallel-fast-compression-unleashing-the-power-of

Get attributes of all files under a directory while accessing the directory only

I'm trying to write a function in C# that gets a directory path as parameter and returns a dictionary where the keys are the files directly under that directory and the values are their last modification time.
This is easy to do with Directory.GetFiles() and then File.GetLastWriteTime(). However, this means that every file must be accessed, which is too slow for my needs.
Is there a way to do this while accessing just the directory? Does the file system even support this kind of requirement?
Edit, after reading some answers:
Thank you guys, you are all saying pretty much the same - use FileInfo object. Still, it is just as slow to use Directory.GetFiles() (or Directory.EnumerateFiles()) to get those objects, and I suspect that getting them requires access to every file. If the file system keeps last modification time of its files in the files themselves only, there can't be a way to extract that info without file access. Is this the case here? Do GetFiles() and EnumerateFiles() of DirectoryInfo access every file or get their info from the directory entry? I know that if I would have wanted to get just the file names, I could do this with the Directory class without accessing every file. But getting attributes seems trickier...
Edit, following henk's response:
it seems that it really is faster to use FileInfo Object. I created the following test:
static void Main(string[] args)
{
Console.WriteLine(DateTime.Now);
foreach (string file in Directory.GetFiles(#"\\169.254.78.161\dir"))
{
DateTime x = File.GetLastWriteTime(file);
}
Console.WriteLine(DateTime.Now);
DirectoryInfo dirInfo2 = new DirectoryInfo(#"\\169.254.78.161\dir");
var files2 = from f in dirInfo2.EnumerateFiles()
select f;
foreach (FileInfo file in files2)
{
DateTime x = file.LastWriteTime;
}
Console.WriteLine(DateTime.Now);
}
For about 800 files, I usually get something like:
31/08/2011 17:14:48
31/08/2011 17:14:51
31/08/2011 17:14:52
I didn't do any timings but your best bet is:
DirectoryInfo di = new DirectoryInfo(myPath);
FileInfo[] files = di.GetFiles();
I think all the FileInfo attributes are available in the directory file records so this should (could) require the minimum I/O.
The only other thing I can think of is using the FileInfo-Class. As far as I can see this might help you or it might read the file as well (Read Permissions are required)

C# - Waiting for a copy operation to complete

I have a program that runs as a Windows Service which is processing files in a specific folder.
Since it's a service, it constantly monitors a folder for new files that have been added. Part of the program's job is to perform comparisons of files in the target folder and flag non-matching files.
What I would like to do is to detect a running copy operation and when it is completed, so that a file is not getting prematurely flagged if it's matching file has not been copied over to the target folder yet.
What I was thinking of doing was using the FileSystemWatcher to watch the target folder and see if a copy operation is occurring. If there is, I put my program's main thread to sleep until the copy operation has completed, then proceed to perform the operation on the folder like normal.
I just wanted to get some insight on this approach and see if it is valid. If anyone else has any other unique approaches to this problem, it would be greatly appreciated.
UPDATE:
I apologize for the confusion, when I say target directory, I mean the source folder containing all the files I want to process. A part of the function of my program is to copy the directory structure of the source directory to a destination directory and copy all valid files to that destination directory, preserving the directory structure of the original source directory, i.e. a user may copy folders containing files to the source directory. I want to prevent errors by ensuring that if a new set of folders containing more subfolders and files is copied to the source directory for processing, my program will not start operating on the target directory until the copy process has completed.
Yup, use a FileSystemWatcher but instead of watching for the created event, watch for the changed event. After every trigger, try to open the file. Something like this:
var watcher = new FileSystemWatcher(path, filter);
watcher.Changed += (sender, e) => {
FileStream file = null;
try {
Thread.Sleep(100); // hack for timing issues
file = File.Open(
e.FullPath,
FileMode.Open,
FileAccess.Read,
FileShare.Read
);
}
catch(IOException) {
// we couldn't open the file
// this is probably because the copy operation is not done
// just swallow the exception
return;
}
// now we have a handle to the file
};
This is about the best that you can do, unfortunately. There is no clean way to know that the file is ready for you to use.
What you are looking for is a typical producer/consumer scenario. What you need to do is outlined in 'Producer/consumer queue' section on this page. This will allow you to use multi threading (maybe span a backgroundworker) to copy files so you don't block the main service thread from listening to system events & you can perform more meaningful tasks there - like checking for new files & updating the queue. So on main thread do check for new files on background threads perform the actual coping task. From personal experience (have implemented this tasks) there is not too much performance gain from this approach unless you are running on multiple CPU machine but the process is very clean & smooth + the code is logically separated nicely.
In short, what you have to do is have an object like the following:
public class File
{
public string FullPath {get; internal set;}
public bool CopyInProgress {get; set;} // property to make sure
// .. other properties if desired
}
Then following the tutorial posted above issue a lock on the File object & the queue to update it & copy it. Using this approach you can use this type approaches instead of constantly monitoring for file copy completion.
The important point to realize here is that your service has only one instance of File object per actual physical file - just make sure you (1)lock your queue when adding & removing & (2) lock the actual File object when initializing an update.
EDIT: Above where I say "there is not too much performance gain from this approach unless" I refere to if you do this approach in a single thread compare to #Jason's suggesting this approach must be noticeably faster due to #Jason's solution performing very expensive IO operations which will fail on most cases. This I haven't tested but I'm quite sure as my approach does not require IO operations open(once only), stream(once only) & close file(once only). #Jason approach suggests multiple open,open,open,open operations which will all fail except the last one.
One approach is to attempt to open the file and see if you get an error. The file will be locked if it is being copied. This will open the file in shared mode so it will conflict with an already open write lock on the file:
using(System.IO.File.Open("file", FileMode.Open,FileAccess.Read, FileShare.Read)) {}
Another is to check the file size. It would change over time if the file is being copied to.
It is also possible to get a list of all applications that has opened a certain file, but I don't know the API for this.
I know this is an old question, but here's an answer I spun up after searching for an answer to just this problem. This had to be tweaked a lot to remove some of the proprietary-ness from what I was working on, so this may not compile directly, but it'll give you an idea. This is working great for me:
void BlockingFileCopySync(FileInfo original, FileInfo copyPath)
{
bool ready = false;
FileSystemWatcher watcher = new FileSystemWatcher();
watcher.NotifyFilter = NotifyFilters.LastWrite;
watcher.Path = copyPath.Directory.FullName;
watcher.Filter = "*" + copyPath.Extension;
watcher.EnableRaisingEvents = true;
bool fileReady = false;
bool firsttime = true;
DateTime previousLastWriteTime = new DateTime();
// modify this as you think you need to...
int waitTimeMs = 100;
watcher.Changed += (sender, e) =>
{
// Get the time the file was modified
// Check it again in 100 ms
// When it has gone a while without modification, it's done.
while (!fileReady)
{
// We need to initialize for the "first time",
// ie. when the file was just created.
// (Really, this could probably be initialized off the
// time of the copy now that I'm thinking of it.)
if (firsttime)
{
previousLastWriteTime = System.IO.File.GetLastWriteTime(copyPath.FullName);
firsttime = false;
System.Threading.Thread.Sleep(waitTimeMs);
continue;
}
DateTime currentLastWriteTime = System.IO.File.GetLastWriteTime(copyPath.FullName);
bool fileModified = (currentLastWriteTime != previousLastWriteTime);
if (fileModified)
{
previousLastWriteTime = currentLastWriteTime;
System.Threading.Thread.Sleep(waitTimeMs);
continue;
}
else
{
fileReady = true;
break;
}
}
};
System.IO.File.Copy(original.FullName, copyPath.FullName, true);
// This guy here chills out until the filesystemwatcher
// tells him the file isn't being writen to anymore.
while (!fileReady)
{
System.Threading.Thread.Sleep(waitTimeMs);
}
}

wait for a TXT file to be readable c#

My application use "FileSystemWatcher()" to raise an event when a TXT file is created by an "X" application and then read its content.
the "X" application create a file (my application detect it successfully) but it take some time to fill the data on it, so the this txt file cannot be read at the creation time, so im
looking for something to wait until the txt file come available to reading. not a static delay but something related to that file.
any help ? thx
Create the file like this:
myfile.tmp
Then when it's finished, rename it to
myfile.txt
and have your filewatcher watch for the .txt extension
The only way I have found to do this is to put the attempt to read the file in a loop, and exit the loop when I don't get an exception. Hopefully someone else will come up with a better way...
bool FileRead = false;
while (!FileRead)
{
try
{
// code to read file, which you already know
FileRead = true;
}
catch(Exception)
{
// do nothing or optionally cause the code to sleep for a second or two
}
}
You could track the file's Changed event, and see if it's available for opening on change. If the file is still locked, just watch for the next change event.
You can open and read a locked file like this
using (var stream = new FileStream(#"c:\temp\file.txt", FileMode.Open, FileAccess.Read, FileShare.ReadWrite)) {
using (var file = new StreamReader(stream)) {
while (!file.EndOfStream) {
var line = file.ReadLine();
Console.WriteLine(line);
}
}
}
However, make sure your file writer flushes otherwise you may not see any changes.
The application X should lock the file until it closes it. Is application X also a .NET application and can you modify it? In that case you can simply use the FileInfo class with the proper value for FileShare (in this case FileShare.Read).
If you have no control over application X, the situation becomes a little more complex. But then you can always attempt to open the file exclusively via the same FileInfo.Open method. Provide FileShare.None in that case. It will attempt to open the file exclusively and will fail if the file is still in use. You can perform this action inside a loop until the file is closed by application X and ready to be read.
We have a virtual printer for creating pdf documents, and I do something like this to access that document after it's sent to the printer:
using (FileSystemWatcher watcher = new FileSystemWatcher(folder))
{
if(!File.Exists(docname))
for (int i = 0; i < 3; i++)
watcher.WaitForChanged(WatcherChangeTypes.Created, i * 1000);
}
So I wait for a total of 6 seconds (some documents can take a while to print but most come very fast, hence the increasing wait time) before deciding that something has gone awry.
After this, I also read in a for loop, in just the same way that I wait for it to be created. I do this just in case the document has been created, but not released by the printer yet, which happens nearly every time.
You can use the same class to be notified when file changes.
The Changed event is raised when changes are made to the size, system attributes, last write time, last access time, or security permissions of a file or directory in the directory being monitored.
So I think you can use that event to check if file is readable and open it if it is.
If you have a DB at your disposal I would recommend using a DB table as a queue with the file names and then monitor that instead. nice and transactional.
You can check if file's size has changed. Although this will require you to poll it's value with some frequency.
Also, if you want to get the data faster, you can .Flush() while writing, and make sure to .Close() stream as soon as you will finish writing to it.

Categories