C# File.Move service slow over network

C# File.Move service slow over network - c#

I created a service that moves certain file types in a directory to another one, this works fine locally and pretty fast over my network. On a different network though it works incredibly slow (a 500mb file takes 6 1/2 minutes) but the same file copied and pasted into the folder via explorer is complete in about 30/40 seconds.
Snippet where file move happens.
currentlyProcessing.Add(currentFile.FullName);
try
{
eventMsg("Attempting to move file","DEBUG");
File.Move(oldFilePath, newFilePath);
eventMsg("File Moved successfully","DEBUG");
}
catch (Exception ex)
{
eventMsg("Cannot Move File another resource is using it", "DEBUG");
eventMsg("Move File Exception : " + ex, "DEBUG");
}
finally
{
if(File.Exists(currentFile.FullName + ".RLK"))
{
try
{
File.Delete(currentFile.FullName + ".RLK");
}
catch (IOException e)
{
eventMsg("File Exception : " + e, "DEBUG");
}
}
currentlyProcessing.Remove(oldFilePath);
}
I fear the code is fine (as it works as expected on other network) so the problem is probably the network, in some shape or form. Has anyone got any common tips to check, the service runs as local system(or Network service) and there doesn't seem to be access problem. What other factors would affect this (other than network/hardware) .
What I would like is it to have a similar transfer speed to that I've witnessed in Explorer. Any pointers greatly appreciated .

First of all
Ping each other to check latency so you can see if it's the problem is globally in the network or is it something with your software.
cmd ping 192.168.1.12 -n 10
If it's a problem on the network follow this:
Restart your HUB / Router
Is one of the PCs use WiFi with low signal?
Is there any Anti Virus that is turned on and monitoring Network Activity ?
If none of the above solves your problem then try using WireShark to be able to further investigate the issue.

For files with such huge size, I would suggest zipping them up if possible, especially since network latency is always a big factor involved in upload/download of files over the internet.
You can zip your files with System.IO.Packaging. Have a look here Using System.IO.Packaging to generate a ZIP file , specifically:
using (Package zip = System.IO.Packaging.Package.Open(zipFilename, FileMode.OpenOrCreate))
{
string destFilename = ".\\" + Path.GetFileName(fileToAdd);
Uri uri = PackUriHelper.CreatePartUri(new Uri(destFilename, UriKind.Relative));
if (zip.PartExists(uri))
{
zip.DeletePart(uri);
}
PackagePart part = zip.CreatePart(uri, "",CompressionOption.Normal);
using (FileStream fileStream = new FileStream(fileToAdd, FileMode.Open, FileAccess.Read))
{
using (Stream dest = part.GetStream())
{
CopyStream(fileStream, dest);
}
}
}
Also, you can should FTP as some other user mentioned if possible. Look at the Renci SSHNet library on CodePlex.

Related

Concurrent File.Move of the same file

It was clearly stated that File.Move is atomic operation here: Atomicity of File.Move.
But the following code snippet results in visibility of moving the same file multiple times.
Does anyone know what is wrong with this code?
using System;
using System.Collections.Generic;
using System.IO;
using System.Threading.Tasks;
namespace FileMoveTest
{
class Program
{
static void Main(string[] args)
{
string path = "test/" + Guid.NewGuid().ToString();
CreateFile(path, new string('a', 10 * 1024 * 1024));
var tasks = new List<Task>();
for (int i = 0; i < 10; i++)
{
var task = Task.Factory.StartNew(() =>
{
try
{
string newPath = path + "." + Guid.NewGuid();
File.Move(path, newPath);
// this line does NOT solve the issue
if (File.Exists(newPath))
Console.WriteLine(string.Format("Moved {0} -> {1}", path, newPath));
}
catch (Exception e)
{
Console.WriteLine(string.Format(" {0}: {1}", e.GetType(), e.Message));
}
});
tasks.Add(task);
}
Task.WaitAll(tasks.ToArray());
}
static void CreateFile(string path, string content)
{
string dir = Path.GetDirectoryName(path);
if (!Directory.Exists(dir))
{
Directory.CreateDirectory(dir);
}
using (FileStream f = new FileStream(path, FileMode.OpenOrCreate))
{
using (StreamWriter w = new StreamWriter(f))
{
w.Write(content);
}
}
}
}
}
The paradoxical output is below. Seems that file was moved multiple times onto different locations. On the disk only one of them is present. Any thoughts?
Moved test/eb85560d-8c13-41c1-926a-6871be030742 -> test/eb85560d-8c13-41c1-926a-6871be030742.0018d317-ed7c-4732-92ac-3bb974d29017
Moved test/eb85560d-8c13-41c1-926a-6871be030742 -> test/eb85560d-8c13-41c1-926a-6871be030742.3965dc15-7ef9-4f36-bdb7-94a5939b17db
Moved test/eb85560d-8c13-41c1-926a-6871be030742 -> test/eb85560d-8c13-41c1-926a-6871be030742.fb66306a-5a13-4f26-ade2-acff3fb896be
Moved test/eb85560d-8c13-41c1-926a-6871be030742 -> test/eb85560d-8c13-41c1-926a-6871be030742.c6de8827-aa46-48c1-b036-ad4bf79eb8a9
System.IO.FileNotFoundException: Could not find file 'C:\file-move-test\test\eb85560d-8c13-41c1-926a-6871be030742'.
System.IO.FileNotFoundException: Could not find file 'C:\file-move-test\test\eb85560d-8c13-41c1-926a-6871be030742'.
System.IO.FileNotFoundException: Could not find file 'C:\file-move-test\test\eb85560d-8c13-41c1-926a-6871be030742'.
System.IO.FileNotFoundException: Could not find file 'C:\file-move-test\test\eb85560d-8c13-41c1-926a-6871be030742'.
System.IO.FileNotFoundException: Could not find file 'C:\file-move-test\test\eb85560d-8c13-41c1-926a-6871be030742'.
System.IO.FileNotFoundException: Could not find file 'C:\file-move-test\test\eb85560d-8c13-41c1-926a-6871be030742'.
The resulting file is here: eb85560d-8c13-41c1-926a-6871be030742.fb66306a-5a13-4f26-ade2-acff3fb896be
UPDATE. I can confirm that checking File.Exists also does NOT solve the issue - it can report that single file was really moved into several different locations.
SOLUTION. The solution I end up with is following: Prior to operations with source file create special "lock" file, if it succeeded then we can be sure that only this thread got exclusive access to the file and we are safe to do anything we want. The below is right set of parameters to create suck "lock" file.
File.Open(lockPath, FileMode.CreateNew, FileAccess.Write);

Does anyone know what is wrong with this code?
I guess that depends on what you mean by "wrong".
The behavior you're seeing is not IMHO unexpected, at least if you're using NTFS (other file systems may or may not behave similarly).
The documentation for the underlying OS API (MoveFile() and MoveFileEx() functions) is not specific, but in general the APIs are thread-safe, in that they guarantee the file system will not be corrupted by concurrent operations (of course, your own data could be corrupted, but it will be done in a file-system-coherent way).
Most likely what is occurring is that as the move-file operation proceeds, it does so by first getting the actual file handle from the given directory link to it (in NTFS, all "file names" that you see are actually hard links to an underlying file object). Having obtained that file handle, the API then creates a new file name for the underlying file object (i.e. as a hard link), and then deletes the previous hard link.
Of course, as this progresses, there is a window during the time between a thread having obtained the underlying file handle but before the original hard link has been deleted. This allows some but not all of the other concurrent move operations to appear to succeed. I.e. eventually the original hard link doesn't exist and further attempts to move it won't succeed.
No doubt the above is an oversimplification. File system behaviors can be complex. In particular, your stated observation is that you only wind up with a single instance of the file when all is said and done. This suggests that the API does also somehow coordinate the various operations, such that only one of the newly-created hard links survives, probably by virtue of the API actually just renaming the associated hard link after retrieving the file object handle, as opposed to creating a new one and deleting the old one (implementation detail).
At the end of the day, what's "wrong" with the code is that it is intentionally attempting to perform concurrent operations on a single file. While the file system itself will ensure that it remains coherent, it's up to your own code to ensure that such operations are coordinated so that the results are predictable and reliable.

Issue downloading large files via WebClient / FTP

I'm currently building an application that is, among other things, going to download large files from a FTP server. Everything works fine for small files (< 50 MB) but the files I'm downloading are way bigger, mainly over 2 GB.
I've been trying with a Webclient using DownloadfileAsync() and a list system as I'm downloading these files one after the other due to their sizes.
DownloadClient.DownloadProgressChanged += new DownloadProgressChangedEventHandler(DownloadProgress);
DownloadClient.DownloadFileCompleted += new AsyncCompletedEventHandler(DownloadCompleted);
private void FileDownload()
{
DownloadClient.DownloadFileAsync(new Uri(#"ftp://" + RemoteAddress + FilesToDownload[0]), LocalDirectory + FilesToDownload[0]));
}
private void DownloadProgress(object sender, DownloadProgressChangedEventArgs e)
{
// Handle progress
}
private void DownloadCompleted(object sender, AsyncCompletedEventArgs e)
{
FilesToDownload.RemoveAt(0);
FileDownload();
}
It works absolutely fine this way on small files, they are all downloaded one by one, the progress is reported and DownloadCompleted fires after each file. This issue I'm facing with big files is that it launches the first download without any issue but doesn't do anything after that. The DownloadCompleted event never fires for some reasons. It looks like the WebClient doesn't know that the file has finished to download, which is an issue as I'm using this event to launch the next download in the FilesToDownload list.
I've also tried to do that synchronously using WebClient.DownloadFile and a for loop to cycle through my FilesToDownload list. It downloads the first file correctly and I get an exception when the second download should start: "The underlying connection was closed: An unexpected error occurred on a receive".
Finally, I've tried to go through this via FTP using edtFTPnet but I'm facing download speed issues (i.e. My download goes full speed with the WebClient and I just get 1/3 of the full speed with edtFTPnet library).
Any thoughts? I have to admit that I'm running out of ideas here.

public string GetRequest(Uri uri, int timeoutMilliseconds)
{
var request = System.Net.WebRequest.Create(uri);
request.Timeout = timeoutMilliseconds;
using (var response = request.GetResponse())
using (var stream = response.GetResponseStream())
using (var reader = new System.IO.StreamReader(stream))
{
return reader.ReadToEnd();
}
}

Forgot to update this thread but I figured how to sort this out a while ago.
The issue was that the Data connection that is opened for a file transfer randomly times out for some reason or is closed by the server before the transfer ends. I haven't been able to figure out why however as there is a load of local and external network interfaces between my computer and the remote server. As it's totally random (i.e the transfer works fine for five files in a row, times out for one file, works fine for the following files etc), the issue may be server or network related.
I'm now catching any FTP exception raised by the FTP client object during the download and issue a REST command with an offset equals to the position in the data stream where the transfer stopped (i.e total bytes amount of the remote file - currently downloaded bytes amount). Doing so allows to get the remaining bytes that are missing in the local file.

querying a directory for files that are complete (not copy-in-progress)

I tried to use FileInfo.CreationTime, but it doesn't represent the copy finish time.
I am trying to get a list of files in a directory. The problem is that the call also returns files which are not yet finished copying.
If I try to use the file, it returns an error stating that the file is in use.
How can you query for files that are fully copied?
As below code. Directory.GetFiles() returns which are not yet finished copying.
my test file size is over 200Mb.
if(String.IsNullOrEmpty(strDirectoryPath)){
txtResultPrint.AppendText("ERROR : Wrong Directory Name! ");
}else{
string[] newFiles = Directory.GetFiles(strDirectoryPath,"*.epk");
_epkList.PushNewFileList(newFiles);
if(_epkList.IsNewFileAdded()){
foreach (var fileName in _epkList.GetNewlyAddedFile()){
txtResultPrint.AppendText(DateTime.Now.Hour + ":" + DateTime.Now.Minute + ":" + DateTime.Now.Second + " => ");
txtResultPrint.AppendText(fileName + Environment.NewLine);
this.Visible = true;
notifyIconMain.Visible = true;
}
}else{
}
}

If performance and best-practices aren't huge concerns then you could simply wrap the failing file operation in an inner-scoped try/catch.
using System.IO;
string[] files = Directory.GetFiles("pathToFiles");
foreach (string file in files) {
FileStream fs = null;
try {
//try to open file for exclusive access
fs = new FileStream(
file,
FileMode.Open,
FileAccess.Read, //we might not have Read/Write privileges
FileShare.None //request exclusive (non-shared) access
);
}
catch (IOException ioe) {
//File is in use by another process, or doesn't exist
}
finally {
if (fs != null)
fs.Close();
}
}
This isn't really the best design advice as you shouldn't be relying on exception handling for this sort of thing, but if you're in a pinch and it's not code for a client or for your boss then this should work alright until a better solution is suggested or found.

Do you have the ability to change thy copying itself?
If yes (and if you can guarantee that your program will always execute on NTFS on Windows Vista or newer), you can use Transactional NTFS to wrap the copy in a single transaction. File(s) being copied will only become visible to the rest of the world after you commit the transaction, so you'll never even see the partially copied files.
Unfortunately Transactional NTFS is not accessible directly from .NET Framework - you'll need to P/Invoke into Win32 APi functions such as: CreateTransaction, CommitTransaction, RollbackTransaction, CopyFileTransacted (and other *Transacted functions).

Check if a file is accessable on a shared network drive

I have a program that does different things my questions is related to access files in a network mapped drive or a shared folder
the program can run a file msi/exe from the network (network mapped drive or a shared folder)
the program can copy file from the network (network mapped drive or a shared folder)
how I can check if the files are accessible before I try to run or copy (in case of a network disconnection, or any other network problem)?
is it enough with File.Exists();
here is an example of my code:
public static bool FileIsOk(string path)
{
try
{
FileInfo finfo = new FileInfo(path);
if (finfo.Exists)
{
return true;
}
MessageBox.Show("file does not exist, or there is a problem with the network preventing access to the file!");
return false;
}
catch (Exception e)
{
MessageBox.Show(e.Message);
}
return false;
}
thanks

File.Exists() should be fine, but if you start a large copy operation, there's not a lot you can do if the connection goes down during that process, so you'll want to make sure you code for that.
You should trap the IOException and handle it as you see fit.
EDIT: code to trap IOException:
try
{
File.Copy(myLocalFile, myNetworkFile);
}
catch (IOException ioEx)
{
Debug.Write(myLocalFile + " failed to copy! Try again or copy later?");
}

Don't. Just attempt the operation. It will fail just as fast, and you won't be introducing a timing window problem. You have to cope with that failure anyway, why code it twice?

The best idea, of course, would be to create local cache of the setup. You cannot trust network connections. They may slow down or break during operation. If everything is run from network, I would say, it's definitely not a safe idea.
But as far as technical question is concerned, File Exists should be fine. A much more descriptive idea has already been discussed to check the existence of a file. Read here.
FileInfo fi = new FileInfo(#"\\server\share\file.txt");
bool exists = fi.Exists;

File IO slow or cached in a web service?

I am writing a simple web service using .NET, one method is used to send a chunk of a file from the client to the server, the server opens a temp file and appends this chunk. The files are quite large 80Mb, the net work IO seems fine, but the append write to the local file is slowing down progressively as the file gets larger.
The follow is the code that slows down, running on the server, where aFile is a string, and aData is a byte[]
using (StreamWriter lStream = new StreamWriter(aFile, true))
{
BinaryWriter lWriter = new BinaryWriter(lStream.BaseStream);
lWriter.Write(aData);
}
Debugging this process I can see that exiting the using statement is slower and slower.
If I run this code in a simple standalone test application the writes are the same speed every time about 3 ms, note the buffer (aData) is always the same side, about 0.5 Mb.
I have tried all sorts of experiments with different writers, system copies to append scratch files, all slow down when running under the web service.
Why is this happening? I suspect the web service is trying to cache access to local file system objects, how can I turn this off for specific files?
More information -
If I hard code the path the speed is fine, like so
using (StreamWriter lStream = new StreamWriter("c:\\test.dat", true))
{
BinaryWriter lWriter = new BinaryWriter(lStream.BaseStream);
lWriter.Write(aData);
}
But then it slow copying this scratch file to the final file destination later on -
File.Copy("c:\\test.dat", aFile);
If I use any varibale in the path it gets slow agin so for example -
using (StreamWriter lStream = new StreamWriter("c:\\test" + someVariable, true))
{
BinaryWriter lWriter = new BinaryWriter(lStream.BaseStream);
lWriter.Write(aData);
}
It has been commented that I should not use StreamWriter, note I tried many ways to open the file using FileStream, none of which made any change when the code is running under the web service, I tried WriteThrough etc.
Its the strangest thing I even tried this -
Write the data to file a.dat
Spawn system "cmd" "copy /b b.dat + a.dat b.dat"
Delete a.dat
This slows down the same way????
Makes me think the web server is running in some protected file IO environment catching all file operations in this process and child process, I can understand this if I was generating a file that might be later served to a client, but I am not, what I am doing is storing large binary blobs on disk, with a index/pointer to them stored in a database, if I comment out the write to the file the whole process fly's no performance issues at all.
I started reading about web server caching strategies, makes me think is there a web.config setting to mark a folder as uncached? Or am I completely barking up the wrong tree.

A long shot: is it possible that you need close some resources when you have finished?

If the file is binary, then why are you using a StreamWriter, which is derived from TextWriter? Just use a FileStream.
Also, BinaryWriter implements IDisposable, You need to put it into a using block.

Update....I replicated the basic code, no database, simple and it seems to work fine, so I suspect there is another reason, I will rest on it over the weekend....
Here is the replicated server code -
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.Services;
using System.IO;
namespace TestWS
{
/// <summary>
/// Summary description for Service1
/// </summary>
[WebService(Namespace = "http://tempuri.org/")]
[WebServiceBinding(ConformsTo = WsiProfiles.BasicProfile1_1)]
[System.ComponentModel.ToolboxItem(false)]
// To allow this Web Service to be called from script, using ASP.NET AJAX, uncomment the following line.
// [System.Web.Script.Services.ScriptService]
public class Service1 : System.Web.Services.WebService
{
private string GetFileName ()
{
if (File.Exists("index.dat"))
{
using (StreamReader lReader = new StreamReader("index.dat"))
{
return lReader.ReadLine();
}
}
else
{
using (StreamWriter lWriter = new StreamWriter("index.dat"))
{
string lFileName = Path.GetRandomFileName();
lWriter.Write(lFileName);
return lFileName;
}
}
}
[WebMethod]
public string WriteChunk(byte[] aData)
{
Directory.SetCurrentDirectory(Server.MapPath("Data"));
DateTime lStart = DateTime.Now;
using (FileStream lStream = new FileStream(GetFileName(), FileMode.Append))
{
BinaryWriter lWriter = new BinaryWriter(lStream);
lWriter.Write(aData);
}
DateTime lEnd = DateTime.Now;
return lEnd.Subtract(lStart).TotalMilliseconds.ToString();
}
}
}
And the replicated client code -
static void Main(string[] args)
{
Service1 s = new Service1();
byte[] b = new byte[1024 * 512];
for ( int i = 0 ; i < 160 ; i ++ )
{
Console.WriteLine(s.WriteChunk(b));
}
}

Based on your code, it appears you're using the default handling inside of StreamWriter for files, which means synchronous and exclusive locks on the file.
Based on your comments, it seems the issue you really want to solve is the return time from the web service -- not necessarily the write time for the file. While the write time is the current gating factor as you've discovered, you might be able to get around your issue by going to an asynchronous-write mode.
Alternatively, I prefer completely de-coupled asynchronous operations. In that scenario, the inbound byte[] of data would be saved to its own file (or some other structure), then appended to the master file by a secondary process. More complex for operation, but also less prone to failure.

I don't have enough points to vote up an answer, but jro has the right idea. We do something similar in our service; each chunk is saved to a single temp file, then as soon as all chunks are received they're reassembled into a single file.
I'm not certain on the underlying processes for appending data to a file using StreamWriter, but I would assume it would have to at least read to the end of the current file before attempting to write whatever is in the buffer to it. So as the file gets larger it would have to read more and more of the existing file before writing the next chunk.

Well I found the root cause, "Microsoft Forefront Security", group policy has this running real time scanning, I could see the process goto 30% CPU usage when I close the file, killing this process and everything works the same speed, outside and inside the web service!
Next task find a way to add an exclusion to MFS!

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.