What is the best way to download all files in a remote directory using C# and FTP and save them to a local directory?
Thanks.
downloading all files in a specific folder seems to be an easy task. However, there are some issues which has to be solved. To name a few:
How to get list of files (System.Net.FtpWebRequest gives you unparsed list and directory list format is not standardized in any RFC)
What if remote directory has both files and subdirectories. Do we have to dive into the subdirs and download it's content?
What if some of the remote files already exist on the local computer? Should they be overwritten? Skipped? Should we overwrite older files only?
What if the local file is not writable? Should the whole transfer fail? Should we skip the file and continue to the next?
How to handle files on a remote disk which are unreadable because we don’t have sufficient access rights?
How are the symlinks, hard links and junction points handled? Links can easily be used to create an infinite recursive directory tree structure. Consider folder A with subfolder B which in fact is not the real folder but the *nix hard link pointing back to folder A. The naive approach will end in an application which never ends (at least if nobody manage to pull the plug).
Decent third party FTP component should have a method for handling those issues. Following code uses our Rebex FTP for .NET.
using (Ftp client = new Ftp())
{
// connect and login to the FTP site
client.Connect("mirror.aarnet.edu.au");
client.Login("anonymous", "my#password");
// download all files
client.GetFiles(
"/pub/fedora/linux/development/i386/os/EFI/*",
"c:\\temp\\download",
FtpBatchTransferOptions.Recursive,
FtpActionOnExistingFiles.OverwriteAll
);
client.Disconnect();
}
The code is taken from my blogpost available at blog.rebex.net. The blogpost also references a sample which shows how ask the user how to handle each problem (e.g. Overwrite/Overwrite older/Skip/Skip all).
Using C# FtpWebRequest and FtpWebReponse, you can use the following recursion (make sure the folder strings terminate in '\'):
public void GetAllDirectoriesAndFiles(string getFolder, string putFolder)
{
List<string> dirIitems = DirectoryListing(getFolder);
foreach (var item in dirIitems)
{
if ( item.Contains('.') )
{
GetFile(getFolder + item, putFolder + item);
}
else
{
var subDirPut = new DirectoryInfo(putFolder + "\\" + item);
subDirPut.Create();
GetAllDirectoriesAndFiles(getFolder + item + "\\", subDirPut.FullName + "\\");
}
}
}
The "item.Contains('.')" is a bit primitive, but has worked for my purposes. Post a comment if you need an example of the methods:
GetFile(string getFileAndPath, string putFileAndPath)
or
DirectoryListing(getFolder)
For FTP protocol you can use FtpWebRequest class from .NET framework. Though it does not have any explicit support for recursive file operations (including downloads). You have to implement the recursion yourself:
List the remote directory
Iterate the entries, downloading files and recursing into subdirectories (listing them again, etc.)
Tricky part is to identify files from subdirectories. There's no way to do that in a portable way with the FtpWebRequest. The FtpWebRequest unfortunately does not support the MLSD command, which is the only portable way to retrieve directory listing with file attributes in FTP protocol. See also Checking if object on FTP server is file or directory.
Your options are:
Do an operation on a file name that is certain to fail for file and succeeds for directories (or vice versa). I.e. you can try to download the "name". If that succeeds, it's a file, if that fails, it's a directory. But that can become a performance problem, when you have a large number of entries.
You may be lucky and in your specific case, you can tell a file from a directory by a file name (i.e. all your files have an extension, while subdirectories do not)
You use a long directory listing (LIST command = ListDirectoryDetails method) and try to parse a server-specific listing. Many FTP servers use *nix-style listing, where you identify a directory by the d at the very beginning of the entry. But many servers use a different format. The following example uses this approach (assuming the *nix format)
void DownloadFtpDirectory(
string url, NetworkCredential credentials, string localPath)
{
FtpWebRequest listRequest = (FtpWebRequest)WebRequest.Create(url);
listRequest.UsePassive = true;
listRequest.Method = WebRequestMethods.Ftp.ListDirectoryDetails;
listRequest.Credentials = credentials;
List<string> lines = new List<string>();
using (WebResponse listResponse = listRequest.GetResponse())
using (Stream listStream = listResponse.GetResponseStream())
using (StreamReader listReader = new StreamReader(listStream))
{
while (!listReader.EndOfStream)
{
lines.Add(listReader.ReadLine());
}
}
foreach (string line in lines)
{
string[] tokens =
line.Split(new[] { ' ' }, 9, StringSplitOptions.RemoveEmptyEntries);
string name = tokens[8];
string permissions = tokens[0];
string localFilePath = Path.Combine(localPath, name);
string fileUrl = url + name;
if (permissions[0] == 'd')
{
Directory.CreateDirectory(localFilePath);
DownloadFtpDirectory(fileUrl + "/", credentials, localFilePath);
}
else
{
var downloadRequest = (FtpWebRequest)WebRequest.Create(fileUrl);
downloadRequest.UsePassive = true;
downloadRequest.UseBinary = true;
downloadRequest.Method = WebRequestMethods.Ftp.DownloadFile;
downloadRequest.Credentials = credentials;
var response = downloadRequest.GetResponse();
using (Stream ftpStream = response.GetResponseStream())
using (Stream fileStream = File.Create(localFilePath))
{
ftpStream.CopyTo(fileStream);
}
}
}
}
The url must be like:
ftp://example.com/ or
ftp://example.com/path/
Or use 3rd party library that supports recursive downloads.
For example with WinSCP .NET assembly you can download whole directory with a single call to Session.GetFiles:
// Setup session options
SessionOptions sessionOptions = new SessionOptions
{
Protocol = Protocol.Ftp,
HostName = "example.com",
UserName = "user",
Password = "mypassword",
};
using (Session session = new Session())
{
// Connect
session.Open(sessionOptions);
// Download files
session.GetFiles("/home/user/*", #"d:\download\").Check();
}
Internally, WinSCP uses the MLSD command, if supported by the server. If not, it uses the LIST command and supports dozens of different listing formats.
(I'm the author of WinSCP)
You could use System.Net.WebClient.DownloadFile(), which supports FTP. MSDN Details here
You can use FTPClient from laedit.net. It's under Apache license and easy to use.
It use FtpWebRequest :
first you need to use WebRequestMethods.Ftp.ListDirectoryDetails to get the detail of all the list of the folder
for each files you need to use WebRequestMethods.Ftp.DownloadFile to download it to a local folder
Related
I'm trying to download multiple files from an SFTP server and save them to the install path (or actually, ANY path at the moment just to get it working). However, I get an UnauthorizedAccess Exception no matter where I try to save the files.
As far as was aware, there are no special permissions required to save files to the install dir (Hence why I chose this folder).
Thread myThread = new Thread(delegate() {
string host;
string username;
string password;
// Path to folder on SFTP server
string pathRemoteDirectory = "public_html/uploads/17015/";
// Path where the file should be saved once downloaded (locally)
StorageFolder localFolder = Windows.ApplicationModel.Package.Current.InstalledLocation;
string pathLocalDirectory = localFolder.Path.ToString();
var methods = new List<AuthenticationMethod>();
methods.Add(new PasswordAuthenticationMethod(username, password));
//TODO - Add SSH Key auth
var con = new ConnectionInfo(host, 233, username, methods.ToArray());
using (SftpClient sftp = new SftpClient(con))
{
try
{
sftp.Connect();
var files = sftp.ListDirectory(pathRemoteDirectory);
// Iterate over them
foreach (SftpFile file in files)
{
Console.WriteLine("Downloading {0}", file.FullName);
using (Stream fileStream = File.OpenWrite(Path.Combine(pathLocalDirectory, file.Name)))
{
sftp.DownloadFile(file.FullName, fileStream);
Debug.WriteLine(fileStream);
}
}
sftp.Disconnect();
}
catch (Exception er)
{
Console.WriteLine("An exception has been caught " + er.ToString());
}
}
});
Connection to the server is all fine, the exception occurs on this line.
using (Stream fileStream = File.OpenWrite(Path.Combine(pathLocalDirectory, file.Name)))
I'm must be missing something obvious here but it's worth noting that I've also tried writing to Special Folders like the Desktop, the users Document folder and also direct to the C:/ drive, all with the same exception. I'm also running with Administrator privileges and I have the correct permissions set in the folders.
It turns out that SFTP was counting '.' and '..' as files and trying to download those, when obviously '.' is the set SFTP folder and '..' is the previous folder. This was causing a permissions exception, not 100% sure why. Simply iterating over the files to make sure they're not named '.' or '..' fixed the issue. Code below.
sftp.Connect();
var files = sftp.ListDirectory(pathRemoteDirectory);
// Iterate over them
foreach (SftpFile file in files)
{
if (!file.IsDirectory && !file.IsSymbolicLink)
{
using (Stream fileStream = File.OpenWrite(Path.Combine(pathLocalDirectory, file.Name)))
{
sftp.DownloadFile(file.FullName, fileStream);
Debug.WriteLine(pathLocalDirectory);
}
}
else if (file.Name != "." && file.Name != "..")
{
Debug.WriteLine("Directory Ignored {0}", file.FullName);
}
else if (file.IsSymbolicLink)
{
Debug.WriteLine("Symbolic link ignored: {0}", file.FullName);
}
}
sftp.Disconnect();
You have multiple problems here. The parent folder ("..") reference you answered is one blocker, but that doesn't address the deeper problem that the InstalledLocation is read-only.
UWP apps do not have direct access to most file system locations. By default they can read and write to their ApplicationData directory and they can read from (but not write to) the InstalledLocation. The failures you saw for Desktop, Documents, and C:\ are all expected.
Other locations (including Desktop, Documents, and C:) may be granted access by the user either explicitly or via the app's declared capabilities. They can be accessed via the file broker through the StorageFile object.
See the UWP File access permissions documentation:
The app's install directory is a read-only location. You can't gain
access to the install directory through the file picker.
For the long term you'll want to download your files somewhere else: probably into one of the ApplicationData folders. These folders are the only ones with no special permission requirements for UWP apps.
So why does this work for you now?
You're running into a debugging quirk where your app is not fully installed but is staged from your VS project directory. This allows the app to write to the staged install directory, but once it is properly deployed into Program Files\WindowsApps writing to the InstalledLocation will fail.
Try Path.GetTempPath();. You should have permission there.
When it says you don't have permission, you don't. 8-)
Also, there's no such thing as "no special permissions". Everything requires some level of permission for access.
I have searched and searched and cannot find a way to do this. I have files in a directory I want to upload. The file names change constantly so I cannot upload by file name. Here is what I have tried.
using (WebClient client = new WebClient())
{
client.Credentials = new NetworkCredential("User", "Password");
foreach (var filePath in files)
client.UploadFile("ftp://site.net//PICS_CAM1//", "STOR", #"PICS_CAM1\");
}
But I am getting a compiler error:
The name 'files' does not exist in the current context
Everything I have researched says this should work.
Does anyone have a good way to upload a directory of files via WebClient?
You have to define and set the files. If you wanted to upload all files in a certain local directory, use for example Directory.EnumerateFiles.
Also the address argument of WebClient.UploadFile has to be a full URL to a target file, not just a URL to a target directory.
IEnumerable<string> files = Directory.EnumerateFiles(#"C:\local\folder");
using (WebClient client = new WebClient())
{
client.Credentials = new NetworkCredential("username", "password");
foreach (string file in files)
{
client.UploadFile(
"ftp://example.com/remote/folder/" + Path.GetFileName(file), file);
}
}
For a recursive upload, see:
Recursive upload to FTP server in C#
I think your web client upload will work fine. Your issue is that your variable files is not in scope.
You need to post more of your code so we can see better
How to get a Word file from server in C#? I use following code:
static void Main(string[] args)
{
Word._Application application = new Word.Application();
object fileformat = Word.WdSaveFormat.wdFormatXMLDocument;
//
DirectoryInfo directory = new DirectoryInfo(#"http://www.sample.com/image/");
foreach (FileInfo file in directory.GetFiles("*.doc", SearchOption.AllDirectories))
{
if (file.Extension.ToLower() == ".doc")
{
object filename = file.FullName;
object newfilename = file.FullName.ToLower().Replace(".doc", ".docx");
Word._Document document = application.Documents.Open(filename);
document.Convert();
document.SaveAs(newfilename, fileformat);
document.Close();
document = null;
}
}
application.Quit();
application = null;
}
but when I use this code to get file from local machine or desktop then work fine.
Please tell me.
You can't use DirectoryInfo with a URL.
By design, this class only takes a local (or mapped network) path in its constructor.
You need to use System.Net.HttpWebRequest class to get the file from a URL, since it's located on a server on the internet, the only way to retrieve the file is to download it via HTTP.
Edit:
Based on your comments, you are looking to process 1 million files on a server you have access to. There are many ways to handle this.
You can use a network path to the server, such as
var di = new DirectoryInfo("\\servername\path\filename.doc")
You can just use a local path and create your application as a C# Console Application and use a local path. This is what I call a utility. It would be the faster method since it will process everything locally, and avoid network traffic.
var di = new DirectoryInfo("c:\your-folder\your-doc-file.doc")
Since you would run the C# console app directly on the server, the above would work.
DirectoryInfo is just an object that contains information about a directory entry in your file system. It doesn't download a file, which I presume is what you want to do.
The code example at http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.connection(v=vs.110).aspx is, I think, similar to what you want.
DirectoryInfo is for accessing local files or UNC paths. You cannot use it to access a http addressed page. You first need to download the file, i.e. using HttpWebRequest.
In Windows you can add a FTP site as a named Network Location using the "Add Network Location Wizard". For instance, a user can add a location called "MyFtp".
In .Net, how can I access (list, read and write) files in that location? Does Windows abstract away the implementation (WebDAV, FTP or else) and make it look like a local folder to my .Net program? If that's the case, how do I specify the path parameter in File.WriteAllText(path, content)? If not, how can I access the files?
No, Windows only handles that in Explorer. (They might have removed this in newer versions of Windows.) You will have to use some built in classes or implement FTP, WebDav and any other protocol yourself.
The MyFtp shortcut in the Network Locations is a shortcut to the FTP Folders shell namespace extension. If you want to use it, you would have to bind to the shortcut target (via the shell namespace) and then navigate via methods like IShellFolder::BindToObject or IShellItem::BindToHandler. This is very advanced stuff and I don't think there is anything built into C# to make it easier. Here are some references to get you started.
Introduction to the Shell Namespace
Scriptable Shell Objects
Navigating the Shell Namespace
You can try this to read/write the content of a file at the network location
//to read a file
string fileContent = System.IO.File.ReadAllText(#"\\MyNetworkPath\ABC\\testfile1.txt");
//and to write a file
string content = "123456";
System.IO.File.WriteAllText(#"\\MyNetworkPath\ABC\\testfile1.txt",content);
But you need to provide read/write permissions for network path to the principal on which the application is running.
you can use the FtpWebRequest-Class
here is some sample-code (from MSDN):
public static bool DisplayFileFromServer(Uri serverUri)
{
// The serverUri parameter should start with the ftp:// scheme.
if (serverUri.Scheme != Uri.UriSchemeFtp)
{
return false;
}
// Get the object used to communicate with the server.
WebClient request = new WebClient();
// This example assumes the FTP site uses anonymous logon.
request.Credentials = new NetworkCredential ("anonymous","janeDoe#contoso.com");
try
{
byte [] newFileData = request.DownloadData (serverUri.ToString());
string fileString = System.Text.Encoding.UTF8.GetString(newFileData);
Console.WriteLine(fileString);
}
catch (WebException e)
{
Console.WriteLine(e.ToString());
}
return true;
}
I'm using the System.Net.FtpWebRequest class and my code is as follows:
FtpWebRequest request = (FtpWebRequest)WebRequest.Create("ftp://example.com/folder");
request.Method = WebRequestMethods.Ftp.ListDirectory;
request.Credentials = new NetworkCredential("username", "password");
FtpWebResponse response = (FtpWebResponse)request.GetResponse();
Stream responseStream = response.GetResponseStream();
StreamReader reader = new StreamReader(responseStream);
string names = reader.ReadToEnd();
reader.Close();
response.Close();
This is based off of the examples provided on MSDN but I couldn't find anything more detailed.
I'm storing all the filenames in the folder in names but how can I now iterate through each of those and retrieve their dates? I want to retrieve the dates so I can find the newest files. Thanks.
This seems to work just fine
http://msdn.microsoft.com/en-us/library/system.net.ftpwebresponse.lastmodified(v=VS.90).aspx
FtpWebRequest request = (FtpWebRequest)WebRequest.Create (serverUri);
request.Method = WebRequestMethods.Ftp.GetDateTimestamp;
FtpWebResponse response = (FtpWebResponse)request.GetResponse ();
Console.WriteLine ("{0} {1}",serverUri,response.LastModified);
WebRequestMethods.Ftp.ListDirectory returns a "short listing" of all the files in an FTP directory. This type of listing is only going to provide file names - not additional details on the file (like permissions or last modified date).
Use WebRequestMethods.Ftp.ListDirectoryDetails instead. This method will return a long listing of files on the FTP server. Once you've retrieved this list into the names variable, you can split the names variable into an array based on an end of line character. This will result in each array element being a file (or directory) name listing that includes the permissions, last modified date owner, etc...
At this point, you can iterate over this array, examine the last modified date for each file, and decide whether to download the file.
I hope this helps!!
Unfortunately, there's no really reliable and efficient way to retrieve timestamps using features offered by .NET framework, as it does not support the FTP MLSD command. The MLSD command provides a listing of remote directory in a standardized machine-readable format. The command and the format is standardized by RFC 3659.
Alternatives you can use, that are supported by .NET framework:
ListDirectoryDetails method (the FTP LIST command) to retrieve details of all files in a directory and then you deal with FTP server specific format of the details (*nix format similar to the ls *nix command is the most common, a drawback is that the format may change over time, as for newer files "May 8 17:48" format is used and for older files "Oct 18 2009" format is used).
DOS/Windows format: C# class to parse WebRequestMethods.Ftp.ListDirectoryDetails FTP response
*nix format: Parsing FtpWebRequest ListDirectoryDetails line
GetDateTimestamp method (the FTP MDTM command) to individually retrieve timestamps for each file. An advantage is that the response is standardized by RFC 3659 to YYYYMMDDHHMMSS[.sss]. A disadvantage is that you have to send a separate request for each file, what can be quite inefficient.
const string uri = "ftp://example.com/remote/path/file.txt";
FtpWebRequest request = (FtpWebRequest)WebRequest.Create(uri);
request.Method = WebRequestMethods.Ftp.GetDateTimestamp;
FtpWebResponse response = (FtpWebResponse)request.GetResponse();
Console.WriteLine("{0} {1}", uri, response.LastModified);
Alternatively you can use a 3rd party FTP client implementation that supports the modern MLSD command.
For example WinSCP .NET assembly supports that.
There's even an example for your specific task: Downloading the most recent file.
The example is for PowerShell and the SFTP, but translates to C# and the FTP easily:
// Setup session options
SessionOptions sessionOptions = new SessionOptions
{
Protocol = Protocol.Ftp,
HostName = "example.com",
UserName = "username",
Password = "password",
};
using (Session session = new Session())
{
// Connect
session.Open(sessionOptions);
// Get list of files in the directory
string remotePath = "/remote/path/";
RemoteDirectoryInfo directoryInfo = session.ListDirectory(remotePath);
// Select the most recent file
RemoteFileInfo latest =
directoryInfo.Files
.OrderByDescending(file => file.LastWriteTime)
.First();
// Download the selected file
string localPath = #"C:\local\path\";
string sourcePath = RemotePath.EscapeFileMask(remotePath + latest.Name);
session.GetFiles(sourcePath, localPath).Check();
}
(I'm the author of WinSCP)
First you will need to break apart the names using String.Split on the filename delimiter. Then iterate through all of the strings and navigate the directories