I'm creating a C# application that needs to FTP to a directory to retrieve a file list. The following code works just fine. However, the folder that I'm FTPing to contains around 92,000 files. This code will not work in the way that I want it to for a file list of that size.
I'm looking only for files that begin with the string "c-". After doing some research, I'm not even sure how to begin trying to solve this issue. Is there any way I can modify this existing code for it to retrieve only those files?
public string[] getFileList() {
string[] downloadFiles;
StringBuilder result = new StringBuilder();
FtpWebRequest reqFTP;
try {
reqFTP = (FtpWebRequest)FtpWebRequest.Create(new Uri(ftpHost));
reqFTP.UseBinary = true;
reqFTP.Credentials = new NetworkCredential(ftpUser, ftpPass);
reqFTP.Method = WebRequestMethods.Ftp.ListDirectory;
WebResponse response = reqFTP.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream());
string line = reader.ReadLine();
while (line != null) {
result.Append(line);
result.Append("\n");
line = reader.ReadLine();
}
// to remove the trailing '\n'
result.Remove(result.ToString().LastIndexOf('\n'), 1);
reader.Close();
response.Close();
return result.ToString().Split('\n');
}
catch (Exception ex) {
System.Windows.Forms.MessageBox.Show(ex.Message);
downloadFiles = null;
return downloadFiles;
}
}
I think the LIST doesn't support wildcard search and in fact it might be vary from different FTP platform and depend the COMMANDS support
you will need to download all the files name in the FTP directory using LIST , probably in the async way.
Here is an alternative implementation along the similar lines. I've tested this with as many as 1000 ftp files, it might work for you. Complete source code can be found here.
public List<ftpinfo> browse(string path) //eg: "ftp.xyz.org", "ftp.xyz.org/ftproot/etc"
{
FtpWebRequest request=(FtpWebRequest)FtpWebRequest.Create(path);
request.Method=WebRequestMethods.Ftp.ListDirectoryDetails;
List<ftpinfo> files=new List<ftpinfo>();
//request.Proxy = System.Net.WebProxy.GetDefaultProxy();
//request.Proxy.Credentials = CredentialCache.DefaultNetworkCredentials;
request.Credentials = new NetworkCredential(_username, _password);
Stream rs=(Stream)request.GetResponse().GetResponseStream();
OnStatusChange("CONNECTED: " + path, 0, 0);
StreamReader sr = new StreamReader(rs);
string strList = sr.ReadToEnd();
string[] lines=null;
if (strList.Contains("\r\n"))
{
lines=strList.Split(new string[] {"\r\n"},StringSplitOptions.None);
}
else if (strList.Contains("\n"))
{
lines=strList.Split(new string[] {"\n"},StringSplitOptions.None);
}
//now decode this string array
if (lines==null || lines.Length == 0)
return null;
foreach(string line in lines)
{
if (line.Length==0)
continue;
//parse line
Match m= GetMatchingRegex(line);
if (m==null) {
//failed
throw new ApplicationException("Unable to parse line: " + line);
}
ftpinfo item=new ftpinfo();
item.filename = m.Groups["name"].Value.Trim('\r');
item.path = path;
item.size = Convert.ToInt64(m.Groups["size"].Value);
item.permission = m.Groups["permission"].Value;
string _dir = m.Groups["dir"].Value;
if(_dir.Length>0 && _dir != "-")
{
item.fileType = directoryEntryTypes.directory;
}
else
{
item.fileType = directoryEntryTypes.file;
}
try
{
item.fileDateTime = DateTime.Parse(m.Groups["timestamp"].Value);
}
catch
{
item.fileDateTime = DateTime.MinValue; //null;
}
files.Add(item);
}
return files;
}
Related
In my project I am downloading few files from a ftp created over IIS7 and also over linux server and saving it to my Appdata/Roamingfolder. Problem is coming when either I modify the content of the csv file or simply deleting the old file and replacing it with new file with same name but modified content.
Every time i have to rename that file and downloading the renamed file works. This indicates its downloading some cached image of the file which i am unable to locate either on my local system as well as over ftp server.
public static bool FTPFileDownload(string strFolderName, string
pathToStore, bool blIsSingleFile = true, string strFileType = "")
{
try
{
if (!Directory.Exists(pathToStore))
{
// Try to create the directory.
DirectoryInfo di = Directory.CreateDirectory(pathToStore);
}
FtpWebRequest request = (FtpWebRequest)WebRequest.Create(ConfigurationManager.AppSettings["FTPUrl"].ToString() + strFolderName);
request.Credentials = new NetworkCredential(ConfigurationManager.AppSettings["FTPUser"].ToString(), ConfigurationManager.AppSettings["FTPPassword"].ToString());
request.Method = WebRequestMethods.Ftp.ListDirectory;
request.Proxy = null;
FtpWebResponse response = (FtpWebResponse)request.GetResponse();
StreamReader streamReader = new StreamReader(response.GetResponseStream());
System.Collections.Generic.List<string> directories = new System.Collections.Generic.List<string>();
string line = streamReader.ReadLine();
while (!string.IsNullOrEmpty(line))
{
//If extension is available match with extension and add.
bool blAddFile = false;
if (!String.IsNullOrEmpty(strFileType))
{
string strExt = Path.GetExtension(ConfigurationManager.AppSettings["FTPUrl"].ToString() + line).Remove(0, 1);
if (strExt.ToLower() == strFileType.ToLower())
blAddFile = true;
}
else
blAddFile = true;
if (blAddFile)
{
directories.Add(line);
}
line = streamReader.ReadLine();
}
streamReader.Close();
using (WebClient ftpClient = new WebClient())
{
ftpClient.Credentials = new System.Net.NetworkCredential(ConfigurationManager.AppSettings["FTPUser"].ToString(), ConfigurationManager.AppSettings["FTPPassword"].ToString());
for (int i = 0; i <= directories.Count - 1; i++)
{
if (directories[i].Contains("."))
{
string path = ConfigurationManager.AppSettings["FTPUrl"].ToString() + strFolderName
+ (blIsSingleFile ? "" : "/" + directories[i].ToString());
string trnsfrpth = pathToStore + directories[i].ToString();
ftpClient.DownloadFile(path, trnsfrpth);
}
}
return true;
}
}
catch (Exception ex)
{
FileLogger.logMessage(ex.Message);
if (FileLogger.IsDebuggingLogEnabled)
{
FileLogger.HandleError("FTPFileDownload", ex, "Common Helper Error 4:");
}
return false;
}
}
I don't know what is going wrong with it. Either my code is wrong or the settings or environment over ftp server.
Please suggest.
I have a requirement to read first 100 files from FTP directory and process those after downloading.
I can't rely on whole list of files obtained in first call because some new files will be added and removed during processing. My program is expected to keep running while there are new files in directory.
I wrote the following code to read all files and take only first 100 from those.
List<FileHolder> list = new List<FileHolder>();
int filesToRead = 100;
FtpWebRequest ftpRequest = GetFtpRequest(directoryPath);
ftpRequest.Method = WebRequestMethods.Ftp.ListDirectoryDetails;
FtpWebResponse ftpResponse = (FtpWebResponse)ftpRequest.GetResponse();
using (Stream responseStream = ftpResponse.GetResponseStream())
{
if (responseStream != null)
{
using (StreamReader reader = new StreamReader(responseStream))
{
var line = reader.ReadLine();
while (line != null)
{
var split = line.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
if (split.Length > 3)
{
var fileName = split[split.Length-1];
if (!string.IsNullOrEmpty(fileName) && split[2].ToLower() != "<dir>" && line.Contains(".xml"))
{
var ftpFile = new FileHolder
{
FileName = fileName
};
list.Add(ftpFile);
//break on desired max number of files
if (list.Count == filesToRead)
{
break;
}
}
}
line = reader.ReadLine();
}
}
}
}
ftpResponse.Close();
Is there any other way or specific FTP method to get only top N files because I have to call this method iteratively.
Is there any other way or specific FTP method to get only top N files because I have to call this method iteratively.
No. You have to retrieve whole directory listing and select your "top 100 files" afterwards. Exactly as you are doing already.
So, the html data I'm looking at is:
Action.log<br> 6/8/2015 3:45 PM
From this I need to extract either instances of Action.log,
My problem is I've been over a ton of regex tutorials and I still can't seem to brain up a pattern to extract it. I guess I'm lacking some fundamental understanding of regex, but any help would be appreciated.
Edit:
internal string[] ParseFolderIndex_Alpha(string url, WebDirectory directory)
{
try
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.Timeout = 3 * 60 * 1000;
request.KeepAlive = true;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode == HttpStatusCode.OK)
{
List<string> fileLocations = new List<string>(); string line;
using (StreamReader reader = new StreamReader(response.GetResponseStream()))
{
while ((line = reader.ReadLine()) != null)
{
int index = line.IndexOf("<a href=");
if (index >= 0)
{
string[] segments = line.Substring(index).Split('\"');
///Can Parse File Size Here: Add todo
if (!segments[1].Contains("/"))
{
fileLocations.Add(segments[1]);
UI.UpdatePatchNotes("Web File Found: " + segments[1]);
UI.UpdateProgressBar();
}
else
{
if (segments[1] != #"../")
{
directory.SubDirectories.Add(new WebDirectory(url + segments[1], this));
UI.UpdatePatchNotes("Web Directory Found: " + segments[1].Replace("/", string.Empty));
}
}
}
else if (line.Contains("</pre")) break;
}
}
response.Dispose(); /// After ((line = reader.ReadLine()) != null)
return fileLocations.ToArray<string>();
}
else return new string[0]; /// !(HttpStatusCode.OK)
}
catch (Exception e)
{
LogHandler.LogErrors(e.ToString(), this);
LogHandler.LogErrors(url, this);
return null;
}
}
That's what I was doing, the problem is I changed servers and the html IIS is displaying is different so I have to make new logic.
Edit / Conclusion:
First of all, I'm sorry I even mentions regex :P Secondly each platform will have to be handled individually depending on environment.
This is how I'm currently gathering the file names.
internal string[] ParseFolderIndex(string url, WebDirectory directory)
{
try
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.Timeout = 3 * 60 * 1000;
request.KeepAlive = true;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
bool endMet = false;
if (response.StatusCode == HttpStatusCode.OK)
{
List<string> fileLocations = new List<string>(); string line;
using (StreamReader reader = new StreamReader(response.GetResponseStream()))
{
while (!endMet)
{
line = reader.ReadLine();
if (line != null && line != "" && line.IndexOf("</A>") >= 0)
{
if (line.Contains("</html>")) endMet = true;
string[] segments = line.Replace("\\", "").Split('\"');
List<string> paths = new List<string>();
List<string> files = new List<string>();
for (int i = 0; i < segments.Length; i++)
{
if (!segments[i].Contains('<'))
paths.Add(segments[i]);
}
paths.RemoveAt(0);
foreach (String s in paths)
{
string[] secondarySegments = s.Split('/');
if (s.Contains(".") || s.Contains("Verinfo"))
files.Add(secondarySegments[secondarySegments.Length - 1]);
else
{
directory.SubDirectories.Add(new WebDirectory
(url + "/" + secondarySegments[secondarySegments.Length - 2], this));
UI.UpdatePatchNotes("Web Directory Found: " + secondarySegments[secondarySegments.Length - 2]);
}
}
foreach (String s in files)
{
if (!String.IsNullOrEmpty(s) && !s.Contains('%'))
{
fileLocations.Add(s);
UI.UpdatePatchNotes("Web File Found: " + s);
UI.UpdateProgressBar();
}
}
if (line.Contains("</pre")) break;
}
}
}
response.Dispose(); /// After ((line = reader.ReadLine()) != null)
return fileLocations.ToArray<string>();
}
else return new string[0]; /// !(HttpStatusCode.OK)
}
catch (Exception e)
{
LogHandler.LogErrors(e.ToString(), this);
LogHandler.LogErrors(url, this);
return null;
}
}
Regex for this is overkill.
It's too heavy, and considering the format of the string will always be the same, you're going to find it easier to debug and maintain using splitting and substrings.
class Program {
static void Main(string[] args) {
String s = "Action.log<br> 6/8/2015 3:45 PM ";
String[] t = s.Split('"');
String fileName = String.Empty;
//To get the entire file name and path....
fileName = t[1].Substring(0, (t[1].Length));
//To get just the file name (Action.log in this case)....
fileName = t[1].Substring(0, (t[1].Length)).Split('/').Last();
}
}
Try matching the following pattern:
<A HREF="(?<url>.*)">
Then get the group called url from the match results.
Working example: https://regex101.com/r/hW8iH6/1
string text = #"Action.log<br> 6/8/2015 3:45 PM";
var match = Regex.Match(text, #"^(.*).*$");
var result = match.Groups[1].Value;
Try http://regexr.com/ or Regexbuddy!
This is my program, and it work correctly if i put username and password :
try
{
var url = #"https://mail.google.com/mail/feed/atom";
var User = username;
var Pasw = password;
var encoded = TextToBase64(User + ":" + Pasw);
var myweb = HttpWebRequest.Create(url) as HttpWebRequest;
myweb.Method = "POST";
myweb.ContentLength = 0;
myweb.Headers.Add("Authorization", "Basic " + encoded);
var response = myweb.GetResponse();
var stream = response.GetResponseStream();
textBox1.Text += ("Connection established with" + User + Pasw);
}
catch (Exception ex)
{
textBox1.Text += ("Error connection. Original error: " + ex.Message);
now i want read string of texfile, split them and read username and password like this format: username:password . There is my code at the moment:
Stream myStream = null;
OpenFileDialog openFileDialog1 = new OpenFileDialog();
openFileDialog1.InitialDirectory = "c:\\";
openFileDialog1.Filter = "txt files (*.txt)|*.txt|All files (*.*)|*.*";
openFileDialog1.FilterIndex = 2;
openFileDialog1.RestoreDirectory = true;
string file_name = "";
file_name = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments) + file_name;
if (openFileDialog1.ShowDialog() == DialogResult.OK)
{
try
{
if ((myStream = openFileDialog1.OpenFile()) != null)
{
using (StringReader reader = new StringReader(file_name))
{
// Loop over the lines in the string.
int count = 0;
string line;
while ((line = reader.ReadLine()) != null)
{
string[] data = line.Split(':');
string username = data[0].Trim();
string password = data[1].Trim();
count++;
/* Console.WriteLine("Line {0}: {1}", count, line); */
}
reader.Close();
}
}
}
catch (Exception ex)
{
MessageBox.Show("Error: Could not read file from disk. Original error: " + ex.Message);
}
You open the file selected by the user, but then try to read from a variable file_name that is not the name of a file but the name of a well kwown folder. Perhaps you want this
try
{
if (openFileDialog1.FileName != string.Empty)
{
using (StreamReader reader = new StreamReader(openFileDialog1.FileName))
{
....
}
}
}
In this same code you use a StringReader, but instead you need a StreamReader to read from a file. StringReader takes the value passed in its constructor and return in the ReadLine call. Then you split the line at the colon but of course this is not the content of your file.
There are other problems in your code. For example, what do you do with the username and password loaded from the line? They are declared as local variables and not used anywhere, so at the next loop they are overwritten and lost.
So, a UserData class could be a possible answer
public class UserData
{
public string UserName {get; set;}
public string Password {get; set;}
}
and declare at the form global level an
List<UserData> data = new List<UserData>
and in your loop
public void button1_Click(object sender, EventArgs e)
{
try
{
if (openFileDialog1.FileName != string.Empty)
{
using (StreamReader reader = new StreamReader(openFileDialog1.FileName))
{
int count = 0;
string line;
while ((line = reader.ReadLine()) != null)
{
UserData d = new UserData();
string[] parts = line.Split(':');
d.UserName = parts[0].Trim();
d.Password = parts[1].Trim();
data.Add(d);
}
// At the loop end you could use the List<UserData> like a normal array
foreach(UserData ud in data)
{
Console.WriteLine("User=" + dd.UserName + " with password=" + dd.Password);
}
}
}
}
}
public void button2_Click(object sender, EventArgs e)
{
try
{
if(data.Count() == 0)
{
MessageBox.Show("Load user info first");
return;
}
var url = #"https://mail.google.com/mail/feed/atom";
var encoded = TextToBase64(data[0].UserName + ":" + data[0].Password);
.....
A warning note. Of course this is just demo code. Remember that in a real scenario saving passwords in clear text is a big security concern. The impact of this is relative to the context of your application but should not be downplayed. A better course of action is to store an hashing of the password values and apply the same hashing function when you need to compare password
You are creating StringReader from file_name varialbe, which is (according to your code)
string file_name = "";
file_name = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments) + file_name;
and points to nothere.
Also you have stream created for file being selected with open file dialog but you haven't use this stream.
I am having a problem reading file with StreamReader and while line != null add to textBox1
Code:
using(StreamReader reader = new StreamReader("lastupdate.txt"))
{
string line;
while((line = reader.ReadLine()) != null)
{
textBox1.Text = line;
}
reader.Close();
}
It's not working and I don't know why. I tried to use using StreamReader, I download the file from the URL and I can see in the folder that the file is downloaded. The lastupdate.txt is 1KB in size.
This is my current working code with MessageBox. If I remove the MessageBox, the code doesn't work. It needs some kind of wait or I don't know:
WebClient client = new WebClient();
client.DownloadFileAsync(new Uri(Settings.Default.patchCheck), "lastupdate.txt"); // ok
if(File.Exists("lastupdate.txt"))
{
MessageBox.Show("Lastupdate.txt exist");
using(StreamReader reader = new StreamReader("lastupdate.txt"))
{
string line;
while((line = reader.ReadLine()) != null)
{
textBox1.Text = line;
MessageBox.Show(line.ToString());
}
reader.Close();
}
File.Delete("lastupdate.txt");
}
Try :
StringBuilder sb = new StringBuilder();
using (StreamReader sr = new StreamReader("lastupdate.txt"))
{
while (sr.Peek() >= 0)
{
sb.Append(sr.ReadLine());
}
}
textbox.Text = sb.Tostring();
If you want the text in the text box it would be much more effective to read all of it and then put it into the text box:
var lines = File.ReadAllLines("lastupdate.txt");
textBox1.Lines = lines; //assuming multi-line text box
or:
textBox1.Text = File.ReadAllText("lastupdate.txt");
Edit:
After latest update - you are downloading the file asynchronously - it might not even be there, only partially there or in a state in-between when your code executes.
If you just want the text string in the file don't download it, use DownloadString instead:
string text = "";
using (WebClient wc = new WebClient())
{
text = wc.DownloadString(new Uri(Settings.Default.patchCheck));
}
textBox1.Text = text;
Try this :
using(StreamReader reader = new StreamReader(Path))
{
string line = reader.ReadLine();
while(line != null)
{
textBox1.Text += line;
line = reader.ReadLine()
}
reader.Close();
}
Web Client has a rather bizarre DownloadFileAsync method. The return type is void, so it is not awaitable. Also, that means we do not even get a Task, so ContinueWith is not possible. That leaves us with using the DownloadFileCompleted event.
const string FileName = "lastupdate.txt";
private void DownloadLastUpdate() {
var client = new WebClient();
client.DownloadFileCompleted += ( s, e ) => {
this.UpdateTextBox( e.Error );
client.Dispose();
};
client.DownloadFileAsync( new Uri( Settings.Default.patchCheck ), FileName );
}
I went with an optional exception parameter to relay any exception messages. Feel free to refactor as desired. File.ReadLines yields text line by line, so large files should not use very much memory.
private void UpdateTextBox( Exception exception = null ) {
textBox1.Text = string.Empty;
if ( exception != null ) {
textBox1.Text = exception.Message;
return;
}
if ( !File.Exists( FileName ) ) {
textBox1.Text = string.Format( "File '{0}' does not exist.", FileName );
return;
}
var lines = File.ReadLines( FileName );
textBox1.Text = string.Join( Environment.NewLine, lines );
}
the answer given above is correct, but in your piece of code, just change 1 line:
textBox1.Text += line;