Listing top N files from FTP directory - c#

I have a requirement to read first 100 files from FTP directory and process those after downloading.
I can't rely on whole list of files obtained in first call because some new files will be added and removed during processing. My program is expected to keep running while there are new files in directory.
I wrote the following code to read all files and take only first 100 from those.
List<FileHolder> list = new List<FileHolder>();
int filesToRead = 100;
FtpWebRequest ftpRequest = GetFtpRequest(directoryPath);
ftpRequest.Method = WebRequestMethods.Ftp.ListDirectoryDetails;
FtpWebResponse ftpResponse = (FtpWebResponse)ftpRequest.GetResponse();
using (Stream responseStream = ftpResponse.GetResponseStream())
{
if (responseStream != null)
{
using (StreamReader reader = new StreamReader(responseStream))
{
var line = reader.ReadLine();
while (line != null)
{
var split = line.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
if (split.Length > 3)
{
var fileName = split[split.Length-1];
if (!string.IsNullOrEmpty(fileName) && split[2].ToLower() != "<dir>" && line.Contains(".xml"))
{
var ftpFile = new FileHolder
{
FileName = fileName
};
list.Add(ftpFile);
//break on desired max number of files
if (list.Count == filesToRead)
{
break;
}
}
}
line = reader.ReadLine();
}
}
}
}
ftpResponse.Close();
Is there any other way or specific FTP method to get only top N files because I have to call this method iteratively.

Is there any other way or specific FTP method to get only top N files because I have to call this method iteratively.
No. You have to retrieve whole directory listing and select your "top 100 files" afterwards. Exactly as you are doing already.

Related

C# code fetching cached version of data from FTP

In my project I am downloading few files from a ftp created over IIS7 and also over linux server and saving it to my Appdata/Roamingfolder. Problem is coming when either I modify the content of the csv file or simply deleting the old file and replacing it with new file with same name but modified content.
Every time i have to rename that file and downloading the renamed file works. This indicates its downloading some cached image of the file which i am unable to locate either on my local system as well as over ftp server.
public static bool FTPFileDownload(string strFolderName, string
pathToStore, bool blIsSingleFile = true, string strFileType = "")
{
try
{
if (!Directory.Exists(pathToStore))
{
// Try to create the directory.
DirectoryInfo di = Directory.CreateDirectory(pathToStore);
}
FtpWebRequest request = (FtpWebRequest)WebRequest.Create(ConfigurationManager.AppSettings["FTPUrl"].ToString() + strFolderName);
request.Credentials = new NetworkCredential(ConfigurationManager.AppSettings["FTPUser"].ToString(), ConfigurationManager.AppSettings["FTPPassword"].ToString());
request.Method = WebRequestMethods.Ftp.ListDirectory;
request.Proxy = null;
FtpWebResponse response = (FtpWebResponse)request.GetResponse();
StreamReader streamReader = new StreamReader(response.GetResponseStream());
System.Collections.Generic.List<string> directories = new System.Collections.Generic.List<string>();
string line = streamReader.ReadLine();
while (!string.IsNullOrEmpty(line))
{
//If extension is available match with extension and add.
bool blAddFile = false;
if (!String.IsNullOrEmpty(strFileType))
{
string strExt = Path.GetExtension(ConfigurationManager.AppSettings["FTPUrl"].ToString() + line).Remove(0, 1);
if (strExt.ToLower() == strFileType.ToLower())
blAddFile = true;
}
else
blAddFile = true;
if (blAddFile)
{
directories.Add(line);
}
line = streamReader.ReadLine();
}
streamReader.Close();
using (WebClient ftpClient = new WebClient())
{
ftpClient.Credentials = new System.Net.NetworkCredential(ConfigurationManager.AppSettings["FTPUser"].ToString(), ConfigurationManager.AppSettings["FTPPassword"].ToString());
for (int i = 0; i <= directories.Count - 1; i++)
{
if (directories[i].Contains("."))
{
string path = ConfigurationManager.AppSettings["FTPUrl"].ToString() + strFolderName
+ (blIsSingleFile ? "" : "/" + directories[i].ToString());
string trnsfrpth = pathToStore + directories[i].ToString();
ftpClient.DownloadFile(path, trnsfrpth);
}
}
return true;
}
}
catch (Exception ex)
{
FileLogger.logMessage(ex.Message);
if (FileLogger.IsDebuggingLogEnabled)
{
FileLogger.HandleError("FTPFileDownload", ex, "Common Helper Error 4:");
}
return false;
}
}
I don't know what is going wrong with it. Either my code is wrong or the settings or environment over ftp server.
Please suggest.

Why am i receiving a "The process cannot access the file because it is being used by another process."

Im trying to process a set of files, i have a given number of txt files, which im currently joining into 1 txt file to apply filters to. The creation of the 1 file from multiple works great. But i have 2 questions and 1 error i cant seem to get around.
1 - Im getting an error when i try to read the newly created file so i can apply the filters. "The process cannot access the file because it is being used by another process."
2 - Am i approaching this the correct or more efficient way? by that i mean can the reading and filtering be applied before creating the concatenated file? I mean i still need to create a new file, but it would be nice to be able to apply everything before creating so that the file is already cleaned and ready for use outside the application.
Here is the current code that is having the issue and the 1 commented line that was my other attempt at releasing the file
private DataTable processFileData(string fname, string locs2 = "0", string effDate = "0", string items = "0")
{
DataTable dt = new DataTable();
string fullPath = fname;
try
{
using (StreamReader sr = new StreamReader(File.OpenRead(fullPath)))
//using (StreamReader sr = new StreamReader(File.Open(fullPath,FileMode.Open,FileAccess.Read, FileShare.Read)))
{
while (!sr.EndOfStream)
{
string line = sr.ReadLine();
if (!String.IsNullOrWhiteSpace(line))
{
string[] headers = line.ToUpper().Split('|');
while (dt.Columns.Count < headers.Length)
{
dt.Columns.Add();
}
string[] rows = line.ToUpper().Split('|');
DataRow dr = dt.NewRow();
for (int i = 0; i < rows.Count(); i++)
{
dr[i] = rows[i];
}
dt.Rows.Add(dr);
}
}
//sr.Close();
sr.Dispose();
}
string cls = String.Format("Column6 NOT LIKE ('{0}')", String.Join("','", returnClass()));
dt.DefaultView.RowFilter = cls;
return dt;
}
catch (IOException ex)
{
Console.WriteLine(ex.Message);
return dt;
}
Here is the concatenation method:
private void Consolidate(string fileType)
{
string sourceFolder = #"H:\Merchant\Strategy\Signs\BACKUP TAG DATA\Wave 6\" + sfld;
string destinationFile = #"H:\Merchant\Strategy\Signs\BACKUP TAG DATA\Wave 6\" + sfld + #"\"+ sfld + #"_consolidation.txt";
// Specify wildcard search to match TXT files that will be combined
string[] filePaths = Directory.GetFiles(sourceFolder, fileType);
StreamWriter fileDest = new StreamWriter(destinationFile, true);
int i;
for (i = 0; i < filePaths.Length; i++)
{
string file = filePaths[i];
string[] lines = File.ReadAllLines(file);
if (i > 0)
{
lines = lines.Skip(1).ToArray(); // Skip header row for all but first file
}
foreach (string line in lines)
{
fileDest.WriteLine(line);
}
}
if (sfld == "CLR")
{
clrFilter(destinationFile);
}
if (sfld == "UPL")
{
uplFilter(destinationFile);
}
if (sfld == "HD")
{
hdFilter(destinationFile);
}
if (sfld == "PD")
{
pdFilter(destinationFile);
}
fileDest.Close();
fileDest.Dispose();
}
What im trying to accomplish is reading min(2 or 3 txt files and as much as 13 txt files) and applying some filtering. But im getting this error:
"The process cannot access the file because it is being used by another process."
You're disposing the stream reader with the following line
sr.Dispose();
Using a 'Using' statement will dispose after the stream goes out of context. So remove the Dispose line (if it wasn't clear below)

FileInfo remove file from list

I have a method in C# which gets files in a directory this way:
FileInfo[] fileInfo = new DirectoryInfo(mypath).GetFiles();
Some of the files in the directory are not the ones we need to process (the only way to know is by its content, not the file extension) so we would like to remove them from the FileInfo list (not from disk).
I was searching for a simple way to exclude a file in the FileInfo array but there seems not to be a way.
Here's the whole code which checks the files we only need in the directory the user selects:
int number_of_files = fileInfo.Length;
for (int i = 0; i < number_of_files ; ++i)
{
string file= fileInfo[i].FullName;
BinaryReader br = new BinaryReader(new FileStream(file, FileMode.Open, FileAccess.Read), Encoding.ASCII);
byte[] preamble = new byte[132];
br.Read(preamble, 0, 132);
if (preamble[128] != 'D' || preamble[129] != 'I' || preamble[130] != 'C' || preamble[131] != 'M')
{
if (preamble[0] + preamble[1] != 0008)
{
return; //Rather than return, remove the file in question from the list....
}
}
br.Dispose();
}
Any ideas how can I do this?
Instead of removing the file from the FileInfo[] array, consider just creating a separate list that collects all files that you do want to keep:
FileInfo[] files = new DirectoryInfo(mypath).GetFiles();
List<FileInfo> filteredFiles = new List<FileInfo>();
foreach (FileInfo file in fileInfos)
{
string file= fileInfo[i].FullName;
using (var stream = new FileStream(file, FileMode.Open, FileAccess.Read))
using (var br = new BinaryReader(stream, Encoding.ASCII))
{
byte[] preamble = new byte[132];
br.Read(preamble, 0, 132);
if (preamble[128] != 'D' || preamble[129] != 'I' || preamble[130] != 'C' || preamble[131] != 'M')
{
if (preamble[0] + preamble[1] != 0008)
{
// skip this file
continue;
}
// keep the file
filteredFiles.Add(file);
// do something else with the file
}
}
}
You should think about whether reading the files just to filter them is really worth the effor though. If you later end up processing the filtered files too, you should really consider doing that at the same time, so you don’t have to open the file twice (once to figure out that you want to keep it, and once to actually process it). That way, you could also get rid of the filteredFiles list since you can just skip the files you are not interested in and process the other ones.

Retrieve Specific Files from Directory Using FTP

I'm creating a C# application that needs to FTP to a directory to retrieve a file list. The following code works just fine. However, the folder that I'm FTPing to contains around 92,000 files. This code will not work in the way that I want it to for a file list of that size.
I'm looking only for files that begin with the string "c-". After doing some research, I'm not even sure how to begin trying to solve this issue. Is there any way I can modify this existing code for it to retrieve only those files?
public string[] getFileList() {
string[] downloadFiles;
StringBuilder result = new StringBuilder();
FtpWebRequest reqFTP;
try {
reqFTP = (FtpWebRequest)FtpWebRequest.Create(new Uri(ftpHost));
reqFTP.UseBinary = true;
reqFTP.Credentials = new NetworkCredential(ftpUser, ftpPass);
reqFTP.Method = WebRequestMethods.Ftp.ListDirectory;
WebResponse response = reqFTP.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream());
string line = reader.ReadLine();
while (line != null) {
result.Append(line);
result.Append("\n");
line = reader.ReadLine();
}
// to remove the trailing '\n'
result.Remove(result.ToString().LastIndexOf('\n'), 1);
reader.Close();
response.Close();
return result.ToString().Split('\n');
}
catch (Exception ex) {
System.Windows.Forms.MessageBox.Show(ex.Message);
downloadFiles = null;
return downloadFiles;
}
}
I think the LIST doesn't support wildcard search and in fact it might be vary from different FTP platform and depend the COMMANDS support
you will need to download all the files name in the FTP directory using LIST , probably in the async way.
Here is an alternative implementation along the similar lines. I've tested this with as many as 1000 ftp files, it might work for you. Complete source code can be found here.
public List<ftpinfo> browse(string path) //eg: "ftp.xyz.org", "ftp.xyz.org/ftproot/etc"
{
FtpWebRequest request=(FtpWebRequest)FtpWebRequest.Create(path);
request.Method=WebRequestMethods.Ftp.ListDirectoryDetails;
List<ftpinfo> files=new List<ftpinfo>();
//request.Proxy = System.Net.WebProxy.GetDefaultProxy();
//request.Proxy.Credentials = CredentialCache.DefaultNetworkCredentials;
request.Credentials = new NetworkCredential(_username, _password);
Stream rs=(Stream)request.GetResponse().GetResponseStream();
OnStatusChange("CONNECTED: " + path, 0, 0);
StreamReader sr = new StreamReader(rs);
string strList = sr.ReadToEnd();
string[] lines=null;
if (strList.Contains("\r\n"))
{
lines=strList.Split(new string[] {"\r\n"},StringSplitOptions.None);
}
else if (strList.Contains("\n"))
{
lines=strList.Split(new string[] {"\n"},StringSplitOptions.None);
}
//now decode this string array
if (lines==null || lines.Length == 0)
return null;
foreach(string line in lines)
{
if (line.Length==0)
continue;
//parse line
Match m= GetMatchingRegex(line);
if (m==null) {
//failed
throw new ApplicationException("Unable to parse line: " + line);
}
ftpinfo item=new ftpinfo();
item.filename = m.Groups["name"].Value.Trim('\r');
item.path = path;
item.size = Convert.ToInt64(m.Groups["size"].Value);
item.permission = m.Groups["permission"].Value;
string _dir = m.Groups["dir"].Value;
if(_dir.Length>0 && _dir != "-")
{
item.fileType = directoryEntryTypes.directory;
}
else
{
item.fileType = directoryEntryTypes.file;
}
try
{
item.fileDateTime = DateTime.Parse(m.Groups["timestamp"].Value);
}
catch
{
item.fileDateTime = DateTime.MinValue; //null;
}
files.Add(item);
}
return files;
}

c# Remove rows from csv

I have two csv files. In the first file i have a list of users, and in the second file i have a list of duplicate users. Im trying to remove the rows in the first file that are equal to the second file.
Heres the code i have so far:
StreamWriter sw = new StreamWriter(path3);
StreamReader sr = new StreamReader(path2);
string[] lines = File.ReadAllLines(path);
foreach (string line in lines)
{
string user = sr.ReadLine();
if (line != user)
{
sw.WriteLine(line);
}
File 1 example:
Modify,ABAMA3C,Allpay - Free State - HO,09072701
Modify,ABCG327,Processing Centre,09085980
File 2 Example:
Modify,ABAA323,Group HR Credit Risk & Finance
Modify,ABAB959,Channel Sales & Service,09071036
Any suggestions?
Thanks.
All you'd have to do is change the following file paths in the code below and you will get a file back (file one) without the duplicate users from file 2. This code was written with the idea in mind that you want something that is easy to understand. Sure there are other more elegant solutions, but I wanted to make it as basic as possible for you:
(Paste this in the main method of your program)
string line;
StreamReader sr = new StreamReader(#"C:\Users\J\Desktop\texts\First.txt");
StreamReader sr2 = new StreamReader(#"C:\Users\J\Desktop\texts\Second.txt");
List<String> fileOne = new List<string>();
List<String> fileTwo = new List<string>();
while (sr.Peek() >= 0)
{
line = sr.ReadLine();
if(line != "")
{
fileOne.Add(line);
}
}
sr.Close();
while (sr2.Peek() >= 0)
{
line = sr2.ReadLine();
if (line != "")
{
fileTwo.Add(line);
}
}
sr2.Close();
var t = fileOne.Except(fileTwo);
StreamWriter sw = new StreamWriter(#"C:\Users\justin\Desktop\texts\First.txt");
foreach(var z in t)
{
sw.WriteLine(z);
}
sw.Flush();
If this is not homework, but a production thing, and you can install assemblies, you'll save 3 hours of your life if you swallow your pride and use a piece of the VB library:
There are many exceptions (CR/LF between commas=legal in quotes; different types of quotes; etc.) This will handle anything excel will export/import.
Sample code to load a 'Person' class pulled from a program I used it in:
Using Reader As New Microsoft.VisualBasic.FileIO.TextFieldParser(CSVPath)
Reader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
Reader.Delimiters = New String() {","}
Reader.TrimWhiteSpace = True
Reader.HasFieldsEnclosedInQuotes = True
While Not Reader.EndOfData
Try
Dim st2 As New List(Of String)
st2.addrange(Reader.ReadFields())
If iCount > 0 Then ' ignore first row = field names
Dim p As New Person
p.CSVLine = st2
p.FirstName = st2(1).Trim
If st2.Count > 2 Then
p.MiddleName = st2(2).Trim
Else
p.MiddleName = ""
End If
p.LastNameSuffix = st2(0).Trim
If st2.Count >= 5 Then
p.TestCase = st2(5).Trim
End If
If st2(3) > "" Then
p.AccountNumbersFromCase.Add(st2(3))
End If
While p.CSVLine.Count < 15
p.CSVLine.Add("")
End While
cases.Add(p)
End If
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & " is not valid and will be skipped.")
End Try
iCount += 1
End While
End Using
this to close the streams properly:
using(var sw = new StreamWriter(path3))
using(var sr = new StreamReader(path2))
{
string[] lines = File.ReadAllLines(path);
foreach (string line in lines)
{
string user = sr.ReadLine();
if (line != user)
{
sw.WriteLine(line);
}
}
}
for help on the real logic of removal or compare, answer the comment of El Ronnoco above...
You need to close the streams or utilize using clause
sw.Close();
using(StreamWriter sw = new StreamWriter(#"c:\test3.txt"))
You can use LINQ...
class Program
{
static void Main(string[] args)
{
var fullList = "TextFile1.txt".ReadAsLines();
var removeThese = "TextFile2.txt".ReadAsLines();
//Change this line if you need to change the filter results.
//Note: this assume you are wanting to remove results from the first
// list when the entire record matches. If you want to match on
// only part of the list you will need to split/parse the records
// and then filter your results.
var cleanedList = fullList.Except(removeThese);
cleanedList.WriteAsLinesTo("result.txt");
}
}
public static class Tools
{
public static IEnumerable<string> ReadAsLines(this string filename)
{
using (var reader = new StreamReader(filename))
while (!reader.EndOfStream)
yield return reader.ReadLine();
}
public static void WriteAsLinesTo(this IEnumerable<string> lines, string filename)
{
using (var writer = new StreamWriter(filename) { AutoFlush = true, })
foreach (var line in lines)
writer.WriteLine(line);
}
}
using(var sw = new StreamWriter(path3))
using(var sr = new StreamReader(path))
{
string []arrRemove = File.ReadAllLines(path2);
HashSet<string> listRemove = new HashSet<string>(arrRemove.Count);
foreach(string s in arrRemove)
{
string []sa = s.Split(',');
if( sa.Count < 2 ) continue;
listRemove.Add(sa[1].toUpperCase());
}
string line = sr.ReadLine();
while( line != null )
{
string []sa = line.Split(',');
if( sa.Count < 2 )
sw.WriteLine(line);
else if( !listRemove.contains(sa[1].toUpperCase()) )
sw.WriteLine(line);
line = sr.ReadLine();
}
}

Categories