LINQ or Lambda for two for loops - c#

The code I have written works fine, this inquiry being purely for educational purposes. I want to know how others would do this better and cleaner. I especially hate the way I use two for loops to get data. There has to be a more efficient way.
I tried to do with LINQ but one of them is a class and the other one is just a string[]. So I couldn't figure out how to use it.
I have got a Document Name Table in my SQL database and Files in Content Folder.
I have got a Two list- ListOfFileNamesSavedInTheDB and ListOfFileNamesInTheFolder.
Basically, I am getting all file names saved in Database and checking is it exist in the Folder, if not delete file name from the database.
var clientDocList = documentRepository.Documents.Where(c => c.ClientID == clientID).ToList();
if (Directory.Exists(directoryPath))
{
string[] fileList = Directory.GetFiles(directoryPath).Select(Path.GetFileName).ToArray();
foreach (var clientDoc in clientDocList)
{
bool fileNotExist = true;
foreach (var file in fileList)
{
if (clientDoc.DocFileName.Trim().ToUpper()==file.ToUpper().Trim())
{
fileNotExist = false;
break;
}
}
if (fileNotExist)
{
documentRepository.Delete(clientDoc);
}
}
}

I am not exactly sure of how you want your code to work but I believe you need something like this
//string TextResult = "";
ClientDocList documentRepository = GetClientDocList();
var directoryPath = "";
var clientID = 1;
var clientDocList = documentRepository.Documents.Where(c => c.ClientID == clientID).ToList();
if (Directory.Exists(directoryPath) || true) // I need to pass your condition
{
string[] files = new string[] { "file1", "file5", "file6" };
List<string> fileList = files.Select(x => x.Trim().ToUpper()).ToList(); // I like working with lists, if you want an array it's ok
foreach (var clientDoc in clientDocList.Where(c => !fileList.Contains(c.DocFileName.Trim().ToUpper())))
{
//TextResult += $" {clientDoc.DocFileName} does not exists so you have to delete it from db";
documentRepository.Delete(clientDoc);
}
}
//Console.WriteLine(TextResult);
To be honest, I really don't like this line
fileList = files.Select(x => x.Trim().ToUpper()).ToList()
so I would suggest you add a helper function comparing the list of file names to the specific file name
public static bool TrimContains(List<string> names, string name)
{
return names.Any(x => x.Trim().Equals(name.Trim(), StringComparison.InvariantCultureIgnoreCase));
}
and your final code would become
List<string> fileList = new List<string>() { "file1", "file5", "file6" };
foreach (var clientDoc in clientDocList.Where(c => !TrimContains(fileList, c.DocFileName)))
{
//TextResult += $" {clientDoc.DocFileName} does not exists so you have to delete it from db";
documentRepository.Delete(clientDoc);
}

Instead of retrieving all documents from database and do the checking in memory, I suggest to check which document doesn't exist in folder in one query:
if (Directory.Exists(directoryPath))
{
var fileList = Directory.GetFiles(directoryPath).Select(Path.GetFileName);
var clientDocList = documentRepository.Documents.Where(c => c.ClientID == clientID && !fileList.Contains(c.DocFileName.Trim())).ToList();
documentRepository.Documents.RemoveRange(clientDocList);
}
Note: this is just a sample to demonstrate the idea, may have syntax error somewhere since I don't have IDE with me at the moment. But the idea is there
This code is not only shorter but also more efficient since it only uses a single query to retrieve documents from database. I assume the number of files in a folder is not too large to convert to SQL by EF

Related

Get files with same/ similar name from array

I have multiple objects in an array of which the format:
id_name_date_filetype.
I need to take all the objects with, let's say same id or same name and insert them in a new array.
With the GetFiles method I already have all the object in one array and I have their names but I don't know how to differentiate them.
I have a foreach I which I'll be going through all the objects but I'm kind of stuck.
Any hints as to what do I do?
//Process the files
string[] filelist = Directory.GetFiles(SourceDirectory, "*.tsv*", SearchOption.TopDirectoryOnly).Select(filename => Path.GetFullPath(filename)).Distinct().ToArray();
foreach (string file in filelist)
{
string[] fileNameSplit = file.Split('_');
switch (fileNameSplit.Last().ToLower())
{
case "assets.tsv":
assets = ReadDataFromCsv<Asset>(file);
break;
case "financialaccounts.tsv":
financialAccounts = ReadDataFromCsv<FinancialAccount>(file);
break;
case "households.tsv":
households = ReadDataFromCsv<Household>(file);
break;
case "registrations.tsv":
registrations = ReadDataFromCsv<Registration>(file);
break;
case "representatives.tsv":
representatives = ReadDataFromCsv<Representative>(file);
break;
}
}
// Find all files from one firm and insert them in a list
foreach (string file in filelist)
{
}
Here is a linq approach as I proposed it in my comment:
First get all distinct ID's from your filelist
string [] allDistinctIDs = filelist.Select(x=>x.Split('_').First()).Distinct(). ToArray();
now you can iterate through the list of ID's and compare each value
for (int i = 0; i < allDistinctIDs.Length; i++)
{
string [] allSameIDStrings = filelist.Where(x=>x.Split('_').First() == allDistinctIDs[i]).ToArray();
}
Basically you split every item by '_' and compare the first (id part) of the string with each item from your list of distinct ID's.
Another approach would be to use GroupBy.
// example input
string[] filelist = {
"123_Name1_xxx_Asset.tsv",
"456_Name2_xxx_Asset.tsv",
"123_Name3_xxx_HouseHold.tsv",
"456_Name4_xxx_HouseHold.tsv"};
IEnumerable<IGrouping<string, string>> ID_Groups = filelist.GroupBy(x=>x.Split('_').First());
This would give you a collection of all filenames grouped by the ID:
at each position in ID_Groups is a list of items with the same ID. You can filter them by fileName:
foreach (var id_group in ID_Groups)
{
assets = ReadDataFromCsv<Asset>(id_group.FirstOrDefault(x=>x.ToLower().Contains("assets.tsv")));
// and so on
households = ReadDataFromCsv<Household>(id_group.FirstOrDefault(x=>x.ToLower().Contains("households.tsv")));
}
You gotta define what is "Similar" to you. It could be the initial letter of the file name? Half of it? Whole filename?
This function should do more or less what you want without using Linq or something more complex than loops.
var IDOffileNameIWant = object.GetFiles()[0].id;
List<string> arrayThatContainsSimilar = new List<string>();
foreach(var file in object.GetFiles())
{
if(file.Name.Split('_')[0].Contains(IDOffileNameIWant))
{
arrayThatContainsSimilar.Add(file.Name);
}
}
It's very basic and can be refined, but you gotta give more details on what is the exact result you want to obtain.
Since you're still struggling, here's a working example:
List<string> files = new List<string>() {
"123_novica_file1", "123_novica_file3", "123_novica_file2", "456_myfilename_file1",
"789_myfilename_file1", "101_novica_file2", "102_novica_file3"};
List<string> filesbyID = new List<string>();
List<string> filesbyName = new List<string>();
string theIDPattern = "123";
string theFileNamePattern = "myfilename";
foreach(var file in files)
{
//splitting the filename and checking by ID
if(file.Split('_')[0].Contains(theIDPattern))
{
filesbyID.Add(file);
}
//splitting the filename and checking by name
if (file.Split('_')[1].Contains(theFileNamePattern))
{
filesbyName.Add(file);
}
}
Result:
files by id:
123_novica_file1
123_novica_file3
123_novica_file2
files by name:
456_myfilename_file1
789_myfilename_file1

How to get query names that references the particular iteration path

With selected project name, I had loaded iteration paths. Now I need to get the query names that references the selected iteration path.
Code to load iteration paths passing project name:
private void LoadIterationPaths(string projectName)
{
var tfs = TfsTeamProjectCollectionFactory.GetTeamProjectCollection(_tfs.Uri);
var wiStore = tfs.GetService<WorkItemStore>();
var projCollections = wiStore.Projects;
var detailsOfTheSelectedProject = projCollections.Cast<Project>().Where(project => !String.IsNullOrEmpty(_selectedTeamProject.Name))
.FirstOrDefault(project => project.Name.Contains(_selectedTeamProject.Name));
var iterationPathsList = GetIterationPaths(detailsOfTheSelectedProject);
foreach (var iterationPath in iterationPathsList.Where(iterationPath => iterationPath.Contains(projectName)))
{
cmbIterationPath.Items.Add(iterationPath);
}
cmbIterationPath.Enabled = cmbIterationPath.Items.Count > 0;
}
Now, I need to get the list of Query names that references the selected iteration Path. Thanks.
Note: I am able to get all the query names in a project but that i don't need.
For that I used the below code
foreach (StoredQuery qi in detailsOfTheSelectedProject.StoredQueries)
{
cmbQueries.Items.Add(qi.Name);
}
Your code should looks like this
string selectedIterationPath = ...
foreach (StoredQuery qi in detailsOfTheSelectedProject.StoredQueries) {
if (qi.QueryText.Contains(selectedIterationPath) {
cmbQueries.Items.Add(qi.Name);
}
}
This is what me and Beytan Kurt suggested in the comments.
Instead of a dumb Contains, you should use a Regular Expression to account for false positives and negatives.

how to efficiently Comparing two lists with 500k objects and strings

So i have a main directory with sub folders and around 500k images. I know alot of theese images does not exist in my database and i want to know which ones so that i can delete them.
This is the code i have so far:
var listOfAdPictureNames = ImageDB.GetAllAdPictureNames();
var listWithFilesFromImageFolder = ImageDirSearch(adPicturesPath);
var result = listWithFilesFromImageFolder.Where(p => !listOfAdPictureNames.Any(q => p.FileName == q));
var differenceList = result.ToList();
listOfAdPictureNames is of type List<string>
here is my model that im returing from the ImageDirSearch:
public class CheckNotUsedAdImagesModel
{
public List<ImageDirModel> ListWithUnusedAdImages { get; set; }
}
public class ImageDirModel
{
public string FileName { get; set; }
public string Path { get; set; }
}
and here is the recursive method to get all images from my folder.
private List<ImageDirModel> ImageDirSearch(string path)
{
string adPicturesPath = ConfigurationManager.AppSettings["AdPicturesPath"];
List<ImageDirModel> files = new List<ImageDirModel>();
try
{
foreach (string f in Directory.GetFiles(path))
{
var model = new ImageDirModel();
model.Path = f.ToLower();
model.FileName = Path.GetFileName(f.ToLower());
files.Add(model);
}
foreach (string d in Directory.GetDirectories(path))
{
files.AddRange(ImageDirSearch(d));
}
}
catch (System.Exception excpt)
{
throw new Exception(excpt.Message);
}
return files;
}
The problem I have is that this row:
var result = listWithFilesFromImageFolder.Where(p => !listOfAdPictureNames.Any(q => p.FileName == q));
takes over an hour to complete. I want to know if there is a better way to check in my images folder if there are images there that doesn't exist in my database.
Here is the method that get all the image names from my database layer:
public static List<string> GetAllAdPictureNames()
{
List<string> ListWithAllAdFileNames = new List<string>();
using (var db = new DatabaseLayer.DBEntities())
{
ListWithAllAdFileNames = db.ad_pictures.Select(b => b.filename.ToLower()).ToList();
}
if (ListWithAllAdFileNames.Count < 1)
return new List<string>();
return ListWithAllAdFileNames;
}
Perhaps Except is what you're looking for. Something like this:
var filesInFolderNotInDb = listWithFilesFromImageFolder.Select(p => p.FileName).Except(listOfAdPictureNames).ToList();
Should give you the files that exist in the folder but not in the database.
Instead of the search being repeated on each of these lists its optimal to sort second list "listOfAdPictureNames" (Use any of n*log(n) sorts). Then checking for existence by binary search will be the most efficient all other techniques including the current one are exponential in order.
As I said in my comment, you seem to have recreated the FileInfo class, you don't need to do this, so your ImageDirSearch can become the following
private IEnumerable<string> ImageDirSearch(string path)
{
return Directory.EnumerateFiles(path, "*.jpg", SearchOption.TopDirectoryOnly);
}
There doesn't seem to be much gained by returning the whole file info where you only need the file name, and also this only finds jpgs, but this can be changed..
The ToLower calls are quite expensive and a bit pointless, so is the to list when you are planning on querying again so you can get rid of that and return an IEnumerable again, (this is in the GetAllAdPictureNames method)
Then your comparison can use equals and ignore case.
!listOfAdPictureNames.Any(q => p.Equals(q, StringComparison.InvariantCultureIgnoreCase));
One more thing that will probably help is removing items from the list of file names as they are found, this should make the searching of the list quicker every time one is removed since there is less to iterate through.

How to delete back up files

I need to delete files with ".bak" and ".csv.bak" extensions. I use .net c#.
I tried like this:
string srcDir = #"D:\Backup";
string[] bakList = Directory.GetFiles(srcDir,".bak");
if (Directory.Exists(srcDir))
{
foreach (string f in bakList)
{
File.Delete(f);
}
}
But when debugging, the bakList array is empty.
Directory.GetFiles() is not loading the file names in the array. I cant figure out what is wrong in my coding.
You need to Add * before your .bak in GetFiles()
string srcDir = #"D:\Backup";
string[] bakList = Directory.GetFiles(srcDir,"*.bak");
if (Directory.Exists(srcDir))
{
foreach (string f in bakList)
{
File.Delete(f);
}
}
If you need to search for both types maybe it works better
var files = Directory.GetFiles(srcDir, "*.*")
.Where(s => s.EndsWith(".bak"));
If your file name is
"Data Logger[2].csv.bak",
go to the properties and check the type of file. it will be something like this
"1 File (.1)" .The file has number as its end extension. So i used like this.
string[] bk = Directory.GetFiles(srcDir, "*.bak.*");
foreach (string f in bk)
{
File.Delete(f);
}
its working...

Get list of titles from xml files

I am trying to get titles of xml files from a folder call "bugs".
My code:
public virtual List<IBug> FillBugs()
{
string folder = xmlStorageLocation + "bugs" + Path.DirectorySeparatorChar;
List<IBug> bugs = new List<IBug>();
foreach (string file in Directory.GetFiles(folder, "*.xml", SearchOption.TopDirectoryOnly))
{
var q = from b in bugs
select new IBug
{
Title = b.Title,
Id = b.Id,
};
return q.ToList();
}
return bugs;
}
But I'm not geting out the titles from all the xml files in the folder "bugs".
the biggest problem is to get eatch files to singel string and not string[].
Your code as written doesn't make any sense. Perhaps you meant something more like this:
public virtual List<IBug> FillBugs()
{
// is this actually correct or did you mix up the concatenation order?
// either way, I suggest Path.Combine() instead
string folder = xmlStorageLocation + "bugs" + Path.DirectorySeparatorChar;
List<IBug> bugs = new List<IBug>();
foreach (string file in Directory.GetFiles(folder, "*.xml",
SearchOption.TopDirectoryOnly))
{
// i guess IBug is not actually an interface even though it starts
// with "I" since you made one in your code
bugs.Add(new IBug {
Title = file, Id = 0 /* don't know where you get an ID */ });
}
return bugs;
}
"from b in bugs" selects from an empty list. you need to initialize bugs from the file at the start of your foreach loop
Do you need a backslash (Path.DirectorySeparatorChar) between xmlStorageLocation and "bugs"?
You don't use file in your loop anywhere - Is that correct or did you miss to push it into the collection?

Categories